Comparative Evaluation for Effectiveness Analysis of Policy Based Deep Reinforcement Learning Approaches

Creative Commons License

Tan Z., Karaköse M.

International Journal of Computer and Information Technology, vol.10, no.3, pp.1-15, 2021 (Peer-Reviewed Journal)


Deep Reinforcement Learning (DRL) has proven to be a very strong technique with results in various applications in recent years. Especially the achievements in the studies in the field of robotics show that much more progress will be made in this field. Undoubtedly, policy choices and parameter settings play an active role in the success of DRL. In this study, an analysis has been made on the policies used by examining the DRL studies conducted in recent years. Policies used in the literature are grouped under three different headings: value-based, policy-based and actor-critic. However, the problem of moving a common target using Newton's law of motion of collaborative agents is presented. Trainings are carried out in a frictionless environment with two agents and one object using four different policies. Agents try to force an object in the environment by colliding it and try to move it out of the area it is in. Two-dimensional surface is used during the training phase. As a result of the training, each policy is reported separately and its success is observed. Test results are discussed in section 5. Thus, policies are tested together with an application by providing information about the policies used in deep reinforcement learning approaches.