AI in EE

AI IN DIVISIONS

AI in Communication Division

Giseung Park, Whiyoung Jung, Seungyul Han, Sungho Choi and Youngchul Sung*, "Adaptive multi-model fusion learning for sparse-reward reinforcement learning," submitted to IEEE Transactions on Systems, Man and Cybernetics: Systems," Apr. 2021

In sparse-reward reinforcement learning settings, extrinsic rewards from the environment are sparsely given to the agent. In this work, we consider intrinsic reward generation based on model prediction error. In typical model-prediction-error-based intrinsic reward generation, the agent has a learning model for the target value or distribution. Then, the intrinsic reward is designed as the error between the model prediction output and the target value or distribution, based on the fact that for less- or un-visited state-action pairs, the learned model yields larger prediction errors. This paper generalizes this model-prediction-error-based intrinsic reward generation method to the case of multiple prediction models. We propose a new adaptive fusion method relevant to the multiple-model case, which learns optimal prediction-error fusion across the learning phase to enhance the overall learning performance.  Numerical results show that for representative locomotion tasks, the proposed intrinsic reward generation method outperforms most of the previous methods, and the gain is significant in some tasks.

 

성영철교수님4

Fig. 1. Adaptive fusion of $K$ prediction errors from the multiple models.

 

성영철교수님5

Fig. 2. Mean number of visited states in continuous 4-room maze environment over ten random seeds (the blue curves represent performance of the proposed method).

 

성영철교수님6

Fig. 3. Performance comparison in delayed Mujoco environments over ten random seeds (the blue curves represent performance of the the proposed method).