{"id":118084,"date":"2021-02-16T15:55:32","date_gmt":"2021-02-16T06:55:32","guid":{"rendered":"http:\/\/175.125.95.178\/ai-in-communication\/18084\/"},"modified":"2026-04-13T05:14:08","modified_gmt":"2026-04-12T20:14:08","slug":"18084","status":"publish","type":"ai-in-communication","link":"http:\/\/ee.presscat.kr\/en\/ai-in-communication\/18084\/","title":{"rendered":"Communication in Multi-Agent Reinforcement Learning: Intention Sharing"},"content":{"rendered":"<p style=\"text-align:justify;margin-bottom:10px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span><b>Title: Communication in Multi-Agent Reinforcement Learning: Intention Sharing<\/b><\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:10px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span><b>Authors: Woojun Kim, Jongeui Park and Youngchul Sung<\/b><\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:10px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span><b>To be presented at International Conference on Learning Representation (ICLR) 2021<\/b><\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:10px\">&nbsp;<\/p>\n<p style=\"text-align:justify;margin-bottom:10px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span>Communication is one of the core components for learning coordinated behavior in multi-agent systems. In this work, W. Kim et al. proposed a new communication scheme named Intention Sharing (IS) for multi-agent reinforcement learning in order to enhance the coordination among agents. In the proposed scheme, each agent generates an imagined trajectory by modeling the environment dynamics and other agents\u2019 actions. The imagined trajectory is a simulated future trajectory of each agent based on the learned model of the environment dynamics and other agents and represents each agent\u2019s future action plan. Each agent compresses this imagined trajectory capturing its future action plan to generate its intention message for communication by applying an attention mechanism to learn the relative importance of the components in the imagined trajectory based on the received message from other agents. Numeral results show that the proposed IS scheme significantly outperforms other communication schemes in multi-agent reinforcement learning.<\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:10px\">&nbsp;<\/p>\n<p style=\"text-align:justify;margin-bottom:10px\">\n<div class=\"\"><img decoding=\"async\" class=\"\" src=\"\/wp-content\/uploads\/drupal\/Figure 1_\uc131\uc601\ucca0.jpg\" alt=\"\" title=\"\"><\/div>\n<\/p>\n<p style=\"text-align:justify;margin-bottom:10px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span>Fig. 1. The overall structure of the proposed IS scheme from the perspective of Agent i<\/span><\/span><\/span><\/p>\n<p style=\"margin-left:53px;text-indent:-40.0pt;text-align:justify;margin-bottom:10px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span><\/p>\n<div class=\"\"><img decoding=\"async\" class=\"\" src=\"\/wp-content\/uploads\/drupal\/Figure 2_\uc131\uc601\ucca0.jpg\" alt=\"\" title=\"\"><\/div>\n<p>\nFig. 2 Performance: <b>:<\/b> MADDPG (blue), DIAL (green), TarMAC (red), Comm-OA (purple), ATOC (cyan) and the proposed IS method (black).&nbsp; (PP: Predator-and-Prey, CN: Cooperative Navigation, TJ: Traffic Junction)<\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:10px\">\n<div class=\"\"><img decoding=\"async\" class=\"\" src=\"\/wp-content\/uploads\/drupal\/Figure 3_\uc131\uc601\ucca0.jpg\" alt=\"\" title=\"\"><\/div>\n<\/p>\n<p style=\"text-align:justify;margin-bottom:10px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span>Fig. 3. Imagined trajectories and attention weights of each agent on PP (N=3): 1st row &#8211; agent1 (red), 2nd row &#8211; agent2 (green), and 3rd row &#8211; agent3 (blue). Black squares, circle inside the times icon, and other circles denote the prey, current position, and estimated future positions, respectively. The brightness of the circle is proportional to the attention weight.<\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:10px\">&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>1173<\/p>\n","protected":false},"featured_media":126889,"template":"","class_list":["post-118084","ai-in-communication","type-ai-in-communication","status-publish","has-post-thumbnail","hentry"],"acf":[],"_links":{"self":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-communication\/118084","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-communication"}],"about":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/types\/ai-in-communication"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media\/126889"}],"wp:attachment":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media?parent=118084"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}