{"id":169033,"date":"2024-08-08T17:50:01","date_gmt":"2024-08-08T08:50:01","guid":{"rendered":"http:\/\/ee.presscat.kr\/?post_type=research-achieve&#038;p=169033"},"modified":"2026-04-13T09:22:01","modified_gmt":"2026-04-13T00:22:01","slug":"ph-d-candidate-hee-suk-yoon-prof-chang-d-yoo-wins-excellent-paper-award","status":"publish","type":"research-achieve","link":"http:\/\/ee.presscat.kr\/en\/research-achieve\/ph-d-candidate-hee-suk-yoon-prof-chang-d-yoo-wins-excellent-paper-award\/","title":{"rendered":"Ph.D. candidate Hee Suk Yoon (Prof. Chang D. Yoo) wins excellent paper award"},"content":{"rendered":"<p><span style=\"font-size: 14pt\"><strong><span style=\"color: #000000\">Ph.D. candidate Hee Suk Yoon (Prof. Chang D. Yoo) wins excellent paper award<\/span><\/strong><\/span><\/p>\n<p><span style=\"color: #000000\">\u00a0 <img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone size-full wp-image-169034\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/08\/\uc0ac\uc9c4.png\" alt=\"\" width=\"368\" height=\"155\" title=\"\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\u00a0<\/span><\/p>\n<p><span style=\"color: #000000\">&lt;(From left)\u00a0Professor Chang D. Yoo, Hee Suk Yoon\u00a0integrated Ph.D. candidate&gt;<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">The Korean Society for Artificial Intelligence holds conferences quarterly, and this year\u2019s summer conference is scheduled to take place from August 15 to 17 at BEXCO in Busan. <\/span><\/p>\n<p><span style=\"color: #000000\">Hee Suk Yoon, a PhD candidate, has been recognized for the excellence of his paper titled \u201cBI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation\u201d and has been selected as an award recipient.<\/span><\/p>\n<p><span style=\"color: #000000\"> Moreover, the findings will be presented at the &#8216;<strong>European Conference on Computer Vision (ECCV) 2024&#8242;<\/strong>, one of the top international conferences in the field of computer vision, to be held in Milan, Italy, in September this year (Paper title: BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation).<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">The detailed information is as follows:<\/span><br \/>\n<span style=\"color: #000000\">* Conference Name: 2024 Summer Conference of the Korean Artificial Intelligence Association<\/span><br \/>\n<span style=\"color: #000000\">* Period: August 15 to 17, 2024<\/span><br \/>\n<span style=\"color: #000000\">* Award Name: Excellent Paper Award<\/span><br \/>\n<span style=\"color: #000000\">* Authors: Hee Suk Yoon, Eunseop Yoon, Chang D. Yoo (Supervising Professor)<\/span><br \/>\n<span style=\"color: #000000\">* Paper Title:\u00a0\u00a0BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">This research is considered an innovative breakthrough that overcomes the limitations of existing multimodal dialogue large models, such as ChatGPT, and maintains consistency in image generation within multimodal dialogues.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\"><img decoding=\"async\" class=\"alignnone size-full wp-image-169036\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/08\/chatgpt.png\" alt=\"\" width=\"803\" height=\"308\" title=\"\"><\/span><\/p>\n<p><span style=\"color: #000000\"><strong>Figure 1 : Image Response of ChatGPT and BI-MDRG (ours)<\/strong><\/span><\/p>\n<p><span style=\"color: #000000\">Traditional multimodal dialogue models prioritize generating textual descriptions of images and then create images using text-to-image models. <\/span><\/p>\n<p><span style=\"color: #000000\">This approach often fails to sufficiently reflect the visual information from previous dialogues, leading to inconsistent image responses. <\/span><\/p>\n<p><span style=\"color: #000000\">However, Professor Yoo\u2019s BI-MDRG minimizes image information loss through a direct image referencing technique, enabling consistent image response generation.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\"><img decoding=\"async\" class=\"alignnone  wp-image-169038\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/08\/240709.png\" alt=\"\" width=\"753\" height=\"373\" title=\"\"><\/span><\/p>\n<p><span style=\"color: #000000\"><strong>Figure 2 : Framework of previous multimodal dialogue system and our proposed BI-MDRG<\/strong><\/span><\/p>\n<p><span style=\"color: #000000\">BI-MDRG is a new system designed to solve the problem of image information loss in existing multimodal dialogue models by proposing Attention Mask Modulation and Citation Module. <\/span><\/p>\n<p><span style=\"color: #000000\">Attention Mask Modulation allows the dialogue to focus directly on the image itself instead of its textual description, while the Citation Module ensures consistent responses by directly referencing objects that should be maintained in image responses through citation tagging of the same objects appearing in the conversation. <\/span><\/p>\n<p><span style=\"color: #000000\">The research team validated BI-MDRG\u2019s performance across various multimodal dialogue benchmarks, achieving high dialogue performance and consistency.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-169040\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/08\/training_overall5.jpg\" alt=\"\" width=\"854\" height=\"367\" title=\"\"><\/span><\/p>\n<p><span style=\"color: #000000\"><strong>Figure 3: Overall framework of BI-MDRG<\/strong><\/span><\/p>\n<p><span style=\"color: #000000\">BI-MDRG offers practical solutions in various multimodal application fields. <\/span><\/p>\n<p><span style=\"color: #000000\">For instance, in customer service, it can enhance user satisfaction by providing accurate images based on conversation content. <\/span><\/p>\n<p><span style=\"color: #000000\">In education, it can improve understanding by consistently providing relevant images and texts in response to learners\u2019 questions. Additionally, in the entertainment field, it can enable natural and immersive interactions in interactive games.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>425<\/p>\n","protected":false},"featured_media":169056,"template":"","research_category":[],"class_list":["post-169033","research-achieve","type-research-achieve","status-publish","has-post-thumbnail","hentry"],"acf":[],"_links":{"self":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/research-achieve\/169033","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/research-achieve"}],"about":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/types\/research-achieve"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media\/169056"}],"wp:attachment":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media?parent=169033"}],"wp:term":[{"taxonomy":"research_category","embeddable":true,"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/research_category?post=169033"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}