{"id":118507,"date":"2021-10-31T23:51:53","date_gmt":"2021-10-31T14:51:53","guid":{"rendered":"http:\/\/175.125.95.178\/ai-in-signal\/18507\/"},"modified":"2026-05-18T22:45:28","modified_gmt":"2026-05-18T13:45:28","slug":"18507","status":"publish","type":"ai-in-signal","link":"http:\/\/ee.presscat.kr\/en\/ai-in-signal\/18507\/","title":{"rendered":"Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency  (Prof. In-So Kweon)"},"content":{"rendered":"<p style=\"text-align:justify;margin-bottom:11px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span>Conference\/Journal, Year: AAAI 2021<\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:11px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span>We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision. Our technical contributions are three-fold. First, we highlight the fundamental difference between inverse and forward projection while modeling the individual motion of each rigid object, and propose a geometrically correct projection pipeline using a neural forward projection module. Second, we design a unified instance-aware photometric and geometric consistency loss that holistically imposes self-supervisory signals for every background and object region. Lastly, we introduce a general-purpose auto annotation scheme using any off-the-shelf instance segmentation and optical flow models to produce video instance segmentation maps that will be utilized as input to our training pipeline. These proposed elements are validated in a detailed ablation study. Through extensive experiments conducted on the KITTI and Cityscapes dataset, our framework is shown to outperform the state-of-the-art depth and motion estimation methods. Our code, dataset, and models are available at <a href=\"https:\/\/github.com\/SeokjuLee\/Insta-DM\" rel=\"nofollow noopener\">https:\/\/github.com\/SeokjuLee\/Insta-DM<\/a>.<\/span><\/span><\/span><\/p>\n<p style=\"text-align:justify;margin-bottom:11px\"><span style=\"font-size:10pt\"><span style=\"line-height:107%\"><span><\/p>\n<div class=\"\"><img decoding=\"async\" class=\"\" src=\"\/wp-content\/uploads\/drupal\/\uad8c\uc778\uc18c\uad50\uc218\ub2d826.png\" alt=\"\" title=\"\"><\/div>\n<p><\/span><\/span><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>693<\/p>\n","protected":false},"featured_media":0,"template":"","class_list":["post-118507","ai-in-signal","type-ai-in-signal","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-signal\/118507","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-signal"}],"about":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/types\/ai-in-signal"}],"wp:attachment":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media?parent=118507"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}