{"id":170556,"date":"2024-09-02T21:21:28","date_gmt":"2024-09-02T12:21:28","guid":{"rendered":"http:\/\/ee.presscat.kr\/?post_type=research-achieve&#038;p=170556"},"modified":"2026-04-13T10:38:06","modified_gmt":"2026-04-13T01:38:06","slug":"ee-professor-dongsu-hans-research-team-develops-technology-to-accelerate-ai-model-training-in-distributed-environments-using-consumer-grade-gpus","status":"publish","type":"research-achieve","link":"http:\/\/ee.presscat.kr\/en\/research-achieve\/ee-professor-dongsu-hans-research-team-develops-technology-to-accelerate-ai-model-training-in-distributed-environments-using-consumer-grade-gpus\/","title":{"rendered":"EE Professor Dongsu Han&#8217;s Research Team Develops Technology to Accelerate AI Model Training in Distributed Environments Using Consumer-Grade GPUs"},"content":{"rendered":"<p><span style=\"font-size: 14pt;color: #000000\"><strong> EE Professor Dongsu Han&#8217;s Research Team Develops Technology to Accelerate AI Model Training in Distributed Environments Using Consumer-Grade GPUs<\/strong><\/span><\/p>\n<p><span style=\"color: #000000\"><img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone  wp-image-170565\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/09\/\ucea1\ucc98-2024-09-02-211619.jpg\" alt=\"\" width=\"502\" height=\"218\" title=\"\"><\/span><\/p>\n<p><span style=\"color: #000000\">&lt;(from left) Professor Dongsu Han, Dr. Hwijoon Iim, Ph.D. Candidate Juncheol Ye&gt;<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">Professor Dongsu Han&#8217;s research team of the KAIST Department of Electrical Engineering has developed a groundbreaking technology that accelerates AI model training in distributed environments with limited network bandwidth using consumer-grade GPUs.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">Training the latest AI models typically requires expensive infrastructure, such as high-performance GPUs costing tens of millions in won and high-speed dedicated networks. <\/span><\/p>\n<p><span style=\"color: #000000\">As a result, most researchers in academia and small to medium-sized enterprises have to rely on cheaper, consumer-grade GPUs for model training. <\/span><\/p>\n<p><span style=\"color: #000000\">However, they face difficulties in efficient model training due to network bandwidth limitations.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\"><img decoding=\"async\" class=\"alignnone  wp-image-170557\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/09\/Inline-image-2024-09-02-14.59.01.205.png\" alt=\"\" width=\"851\" height=\"294\" title=\"\"><\/span><\/p>\n<p><span style=\"color: #000000\">&lt;Figure 1. Problems in Conventional Low-Cost Distributed Deep Learning Environments&gt;<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">To address these issues, Professor Han&#8217;s team developed a distributed learning framework called StellaTrain. <\/span><\/p>\n<p><span style=\"color: #000000\">StellaTrain accelerates model training on low-cost GPUs by integrating a pipeline that utilizes both CPUs and GPUs. It dynamically adjusts batch sizes and compression rates according to the network environment, enabling fast model training in multi-cluster and multi-node environments without the need for high-speed dedicated networks.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">StellaTrain adopts a strategy that offloads gradient compression and optimization processes to the CPU to maximize GPU utilization by optimizing the learning pipeline. The team developed and applied a new sparse optimization technique and cache-aware gradient compression technology that work efficiently on CPUs.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">This implementation creates a seamless learning pipeline where CPU tasks overlap with GPU computations. Furthermore, dynamic optimization technology adjusts batch sizes and compression rates in real-time according to network conditions, achieving high GPU utilization even in limited network environments.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\"><img decoding=\"async\" class=\"alignnone  wp-image-170559\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/09\/Inline-image-2024-09-02-14.59.01.206.png\" alt=\"\" width=\"764\" height=\"459\" title=\"\"><\/span><\/p>\n<p><span style=\"color: #000000\">&lt;Figure 2. Overview of the StellaTrain Learning Pipeline&gt;<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">Through these innovations, StellaTrain significantly improves the speed of distributed model training in low-cost multi-cloud environments, achieving up to 104 times performance improvement compared to the existing PyTorch DDP.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">Professor Han&#8217;s research team has paved the way for efficient AI model training without the need for expensive data center-grade GPUs and high-speed networks. This breakthrough is expected to greatly aid AI research and development in resource-constrained environments, such as academia and small to medium-sized enterprises.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">Professor Han emphasized, &#8220;KAIST is demonstrating leadership in the AI systems field in South Korea.&#8221; He added, &#8220;We will continue active research to implement large-scale language model (LLM) training, previously considered the domain of major IT companies, in more affordable computing environments. We hope this research will serve as a critical stepping stone toward that goal.&#8221;<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">The research team included Dr. Hwijoon Iim and Ph.D. candidate Juncheol Ye from KAIST, as well as Professor Sangeetha Abdu Jyothi from UC Irvine. The findings were presented at ACM SIGCOMM 2024, the premier international conference in the field of computer networking, held from August 4 to 8 in Sydney, Australia (Paper title: Accelerating Model Training in Multi-cluster Environments with Consumer-grade GPUs).\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">Meanwhile, Professor Han&#8217;s team has also made continuous research advancements in the AI systems field, presenting a framework called ES-MoE, which accelerates Mixture of Experts (MoE) model training, at ICML 2024 in Vienna, Austria.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\">By overcoming GPU memory limitations, they significantly enhanced the scalability and efficiency of large-scale MoE model training, enabling fine-tuning of a 15-billion parameter language model using only four GPUs. This achievement opens up the possibility of effectively training large-scale AI models with limited computing resources.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-170561\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/09\/Inline-image-2024-09-02-14.59.01.207-1.png\" alt=\"\" width=\"813\" height=\"378\" title=\"\"><\/span><\/p>\n<p><span style=\"color: #000000\">&lt;Figure 3. Overview of the ES-MoE Framework&gt;<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #000000\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-170563\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2024\/09\/Inline-image-2024-09-02-14.59.01.207.png\" alt=\"\" width=\"789\" height=\"201\" title=\"\"><\/span><\/p>\n<div>\n<p><span style=\"color: #000000\">&lt;Figure 4. Professor Dongsu Han&#8217;s research team has enabled AI model training in low-cost computing environments, even with limited or no high-performance GPUs, through their research on StellaTrain and ES-MoE.&gt;<\/span><\/p>\n<\/div>\n<div>\u00a0<\/div>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>659<\/p>\n","protected":false},"featured_media":170570,"template":"","research_category":[],"class_list":["post-170556","research-achieve","type-research-achieve","status-publish","has-post-thumbnail","hentry"],"acf":[],"_links":{"self":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/research-achieve\/170556","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/research-achieve"}],"about":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/types\/research-achieve"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media\/170570"}],"wp:attachment":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media?parent=170556"}],"wp:term":[{"taxonomy":"research_category","embeddable":true,"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/research_category?post=170556"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}