{"id":132004,"date":"2022-07-25T18:49:25","date_gmt":"2022-07-25T09:49:25","guid":{"rendered":"http:\/\/192.249.19.202\/?post_type=ai-in-circuit&#038;p=132004"},"modified":"2026-04-05T18:09:30","modified_gmt":"2026-04-05T09:09:30","slug":"a-framework-for-area-efficient-multi-task-bert-execution-on-reram-based-accelerators","status":"publish","type":"ai-in-circuit","link":"http:\/\/ee.presscat.kr\/en\/ai-in-circuit\/a-framework-for-area-efficient-multi-task-bert-execution-on-reram-based-accelerators\/","title":{"rendered":"A Framework for Area-efficient Multi-task BERT Execution on ReRAM-based Accelerators"},"content":{"rendered":"<p>Title : A Framework for Area-efficient Multi-task BERT Execution on ReRAM-based Accelerators<\/p>\n<p>&nbsp;<\/p>\n<p>Author : Myeonggu Kang, Hyein Shin, Jaekang Shin, Lee-Sup Kim<\/p>\n<p>&nbsp;<\/p>\n<p>Conference : IEEE\/ACM International Conference On Computer Aided Design 2021<\/p>\n<p>&nbsp;<\/p>\n<p>Abstract : With the superior algorithmic performances, BERT has become the de-facto standard model for various NLP tasks. Accordingly, multiple BERT models have been adopted on a single system, which is also called multi-task BERT. Although the ReRAM-based accelerator shows the sufficient potential to execute a single BERT model by adopting in-memory computation, processing multi-task BERT on the ReRAM-based accelerator extremely increases the overall area due to multiple fine-tuned models. In this paper, we propose a framework for area-efficient multi-task BERT execution on the ReRAM-based accelerator. Firstly, we decompose the fine-tuned model of each task by utilizing the base-model. After that, we propose a two-stage weight compressor, which shrinks the decomposed models by analyzing the properties of the ReRAM-based accelerator. We also present a profiler to generate hyper-parameters for the proposed compressor. By sharing the base-model and compressing the decomposed models, the proposed framework successfully reduces the total area of the ReRAM-based accelerator without an additional training procedure. It achieves a 0.26 x area than baseline while maintaining the algorithmic performances.<img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone size-full wp-image-132005\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2022\/07\/\uae40\uc774\uc12d4_1.png\" alt=\"\" width=\"636\" height=\"422\" title=\"\" srcset=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2022\/07\/\uae40\uc774\uc12d4_1.png 636w, http:\/\/ee.presscat.kr\/wp-content\/uploads\/2022\/07\/\uae40\uc774\uc12d4_1-300x199.png 300w\" sizes=\"(max-width: 636px) 100vw, 636px\" \/><img decoding=\"async\" class=\"alignnone size-full wp-image-132007\" src=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2022\/07\/\uae40\uc774\uc12d4_2.png\" alt=\"\" width=\"638\" height=\"200\" title=\"\" srcset=\"http:\/\/ee.presscat.kr\/wp-content\/uploads\/2022\/07\/\uae40\uc774\uc12d4_2.png 638w, http:\/\/ee.presscat.kr\/wp-content\/uploads\/2022\/07\/\uae40\uc774\uc12d4_2-300x94.png 300w\" sizes=\"(max-width: 638px) 100vw, 638px\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>877<\/p>\n","protected":false},"featured_media":0,"template":"","class_list":["post-132004","ai-in-circuit","type-ai-in-circuit","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-circuit\/132004","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-circuit"}],"about":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/types\/ai-in-circuit"}],"wp:attachment":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media?parent=132004"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}