"DPIM: A 19.36 TOPS/W 2T1C eDRAM Transformer-in-Memory Chip with Sparsity-Aware Quantization and Heterogeneous Dense-Sparse Core," IEEE European Solid-State Electronics Research Conference, Sep. 2024 Accept (김주영 교수 연구실)

Donghyuk Kim, Jae-Young Kim, Hyunjun Cho, Seungjae Yoo, Sukjin Lee, Sungwoong Yune, Hoichang Jeong, Keonhee Park, Ki-Soo Lee, Jongchan Lee, Chanheum Han, Gunmo Koo, Yuli Han, Jaejin Kim, Jaemin Kim, Kyuho Lee, Joo-Hyung Chae, Kunhee Cho, and Joo-Young Kim, “DPIM: A 19.36 TOPS/W 2T1C eDRAM Transformer-in-Memory Chip with Sparsity-Aware Quantization and Heterogeneous Dense-Sparse Core,” IEEE European Solid-State Electronics Research Conference, Sep. 2024

Abstract: This paper presents DPIM, the first 2T1C eDRAM Transformer-in-memory chip. Its high-density eDRAM cell supports large-capacity processing-in-memory (PIM) macros of 1.38 Mb/mm2, reducing external memory access. DPIM adopts a sparse-aware quantization scheme to entire layers of Transformer, which quantizes the model to 8-bit integer (INT8) with a minimal accuracy drop of 2\% in the BERT-large model on the GLUE dataset while increasing the bit-slice sparsity ratio of both weight and activation from dense matrices to 83.3\% and 88.4\%, respectively. Its heterogeneous PIM macro supports intensive dense matrix multiplications with an extreme to moderate range of sparse matrix multiplications with a peak throughput of 3.03-12.12 TOPS, enhancing the efficiency up to 4.84-19.36 TOPS/W.

Main figure

AI in EE

AI in Circuit Division

AI in Computer Division

AI in Communication Division

AI in Signal Division

AI in Wave Division

AI in Circuit Division

AI in Device Division

“DPIM: A 19.36 TOPS/W 2T1C eDRAM Transformer-in-Memory Chip with Sparsity-Aware Quantization and Heterogeneous Dense-Sparse Core,” IEEE European Solid-State Electronics Research Conference, Sep. 2024 Accept (김주영 교수 연구실)

학부 소개

연구

EE-X

AI in EE

구성원

교육

입학

소식

기부

학부 소개

연구

EE-X

AI in EE

구성원

교육

대외협력

입학

소식