{"id":122899,"date":"2022-02-15T22:51:02","date_gmt":"2022-02-15T13:51:02","guid":{"rendered":"http:\/\/175.125.95.178\/ai-in-circuit\/22899\/"},"modified":"2026-04-05T23:34:22","modified_gmt":"2026-04-05T14:34:22","slug":"22899","status":"publish","type":"ai-in-circuit","link":"http:\/\/ee.presscat.kr\/en\/ai-in-circuit\/22899\/","title":{"rendered":"Z-PIM: A Sparsity-Aware Processing-In-Memory Architecture with Fully-Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks"},"content":{"rendered":"<p class=\"MsoNoSpacing\" style=\"text-align:justify\">\n<div class=\"\"><img decoding=\"async\" class=\"\" src=\"\/wp-content\/uploads\/drupal\/\uae40\uc8fc\uc6011.png\" alt=\"\" title=\"\"><\/div>\n<\/p>\n<p class=\"MsoNoSpacing\" style=\"text-align:justify\"><span style=\"font-size:10pt\"><span><span lang=\"EN-US\" style=\",sans-serif\">We present an energy-efficient processing-in-memory (PIM) architecture named Z-PIM that supports both sparsity handling and fully-variable bit-precision in weight data for energy-efficient deep neural networks. Z-PIM adopts the bit-serial arithmetic that performs a multiplication bit-by-bit through multiple cycles to reduce the complexity of the operation in a single cycle and to provide flexibility in bit-precision. To this end, it employs a zero-skipping convolution SRAM, which performs in-memory AND operations based on custom 8T-SRAM cells and channel-wise accumulations, and a diagonal accumulation SRAM that performs bit-wise and spatial-wise accumulation on the channel-wise accumulation results using diagonal logic and adders to produce the final convolution outputs. We propose the hierarchical bit-line structure for energy-efficient weight bit pre-charging and computational read-out by reducing the parasitic capacitances of the bit-lines. Its charge reuse scheme reduces the switching rate by 95.42% for the convolution layers of VGG-16 model. In addition, Z-PIM&#8217;s channel-wise data mapping enables sparsity handling by skip-reading the input channels with zero weight. Its read-operation pipelining enabled by a read-sequence scheduling improves the throughput by 66.1%. The Z-PIM chip is fabricated in 65-nm CMOS process on a 7.568 mm2 die while it consumes average 5.294mW power at 1.0V voltage and 200MHz frequency. It achieves 0.31-49.12 TOPS\/W energy efficiency for convolution operations as the weight sparsity and bit-precision vary from 0.1 to 0.9 and 1-bit to 16-bit, respectively. For the figure of merit considering input bit-width, weight bit-width, and energy efficiency, the Z-PIM shows more than 2.1 times improvement over the state-of-the-art PIM implementations.<\/span><\/span><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>756<\/p>\n","protected":false},"featured_media":0,"template":"","class_list":["post-122899","ai-in-circuit","type-ai-in-circuit","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-circuit\/122899","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/ai-in-circuit"}],"about":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/types\/ai-in-circuit"}],"wp:attachment":[{"href":"http:\/\/ee.presscat.kr\/en\/wp-json\/wp\/v2\/media?parent=122899"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}