TY - JOUR
T1 - S-FLASH
T2 - A NAND Flash-Based Deep Neural Network Accelerator Exploiting Bit-Level Sparsity
AU - Kang, Myeonggu
AU - Kim, Hyeonuk
AU - Shin, Hyein
AU - Sim, Jaehyeong
AU - Kim, Kyeonghan
AU - Kim, Lee Sup
N1 - Publisher Copyright:
© 1968-2012 IEEE.
PY - 2022/6/1
Y1 - 2022/6/1
N2 - The processing in-memory (PIM) approach that combines memory and processor appears to solve the memory wall problem. NAND flash memory, which is widely adopted in edge devices, is one of the promising platforms for PIM with its high-density property and the intrinsic ability for analog vector-matrix multiplication. Despite its potential, the domain conversion process, which converts an analog current to a digital value, accounts for most energy consumption on the NAND flash-based accelerator. It restricts the NAND flash memory usage for PIM compared to the other platforms. In this article, we propose a NAND flash-based DNN accelerator to achieve both large memory density and energy efficiency among various platforms. As the NAND flash memory already shows higher memory density than other memory platforms, we aim to enhance energy efficiency by reducing the domain conversion process burden. First, we optimize the bit width of partial multiplication by considering the analog-to-digital converter (ADC) resource. For further optimization, we propose a methodology to exploit many zero partial multiplication results for enhancing both energy efficiency and throughput. The proposed work successfully exploits the bit-level sparsity of DNN, which results in achieving up to 8.6×/8.2× larger energy efficiency/throughput over the provisioned baseline.
AB - The processing in-memory (PIM) approach that combines memory and processor appears to solve the memory wall problem. NAND flash memory, which is widely adopted in edge devices, is one of the promising platforms for PIM with its high-density property and the intrinsic ability for analog vector-matrix multiplication. Despite its potential, the domain conversion process, which converts an analog current to a digital value, accounts for most energy consumption on the NAND flash-based accelerator. It restricts the NAND flash memory usage for PIM compared to the other platforms. In this article, we propose a NAND flash-based DNN accelerator to achieve both large memory density and energy efficiency among various platforms. As the NAND flash memory already shows higher memory density than other memory platforms, we aim to enhance energy efficiency by reducing the domain conversion process burden. First, we optimize the bit width of partial multiplication by considering the analog-to-digital converter (ADC) resource. For further optimization, we propose a methodology to exploit many zero partial multiplication results for enhancing both energy efficiency and throughput. The proposed work successfully exploits the bit-level sparsity of DNN, which results in achieving up to 8.6×/8.2× larger energy efficiency/throughput over the provisioned baseline.
KW - Deep neural network
KW - bit-level sparsity
KW - processing-in-memory
UR - http://www.scopus.com/inward/record.url?scp=85106763799&partnerID=8YFLogxK
U2 - 10.1109/TC.2021.3082003
DO - 10.1109/TC.2021.3082003
M3 - Article
AN - SCOPUS:85106763799
SN - 0018-9340
VL - 71
SP - 1291
EP - 1304
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 6
ER -