S-FLASH: A NAND Flash-Based Deep Neural Network Accelerator Exploiting Bit-Level Sparsity

Myeonggu Kang, Hyeonuk Kim, Hyein Shin, Jaehyeong Sim, Kyeonghan Kim, Lee Sup Kim

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

The processing in-memory (PIM) approach that combines memory and processor appears to solve the memory wall problem. NAND flash memory, which is widely adopted in edge devices, is one of the promising platforms for PIM with its high-density property and the intrinsic ability for analog vector-matrix multiplication. Despite its potential, the domain conversion process, which converts an analog current to a digital value, accounts for most energy consumption on the NAND flash-based accelerator. It restricts the NAND flash memory usage for PIM compared to the other platforms. In this article, we propose a NAND flash-based DNN accelerator to achieve both large memory density and energy efficiency among various platforms. As the NAND flash memory already shows higher memory density than other memory platforms, we aim to enhance energy efficiency by reducing the domain conversion process burden. First, we optimize the bit width of partial multiplication by considering the analog-to-digital converter (ADC) resource. For further optimization, we propose a methodology to exploit many zero partial multiplication results for enhancing both energy efficiency and throughput. The proposed work successfully exploits the bit-level sparsity of DNN, which results in achieving up to 8.6×/8.2× larger energy efficiency/throughput over the provisioned baseline.

Original languageEnglish
Pages (from-to)1291-1304
Number of pages14
JournalIEEE Transactions on Computers
Volume71
Issue number6
DOIs
StatePublished - 1 Jun 2022

Bibliographical note

Publisher Copyright:
© 1968-2012 IEEE.

Keywords

  • Deep neural network
  • bit-level sparsity
  • processing-in-memory

Fingerprint

Dive into the research topics of 'S-FLASH: A NAND Flash-Based Deep Neural Network Accelerator Exploiting Bit-Level Sparsity'. Together they form a unique fingerprint.

Cite this