The processing in-memory (PIM) approach that combines memory and processor appears to solve the memory wall problem. NAND flash memory, which is widely adopted in edge devices, is one of the promising platforms for PIM with its high-density property and the intrinsic ability for analog vector-matrix multiplication. Despite its potential, the domain conversion process, which converts an analog current to a digital value, accounts for most energy consumption on the NAND flash-based accelerator. It restricts the NAND flash memory usage for PIM compared to the other platforms. In this article, we propose a NAND flash-based DNN accelerator to achieve both large memory density and energy efficiency among various platforms. As the NAND flash memory already shows higher memory density than other memory platforms, we aim to enhance energy efficiency by reducing the domain conversion process burden. First, we optimize the bit width of partial multiplication by considering the analog-to-digital converter (ADC) resource. For further optimization, we propose a methodology to exploit many zero partial multiplication results for enhancing both energy efficiency and throughput. The proposed work successfully exploits the bit-level sparsity of DNN, which results in achieving up to 8.6×/8.2× larger energy efficiency/throughput over the provisioned baseline.
Bibliographical notePublisher Copyright:
© 1968-2012 IEEE.
- Deep neural network
- bit-level sparsity