Recently, AI acceleration is critical for hardware systems from mobile/edge devices to high-performance data servers. For on-device AI, there have been many studies on hardware numerical precision reduction for low hardware complexity considering limited hardware resources of mobile/edge devices [1-2]. However, if taken too far, aggressive quantization with low precision can degrade model accuracy which is the fundamental measure of deep learning quality. With the layer-wise mixed-precision networks where the different precision scheme is applied to each layer, we can reduce off-chip memory bandwidth requirements as well as hardware complexity while retaining model accuracy . However, it takes a long time to determine the optimal precision scheme for each layer due to the repetitive trainings with the different precision configurations. Also, if only the accuracy of the target neural network is considered in the process of the mixed-precision determination, it is difficult to sufficiently reduce the computational complexity, which is the most important in mobile/edge devices .
|Title of host publication
|IEEE International Symposium on Circuits and Systems, ISCAS 2022
|Institute of Electrical and Electronics Engineers Inc.
|Number of pages
|Published - 2022
|2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022 - Austin, United States
Duration: 27 May 2022 → 1 Jun 2022
|Proceedings - IEEE International Symposium on Circuits and Systems
|2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022
|27/05/22 → 1/06/22
Bibliographical notePublisher Copyright:
© 2022 IEEE.