Repository logo
 

Characterizing Sources of Ineffectual Computations in Deep Learning Networks

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Type

Conference Object

Change log

Authors

Nikolic, M 
Mahmoud, M 
Moshovos, A 
Zhao, Y 

Abstract

Hardware accelerators for inference with neural networks can take advantage of the properties of data they process. Performance gains and reduced memory bandwidth during inference have been demonstrated by using narrower data types [1] [2] and by exploiting the ability to skip and compress values that are zero [3]-[6]. Similarly useful properties have been identified at a lower-level such as varying precision requirements [7] and bit-level sparsity [8] [9]. To date, the analysis of these potential sources of superfluous computation and communication has been constrained to a small number of older Convolutional Neural Networks (CNNs) used for image classification. It is an open question as to whether they exist more broadly. This paper aims to determine whether these properties persist in: (1) more recent and thus more accurate and better performing image classification networks, (2) models for image applications other than classification such as image segmentation and low-level computational imaging, (3) Long-Short-Term-Memory (LSTM) models for non-image applications such as those for natural language processing, and (4) quantized image classification models. We demonstrate that such properties persist and discuss the implications and opportunities for future accelerator designs.

Description

Keywords

46 Information and Computing Sciences, 4611 Machine Learning

Journal Title

Proceedings - 2019 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2019

Conference Name

2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

Journal ISSN

Volume Title

Publisher

IEEE

Rights

All rights reserved