Repository logo
 

1D-FALCON: Accelerating Deep Convolutional Neural Network Inference by Co-optimization of Models and Underlying Arithmetic Implementation

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Maji, PP 

Abstract

Deep convolutional neural networks (CNNs), which are at the heart of many new emerging applications, achieve remarkable performance in audio and visual recognition tasks, at the expense of high computational complexity, limiting their deployability. In modern CNNs, convolutional layers mostly consume 90% of the processing time during a forward inference and acceleration of these layers are of great research and commercial interest. In this paper, we examine the effects of co-optimizing internal structures of convolutional layers and underlying implementation of fundamental convolution operation. We demonstrate that a combination of these methods can have a big impact on the overall speed-up of a CNN, achieving a tenfold increase over baseline. We also introduce a new class of fast 1-D convolutions for CNNs using the Toom-Cook algorithm. We show that our proposed scheme is mathematically well grounded, robust, does not require any time-consuming retraining, and still achieves speedups solely from convolutional layers with no loss in baseline accuracy.

Description

Keywords

convolutional neural network, deep learning, computational optimization, hardware implementation

Journal Title

Artificial Neural Networks and Machine Learning – ICANN 2017

Conference Name

The 26th International Conference on Artificial Neural Networks 2017

Journal ISSN

0302-9743
1611-3349

Volume Title

Publisher

Springer