Show simple item record

dc.contributor.authorMaji, Parthaen
dc.contributor.authorMullins, Roberten
dc.date.accessioned2017-07-03T09:32:33Z
dc.date.available2017-07-03T09:32:33Z
dc.date.issued2017-10-17en
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/265122
dc.description.abstractBreakthroughs from the field of deep learning are radically changing how sensor data are interpreted to extract important information to help advance healthcare, make our cities smarter, and innovate in smart home technology. Deep convolutional neural networks, which are at the heart of many emerging Internet-of-Things (IoT) applications, achieve remarkable performance in audio and visual recognition tasks, at the expense of high computational complexity in convolutional layers, limiting their deployability. In this paper, we present an easy-to-implement acceleration scheme, named ADaPT, which can be applied to already available pre-trained networks. Our proposed technique exploits redundancy present in the convolutional layers to reduce computation and storage requirements. Additionally, we also decompose each convolution layer into two consecutive one-dimensional stages to make full use of the approximate model. This technique can easily be applied to existing low power processors, GPUs or new accelerators. We evaluated this technique using four diverse and widely used benchmarks, on hardware ranging from embedded CPUs to server GPUs. Our experiments show an average 3-5x speed-up in all deep models and a maximum 8-9x speed-up on many individual convolutional layers. We demonstrate that unlike iterative pruning based methodology, our approximation technique is mathematically well grounded, robust, does not require any time-consuming retraining, and still achieves speed-ups solely from convolutional layers with no loss in baseline accuracy.
dc.language.isoenen
dc.publisherAssociation for Computing Machinery
dc.subjectConvolutional Neural Networksen
dc.subjectDeep Learningen
dc.subjectInternet of Things (IoT)en
dc.subjectSensors and Hardware Programmingen
dc.titleADaPT: optimizing CNN inference on IoT and mobile devices using approximately separable 1-D kernelsen
dc.typeConference Object
prism.numberArticle 43en
prism.publicationDate2017en
prism.publicationNameIML '17 Proceedings of the 1st International Conference on Internet of Things and Machine Learningen
dc.identifier.doi10.17863/CAM.10692
dcterms.dateAccepted2017-04-20en
rioxxterms.versionofrecord10.1145/3109761.3109804en
rioxxterms.versionAMen
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2017-10-17en
dc.contributor.orcidMaji, Partha [0000-0002-1919-1228]
dc.contributor.orcidMullins, Robert [0000-0002-8393-2748]
rioxxterms.typeConference Paper/Proceeding/Abstracten
pubs.conference-nameACM International Conference on Internet of Things and Machine Learningen
pubs.conference-start-date2017-10-17en


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record