Enabling On-Device Smartphone GPU based Training: Lessons Learned

Das, A; Kwon, YD; Chauhan, J; Mascolo, C

Enabling On-Device Smartphone GPU based Training: Lessons Learned

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/338356

Repository DOI

https://doi.org/10.17863/CAM.85765

Files

Accepted version (365.77 KB)

Type

Article

Authors

Das, A

Kwon, YD

Chauhan, J

Mascolo, C

Abstract

Deep Learning (DL) has shown impressive performance in many mobile applications. Most existing works have focused on reducing the computational and resource overheads of running Deep Neural Networks (DNN) inference on resource-constrained mobile devices. However, the other aspect of DNN operations, i.e. training (forward and backward passes) on smartphone GPUs, has received little attention thus far. To this end, we conduct an initial analysis to examine the feasibility of on-device training on smartphones using mobile GPUs. We first employ the open-source mobile DL framework (MNN) and its OpenCL backend for running compute kernels on GPUs. Next, we observed that training on CPUs is much faster than on GPUs and identified two possible bottlenecks related to this observation: (i) computation and (ii) memory bottlenecks. To solve the computation bottleneck, we optimize the OpenCL backend's kernels, showing 2x improvements (40-70 GFLOPs) over CPUs (15-30 GFLOPs) on the Snapdragon 8 series processors. However, we find that the full DNN training is still much slower on GPUs than on CPUs, indicating that memory bottleneck plays a significant role in the lower performance of GPU over CPU. The data movement takes almost 91% of training time due to the low bandwidth. Lastly, based on the findings and failures during our investigation, we present limitations and practical guidelines for future directions.

Keywords

GPU, Training, Smartphones, OpenCL

Journal Title

2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events, PerCom Workshops 2022

Conference Name

2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)

Volume Title

00

Publisher

IEEE

Publisher DOI

https://doi.org/10.1109/PerComWorkshops53856.2022.9767442

Rights

Collections

Cambridge University Research Outputs