Distributed Video Processing Using Deep Learning on Networked Devices

Deep Learning, Edge Computing

The vast adoption of mobile devices with cameras has greatly assisted in the proliferation of the creation and distribution of videos. Videos, which are a rich source of information, can be exploited for on-demand information retrieval. Deep learning using Convolutional Neural Networks (CNNs) is state of the art computer vision techniques that can be used for information retrieval. However, due to the high computation of video processing using CNNs, it is not feasible or costs too much to process all videos at a centralized entity, considering a large set of videos which is common in this big data epoch. Therefore, the aim of this research is to provide distributed video processing platforms for on-demand information retrieval using deep learning on big video data, and the focus of this research is to design and implement systems for different application scenarios.

NetVision

First, we consider on-demand information retrieval from videos stored across a wireless network, where network devices may different computation capabilities, e.g., computers with a much powerful GPU can significantly accelerate video processing using deep learning. An example application scenario is wireless surveillance systems, where camera devices and more computationally capable computers are deployed to record, store and process videos.

We has designed and built NetVision, a distributed video processing system using deep learning in a wireless network. NetVision takes information queries, filters out stored videos based on metadata, performs CNNs based video processing, balances the computing workload among network devices by computation offload (i.e., videos can be uploaded to more computationally capable devices). We designed the algorithm that determines the near-optimal video allocation to minimize the overall video processing time, considering the limitations of wireless channels. We has implemented and evaluated NetVision on a small testbed and built an emulation environment of both real devices and virtual devices for large-scale experiments.

CrowdVision

Then, we consider an application scenario for large-scale videos, e.g., municipal agencies ask the public to assist in identifying terrorists or criminals through scanning of their videos that may capture suspects, as the FBI did after the Boston Marathon bombing. We propose a much smarter and more automatic way to handle this using deep learning. We also allow individuals to process some of the data rather than having a centralized entity process everything, which we call crowdprocessing.

We designed and built CrowdVision, a system that enables smartphones to crowdprocess videos using deep learning framework Caffe in a distributed and energy-efficient way, leveraging cloud offload under different network connections. CrowdVision is built as a computing platform to quickly and efficiently retrieve information from big video data. It breaks down the workflow of video processing using Caffe, characterizes the computing of CNNs, considers the data and energy usage constraints imposed by users, and determines whether or at which step to offload computation so as to optimize the performance. We have implemented CrowdVision as an Android app and evaluated it on an off-the-shelf smartphone to confirm the performance gains for video crowdprocessing.

Publications

[ICNP'16] On-demand Video Processing in Wireless Networks

Zongqing Lu, Kevin Chan, Rahul Urgaonkar, and Thomas La Porta

In Proceedings of IEEE International Conference on Network Protocols (ICNP), November 8-11, 2016.

(Acceptance Rate: 20%=⁴⁶⁄₂₂₉)

[MM'17] Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices

Zongqing Lu, Swati Rallapalli, Kevin Chan, and Thomas La Porta

In Proceedings of ACM International Conference on Multimedia (MM), October 23-27, 2017.

(Acceptance Rate: 28%=¹⁹¹⁄₆₇₅)

[INFOCOM'18] A Computing Platform for Video Crowdprocessing Using Deep Learning

Zongqing Lu, Kevin Chan, and Thomas La Porta

In Proceedings of IEEE International Conference on Computer Communications (INFOCOM), April 15-19, 2018.

(Acceptance Rate: 19%=³⁰⁹⁄₁₆₀₆)

[TMC] CrowdVision: A Computing Platform for Video Crowdprocessing Using Deep Learning

Zongqing Lu, Kevin Chan, Shiliang Pu, and Thomas La Porta

IEEE Transactions on Mobile Computing, vol.18, no. 7, pp. 1513-1526, 2019

[TON] NetVision: On-demand Video Processing in Wireless Networks

Zongqing Lu, Kevin Chan, Rahul Urgaonkar, Shiliang Pu, and Thomas La Porta

IEEE/ACM Transactions on Networking, vol 28, no. 1, pp. 196-209, 2020

[TMC] Augur: Modeling the Resource Requirements of ConvNets on Mobile Devices

Zongqing Lu, Swati Rallapalli, Kevin Chan, Shiliang Pu, and Thomas La Porta

IEEE Transactions on Mobile Computing, vol. 20, no. 2, pp. 352-365, 2021

[TMC] PicSys: Energy-Efficient Fast Image Search on Distributed Mobile Networks

Noor Felemban, Fidan Mehmeti, Hana Khamfroush, Zongqing Lu, Swati Rallapalli, Kevin Chan, and Thomas La Porta

IEEE Transactions on Mobile Computing, vol. 20, no. 4, pp. 1574-1589, 2021.