Delivering Deep Learning to Mobile Devices via Offloading

Similar documents
DeepDecision: A Mobile Deep Learning Framework for Edge Video Analytics

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Fog/Edge Computing Platform : Enabling Low-Latency Application in Next Generation Network

Yiqi Yan. May 10, 2017

YOLO9000: Better, Faster, Stronger

ECOS: Practical Mobile Application Offloading for Enterprises. Aaron Gember, Chris Dragga, Aditya Akella University of Wisconsin-Madison

Faculté Polytechnique

Deep Learning with Tensorflow AlexNet

Buying New. Devices. Tech & Tea Seminar By CommuniTech

Project Details. Jiasi Chen CS 179i: Project in Computer Science (Networks) Lectures: Monday 3:10-4pm in Spieth 1307

Blazer Pro V2.1 Client Requirements & Hardware Performance

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

Cloud Based Framework for Rich Mobile Application

Unified, real-time object detection

The Impact of Mobile Multimedia Applications on Data Center Consolidation

Spatial Localization and Detection. Lecture 8-1

Accelerating Reinforcement Learning in Engineering Systems

Learning Semantic Video Captioning using Data Generated with Grand Theft Auto

CS 4510/9010 Applied Machine Learning. Deep Learning. Paula Matuszek Fall copyright Paula Matuszek 2016

POMAC: Properly Offloading Mobile Applications to Clouds

Video. What's New? Cisco WebEx Event Center Release Notes (version WBS29.8) 1

Lecture 5: Object Detection

An outlook on INTELLIGENCE in 2024

CS230: Lecture 3 Various Deep Learning Topics

Lecture 12: Model Serving. CSE599W: Spring 2018

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

ImageNet Classification with Deep Convolutional Neural Networks

Architectures for Scalable Media Object Search

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Apple store iphone 4 8gb unlocked

Contour Detection on Mobile Platforms

ARCHOS Junior: a range made for children starting at 79.99

Deep Neural Network Enhanced VSLAM Landmark Selection

All Opportunities Are Not Equal: Enabling Energy Efficient App Syncs In Diverse Networks

Using Facebook Messenger Bots for Market Research

視覚情報処理論. (Visual Information Processing ) 開講所属 : 学際情報学府水 (Wed)5 [16:50-18:35]

Face Detection and Tracking Control with Omni Car

EPUB // SAMSUNG GALAXY 7500 ONLINE MANUAL DOWNLOAD

GTC Interaction Simplified. Gesture Recognition Everywhere: Gesture Solutions on Tegra

Center for Networked Computing

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

Machine Learning 13. week

Introduction. Do you have any difficulty in choosing an ideal mobile phone?

An introduction to Machine Learning silicon

arxiv: v1 [cs.cv] 30 Apr 2018

Object Detection Based on Deep Learning

Video. What's New. Cisco WebEx Training Center Release Notes (version WBS29.13) 1

Small is the New Big: Data Analytics on the Edge

How to Build Optimized ML Applications with Arm Software

Facial Recognition Using Neural Networks over GPGPU

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, September 18,

Towards Pervasive and Mobile Gaming with Distributed Cloud Infrastructure. Teemu Kämäräinen, Matti Siekkinen, Yu Xiao, Antti Ylä-Jääski

i3allsync Wireless Presentation System User Manual

Deep Learning for Virtual Shopping. Dr. Jürgen Sturm Group Leader RGB-D

GTC 2013 March San Jose, CA The Smartest People. The Best Ideas. The Biggest Opportunities. Opportunities for Participation:

! References: ! Computer eyesight gets a lot more accurate, NY Times. ! Stanford CS 231n. ! Christopher Olah s blog. ! Take ECS 174!

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

CSci 8980 Mobile Cloud Computing. Outsourcing: Components I

Photos, Photos. What to do with All Those Photos? Presented by Phil Goff Area 16 Computers and Technology August 17, 2017

BlueDBM: An Appliance for Big Data Analytics*

Pedestrian Detection based on Deep Fusion Network using Feature Correlation

Won Woo Ro, Ph.D. School of Electrical and Electronic Engineering

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

The Potential for Edge Computing in the Commercial Building

Application of Deep Learning Techniques in Satellite Telemetry Analysis.

IBM Mid-Peninsular PC Club Meeting

Deep learning in MATLAB From Concept to CUDA Code

Open Mobile Platforms. EE 392I, Lecture-6 May 4 th, 2010

Semantic Segmentation

Real-Time Image Processing and Analytics Using SAS Event Stream Processing

OBJECT DETECTION HYUNG IL KOO

Face Detection using GPU-based Convolutional Neural Networks

CS231N Project Final Report - Fast Mixed Style Transfer

Neural Nets & Deep Learning

Windows Windows 7, Windows 8, Windows 8.1, Windows 10 Browsers

Real-time Object Detection CS 229 Course Project

2015 The MathWorks, Inc. 1

TR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

Data Driven Networks

Polytechnic University of Tirana

How to Build Optimized ML Applications with Arm Software

We are all connected. The Networked Society

HikCentral V1.2 Software Requirements & Hardware Performance

Demystifying Deep Learning

Deep Learning Based Real-time Object Recognition System with Image Web Crawler

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

Next-Generation Cloud Platform

A Hardware-Friendly Bilateral Solver for Real-Time Virtual-Reality Video

Deep Learning for Computer Vision with MATLAB By Jon Cherrie

Voice Control. Setting with Google and Amazon

CNN optimization. Rassadin A

A First Look at Mobile Intelligence: Architecture, Experimentation and Challenges

How to setup Google Assistant and Amazon Alexa. Make sure your Gateway has Firmware version 1.81 or higher.

Transcription:

Delivering Deep Learning to Mobile Devices via Offloading Xukan Ran*, Haoliang Chen*, Zhenming Liu 1, Jiasi Chen* *University of California, Riverside 1 College of William and Mary

Deep learning on mobile devices Augmented reality (AR) is the next killer app Pokemon Go Google Translate (text processing) Snapchat filters (face detection) Fast object recognition is key for general AR applications Deep learning is a popular technique for object recognition 2

Problem Current approaches for deep learning on mobile devices 1. Local-only processing Apple Photos, Google Translate GPU speedup [1] Slow! (~600 ms/frame) 2. Remote-only processing Apple Siri, Amazon Alexa Doesn t work when network is bad Goal: Develop a framework to intelligently offload to nearby edge devices for real-time video analysis using deep learning. Cannot use general offloading techniques. Need to specifically account for: Characteristics of the video Characteristics of the deep learning models Application requirements [1] L. Huynh, Y. Lee, R. Balan, DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications, ACM MobiSys, 2016. 3

Design space Degrees of freedom Video characteristics Frame rate Resolution Bit rate Deep learning characteristics Model size Model latency / energy Model accuracy Constraints App requirements Latency Accuracy Metrics Accuracy Frame rate Energy Complex interactions between these degrees of freedom and metrics e.g., high bit rate when offloading à high accuracy, high energy e.g., small deep learning model à high frame rate, low accuracy How to decide? 4

Decision framework Degrees of freedom: video resolution Constraints: Current network conditions Application requirements Metrics: frame rate Optimize decision neural net model size detection accuracy offloading decision energy consumption Relation between the degrees of freedom on the metrics cannot be analytically understood à need measurements!

System design Measurements: Front-end device Server - - - Battery function s Latency function Accuracy function Offline performance characterization Big CNN Camera feed Online decision engine? Small CNN car 0.9 Output display 6

Experimental setup Deep learning model: YOLO built on Tensorflow [2] tiny-yolo: 9 convolutional layers big-yolo: 22 convolutional layers Local processing: OnePlus 3T Android phone with quad-core CPU, 6 GB RAM Remote processing: Server with quad-core CPU, 8 GB RAM, NVIDIA GeForce GTX970 graphics card with 4GB of RAM Developed app to implement offloading: Video frame Bounding box [2] Joseph Redmon, Ali Farhadi, YOLO9000: Better, Faster, Stronger, CVPR, 2017. 7

Remote-only processing Local-only processing 8

How do latency and energy change with resolution? Encode a video frame at different resolutions Measure the processing time and energy usage in Android on the smartphone Energy and latency increase with pixels 2 9

How does accuracy change with bit rate and resolution? Encode 20 videos at different bit rate and resolutions Measure the accuracy (IoU) relative to the big-yolo + raw video big-yolo tiny-yolo Accuracy increases with larger model, higher resolution, higher bit rate 10

How fast is deep learning, end-to-end? Measure # processed frames per second, under controlled network conditions Caveat: stop-and-wait for each processed frame 2 1 0 Frames per second 3 offload to server run locally on phone 0 200 400 Added Network Latency (ms) Increased bandwidth à higher frame rate When bandwidth > 5Mbps, should offload Increased latency à lower frame rate When latency < 100ms, should offload 11

How much time is spent for communication? Record timestamps as frame travels from phone to server and back When offloading, majority of time is spent on network 12

How much battery is used from offloading deep learning? Measure the battery drop after 30 seconds of continuous usage Higher bandwidth à more battery Prefer to run locally to save battery 13

How well does offloading do in the wild? Perform 5 trials in public locations over LTE and WiFi Coffee shop 1: Different city from server Coffee shop 2: Same city, same subnet as server Apartment 1: Different city than server Apartment 2: Same city as server Performance over LTE sometimes > WiFi Higher frame rates over LTE at the 14 expense of data cost

Key Take-Aways Real-time video analysis using deep learning is slow (~600 ms/frame on smartphones) Offloading can be beneficial (up to 2x frame rate), but optimal decision is unclear In the wild, LTE sometimes > public WiFi 15