Advanced Imaging Applications on Smart-phones Convergence of General-purpose computing, Graphics acceleration, and Sensors

Similar documents
INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES. Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager

The Benefits of GPU Compute on ARM Mali GPUs

Unleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases. Steve Steele, ARM

Fusing Sensors into Mobile Operating Systems & Innovative Use Cases

GTC 2013 March San Jose, CA The Smartest People. The Best Ideas. The Biggest Opportunities. Opportunities for Participation:

GTC Interaction Simplified. Gesture Recognition Everywhere: Gesture Solutions on Tegra

Mobile AR Hardware Futures

EMBEDDED VISION AND 3D SENSORS: WHAT IT MEANS TO BE SMART

Open Standard APIs for Embedded Vision Processing

Multimedia in Mobile Phones. Architectures and Trends Lund

GPGPU on ARM. Tom Gall, Gil Pitney, 30 th Oct 2013

AR Standards Update Austin, March 2012

Vision Acceleration. Launch Briefing October Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group

MediaTek Video Face Beautify

Our Technology Expertise for Software Engineering Services. AceThought Services Your Partner in Innovation

Embedded real-time stereo estimation via Semi-Global Matching on the GPU

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

Khronos and the Mobile Ecosystem

Real-time image processing and object recognition for robotics applications. Adrian Stratulat

! Readings! ! Room-level, on-chip! vs.!

Adding Advanced Shader Features and Handling Fragmentation

IEEE Consumer Electronics Society Calibrating a VR Camera. Adam Rowell CTO, Lucid VR

GPGPU on Mobile Devices

Application questions. Theoretical questions

TEGRA K1 AND THE AUTOMOTIVE INDUSTRY. Gernot Ziegler, Timo Stich

HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES. Cliff Woolley, NVIDIA

Accessing the GPU & the GPUImage Library

Take GPU Processing Power Beyond Graphics with Mali GPU Computing

Open Standard APIs for Augmented Reality

Open API Standards for Mobile Graphics, Compute and Vision Processing GTC, March 2014

Accessing the GPU & the GPUImage Library

CS664 Lecture #19: Layers, RANSAC, panoramas, epipolar geometry

Static Scene Reconstruction

Next Generation Visual Computing

Step-by-Step Model Buidling

Mobile Graphics Ecosystem. Tom Olson OpenGL ES working group chair

Rectification and Disparity

PART IV: RS & the Kinect

Module Introduction. Content 15 pages 2 questions. Learning Time 25 minutes

New Technologies for UAV/UGV

Bringing it all together: The challenge in delivering a complete graphics system architecture. Chris Porthouse

EE795: Computer Vision and Intelligent Systems

Artec Leo. A smart professional 3D scanner for a next-generation user experience

Bifrost - The GPU architecture for next five billion

360 Full View Spherical Mosaic

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

GPGPU. Peter Laurens 1st-year PhD Student, NSC

Spring 2011 Prof. Hyesoon Kim

Multisensory Embedded Pose Estimation

ARK. SDP18 Team 21 MDR 12/07/2017

2010 PROFIT. Research & Education. Tim Cheng ( 鄭光廷 )

TIOVX TI s OpenVX Implementation

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Artec Leo. A smart professional 3D scanner for a next-generation user experience

Artec Leo. A smart professional 3D scanner for a next-generation user experience

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

DEPTH, STEREO AND FOCUS WITH LIGHT-FIELD CAMERAS

Multi-stable Perception. Necker Cube

Game Application Using Orientation Sensor

Stereo Rig Final Report

The Bifrost GPU architecture and the ARM Mali-G71 GPU

Introduction to Shutter Speed in Digital Photography. Read more:

WebGL Meetup GDC Copyright Khronos Group, Page 1

Utilizing commercial graphics processors in the real-time geo-registration of streaming high-resolution imagery

Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)

Overview. Augmented reality and applications Marker-based augmented reality. Camera model. Binary markers Textured planar markers

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

HTC ONE (M8) DUO CAMERA WHITE PAPERS

Emerging Vision Technologies: Enabling a New Era of Intelligent Devices

Baback Elmieh, Software Lead James Ritts, Profiler Lead Qualcomm Incorporated Advanced Content Group

Fast Natural Feature Tracking for Mobile Augmented Reality Applications

3D Time-of-Flight Image Sensor Solutions for Mobile Devices

Fast Image Registration via Joint Gradient Maximization: Application to Multi-Modal Data

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation

Spring 2009 Prof. Hyesoon Kim

Homographies and RANSAC

Inside VR on Mobile. Sam Martin Graphics Architect GDC 2016

EE795: Computer Vision and Intelligent Systems

Distributed Vision Processing in Smart Camera Networks

Marker Based Localization of a Quadrotor. Akshat Agarwal & Siddharth Tanwar

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

Today s lecture. Image Alignment and Stitching. Readings. Motion models

The Most User-Friendly 3D scanner

SMARTPHONE HARDWARE: ANATOMY OF A HANDSET. Mainak Chaudhuri Indian Institute of Technology Kanpur Commonwealth of Learning Vancouver

Artec Leo. A smart professional 3D scanner for a next-generation user experience

Building Ultra-Low Power Wearable SoCs

An overview of mobile and embedded platforms

Integrating CPU and GPU, The ARM Methodology. Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM

Mobile Performance Tools and GPU Performance Tuning. Lars M. Bishop, NVIDIA Handheld DevTech Jason Allen, NVIDIA Handheld DevTools

MAD Gaze x HKCS. Best Smart Glass App Competition Developer Guidelines VERSION 1.0.0

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

NVIDIA DESIGNWORKS Ankit Patel - Prerna Dogra -

3D Editing System for Captured Real Scenes

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

Computer Vision. Exercise 3 Panorama Stitching 09/12/2013. Compute Vision : Exercise 3 Panorama Stitching

CSE 252B: Computer Vision II

EMBEDDED SYSTEMS AND MOBILE SYSTEMS

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

Transcription:

Advanced Imaging Applications on Smart-phones Convergence of General-purpose computing, Graphics acceleration, and Sensors Sriram Sethuraman Technologist & DMTS, Ittiam 1

Overview Imaging on Smart-phones GPP, Graphics acceleration & Sensors Advanced Imaging Image registration Image warping Camera movement & sensor readings Advanced imaging Application needs Samples Advanced HDR 2D to 3D Conclusions 2

Imaging on Smart-phones Standard on most Smart-phones today Aim to be replacements for DSCs ~8 Mpixel CMOS sensors Most use a hardware imaging pipeline to get a reasonably good looking picture Of late, include features such as automatic face detection in h/w Some models have flashes Many app. store applications 3

Sensors in Smart-phones Other than CMOS sensor MEMS based Accelerometers Originally added to detect orientation (portrait/landscape) Sometimes used to take shake free pictures MEMS based 3-axis accelerometer + 3-axis magnetometer MEMS based 3-axis gyroscopes 3D-TOF sensors IR sensors 4

General-purpose Compute Power in Smart-phones ARMv7 architecture Super-scalar A8, A9, A15 Vector Floating Point + NEON SIMD Suitable for geometry calculations or pixel processing 1-2 cores is standard (with options of up to 4 cores) Each core is clocked in the 600MHz - 1.5GHz range with talks of up to 2GHz 5

Graphics Acceleration in Smart-phones Heavily driven by user-interfaces and gaming Multiple options in h/w 40-50 Million triangles per second IMG (MBX/SGX), ARM (Mali), nvidia (GeForce), Qualcomm (Adreno), etc. OpenGL ES 2.0 compliant 6

Advanced Imaging Multiple possibilities Bridging quality gaps with DSCs Providing new imaging modes obtained by moving the sensor in a particular manner Panorama 3D stereoscopic 3D Panorama Augmented reality Gestural interfaces Gaming Most of them involve computer vision (scene understanding) algorithms 7

Building Blocks for Scene Understanding Features or invariants detection Low level image processing such as corner detection Correspondence establishment Identify corresponding feature points Used in a multitude of applications Model parameter estimation Learn underlying scene motion Fundamental matrix Outliers object motion Object Segmentation Morphological processing on spatio-temporal data Object motion tracking E.g. Hand motion Object motion trajectory classification Gesture for UI Co-ordinate system mapping OpenCV has a good collection of reference algorithms for these 8

9 Advanced Imaging Application Needs Typical needs Driven by ease of use Human factors All the advanced processing needs to be completed within a short time due to the following needs Shot to shot delay is reasonably low for still image capture Real-time processing for augmented reality, gestural interfaces, and gaming Preferable to include most processing at capture time to allow use of standard viewing options (to allow sharing with others who will have entirely different setups)

Speeding up the algorithms - 1 Use of GPUs General-purpose computing on GPUs Stream processing (massively parallel simple operations) Huge body of implementations exist Some of the feature detection steps can be done Morphological filtering can be done Warping Some motion model estimation methods require iterative registration requiring warping of images Advanced imaging methods require temporal alignment before stitching or combining Rectification methods in 3D require applying transformations such as rotation correction 10

Speeding up the algorithms - 2 Use of available sensors Conventional method of global motion analysis Compute features, perform correspondence, and then solve using least squares method for the rotation matrix part of the Fundamental matrix Computationally intensive Fails in the presence of noisy data De-generate cases are possible based on scene content With smart sensors (such as 3-axis Acccelerometer + 3-axis magnetometer or 3-axis Gyro) Rotation matrix can be obtained Most smartphone OS frameworks (such as ios, Android) provide APIs to get the rotation matrix Depending on the sensitivity of the sensors used, accuracy can be different Local magnetic fields can cause big errors Frequency of update may be limited 11

Speeding up the algorithms - 3 After off-loading suitable algorithms to GPU or obviating the need for some algorithms using sensor data, Optimize any core pixel processing modules using NEON SIMD Optimize any floating point operations using VFP instructions 12

Advanced HDR Imaging HDR High Dynamic Range Involves taking pictures at multiple exposures and combining them to get the impression of HDR Normally under-exposed parts get resolved well Areas that get saturated at higher exposure retain their details from the lower exposure However, conventional HDR works well only for cameras on tripod and scenes that are highly static to facilitate such combining Advanced HDR involves getting HDR even when slight camera or object movement is there Involves registering the multiple exposure pictures and then combining them robustly 13

Advanced HDR Imaging acceleration Sensor data can be used to compensate for rotation Graphics accelerator h/w can be used to warp the images after computing the global transformation Robust combining can then be done using Generalpurpose computing on the application processor and is highly SIMD friendly All the above processing can be done in the time that it takes to capture the multiple exposure pictures + a few milliseconds by using the sensor + GPU + CPU 14

3D using 2D Sensor Simple iphone applications exist that require a user to move the camera by a specified amount to take a stereo image pair Cumbersome Improperly taken pairs cannot be fused very well Automated 3D from 2D Found on some Sony DSCs as 3D sweep and 3D sweep panorama Requires moving the camera parallel to the scene and capturing multiple pictures, analyze them and create one or more stereo pairs 15

16 3D from 2D - Acceleration Sensor can be used to obtain rotation matrix Resizer, if any, available in h/w can be used to decimate pictures for faster analysis Graphics h/w can be used to perform rotation adjustment General purpose processing can be used to select optimum stereo pairs based on disparity analysis Final chosen pairs at the high resolution can be rectified using graphics h/w Fast 3D capture that is intuitive and simple for users to capture can be realized speeds can be similar or better than on DSCs Demo!

17 Conclusions Advanced Imaging Applications require a lot of image processing and scene understanding algorithms Smart-phones are increasingly equipped with multiple sensors, graphics acceleration, and high general-purpose compute power on the application processor Good user experience can be realized for Advanced Imaging Apps by leveraging the above to accelerate the processing