GPGPU on Mobile Devices

Similar documents
Handheld Devices. Kari Pulli. Research Fellow, Nokia Research Center Palo Alto. Material from Jyrki Leskelä, Jarmo Nikula, Mika Salmela

Dave Shreiner, ARM March 2009

Lecture 2. Shaders, GLSL and GPGPU

Shaders. Slide credit to Prof. Zwicker

CS427 Multicore Architecture and Parallel Computing

Introduction to OpenGL ES 3.0

Real - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský

OpenGL on Android. Lecture 7. Android and Low-level Optimizations Summer School. 27 July 2015

CS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST

Graphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics

The Benefits of GPU Compute on ARM Mali GPUs

Graphics Hardware. Instructor Stephen J. Guy

PowerVR Series5. Architecture Guide for Developers

Mobile Graphics Ecosystem. Tom Olson OpenGL ES working group chair

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager

The Application Stage. The Game Loop, Resource Management and Renderer Design

GPGPU on ARM. Tom Gall, Gil Pitney, 30 th Oct 2013

Lecture 13: OpenGL Shading Language (GLSL)

Graphics Processing Unit Architecture (GPU Arch)

Introduction to Shaders.

Spring 2009 Prof. Hyesoon Kim

Introduction to Modern GPU Hardware

Spring 2011 Prof. Hyesoon Kim

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Bifrost - The GPU architecture for next five billion

Enhancing Traditional Rasterization Graphics with Ray Tracing. March 2015

CS GPU and GPGPU Programming Lecture 7: Shading and Compute APIs 1. Markus Hadwiger, KAUST

What was removed? (1) OpenGL ES vs. OpenGL

Bringing AAA graphics to mobile platforms. Niklas Smedberg Senior Engine Programmer, Epic Games

Current Trends in Computer Graphics Hardware

POWERVR MBX & SGX OpenVG Support and Resources

Real - Time Rendering. Pipeline optimization. Michal Červeňanský Juraj Starinský

Mali Developer Resources. Kevin Ho ARM Taiwan FAE

Multimedia in Mobile Phones. Architectures and Trends Lund

Applications and Implementations

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

Introduction to creating 3D UI with BeagleBoard. ESC-341 Presented by Diego Dompe

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer

GPGPU. Peter Laurens 1st-year PhD Student, NSC

CS 179: GPU Programming

Copyright Khronos Group Page 1

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

Our Technology Expertise for Software Engineering Services. AceThought Services Your Partner in Innovation

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

Mobile graphics API Overview

Tutorial on GPU Programming #2. Joong-Youn Lee Supercomputing Center, KISTI

Prospects for a more robust, simpler and more efficient shader cross-compilation pipeline in Unity with SPIR-V

SMARTPHONE HARDWARE: ANATOMY OF A HANDSET. Mainak Chaudhuri Indian Institute of Technology Kanpur Commonwealth of Learning Vancouver

Mattan Erez. The University of Texas at Austin

Graphics Programming. Computer Graphics, VT 2016 Lecture 2, Chapter 2. Fredrik Nysjö Centre for Image analysis Uppsala University

X. GPU Programming. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter X 1

C P S C 314 S H A D E R S, O P E N G L, & J S RENDERING PIPELINE. Mikhail Bessmeltsev

Threading Hardware in G80

Efficient and Scalable Shading for Many Lights

Enhancing Traditional Rasterization Graphics with Ray Tracing. October 2015

CMPE 665:Multiple Processor Systems CUDA-AWARE MPI VIGNESH GOVINDARAJULU KOTHANDAPANI RANJITH MURUGESAN

Profiling and Debugging Games on Mobile Platforms

API Background. Prof. George Wolberg Dept. of Computer Science City College of New York

Hardware Accelerated Volume Visualization. Leonid I. Dimitrov & Milos Sramek GMI Austrian Academy of Sciences

NVIDIA Parallel Nsight. Jeff Kiel

CS179: GPU Programming

The Rasterization Pipeline

Cloth Simulation on the GPU. Cyril Zeller NVIDIA Corporation

CS450/550. Pipeline Architecture. Adapted From: Angel and Shreiner: Interactive Computer Graphics6E Addison-Wesley 2012

Applications and Implementations

Bifurcation Between CPU and GPU CPUs General purpose, serial GPUs Special purpose, parallel CPUs are becoming more parallel Dual and quad cores, roadm

Programmable GPUs. Real Time Graphics 11/13/2013. Nalu 2004 (NVIDIA Corporation) GeForce 6. Virtua Fighter 1995 (SEGA Corporation) NV1

Antonio R. Miele Marco D. Santambrogio

Today s Agenda. Basic design of a graphics system. Introduction to OpenGL

Lecture 15: Introduction to GPU programming. Lecture 15: Introduction to GPU programming p. 1

PowerVR Hardware. Architecture Overview for Developers

3D Rendering Pipeline

The Bifrost GPU architecture and the ARM Mali-G71 GPU

Advanced Imaging Applications on Smart-phones Convergence of General-purpose computing, Graphics acceleration, and Sensors

GPU Architecture. Michael Doggett Department of Computer Science Lund university

Improving Mobile Augmented Reality User Experience on Smartphones

CS770/870 Spring 2017 Open GL Shader Language GLSL

CS770/870 Spring 2017 Open GL Shader Language GLSL

Whiz-Bang Graphics and Media Performance for Java Platform, Micro Edition (JavaME)

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Introduction to CUDA programming

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

Copyright Khronos Group, Page Graphic Remedy. All Rights Reserved

Graphics Hardware, Graphics APIs, and Computation on GPUs. Mark Segal

SHADER PROGRAMMING. Based on Jian Huang s lecture on Shader Programming

Programming shaders & GPUs Christian Miller CS Fall 2011

OpenGL BOF Siggraph 2011

Module 13C: Using The 3D Graphics APIs OpenGL ES

Building scalable 3D applications. Ville Miettinen Hybrid Graphics

General Purpose Computation (CAD/CAM/CAE) on the GPU (a.k.a. Topics in Manufacturing)

Programming Guide. Aaftab Munshi Dan Ginsburg Dave Shreiner. TT r^addison-wesley

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

IEEE Computer Graphics and Applications

GoForce 3D: Coming to a Pixel Near You

OpenGL ES 2.0 : Start Developing Now. Dan Ginsburg Advanced Micro Devices, Inc.

Khronos and the Mobile Ecosystem

NVIDIA nfinitefx Engine: Programmable Pixel Shaders

SIGGRAPH Briefing August 2014

Grafica Computazionale: Lezione 30. Grafica Computazionale. Hiding complexity... ;) Introduction to OpenGL. lezione30 Introduction to OpenGL

Compute Shaders. Christian Hafner. Institute of Computer Graphics and Algorithms Vienna University of Technology

Transcription:

GPGPU on Mobile Devices

Introduction Addressing GPGPU for very mobile devices Tablets Smartphones

Introduction Why dedicated GPUs in mobile devices? Gaming Physics simulation for realistic effects 3D-GUI / compositor effects

Introduction Why GPGPU on mobile devices? Computational photography Image enhancement Image editing Computer vision Visual/image recognition Geo-localization Token recognition Augmented reality Easier support for new media codecs

Hardware Dedicated GPUs PowerVR's GPU series e.g. SGX540/543 used for the Nexus S / ipad 4-8 cores at ~200MHz 20-35 MTriangles/s / 1000 MPixel/s fill rate Nvidia's Tegra series CPU/GPU combination (usually ARM CPU core) Used mostly in tablets and cars ULP (ultra-low power) GeForce GPU ~300-400 MHz core clock speed 4 pixel and 4 vertex shader processors Not a unified architecture!

Hardware Nvidia's Tegra Development Kit

Graphics/GPU APIs OpenGL ES is the embedded 3D graphics standard OpenGL for Embedded Systems http://www.khronos.org/opengles/ OpenVG is 2D rendering standard Open Vector Graphics http://www.khronos.org/openvg EGL Embedded-System Graphics Library Interface to the window system Mobile version of WGL and GLX

OpenGL ES 2.0 Mostly programmable pipeline No more fixed function pipeline No glbegin() / glend() Drawing only via vertex arrays OpenGL ES Shading Language Similar to GLSL Sample from OpenGL ES Quick Reference Card Frame buffer objects available Depth test, stencil test etc.

OpenCL Embedded Profile

OpenCL Embedded Profile

OpenCL Embedded Profile Stripped-down version of OpenCL Minimum requirements can be smaller No 64-bit integers Reduced floating point accuracy Nearest sampling for float texture images 2D/3D image support is optional Might run on a DSP not on the GPU! Or even on a mixed CPU/GPU/DSP environment

OpenCL Embedded Profile OpenCL Embedded Profile Prototype in Mobile Devive Nokia http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5336267

OpenCL Embedded Profile Querying available profiles

Application 1 Accelerating image recognition on mobile devices using GPGPU, SPIE 2011 Miguel Bordallo López et al, U Oulo & Nokia Research Goal: face tracking on mobile devices Locally binary patterns as features Ada-Boost for classification Hardware platform TI OMAP3530 ARM Cortex A-8 PowerVR SGX 530 E.g. Nokia N900

Mobile Face Tracking Most steps on the GPU Linear classifier on CPU c (f 1,..., f n ; x )=sgn ( wi f i ( x )) Courtesy of Miguel Bordallo López

Mobile Face Tracking Optional preprocessing Convert to gray scale One quarter of the image per color channel Better utilization of vec4 shader units Courtesy of Miguel Bordallo López

Mobile Face Tracking Locally binary patterns (LBP) A.k.a. census transform Ojala et.al. 1994 Zabih & Woodfill, 1994 Texture-based features Robust to illumination changes

Mobile Face Tracking LBP extraction run-time Rescaling & image preprocessing Courtesy of Miguel Bordallo López

Mobile Face Tracking Both preprocessing & LBP extraction Power consumption Courtesy of Miguel Bordallo López

Application 2 OpenCL for image processing, Nokia "OpenCL embedded profile prototype in mobile device," J. Leskelä et al., IEEE Workshop on Signal Processing Systems, 2009. Geometric distortion + blurring + color transformation Based on OpenCL not OpenGL ES Leskelä et al., 2009

OpenCL for Image Processing Kernel header Leskelä et al., 2009

OpenCL for Image Processing Local declarations Leskelä et al., 2009

OpenCL for Image Processing Distort geometry (texture lookup) Leskelä et al., 2009

OpenCL for Image Processing Fetch neighboring pixels & blur Leskelä et al., 2009

OpenCL for Image Processing Color transformation and write to destination Leskelä et al., 2009

OpenCL for Image Processing Run-time comparison CPU: ARM Cortex A-8; 550 MHz GPU: PowerVR SGX530; 110 MHz 3 MPixel RGBA images CPU only: 8.6 s/image Slow FPU Will work faster with fixed point arithmetic GPU only: 2.4 s/image CPU+GPU: 2.5 s/image Bad scheduling CPU+GPU improved: 2.02 s/image Leskelä et al., 2009

OpenCL for Image Processing Energy consumption CPU: 3.93 J/frame GPU: 0.56 J/frame (~14%) 0.26 J/frame due to CPU GPU data transfer When power consumption matters High parallelism at low clock frequencies (110 MHz) is better than low parallelism at high clock frequencies (550 Mhz) Dissipation increases super-linearly with frequency Leskelä et al., 2009