Intelligent Video Analytics for Urban Management
|
|
- Claud Norton
- 6 years ago
- Views:
Transcription
1 Smart-I Gabriele Randelli Founder & CTO Intelligent Video Analytics for Urban Management Gabriele Randelli Founder & CTO 1
2 Gabriele Randelli Founder & CTO Smart- Feel Interactive I Short Company Overview Growth Plan 1st yeartargets Activity & Results Towards the 2nd year Company Overview Enel Lab Review 2 1
3 3 3
4 4 4
5 SmartEye Platform NVIDIA Tegra T30 with Quad ARM CORTEX -A9 Embedded NVIDIA ULP GeForce GPU (OpenGL) 64-bit ARM A57 CPUs 1 TFLOP/s 256-core with NVIDIA Maxwell Architecture 5
6 Adaptive traffic lights go with the flow by measuring vehicle inflow and outflow through each intersection Traffic lights usually are controlled according to an optimal cycle that maximizes the expected flow of traffic U.S. drivers lost $124 billion in 2013 due to the cost of fuel and the value of time wasted in traffic!!! (CEBR) 6 6
7 More than 200 traffic lights currently optimized and controlled with SmartEye! (1200 in May 2017) 7
8 Traffic Lights Control - Overall Architecture & T30 Algorithms Raw Images Intelligent Video Analytics Traffic Statistics Routing Planner Traffic Light Phase Profile Traffic Light Control Profile Application ROI Crop Median Blur Equalization Background Subtraction Legend CPU Only Blob Association & Matching Extended Kalman Filter Lucas Kanade Tracking CPU with NEON Vehicle Classification Homography Video Streaming On-board 8 GPU (OpenGL)
9 Major Issues Computational Power vs Analytics Power 5 fps implies problems for blob tracking and Kalman filter Optimizing with NEON takes a long time! (Inline Assembly) Maximum number of concurrent moving objects set to 32 Weak classifiers only 9
10 Traffic Lights Control - GPU-based Architecture Raw Images Intelligent Video Analytics Traffic Statistics Routing Planner Traffic Light Phase Profile Traffic Light Control Profile Application Legend ROI Crop Median Blur Equalization Background Subtraction HCM CPU Only Blob Association & Matching Extended Kalman Filter Lucas Kanade Tracking Traffic Light Synchronization CPU with NEON GPU Vehicle Classification Homography Video Streaming On-board Webster Cycle Length 10 On-board
11 SmartEye + Jetson TX1 - Computer Vision Optimizations General Considerations: We rely on VisionWorks Computer Vision pipeline becomes graph based execution Manifold VisionWorks vision functions adopted (LUT, arithmetic operations, color convert,...) ROI Crop vxaccessimagepatch Median Blur Equalization Lucas Kanade Tracking Homography vxequalizehistnode vxmedian3x3node vxopticalflowpyrlknode nvxfasttracknode vxwarpperspectivenode - 4 new algs on GPU - 4 implementations already available - About 30% code re-engineered - 15 fps! - Routing planning on-board! Extended Kalman Filter Ad-hoc implementation 11
12 12 12
13 13
14 Real-time In the loop control No Cloud 14
15 15
16 Smart Lighting - Overall Architecture & T30 Algorithms Raw Images Intelligent Video Analytics Traffic Statistics LED Dimming Prediction Plant Profile Prediction LED Plant Control Profile Application Same as Traffic Light Nonlinear Autoregressive Exogenous Model (NARX) Traffic Volume Time Series Traffic Volume Prediction Lighting Category Prediction Plant Profile Prediction On Cloud (Training & Prediction) 16
17 Major Issues Machine learning takes time No closed-loop control (we need cloud) Adaptive, but no lighting on demand 17
18 Smart Lighting - GPU-based Architecture Raw Images Intelligent Video Analytics Traffic Statistics LED Dimming Prediction Plant Profile Prediction LED Plant Control Profile Application Same as Traffic Light Nonlinear Autoregressive Exogenous Model (NARX) On board prediction enables lighting on demand profile Traffic Volume Time Series Traffic Volume Prediction Lighting Category Prediction Plant Profile Prediction On Board (Prediction) Model Training On Cloud (Training) 18
19 SmartEye + Jetson TX1 - Time Series Prediction Optimization for Closed-loop Lamp Control General Considerations: cudnn does not support NARX models :( Ad-hoc implementation with CUDA Toolkit Only network execution, training still on cloud Design Considerations: neurons on a same network level are completely isolated (parallelization) no writing sync access during NARX prediction every network level computation is a CUDA function - 37% faster - Closed-loop control - Lighting on demand 19
20 Conclusions In Smart Cities real-time closed loop control is a significant business advantage Cloud is fine, but most of processing has to be locally deployed Unloading algs on GPUs takes less time than using NEON/SSE or relying on FPGAs VisionWorks on Jetson TX1 already supports many algs and easily interacts with OpenCV (very low dev effort) cudnn still needs to support more advanced NN models More than 2x speed-up on two relevant application fields 20
21 Gabriele Randelli Founder & CTO Smart-I We live in the big data world it is now time to reason on top of these data to integrate smart control of the environment where we live Gabriele Randelli Founder & CTO 21
Deep Learning: Transforming Engineering and Science The MathWorks, Inc.
Deep Learning: Transforming Engineering and Science 1 2015 The MathWorks, Inc. DEEP LEARNING: TRANSFORMING ENGINEERING AND SCIENCE A THE NEW RISE ERA OF OF GPU COMPUTING 3 NVIDIA A IS NEW THE WORLD S ERA
More informationMIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT. Kurtis McBride CEO, Miovision
MIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT Kurtis McBride CEO, Miovision ABOUT MIOVISION COMPANY Founded in 2005 40% growth, year over year Offices in Kitchener, Canada
More informationSmall is the New Big: Data Analytics on the Edge
Small is the New Big: Data Analytics on the Edge An overview of processors and algorithms for deep learning techniques on the edge Dr. Abhay Samant VP Engineering, Hiller Measurements Adjunct Faculty,
More informationRealtime Object Detection and Segmentation for HD Mapping
Realtime Object Detection and Segmentation for HD Mapping William Raveane Lead AI Engineer Bahram Yoosefizonooz Technical Director NavInfo Europe Advanced Research Lab Presented at GTC Europe 2018 AI in
More informationEmbarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA
Embarquez votre Intelligence Artificielle (IA) sur CPU, GPU et FPGA Pierre Nowodzienski Engineer pierre.nowodzienski@mathworks.fr 2018 The MathWorks, Inc. 1 From Data to Business value Make decisions Get
More informationTR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut
TR-2014-17 An Overview of NVIDIA Tegra K1 Architecture Ang Li, Radu Serban, Dan Negrut November 20, 2014 Abstract This paperwork gives an overview of NVIDIA s Jetson TK1 Development Kit and its Tegra K1
More informationTEGRA K1 AND THE AUTOMOTIVE INDUSTRY. Gernot Ziegler, Timo Stich
TEGRA K1 AND THE AUTOMOTIVE INDUSTRY Gernot Ziegler, Timo Stich Previously: Tegra in Automotive Infotainment / Navigation Digital Instrument Cluster Passenger Entertainment TEGRA K1 with Kepler GPU GPU:
More informationTHE LEADER IN VISUAL COMPUTING
MOBILE EMBEDDED THE LEADER IN VISUAL COMPUTING 2 TAKING OUR VISION TO REALITY HPC DESIGN and VISUALIZATION AUTO GAMING 3 BEST DEVELOPER EXPERIENCE Tools for Fast Development Debug and Performance Tuning
More informationINTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES. Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.
INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp. Computer Vision in Mobile Tegra K1 It s time! AGENDA Use cases categories
More informationP I X E V I A : A I B A S E D, R E A L - T I M E C O M P U T E R V I S I O N S Y S T E M F O R D R O N E S
P I X E V I A : A I B A S E D, R E A L - T I M E C O M P U T E R V I S I O N S Y S T E M F O R D R O N E S Mindaugas Eglinskas, CEO at PIXEVIA www.pixevia.com Origins in R&D projects for Lithuanian MoD.
More informationDeploying Deep Learning Networks to Embedded GPUs and CPUs
Deploying Deep Learning Networks to Embedded GPUs and CPUs Rishu Gupta, PhD Senior Application Engineer, Computer Vision 2015 The MathWorks, Inc. 1 MATLAB Deep Learning Framework Access Data Design + Train
More informationManycore and GPU Channelisers. Seth Hall High Performance Computing Lab, AUT
Manycore and GPU Channelisers Seth Hall High Performance Computing Lab, AUT GPU Accelerated Computing GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate
More informationElaborazione dati real-time su architetture embedded many-core e FPGA
Elaborazione dati real-time su architetture embedded many-core e FPGA DAVIDE ROSSI A L E S S A N D R O C A P O T O N D I G I U S E P P E T A G L I A V I N I A N D R E A M A R O N G I U C I R I - I C T
More informationA176 Cyclone. GPGPU Fanless Small FF RediBuilt Supercomputer. IT and Instrumentation for industry. Aitech I/O
The A176 Cyclone is the smallest and most powerful Rugged-GPGPU, ideally suited for distributed systems. Its 256 CUDA cores reach 1 TFLOPS, and it consumes less than 17W at full load (8-10W at typical
More informationHardware Acceleration of Feature Detection and Description Algorithms on Low Power Embedded Platforms
Hardware Acceleration of Feature Detection and Description Algorithms on LowPower Embedded Platforms Onur Ulusel, Christopher Picardo, Christopher Harris, Sherief Reda, R. Iris Bahar, School of Engineering,
More informationWhat s inside: What is deep learning Why is deep learning taking off now? Multiple applications How to implement a system.
Point Grey White Paper Series What s inside: What is deep learning Why is deep learning taking off now? Multiple applications How to implement a system More and more, machine vision systems are expected
More informationNvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018
Nvidia Jetson TX2 and its Software Toolset João Fernandes 2017/2018 In this presentation Nvidia Jetson TX2: Hardware Nvidia Jetson TX2: Software Machine Learning: Neural Networks Convolutional Neural Networks
More informationEmbedded Computing without Compromise. Evolution of the Rugged GPGPU Computer Session: SIL7127 Dan Mor PLM -Aitech Systems GTC Israel 2017
Evolution of the Rugged GPGPU Computer Session: SIL7127 Dan Mor PLM - Systems GTC Israel 2017 Agenda Current GPGPU systems NVIDIA Jetson TX1 and TX2 evaluation Conclusions New Products 2 GPGPU Product
More informationEmbedded GPGPU and Deep Learning for Industrial Market
Embedded GPGPU and Deep Learning for Industrial Market Author: Dan Mor GPGPU and HPEC Product Line Manager September 2018 Table of Contents 1. INTRODUCTION... 3 2. DIFFICULTIES IN CURRENT EMBEDDED INDUSTRIAL
More informationTransforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell
Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell 11 th Oct 2018 2 Contents Our Vision Of Smarter Transport Company introduction and journey so far Advanced
More informationHigh-Performance Data Loading and Augmentation for Deep Neural Network Training
High-Performance Data Loading and Augmentation for Deep Neural Network Training Trevor Gale tgale@ece.neu.edu Steven Eliuk steven.eliuk@gmail.com Cameron Upright c.upright@samsung.com Roadmap 1. The General-Purpose
More informationAutonomous Driving Solutions
Autonomous Driving Solutions Oct, 2017 DrivePX2 & DriveWorks Marcus Oh (moh@nvidia.com) Sr. Solution Architect, NVIDIA This work is licensed under a Creative Commons Attribution-Share Alike 4.0 (CC BY-SA
More informationBifurcation Between CPU and GPU CPUs General purpose, serial GPUs Special purpose, parallel CPUs are becoming more parallel Dual and quad cores, roadm
XMT-GPU A PRAM Architecture for Graphics Computation Tom DuBois, Bryant Lee, Yi Wang, Marc Olano and Uzi Vishkin Bifurcation Between CPU and GPU CPUs General purpose, serial GPUs Special purpose, parallel
More informationA176 C clone. GPGPU Fanless Small FF RediBuilt Supercomputer. Aitech
The A176 Cyclone is the smallest and most powerful Rugged-GPGPU, ideally suited for distributed systems. Its 256 CUDA cores reach 1 TFLOPS at a remarkable level of energy efficiency, providing all the
More informationGTC 2013 March San Jose, CA The Smartest People. The Best Ideas. The Biggest Opportunities. Opportunities for Participation:
GTC 2013 March 18-21 San Jose, CA The Smartest People. The Best Ideas. The Biggest Opportunities. Opportunities for Participation: SPEAK - Showcase your work among the elite of graphics computing - Call
More informationCS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8 Markus Hadwiger, KAUST Reading Assignment #5 (until March 12) Read (required): Programming Massively Parallel Processors book, Chapter
More informationNVIDIA DATA LOADING LIBRARY (DALI)
NVIDIA DATA LOADING LIBRARY (DALI) RN-09096-001 _v01 September 2018 Release Notes TABLE OF CONTENTS Chapter Chapter Chapter Chapter Chapter 1. 2. 3. 4. 5. DALI DALI DALI DALI DALI Overview...1 Release
More informationDeep learning in MATLAB From Concept to CUDA Code
Deep learning in MATLAB From Concept to CUDA Code Roy Fahn Applications Engineer Systematics royf@systematics.co.il 03-7660111 Ram Kokku Principal Engineer MathWorks ram.kokku@mathworks.com 2017 The MathWorks,
More informationEfficient Video Processing on Embedded GPU
Efficient Video Processing on Embedded GPU Tobias Kammacher Armin Weiss Matthias Frei Institute of Embedded Systems High Performance Multimedia Research Group Zurich University of Applied Sciences (ZHAW)
More informationCUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni
CUDA Optimizations WS 2014-15 Intelligent Robotics Seminar 1 Table of content 1 Background information 2 Optimizations 3 Summary 2 Table of content 1 Background information 2 Optimizations 3 Summary 3
More information4K Video Processing and Streaming Platform on TX1
4K Video Processing and Streaming Platform on TX1 Tobias Kammacher Dr. Matthias Rosenthal Institute of Embedded Systems / High Performance Multimedia Research Group Zurich University of Applied Sciences
More informationDEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA
DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA TOPICS COVERED Convolutional Networks Deep Learning Use Cases GPUs cudnn 2 MACHINE LEARNING! Training! Train the model from supervised
More informationGPU-ACCELERATED PLATFORM TRANSFORMING THE SMART CITIES LANDSCAPE PRADEEP GUPTA SENIOR SOLUTIONS ARCHITECT, NVIDIA
GPU-ACCELERATED PLATFORM TRANSFORMING THE SMART CITIES LANDSCAPE PRADEEP GUPTA SENIOR SOLUTIONS ARCHITECT, NVIDIA Smart City - Concept and Motivation Agenda NVIDIA s Platform for Making Smart Cities Use
More informationEdge-to-Cloud Compute with MxNet
Presented @ GTC 2017 & Edge-to-Cloud Compute with MxNet AWS & NVIDIA Aran Khanna, Software Developer, AWS Miro Enev, Solutions Architect, NVIDIA 2017, Amazon Web Services, Inc. or its Affiliates. All rights
More informationREAL-TIME ROAD SIGNS RECOGNITION USING MOBILE GPU
High-Performance Сomputing REAL-TIME ROAD SIGNS RECOGNITION USING MOBILE GPU P.Y. Yakimov Samara National Research University, Samara, Russia Abstract. This article shows an effective implementation of
More informationGPGPU on Mobile Devices
GPGPU on Mobile Devices Introduction Addressing GPGPU for very mobile devices Tablets Smartphones Introduction Why dedicated GPUs in mobile devices? Gaming Physics simulation for realistic effects 3D-GUI
More informationArtec Leo. A smart professional 3D scanner for a next-generation user experience
Artec Leo A smart professional 3D scanner for a next-generation user experience Industrial design and manufacturing / Healthcare VR / E-commerce / Science and education Forensics / Art and design See your
More informationArtec Leo. A smart professional 3D scanner for a next-generation user experience
Artec Leo A smart professional 3D scanner for a next-generation user experience Industrial design and manufacturing / Healthcare VR / E-commerce / Science and education Forensics / Art and design See your
More informationCarlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain)
Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain) 4th IEEE International Workshop of High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB
More informationArtec Leo. A smart professional 3D scanner for a next-generation user experience
Artec Leo A smart professional 3D scanner for a next-generation user experience Industrial design and manufacturing / Healthcare VR / E-commerce / Science and education Forensics / Art and design Easy
More informationAccelerating your Embedded Vision / Machine Learning design with the revision Stack. Giles Peckham, Xilinx
Accelerating your Embedded Vision / Machine Learning design with the revision Stack Giles Peckham, Xilinx Xilinx Foundation at the Edge Vision Customers Using Xilinx >80 ADAS Models From 23 Makers >80
More informationApril 4-7, 2016 Silicon Valley VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783. Elif Albuz, April 4, 2016
April 4-7, 2016 Silicon Valley VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif Albuz, April 4, 2016 Motivation Introduction to VisionWorks AGENDA VisionWorks Software Stack VisionWorks
More informationACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015
ACCELERATED COMPUTING: THE PATH FORWARD Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015 COMMODITY DISRUPTS CUSTOM SOURCE: Top500 ACCELERATED COMPUTING: THE PATH FORWARD It s time to start
More informationNVIDIA DLI HANDS-ON TRAINING COURSE CATALOG
NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial
More informationGPU Coder: Automatic CUDA and TensorRT code generation from MATLAB
GPU Coder: Automatic CUDA and TensorRT code generation from MATLAB Ram Kokku 2018 The MathWorks, Inc. 1 GPUs and CUDA programming faster Performance CUDA OpenCL C/C++ GPU Coder MATLAB Python Ease of programming
More information4K Video Processing and Streaming Platform on TX1
4K Video Processing and Streaming Platform on TX1 Tobias Kammacher Dr. Matthias Rosenthal Institute of Embedded Systems / High Performance Multimedia Research Group Zurich University of Applied Sciences
More informationPERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE
April 4-7, 2016 Silicon Valley PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE Pradeep Chandrahasshenoy, Automotive Solutions Architect, NVIDIA Stefan Schoenefeld, ProViz DevTech, NVIDIA 4 th April 2016
More informationComputer Vision on Tegra K1. Chen Sagiv SagivTech Ltd.
Computer Vision on Tegra K1 Chen Sagiv SagivTech Ltd. Established in 2009 and headquartered in Israel Core domain expertise: GPU Computing and Computer Vision What we do: - Technology - Solutions - Projects
More informationNVIDIA PLATFORM FOR AI
NVIDIA PLATFORM FOR AI João Paulo Navarro, Solutions Architect - Linkedin i am ai HTTPS://WWW.YOUTUBE.COM/WATCH?V=GIZ7KYRWZGQ 2 NVIDIA Gaming VR AI & HPC Self-Driving Cars GPU Computing 3 GPU COMPUTING
More informationINTELLIGENCE AT THE EDGE -HIGH PERFORMANCE EMBEDDED COMPUTING TRENDS (HPEC)
INTELLIGENCE AT THE EDGE -HIGH PERFORMANCE EMBEDDED COMPUTING TRENDS (HPEC) Sida 1 Patrik Björklund-Director of sales Tritech Solutions WE ARE TRITECH Sida 2 Embedded products, solutions and engineering
More informationImplementing Deep Learning for Video Analytics on Tegra X1.
Implementing Deep Learning for Video Analytics on Tegra X1 research@hertasecurity.com Index Who we are, what we do Video analytics pipeline Video decoding Facial detection and preprocessing DNN: learning
More informationGetting started with Caffe. Jon Barker, Solutions Architect
Getting started with Caffe Jon Barker, Solutions Architect Caffe tour Overview Agenda Example applications Setup Performance Hands-on lab preview 2 A tour of Caffe 3 What is Caffe? An open framework for
More informationNVIDIA DEEP LEARNING INSTITUTE
NVIDIA DEEP LEARNING INSTITUTE TRAINING CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial
More informationBuilding the Most Efficient Machine Learning System
Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide
More informationParallel HMMs. Parallel Implementation of Hidden Markov Models for Wireless Applications
Parallel HMMs Parallel Implementation of Hidden Markov Models for Wireless Applications Authors Shawn Hymel (Wireless@VT, Virginia Tech) Ihsan Akbar (Harris Corporation) Jeffrey Reed (Wireless@VT, Virginia
More information4K HEVC Video Processing with GPU Optimization on Jetson TX1
4K HEVC Video Processing with GPU Optimization on Jetson TX1 Tobias Kammacher Matthias Frei Hans Gelke Institute of Embedded Systems / High Performance Multimedia Research Group Zurich University of Applied
More informationGPU-Accelerated Deep Learning
GPU-Accelerated Deep Learning July 6 th, 2016. Greg Heinrich. Credits: Alison B. Lowndes, Julie Bernauer, Leo K. Tam. PRACTICAL DEEP LEARNING EXAMPLES Image Classification, Object Detection, Localization,
More informationThe Internet of Things
The Internet of Things George Debbo Presentation for SASGI Meeting on 22 nd June 2016 1 Agenda What is IoT? How big is it? What effect does it have on telecom networks? Use cases/applications: The connected
More informationvs. GPU Performance Without the Answer University of Virginia Computer Engineering g Labs
Where is the Data? Why you Cannot Debate CPU vs. GPU Performance Without the Answer Chris Gregg and Kim Hazelwood University of Virginia Computer Engineering g Labs 1 GPUs and Data Transfer GPU computing
More informationIntroduction to the Tegra SoC Family and the ARM Architecture. Kristoffer Robin Stokke, PhD FLIR UAS
Introduction to the Tegra SoC Family and the ARM Architecture Kristoffer Robin Stokke, PhD FLIR UAS Goals of Lecture To give you something concrete to start on Simple introduction to ARMv8 NEON programming
More informationHPC with Multicore and GPUs
HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville COSC 594 Lecture Notes March 22, 2017 1/20 Outline Introduction - Hardware
More informationCS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS
CS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS 1 Last time Each block is assigned to and executed on a single streaming multiprocessor (SM). Threads execute in groups of 32 called warps. Threads in
More informationArtec Leo. A smart professional 3D scanner for a next-generation user experience
001-09/2017-USD-EN Artec Leo A smart professional 3D scanner for a next-generation user experience Industrial design and manufacturing / Healthcare VR / E-commerce / Science and education Forensics / Art
More informationBehind Today s Trends The Technologies Driving Change. Paul Smith Director Consulting Services
Behind Today s Trends The Technologies Driving Change Paul Smith Director Consulting Services Industry 4.0 Big Data Wearable Tech Cloud Computing Internet of Things MOOC Trends from 2009 Social Computing
More informationAn introduction to Machine Learning silicon
An introduction to Machine Learning silicon November 28 2017 Insight for Technology Investors AI/ML terminology Artificial Intelligence Machine Learning Deep Learning Algorithms: CNNs, RNNs, etc. Additional
More informationExperts in Application Acceleration Synective Labs AB
Experts in Application Acceleration 1 2009 Synective Labs AB Magnus Peterson Synective Labs Synective Labs quick facts Expert company within software acceleration Based in Sweden with offices in Gothenburg
More informationMachine Learning. Bridging the OT IT Gap for Machine Learning with Ignition and AWS Greengrass
Machine Learning with Ignition and AWS Greengrass Bridging B the OT IT Gap for Machine Learning Simply Connect Ignition & AWS Greengrass for Machine Learning! Bridging the OT IT gaps is now easier using
More informationOPTIMIZED GPU KERNELS FOR DEEP LEARNING. Amir Khosrowshahi
OPTIMIZED GPU KERNELS FOR DEEP LEARNING Amir Khosrowshahi GTC 17 Mar 2015 Outline About nervana Optimizing deep learning at assembler level Limited precision for deep learning neon benchmarks 2 About nervana
More informationS7105 ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION. Venugopala Madumbu, NVIDIA GTC D
S7105 ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION Venugopala Madumbu, NVIDIA GTC 2017 210D ADVANCED DRIVING ASSIST SYSTEMS (ADAS) & AUTONOMOUS DRIVING (AD) High Compute Workloads Mapped to GPU
More informationEmbedded Linux Conference San Diego 2016
Embedded Linux Conference San Diego 2016 Linux Power Management Optimization on the Nvidia Jetson Platform Merlin Friesen merlin@gg-research.com About You Target Audience - The presentation is introductory
More informationOutline. Person detection in RGB/IR images 8/11/2017. Pedestrian detection how? Pedestrian detection how? Comparing different methodologies?
Outline Person detection in RGB/IR images Kristof Van Beeck How pedestrian detection works Comparing different methodologies Challenges of IR images DPM & ACF o Methodology o Integrating IR images & benefits
More informationFCUDA: Enabling Efficient Compilation of CUDA Kernels onto
FCUDA: Enabling Efficient Compilation of CUDA Kernels onto FPGAs October 13, 2009 Overview Presenting: Alex Papakonstantinou, Karthik Gururaj, John Stratton, Jason Cong, Deming Chen, Wen-mei Hwu. FCUDA:
More informationBuilding an Intelligent World. Zhexuan Song, Chief Strategy Officer, Huawei
Building an Intelligent World Zhexuan Song, Chief Strategy Officer, Huawei AI, as a general purpose technology, changed Huawei Manufacture Logistics Retail Yield increase Information driven Risk management
More informationComputer Organization and Design, 5th Edition: The Hardware/Software Interface
Computer Organization and Design, 5th Edition: The Hardware/Software Interface 1 Computer Abstractions and Technology 1.1 Introduction 1.2 Eight Great Ideas in Computer Architecture 1.3 Below Your Program
More informationUnleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases. Steve Steele, ARM
Unleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases Steve Steele, ARM 1 Today s Computational Challenges Trends Growing display sizes and resolutions, richer
More informationFCUDA: Enabling Efficient Compilation of CUDA Kernels onto
FCUDA: Enabling Efficient Compilation of CUDA Kernels onto FPGAs October 13, 2009 Overview Presenting: Alex Papakonstantinou, Karthik Gururaj, John Stratton, Jason Cong, Deming Chen, Wen-mei Hwu. FCUDA:
More informationThe Smart Urban Platform
The Smart Urban Platform M2M Forum 2014, Milan Filippo Murroni, CTO Abo Data 20 May 2014 A Smart City What makes a city smart An efficient technological network that connect people and things Generation
More informationS WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS. Jakob Progsch, Mathias Wagner GTC 2018
S8630 - WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS Jakob Progsch, Mathias Wagner GTC 2018 1. Know your hardware BEFORE YOU START What are the target machines, how many nodes? Machine-specific
More informationOpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision. Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017
OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017 Agenda Why Zynq SoCs for Traditional Computer Vision Automated
More informationGPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC
GPGPUs in HPC VILLE TIMONEN Åbo Akademi University 2.11.2010 @ CSC Content Background How do GPUs pull off higher throughput Typical architecture Current situation & the future GPGPU languages A tale of
More informationComputer Vision Algorithm Acceleration Using GPGPU and the Tegra Processor's Unified Memory
Engineering, Operations & Technology Boeing Research & Technology Computer Vision Algorithm Acceleration Using GPGPU and the Tegra Processor's Unified Memory Aaron Mosher Boeing Research & Technology Avionics
More informationParallel Processing SIMD, Vector and GPU s cont.
Parallel Processing SIMD, Vector and GPU s cont. EECS4201 Fall 2016 York University 1 Multithreading First, we start with multithreading Multithreading is used in GPU s 2 1 Thread Level Parallelism ILP
More informationDeep Neural Network Enhanced VSLAM Landmark Selection
Deep Neural Network Enhanced VSLAM Landmark Selection Dr. Patrick Benavidez Overview 1 Introduction 2 Background on methods used in VSLAM 3 Proposed Method 4 Testbed 5 Preliminary Results What is VSLAM?
More informationNVIDIA AI BRAIN OF SELF DRIVING AND HD MAPPING. September 13, 2016
NVIDIA AI BRAIN OF SELF DRIVING AND HD MAPPING September 13, 2016 AI FOR AUTONOMOUS DRIVING MAPPING KALDI LOCALIZATION DRIVENET Training on DGX-1 NVIDIA DGX-1 NVIDIA DRIVE PX 2 Driving with DriveWorks
More informationNeuromorphic Data Microscope
Neuromorphic Data Microscope CLSAC 16 October 28, 2016 David Follett Founder, CEO Lewis Rhodes Labs (LRL) david@lewis-rhodes.com 978-273-0537 Slide 1 History Neuroscience 1998-2012 Neuronal Spiking Models
More informationHigh Performance Computing
High Performance Computing 9th Lecture 2016/10/28 YUKI ITO 1 Selected Paper: vdnn: Virtualized Deep Neural Networks for Scalable, MemoryEfficient Neural Network Design Minsoo Rhu, Natalia Gimelshein, Jason
More informationReal-time image processing and object recognition for robotics applications. Adrian Stratulat
Real-time image processing and object recognition for robotics applications Adrian Stratulat What is computer vision? Computer vision is a field that includes methods for acquiring, processing, analyzing,
More informationEdge of Tomorrow Deploying Collaborative Machine Intelligence to the Edge
Edge of Tomorrow Deploying Collaborative Machine Intelligence to the Edge Adarsh Pal Singh International Institute of Information Technology, Hyderabad Wenjing Chu Futurewei Technologies, Inc. @wenjing
More informationAn introduction to Halide. Jonathan Ragan-Kelley (Stanford) Andrew Adams (Google) Dillon Sharlet (Google)
An introduction to Halide Jonathan Ragan-Kelley (Stanford) Andrew Adams (Google) Dillon Sharlet (Google) Today s agenda Now: the big ideas in Halide Later: writing & optimizing real code Hello world (brightness)
More informationEfficient Lists Intersection by CPU- GPU Cooperative Computing
Efficient Lists Intersection by CPU- GPU Cooperative Computing Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, Jing Liu Nankai-Baidu Joint Lab, Nankai University Outline Introduction Cooperative
More informationConcurrent execution of an analytical workload on a POWER8 server with K40 GPUs A Technology Demonstration
Concurrent execution of an analytical workload on a POWER8 server with K40 GPUs A Technology Demonstration Sina Meraji sinamera@ca.ibm.com Berni Schiefer schiefer@ca.ibm.com Tuesday March 17th at 12:00
More informationAntonio R. Miele Marco D. Santambrogio
Advanced Topics on Heterogeneous System Architectures GPU Politecnico di Milano Seminar Room A. Alario 18 November, 2015 Antonio R. Miele Marco D. Santambrogio Politecnico di Milano 2 Introduction First
More informationAutomated Geophysical Feature Detection with Deep Learning
Automated Geophysical Feature Detection with Deep Learning Chiyuan Zhang, Charlie Frogner and Tomaso Poggio, MIT. Mauricio Araya-Polo, Jan Limbeck and Detlef Hohl, Shell International Exploration & Production
More informationGPU Programming. Lecture 1: Introduction. Miaoqing Huang University of Arkansas 1 / 27
1 / 27 GPU Programming Lecture 1: Introduction Miaoqing Huang University of Arkansas 2 / 27 Outline Course Introduction GPUs as Parallel Computers Trend and Design Philosophies Programming and Execution
More informationGraphics Hardware. Graphics Processing Unit (GPU) is a Subsidiary hardware. With massively multi-threaded many-core. Dedicated to 2D and 3D graphics
Why GPU? Chapter 1 Graphics Hardware Graphics Processing Unit (GPU) is a Subsidiary hardware With massively multi-threaded many-core Dedicated to 2D and 3D graphics Special purpose low functionality, high
More informationOpenCV on a GPU. Shalini Gupta, Shervin Emami, Frank Brill NVIDIA
OpenCV on a GPU Shalini Gupta, Shervin Emami, Frank Brill NVIDIA GPU access To access NVIDIA cluster send email to jlevites@nvidia.com Subject line: OpenCV GPU Test Drive Add your name and phone number
More informationPower Aware Data Driven Distributed Simulation on Micro-Cluster Platforms
EAGER Power Aware Data Driven Distributed Simulation on Micro-Cluster Platforms Richard Fujimoto Georgia Institute of Technology Project Team Faculty Richard Fujimoto (Distributed Simulation) Angshuman
More informationGraphics Processor Acceleration and YOU
Graphics Processor Acceleration and YOU James Phillips Research/gpu/ Goals of Lecture After this talk the audience will: Understand how GPUs differ from CPUs Understand the limits of GPU acceleration Have
More informationBuilding the Most Efficient Machine Learning System
Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide
More informationHardware/Software Co-Design
1 / 27 Hardware/Software Co-Design Miaoqing Huang University of Arkansas Fall 2011 2 / 27 Outline 1 2 3 3 / 27 Outline 1 2 3 CSCE 5013-002 Speical Topic in Hardware/Software Co-Design Instructor Miaoqing
More information