TEGRA K1 による GPU コンピューティング
|
|
- Clifton Cameron
- 5 years ago
- Views:
Transcription
1 TEGRA K1 による GPU コンピューティング COMPUTE WITH TEGRA K1 馬路徹 シニア ソリューション アーキテクト NVIDIA
2 AGENDA Introducing Tegra K1 Tegra K1 Compute Software Capabilities OpenGL GLSL OpenCL CUDA/Unified Memory Google Renderscript
3 Tegra K1 Kepler Architecture ISA Compatible to GeForce, Quadro, Tesla Tesla In Super Computers A15 A15 LP A15 A15 A15 64kB L1 Cache and Shared Memory Quadro In Work Stations 192 CUDA Cores 128kB L2 Cache GeForce In PCs Mobile Kepler In Tegra
4 TEGRA K1 DEVELOPMENT PLATFORMS Coming to Android K1 Devices soon JETSON TK1 GigE, USB3.0, HDMI running Linux4Tegra JETSON X3 (TK1 PRO) GigE, USB3.0, HDMI, 8 x Cameras, CANBUS running Vibrante Linux AUTOMOTIVE GRADE
5 SOFTWARE FOR COMPUTE Tegra K1 can accelerate Renderscript OpenGL /OpenGL ES with Compute Shaders NPP cufft, cublas, cusparse OpenCV OpenCL full profile CUDA and a whole list of libraries enabling compute on the GPU
6 TEGRA K1 FOR OPENGL/GLSL Kepler Architecture 192 CUDA Cores Cortex-A15 4-Plus-1 Shared Physical Memory 2D Engine / ISP
7 COMPUTE SHADERS Standard OpenGL API Execute algorithmically general purpose GLSL shaders Operate on buffers, images and textures Process graphics data in the context of the graphics pipeline Easier than interoperating with a compute API for graphics apps Standard part of all OpenGL 4.3+ implementations And now OpenGL ES 3.1! Image processing AI Simulation Ray Tracing Wave Simulation Global Illumination
8 OPENGL COMPUTE SHADERS From Application From Application Element Array Buffer b Vertex Puller Dispatch Indirect Buffer b Dispatch Draw Indirect Buffer b Vertex Shader Image Load / Store t/b Compute Shader Vertex Buffer Object b Tessellation Control Shader Atomic Counter b Tessellation Primitive Gen. Shader Storage b Tessellation Eval. Shader Geometry Shader Texture Fetch t/b Transform Feedback Buffer b Transform Feedback Uniform Block b Legend Rasterization From Application Fixed Function Stage Programmable Stage Fragment Shader Pixel Assembly Pixel Unpack Buffer b b Buffer Binding Per-Fragment Operations Pixel Operations Texture Image t t Texture Binding Arrows indicate data flow Framebuffer Pixel Pack Pixel Pack Buffer b
9 TEGRA K1 FOR COMPUTE Kepler Architecture 192 CUDA Cores Cortex-A15 4-Plus-1 Shared Physical Memory 2D Engine / ISP
10 TEGRA K1 FOR OPENCL OpenCL 1.2 Full profile support (OpenCL and OpenCL Embedded) True portability from desktop Higher precision, higher limits Awesome performance Related Session: US S Real-Time Facial Motion Capture and Animation on Mobile Emiliano Gambaretto
11 TEGRA K1 CUDA 6 AND SHARED PHYSICAL MEMORY Kepler Architecture 192 CUDA Cores Cortex-A15 4-Plus-1 Shared Physical Memory 2D Engine / ISP
12 CUDA REQUIRES MEMORY COPY Programmers are forced do perform additional work to allocate memories both in host/device and copy data from/to host to/from device Conventional Discrete GPU global void saxpy(int n, float a, float *x, float *y) { } int i = blockidx.x*blockdim.x + threadidx.x; if (i < n) y[i] = a*x[i] + y[i]; PCIe 1 int N = 1<<20; cudamemcpy(d_x, x, N, cudamemcpyhosttodevice); cudamemcpy(d_y, y, N, cudamemcpyhosttodevice); 1 // Perform SAXPY on 1M elements saxpy<<<4096,256>>>(n, 2.0, d_x, d_y); Host Memory (CPUMemory) 2 Device Memory (GPU Memory) 2 cudamemcpy(y, d_y, N, cudamemcpydevicetohost);
13 CUDA UNIFIED MEMORY (FROM CUDA 6) Developer View Today Developer View With Unified Memory System Memory GPU Memory Unified Memory Dramatically Lower Developer Effort Faster performance on SoC
14 WHAT IS HAPPENING IN THE BACKGROUND
15 Audio Processor ARM7 TEGRA K1 PHYSICALLY SHARED MEMORY Physically CPU Quad Cortex-A15 + Shadow LP C-A15 CPU HD Video Processor Unified Memory without the need of Data SATA2 x1 USB 2.0 x3 PCIe G2 x4 + x1 Image Processor GPU Kepler 192 CUDA Cores Migration USB 3.0 x2 UART x4 I2C x5 SPI x4 SDIO/MMC x4 Display x2 HDMI edp/lvds CSI x4 + x4 NOR Flash DDR3 Ctlr 64b Security Engine DAP x5 (1 2 S/TDM) Unified Memory
16 CUDA UNIFIED MEMORY (FROM CUDA 6) Developer View Today Developer View With Unified Memory System Memory GPU Memory Unified Memory Dramatically Lower Developer Effort Faster performance on SoC
17 TEGRA K1 FOR RENDERSCRIPT Kepler Architecture 192 CUDA Cores Cortex-A15 4-Plus-1 Shared Physical Memory 2D Engine / ISP
18 RENDERSCRIPT C99 based kernel language Easy programmability with host and device portability. Portable across wide range of devices, fastest on Tegra K1 GPU (+ CPU) and more Renderscript API 19 Support
19 RENDERSCRIPT ON THE SOC Acceleration of Renderscript Scripts over GPU ScriptC, not just ScriptIntrinsics Huge gains in performance and performance/watt Runtime capable of scheduling work across units CPU GPU 2D Engine/ISP Related Session: US S Efficient Parallel Computation on Android Jason Sams, Tim Murray
20 INTRODUCTION (REF S4885)
21 CONSTRAINTS #1, #2 (Ref S4885)
22 CONSTRAINTS #3 (Ref S4885)
23 GPU OR CPU? (REF S4885)
24 DESKTOP PERFORMANCE TODAY (REF S4885)
25 MOBILE PERFORMANCE TODAY (SHIPPING) (REF S4885)
26 ARCHITECTUAL DIVERSITY? (REF S4885)
27 GOAL OF RENDERSCRIPT (REF S4885)
28 WHAT IS RENDERSCRIPT? (REF S4885)
29 RENDERSCRIPT INTRINSICS (REF S4885)
30 TEGRA K1? (REF S4885)
31 TEGRA K1? (REF S4885)
32 SUMMARY Tegra K1 内蔵のKeplerはTesla/Quadro/GeForceとアーキテクチャを共通とするスケーラブルなGPU これによりTesla/Quadro/GeForceで熟成されたCUDA, OpenCL, OpenGL Shader Languageのソフト資産 開発環境が使用可能 さらにHPC WS PCとは対極にあるモバイル用のRenderscriptに関しても GPUを活用することにより 優れた性能を発揮する
33 THANK YOU
INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES. Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.
INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp. Computer Vision in Mobile Tegra K1 It s time! AGENDA Use cases categories
More informationTHE LEADER IN VISUAL COMPUTING
MOBILE EMBEDDED THE LEADER IN VISUAL COMPUTING 2 TAKING OUR VISION TO REALITY HPC DESIGN and VISUALIZATION AUTO GAMING 3 BEST DEVELOPER EXPERIENCE Tools for Fast Development Debug and Performance Tuning
More informationTEGRA K1 AND THE AUTOMOTIVE INDUSTRY. Gernot Ziegler, Timo Stich
TEGRA K1 AND THE AUTOMOTIVE INDUSTRY Gernot Ziegler, Timo Stich Previously: Tegra in Automotive Infotainment / Navigation Digital Instrument Cluster Passenger Entertainment TEGRA K1 with Kepler GPU GPU:
More informationGPU programming CUDA C. GPU programming,ii. COMP528 Multi-Core Programming. Different ways:
COMP528 Multi-Core Programming GPU programming,ii www.csc.liv.ac.uk/~alexei/comp528 Alexei Lisitsa Dept of computer science University of Liverpool a.lisitsa@.liverpool.ac.uk Different ways: GPU programming
More informationIMAGE AND VISION PROCESSING ON TEGRA K1. Elif Albuz
IMAGE AND VISION PROCESSING ON TEGRA K1 Elif Albuz IMAGE AND VISION USE CASES Driven by using camera as a sensor Computational Photography and Videography Face, Body and Gesture Tracking 3D Scene/Object
More informationNext Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1
Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Ecosystem @neilt3d Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon
More informationIntroduction to CUDA C/C++ Mark Ebersole, NVIDIA CUDA Educator
Introduction to CUDA C/C++ Mark Ebersole, NVIDIA CUDA Educator What is CUDA? Programming language? Compiler? Classic car? Beer? Coffee? CUDA Parallel Computing Platform www.nvidia.com/getcuda Programming
More informationDave Shreiner, ARM March 2009
4 th Annual Dave Shreiner, ARM March 2009 Copyright Khronos Group, 2009 - Page 1 Motivation - What s OpenGL ES, and what can it do for me? Overview - Lingo decoder - Overview of the OpenGL ES Pipeline
More informationUnofficial Redmine Cooking - QA #782 yaml_db を使った DB のマイグレーションで失敗する
Unofficial Redmine Cooking - QA #782 yaml_db を使った DB のマイグレーションで失敗する 2018/03/26 10:04 - Tamura Shinji ステータス : 新規開始日 : 2018/03/26 優先度 : 通常期日 : 担当者 : 進捗率 : 0% カテゴリ : 予定工数 : 0.00 時間 対象バージョン : 作業時間 : 0.00 時間
More informationReal - Time Rendering. Graphics pipeline. Michal Červeňanský Juraj Starinský
Real - Time Rendering Graphics pipeline Michal Červeňanský Juraj Starinský Overview History of Graphics HW Rendering pipeline Shaders Debugging 2 History of Graphics HW First generation Second generation
More informationShaders. Slide credit to Prof. Zwicker
Shaders Slide credit to Prof. Zwicker 2 Today Shader programming 3 Complete model Blinn model with several light sources i diffuse specular ambient How is this implemented on the graphics processor (GPU)?
More informationGPGPU on ARM. Tom Gall, Gil Pitney, 30 th Oct 2013
GPGPU on ARM Tom Gall, Gil Pitney, 30 th Oct 2013 Session Description This session will discuss the current state of the art of GPGPU technologies on ARM SoC systems. What standards are there? Where are
More informationhttps://login.microsoftonline.com/ /oauth2 Protected API Your Client App Your Client App Your Client App Microsoft Account v2.0 endpoint Unified AuthN/Z endpoint Outlook.com (https://login.microsoftonline.com/common/oauth2/v2.0)
More informationSIGGRAPH Briefing August 2014
Copyright Khronos Group 2014 - Page 1 SIGGRAPH Briefing August 2014 Neil Trevett VP Mobile Ecosystem, NVIDIA President, Khronos Copyright Khronos Group 2014 - Page 2 Significant Khronos API Ecosystem Advances
More informationIntroduction to CUDA CME343 / ME May James Balfour [ NVIDIA Research
Introduction to CUDA CME343 / ME339 18 May 2011 James Balfour [ jbalfour@nvidia.com] NVIDIA Research CUDA Programing system for machines with GPUs Programming Language Compilers Runtime Environments Drivers
More informationGraphics Hardware. Instructor Stephen J. Guy
Instructor Stephen J. Guy Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability! Programming Examples Overview What is a GPU Evolution of GPU GPU Design Modern Features Programmability!
More informationDynamIQ Processor Solutions (Using Cortex- A75 & Cortex-A55) for 5G Networks & Mobile
DynamIQ Processor Solutions (Using Cortex- A75 & Cortex-A55) for 5G Networks & Mobile Satoshi Nakajima FAE manager Arm Norio Kisumi Sr Manager, Operator Relations Arm 2017 Arm Limited Arm Tech Symposia
More informationAndroidプログラミング 2 回目 迫紀徳
Androidプログラミング 2 回目 迫紀徳 前回の復習もかねて BMI 計算アプリを作ってみよう! 2 3 BMI の計算方法 BMI = 体重 [kg] 身長 [m] 2 状態も表示できると GOOD 状態低体重 ( 痩せ型 ) 普通体重肥満 (1 度 ) 肥満 (2 度 ) 肥満 (3 度 ) 肥満 (4 度 ) 指標 18.5 未満 18.5 以上 25 未満 25 以上 30 未満 30
More information今日の予定 1. 展開図の基礎的な知識 1. 正多面体の共通の展開図. 2. 複数の箱が折れる共通の展開図 :2 時間目 3. Rep-Cube: 最新の話題 4. 正多面体に近い立体と正 4 面体の共通の展開図 5. ペタル型の紙で折るピラミッド型 :2 時間目 ~3 時間目
今日の予定 このミステリー (?) の中でメイントリックに使われました! 1. 展開図の基礎的な知識 1. 正多面体の共通の展開図 2. 複数の箱が折れる共通の展開図 :2 時間目 3. Rep-Cube: 最新の話題 4. 正多面体に近い立体と正 4 面体の共通の展開図 5. ペタル型の紙で折るピラミッド型 :2 時間目 ~3 時間目 Some nets are available at http://www.jaist.ac.jp/~uehara/etc/origami/nets/index-e.html
More informationReal-Time Rendering (Echtzeitgraphik) Michael Wimmer
Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key
More informationCloud Connector 徹底解説. 多様な基盤への展開を可能にするための Citrix Cloud のキーコンポーネント A-5 セールスエンジニアリング本部パートナー SE 部リードシステムズエンジニア. 哲司 (Satoshi Komiyama) Citrix
1 2017 Citrix Cloud Connector 徹底解説 多様な基盤への展開を可能にするための Citrix Cloud のキーコンポーネント A-5 セールスエンジニアリング本部パートナー SE 部リードシステムズエンジニア 小宮山 哲司 (Satoshi Komiyama) 2 2017 Citrix このセッションのもくじ Cloud Connector 徹底解説 Cloud Connector
More informationGPU CUDA Programming
GPU CUDA Programming 이정근 (Jeong-Gun Lee) 한림대학교컴퓨터공학과, 임베디드 SoC 연구실 www.onchip.net Email: Jeonggun.Lee@hallym.ac.kr ALTERA JOINT LAB Introduction 차례 Multicore/Manycore and GPU GPU on Medical Applications
More informationYamaha Steinberg USB Driver V for Mac Release Notes
Yamaha Steinberg USB Driver V1.10.2 for Mac Release Notes Contents System Requirements for Software Main Revisions and Enhancements Legacy Updates System Requirements for Software - Note that the system
More informationPERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE
April 4-7, 2016 Silicon Valley PERFORMANCE OPTIMIZATIONS FOR AUTOMOTIVE SOFTWARE Pradeep Chandrahasshenoy, Automotive Solutions Architect, NVIDIA Stefan Schoenefeld, ProViz DevTech, NVIDIA 4 th April 2016
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationCS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8 Markus Hadwiger, KAUST Reading Assignment #5 (until March 12) Read (required): Programming Massively Parallel Processors book, Chapter
More informationVerify99. Axis Systems
Axis Systems Axis Systems Mission Axis Systems, Inc. is a technology leader in the logic design verification market. Founded in 1996, the company offers breakthrough technologies and high-speed simulation
More informationProgramming shaders & GPUs Christian Miller CS Fall 2011
Programming shaders & GPUs Christian Miller CS 354 - Fall 2011 Fixed-function vs. programmable Up until 2001, graphics cards implemented the whole pipeline for you Fixed functionality but configurable
More informationGPGPU on Mobile Devices
GPGPU on Mobile Devices Introduction Addressing GPGPU for very mobile devices Tablets Smartphones Introduction Why dedicated GPUs in mobile devices? Gaming Physics simulation for realistic effects 3D-GUI
More informationIntroduction to OpenGL ES 3.0
Introduction to OpenGL ES 3.0 Eisaku Ohbuchi Digital Media Professionals Inc. 2012 Digital Media Professionals Inc. All rights reserved. 12/Sep/2012 Page 1 Agenda DMP overview (quick!) OpenGL ES 3.0 update
More informationNavigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015
Copyright Khronos Group 2015 - Page 1 Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem
More informationCS4621/5621 Fall Computer Graphics Practicum Intro to OpenGL/GLSL
CS4621/5621 Fall 2015 Computer Graphics Practicum Intro to OpenGL/GLSL Professor: Kavita Bala Instructor: Nicolas Savva with slides from Balazs Kovacs, Eston Schweickart, Daniel Schroeder, Jiang Huang
More informationNVIDIA Fermi Architecture
Administrivia NVIDIA Fermi Architecture Patrick Cozzi University of Pennsylvania CIS 565 - Spring 2011 Assignment 4 grades returned Project checkpoint on Monday Post an update on your blog beforehand Poster
More informationGTC 2013 March San Jose, CA The Smartest People. The Best Ideas. The Biggest Opportunities. Opportunities for Participation:
GTC 2013 March 18-21 San Jose, CA The Smartest People. The Best Ideas. The Biggest Opportunities. Opportunities for Participation: SPEAK - Showcase your work among the elite of graphics computing - Call
More informationCopyright Khronos Group Page 1
Gaming Market Briefing Overview of APIs GDC March 2016 Neil Trevett Khronos President NVIDIA Vice President Developer Ecosystem ntrevett@nvidia.com @neilt3d Copyright Khronos Group 2016 - Page 1 Copyright
More informationEmbedded Computing without Compromise. Evolution of the Rugged GPGPU Computer Session: SIL7127 Dan Mor PLM -Aitech Systems GTC Israel 2017
Evolution of the Rugged GPGPU Computer Session: SIL7127 Dan Mor PLM - Systems GTC Israel 2017 Agenda Current GPGPU systems NVIDIA Jetson TX1 and TX2 evaluation Conclusions New Products 2 GPGPU Product
More informationDirectX10 Effects and Performance. Bryan Dudash
DirectX10 Effects and Performance Bryan Dudash Today s sessions Now DX10のエフェクトとパフォーマンスならび使用法 Bryan Dudash NVIDIA 16:50 17:00 BREAK 17:00 18:30 NVIDIA GPUでの物理演算 Simon Green NVIDIA Motivation Direct3D 10
More informationAntonio R. Miele Marco D. Santambrogio
Advanced Topics on Heterogeneous System Architectures GPU Politecnico di Milano Seminar Room A. Alario 18 November, 2015 Antonio R. Miele Marco D. Santambrogio Politecnico di Milano 2 Introduction First
More informationJ の Lab システムの舞台裏 - パワーポイントはいらない -
JAPLA 研究会資料 2011/6/25 J の Lab システムの舞台裏 - パワーポイントはいらない - 西川利男 学会の発表などでは 私は J の Lab を活用している 多くの人が使っているパワーポイントなぞ使う気にはならない J の Lab システムは会場の大きなスクリーンで説明文書が出来ることはもちろんだが システム自身が J の上で動いていることから J のプログラムが即実行出来て
More informationGetting Started with CUDA C/C++ Mark Ebersole, NVIDIA CUDA Educator
Getting Started with CUDA C/C++ Mark Ebersole, NVIDIA CUDA Educator Heterogeneous Computing CPU GPU Once upon a time Past Massively Parallel Supercomputers Goodyear MPP Thinking Machine MasPar Cray 2 1.31
More informationECE 574 Cluster Computing Lecture 15
ECE 574 Cluster Computing Lecture 15 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 30 March 2017 HW#7 (MPI) posted. Project topics due. Update on the PAPI paper Announcements
More informationDeep Learning: Transforming Engineering and Science The MathWorks, Inc.
Deep Learning: Transforming Engineering and Science 1 2015 The MathWorks, Inc. DEEP LEARNING: TRANSFORMING ENGINEERING AND SCIENCE A THE NEW RISE ERA OF OF GPU COMPUTING 3 NVIDIA A IS NEW THE WORLD S ERA
More informationChapter 1 Videos Lesson 61 Thrillers are scary ~Reading~
LESSON GOAL: Can read about movies. 映画に関する文章を読めるようになろう Choose the word to match the underlined word. 下線の単語から考えて どんな映画かを言いましょう 1. The (thriller movie, sports video) I watched yesterday was scary. 2. My
More informationIntroduction to Information and Communication Technology (a)
Introduction to Information and Communication Technology (a) 6 th week: 1.5 Information security and management Kazumasa Yamamoto Dept. Computer Science & Engineering Introduction to ICT(a) 6th week 1
More informationSpring 2011 Prof. Hyesoon Kim
Spring 2011 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationFirefox for mac
Мобильный портал WAP версия: wap.altmaster.ru Firefox for mac 10.6.8 Download old versions of Firefox for Mac.. Firefox. A multi-platform web browser with open source code. Mozilla Firefox for Mac latest
More information本書について... 7 本文中の表記について... 7 マークについて... 7 MTCE をインストールする前に... 7 ご注意... 7 推奨 PC 仕様... 8 MTCE をインストールする... 9 MTCE をアンインストールする... 11
Installation Guide FOR English 2 About this guide... 2 Notations used in this document... 2 Symbols... 2 Before installing MTCE... 2 Notice... 2 Recommended computer specifications... 3 Installing MTCE...
More informationA176 Cyclone. GPGPU Fanless Small FF RediBuilt Supercomputer. IT and Instrumentation for industry. Aitech I/O
The A176 Cyclone is the smallest and most powerful Rugged-GPGPU, ideally suited for distributed systems. Its 256 CUDA cores reach 1 TFLOPS, and it consumes less than 17W at full load (8-10W at typical
More informationMethods to Detect Malicious MS Document File using File Structure Inspection
MS 1,a) 2,b) 2 MS Rich Text Compound File Binary MS MS MS 98.4% MS MS Methods to Detect Malicious MS Document File using File Structure Inspection Abstract: Today, the number of targeted attacks is increasing,
More informationApril 4-7, 2016 Silicon Valley
April 4-7, 2016 Silicon Valley TEGRA PLATFORMS GAMING DRONES ROBOTICS IVA AUTOMOTIVE 2 Compile Debug Profile Trace C/C++ NVTX NVIDIA Tools extension Getting Started CodeWorks JetPack Installers IDE Integration
More informationManycore and GPU Channelisers. Seth Hall High Performance Computing Lab, AUT
Manycore and GPU Channelisers Seth Hall High Performance Computing Lab, AUT GPU Accelerated Computing GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate
More informationCUDA PROGRAMMING MODEL. Carlo Nardone Sr. Solution Architect, NVIDIA EMEA
CUDA PROGRAMMING MODEL Carlo Nardone Sr. Solution Architect, NVIDIA EMEA CUDA: COMMON UNIFIED DEVICE ARCHITECTURE Parallel computing architecture and programming model GPU Computing Application Includes
More informationGPU Memory Model Overview
GPU Memory Model Overview John Owens University of California, Davis Department of Electrical and Computer Engineering Institute for Data Analysis and Visualization SciDAC Institute for Ultrascale Visualization
More informationYamaha Steinberg USB Driver V for Windows Release Notes
Yamaha Steinberg USB Driver V1.10.4 for Windows Release Notes Contents System Requirements for Software Main Revisions and Enhancements Legacy Updates System Requirements for Software - Note that the system
More informationAccelerating Realism with the (NVIDIA Scene Graph)
Accelerating Realism with the (NVIDIA Scene Graph) Holger Kunz Manager, Workstation Middleware Development Phillip Miller Director, Workstation Middleware Product Management NVIDIA application acceleration
More informationYamaha Steinberg USB Driver V for Windows Release Notes
Yamaha Steinberg USB Driver V1.9.11 for Windows Release Notes Contents System Requirements for Software Main Revisions and Enhancements Legacy Updates System Requirements for Software - Note that the system
More informationOpenGL BOF Siggraph 2011
OpenGL BOF Siggraph 2011 OpenGL BOF Agenda OpenGL 4 update Barthold Lichtenbelt, NVIDIA OpenGL Shading Language Hints/Kinks Bill Licea-Kane, AMD Ecosystem update Jon Leech, Khronos Viewperf 12, a new beginning
More informationサンプル. NI TestStand TM I: Introduction Course Manual
NI TestStand TM I: Introduction Course Manual Course Software Version 4.1 February 2009 Edition Part Number 372771A-01 NI TestStand I: Introduction Course Manual Copyright 2009 National Instruments Corporation.
More informationLecture 15: Introduction to GPU programming. Lecture 15: Introduction to GPU programming p. 1
Lecture 15: Introduction to GPU programming Lecture 15: Introduction to GPU programming p. 1 Overview Hardware features of GPGPU Principles of GPU programming A good reference: David B. Kirk and Wen-mei
More informationS CUDA on Xavier
S8868 - CUDA on Xavier Anshuman Bhat CUDA Product Manager Saikat Dasadhikari CUDA Engineering 29 th March 2018 1 CUDA ECOSYSTEM 2018 CUDA DOWNLOADS IN 2017 3,500,000 CUDA REGISTERED DEVELOPERS 800,000
More informationLecture 4 Branch & cut algorithm
Lecture 4 Branch & cut algorithm 1.Basic of branch & bound 2.Branch & bound algorithm 3.Implicit enumeration method 4.B&B for mixed integer program 5.Cutting plane method 6.Branch & cut algorithm Slide
More informationLPGPU Workshop on Power-Efficient GPU and Many-core Computing (PEGPUM 2014)
A practitioner s view of challenges faced with power and performance on mobile GPU Prashant Sharma Samsung R&D Institute UK LPGPU Workshop on Power-Efficient GPU and Many-core Computing (PEGPUM 2014) SERI
More informationS8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer
S8822 OPTIMIZING NMT WITH TENSORRT Micah Villmow Senior TensorRT Software Engineer 2 100 倍以上速く 本当に可能ですか? 2 DOUGLAS ADAMS BABEL FISH Neural Machine Translation Unit 3 4 OVER 100X FASTER, IS IT REALLY POSSIBLE?
More informationTake GPU Processing Power Beyond Graphics with Mali GPU Computing
Take GPU Processing Power Beyond Graphics with Mali GPU Computing Roberto Mijat Visual Computing Marketing Manager August 2012 Introduction Modern processor and SoC architectures endorse parallelism as
More information携帯電話の 吸収率 (SAR) について / Specific Absorption Rate (SAR) of Mobile Phones
携帯電話の 吸収率 (SAR) について / Specific Absorption Rate (SAR) of Mobile Phones 1. SC-02L の SAR / About SAR of SC-02L ( 本語 ) この機種 SC-02L の携帯電話機は 国が定めた電波の 体吸収に関する技術基準および電波防護の国際ガイドライ ンに適合しています この携帯電話機は 国が定めた電波の 体吸収に関する技術基準
More informationThreading Hardware in G80
ing Hardware in G80 1 Sources Slides by ECE 498 AL : Programming Massively Parallel Processors : Wen-Mei Hwu John Nickolls, NVIDIA 2 3D 3D API: API: OpenGL OpenGL or or Direct3D Direct3D GPU Command &
More informationQuantaPlex Series T41S-2U/T41SP-2U
QuantaPlex Series T41S-2U/T41SP-2U 2U 4-Node Server Featuring Latest DDR4 Technology User's Guide Version: 2.0.0 Copyright Copyright 2014 Quanta Computer Inc. This publication, including all photographs,
More informationOnline Meetings with Zoom
Online Meetings with Zoom Electronic Applications の下の部分に Zoom への入り口 What is Zoom? This Web Conferencing service is offered free of charge to eligible officers of technical committees, subcommittees, working
More informationVehicle Calibration Techniques Established and Substantiated for Motorcycles
Technical paper Vehicle Calibration Techniques Established and Substantiated for Motorcycles モータサイクルに特化した車両適合手法の確立と実証 Satoru KANNO *1 Koichi TSUNOKAWA *1 Takashi SUDA *1 菅野寛角川浩一須田玄 モータサイクル向け ECU は, 搭載性をよくするため小型化が求められ,
More informationCS GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 2: Introduction; GPU Architecture 1 Markus Hadwiger, KAUST Reading Assignment #2 (until Feb. 17) Read (required): GLSL book, chapter 4 (The OpenGL Programmable
More informationReal-time Graphics 9. GPGPU
9. GPGPU GPGPU GPU (Graphics Processing Unit) Flexible and powerful processor Programmability, precision, power Parallel processing CPU Increasing number of cores Parallel processing GPGPU general-purpose
More informationPRODUCT DESCRIPTIONS AND METRICS
PRODUCT DESCRIPTIONS AND METRICS 1. Multiple-User Access. 1.1 If On-Premise Software licensed on a per-user basis is installed on a Computer accessible by more than one User, then the total number of Users
More informationCg 2.0. Mark Kilgard
Cg 2.0 Mark Kilgard What is Cg? Cg is a GPU shading language C/C++ like language Write vertex-, geometry-, and fragmentprocessing kernels that execute on massively parallel GPUs Productivity through a
More information携帯電話の 吸収率 (SAR) について / Specific Absorption Rate (SAR) of Mobile Phones
携帯電話の 吸収率 (SAR) について / Specific Absorption Rate (SAR) of Mobile Phones 1. Z-01K の SAR / About SAR of Z-01K ( 本語 ) この機種 Z-01K の携帯電話機は 国が定めた電波の 体吸収に関する技術基準および電波防護の国際ガイドライン に適合しています この携帯電話機は 国が定めた電波の 体吸収に関する技術基準
More informationUB-U01III/U02III/U03II User s Manual
English UB-U01III/U02III/U03II User s Manual Standards and Approvals Copyright 2003 by Seiko Epson Corporation Printed in China The following standards are applied only to the boards that are so labeled.
More informationCentralized (Indirect) switching networks. Computer Architecture AMANO, Hideharu
Centralized (Indirect) switching networks Computer Architecture AMANO, Hideharu Textbook pp.92~130 Centralized interconnection networks Symmetric: MIN (Multistage Interconnection Networks) Each node is
More informationA Trip Down The (2011) Rasterization Pipeline
A Trip Down The (2011) Rasterization Pipeline Aaron Lefohn - Intel / University of Washington Mike Houston AMD / Stanford 1 This talk Overview of the real-time rendering pipeline available in ~2011 corresponding
More informationPreparing Information Design-Oriented. Posters. easy to. easy to. See! Understand! easy to. Convey!
Preparing Information Design-Oriented Posters easy to Convey! easy to See! easy to Understand! Introduction What is the purpose of a presentation? It is to convey accurately what you want to convey to
More informationMobile Graphics Ecosystem. Tom Olson OpenGL ES working group chair
OpenGL ES in the Mobile Graphics Ecosystem Tom Olson OpenGL ES working group chair Director, Graphics Research, ARM Ltd 1 Outline Why Mobile Graphics? OpenGL ES Overview Getting Started with OpenGL ES
More informationReal-time Graphics 9. GPGPU
Real-time Graphics 9. GPGPU GPGPU GPU (Graphics Processing Unit) Flexible and powerful processor Programmability, precision, power Parallel processing CPU Increasing number of cores Parallel processing
More informationBifrost - The GPU architecture for next five billion
Bifrost - The GPU architecture for next five billion Hessed Choi Senior FAE / ARM ARM Tech Forum June 28 th, 2016 Vulkan 2 ARM 2016 What is Vulkan? A 3D graphics API for the next twenty years Logical successor
More informationEnabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition. Jeff Kiel Director, Graphics Developer Tools
Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition Jeff Kiel Director, Graphics Developer Tools Computational Graphics Enabled Problem: Complexity of Computation
More informationDirect Rendering of Trimmed NURBS Surfaces
Direct Rendering of Trimmed NURBS Surfaces Hardware Graphics Pipeline 2/ 81 Hardware Graphics Pipeline GPU Video Memory CPU Vertex Processor Raster Unit Fragment Processor Render Target Screen Extended
More informationMySQL Cluster 7.3 リリース記念!! 5 分で作る MySQL Cluster 環境
MySQL Cluster 7.3 リリース記念!! 5 分で作る MySQL Cluster 環境 日本オラクル株式会社山崎由章 / MySQL Senior Sales Consultant, Asia Pacific and Japan 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. New!! 外部キー
More informationHands-On Workshop: 3D Automotive Graphics on Connected Radios Using Rayleigh and OpenGL ES 2.0
Hands-On Workshop: 3D Automotive Graphics on Connected Radios Using Rayleigh and OpenGL ES 2.0 FTF-AUT-F0348 Hugo Osornio Luis Olea A P R. 2 0 1 4 TM External Use Agenda Back to the Basics! What is a GPU?
More informationMotivation Hardware Overview Programming model. GPU computing. Part 1: General introduction. Ch. Hoelbling. Wuppertal University
Part 1: General introduction Ch. Hoelbling Wuppertal University Lattice Practices 2011 Outline 1 Motivation 2 Hardware Overview History Present Capabilities 3 Programming model Past: OpenGL Present: CUDA
More informationIRS16: 4 byte ASN. Version: 1.0 Date: April 22, 2008 Cisco Systems 2008 Cisco, Inc. All rights reserved. Cisco Systems Japan
IRS16: 4 byte ASN Version: 1.0 Date: April 22, 2008 Cisco Systems hkanemat@cisco.com 1 目次 4 byte ASN の対応状況 運用での変更点 2 4 byte ASN の対応状況 3 4 byte ASN の対応状況 IOS XR 3.4 IOS: 12.0S 12.2SR 12.2SB 12.2SX 12.5T
More informationRechargeable LED Work Light
Rechargeable LED Work Light 充電式 LED 作業灯 Model:SWL-150R1 Using LED:LG innotek SMD, HI-POWER(150mA 15 position) Color Temperature:5,700 kelvin Using Battery:LG chemical Li-ion Battery(2,600mA 1set) Brightness
More informationframe buffer depth buffer stencil buffer
Final Project Proposals Programmable GPUS You should all have received an email with feedback Just about everyone was told: Test cases weren t detailed enough Project was possibly too big Motivation could
More informationSpring 2009 Prof. Hyesoon Kim
Spring 2009 Prof. Hyesoon Kim Application Geometry Rasterizer CPU Each stage cane be also pipelined The slowest of the pipeline stage determines the rendering speed. Frames per second (fps) Executes on
More informationCiril Bohak. - INTRODUCTION TO WEBGL
2016 Ciril Bohak ciril.bohak@fri.uni-lj.si - INTRODUCTION TO WEBGL What is WebGL? WebGL (Web Graphics Library) is an implementation of OpenGL interface for cmmunication with graphical hardware, intended
More informationHardware- Software Co-design at Arm GPUs
Hardware- Software Co-design at Arm GPUs Johan Grönqvist MCC 2017 - Uppsala About Arm Arm Mali GPUs: The World s #1 Shipping Graphics Processor 151 Total Mali licenses 21 Mali video and display licenses
More informationgopro silver edition 3B894937B25EC9AF E4F5DA Gopro Silver Edition
Gopro Silver Edition Thank you very much for reading. Maybe you have knowledge that, people have search numerous times for their favorite novels like this, but end up in harmful downloads. Rather than
More informationGraphics Architectures and OpenCL. Michael Doggett Department of Computer Science Lund university
Graphics Architectures and OpenCL Michael Doggett Department of Computer Science Lund university Overview Parallelism Radeon 5870 Tiled Graphics Architectures Important when Memory and Bandwidth limited
More informationGoogleの強みは ささえるのは世界一のインフラ. Google File System 2008年度後期 情報システム構成論2 第10回 クラウドと協調フィルタリング. 初期(1999年)の Googleクラスタ. 最近のデータセンタ Google Chrome Comicより
Googleの強みは 2008年度後期 情報システム構成論2 第10回 クラウドと協調フィルタリング 西尾 信彦 nishio@cs.ritsumei.ac.jp 立命館大学 情報理工学部 Cloud Computing 全地球規模で構成された圧倒的なPCクラスタ 部分的な機能不全を補う機能 あらゆる種類の情報へのサービスの提供 Web上の 全 情報 地図情報 (実世界情報) どのように利用されているかを機械学習
More informationJASCO-HPLC Operating Manual. (Analytical HPLC)
JASCO-HPLC Operating Manual (Analytical HPLC) Index A) Turning on Equipment and Starting ChromNav... 3 B) For Manual Measurement... 6 (1) Making Control Method... 7 (2) Preparation for Measurement... 9
More informationAPI サーバの URL. <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE COMPLIANCE_SCAN SYSTEM "
Policy Compliance PC スキャン結果の XML Policy Compliance(PC) スキャンの結果は ユーザインタフェースのスキャン履歴リストから XML 形式でダウンロードできます UI からダウンロードした XML 形式の PC スキャン結果には その他のサポートされている形式 (PDF HTML MHT および CSV) の PC スキャン結果と同じ内容が表示されます
More informationModern editor-independent development environment for PHP
エディタ中立な PHP 開発環境の現在 Modern editor-independent development environment for PHP 2018-11-23 Akiba Tokyo, Japan VimConf 2018 #vimconf 日本語でおk 筆者の英語は残念なので 発表済みの日本語資料を 先に読むことをおすすめ Who am I...? aka @tadsan Kenta
More informationEGLSTREAMS: INTEROPERABILITY FOR CAMERA, CUDA AND OPENGL. Debalina Bhattacharjee Sharan Ashwathnarayan
53023 - EGLSTREAMS: INTEROPERABILITY FOR CAMERA, CUDA AND OPENGL Debalina Bhattacharjee Sharan Ashwathnarayan Tegra SOC and typical use-cases Why Interops EGLStream and Its Key Features Agenda Examples
More informationHPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Introduction to CUDA programming
KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Introduction to CUDA programming 1 Agenda GPU Architecture Overview Tools of the Trade Introduction to CUDA C Patterns of Parallel
More information