GPU. Study of Automatic GPU Offloading Technology for Open IoT
|
|
- Rodger Horn
- 5 years ago
- Views:
Transcription
1 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. SC ( ) IoT GPU NTT IoT IoT IoT Tacit Computing Tacit Computing GPU Tacit Computing GPU C/C IoT, GPGPU Tacit Computing Study of Automatic GPU Offloading Technology for Open IoT Yoji YAMATO, Tatsuya DEMIZU, Hirofumi NOGUCHI, and Misao KATAOKA Network Service Systems Laboratories, NTT Corporation, , Midori-cho, Musashino-shi, Tokyo yamato.yoji@lab.ntt.co.jp Abstract IoT technologies have been progressed. Now Open IoT concept has attracted attentions which achieve various IoT services by integrating horizontal separated devices and services. For Open IoT era, we have proposed the Tacit Computing technology to discover the devices with necessary data for users on demand and use them dynamically. However, existing Tacit Computing does not care about performance and operation cost. Therefore in this paper, we propose an automatic GPU offloading technology as an elementary technology of Tacit Computing which uses Genetic Algorithm to extract appropriate offload loop statements to improve performances. evaluate a C/C++ matrix manipulation to verify effectiveness of GPU offloading and confirm more than 35 times performances within 1 hour tuning time. Key words Open IoT, GPGPU, Tacit Computing, Genetic Algorithm, Automatic Offloading We 1. IoT Internet of Things [1]- [18] [19]- [39] IoT Open IoT Open IoT CPU IoT CPU GPU Graphics Processing Unit FPGA Field Programmable Gate Array Amazon Web Services (AWS) GPU FPGA Microsoft FPGA Open IoT 1
2 CUDA Compute Unified Device Architecture, OpenCL Open Computing Language GPU FPGA IoT Open IoT Open IoT GPU FPGA Open IoT Tacit Computing [40] Tacit Computing Open IoT GPU FPGA CPU C/C++ GPU 2. GPU GPGPU General Purpose GPU CUDA CUDA GPGPU GPU, FPGA, CPU OpenCL CUDA OpenCL C GPU CPU CUDA OpenCL GPGPU OpenACC [41] PGI [42] OpenACC C/C++/Fortran OpenACC PGI GPU CPU GPU OpenCL, CUDA, OpenACC GPU Intel for GPU CPU-GPU GPU OpenCL CUDA PGI [43] for for GPU for 3. Tacit Computing GPU 3. 1 Tacit Computing Tacit Computing Open IoT 3 1) [40] 3 Tacit Computing IoT Tacit Computing Tacit Computing OpenCV IoT Tacit Computing Semantic Web RDF/OWL Semantic Web Services [44]- [87] 2
3 1 Tacit Computing 3. 2 Tacit Computing Tacit Computing Open IoT Tacit Computing Tacit Computing Tacit Computing Tacit Computing Tacit Computing 1-2 Tacit Computing FFT (Fast Fourier Transformation) 1-3 Tacit Computing FFT GPU FPGA 1-4 Tacit Computing GPU FPGA DB 2-5 IoT 3. 3 GA GPU Tacit Computing GPU GPU GPU GPU IoT GPU 2 GPU CPU GPU PGI for IoT for for GA for GA 3
4 3 GA GPU GA GA Simple GA Simple GA 1, 0 1 GA for for GPU 1 0 M 1, 0 Pc Pm T GA GPU C/C++ GPU PGI C/C++ OSS proprietary C/C++ OSS GPU PGI PGI OpenACC C/C++/Fortran for OpenACC #pragma acc kernels GPU GPU for for Perl 5 C/C++ C/C++ C/C++ for CPU GPU for for #pragma acc kernels for for break for for #pragma a a 1 0 a C/C++ C/C++ GPU PGI GA GA GA C/C++ GA IoT 4
5 4 5 GA Open IoT GA Simple GA 12 M 12 T 12 ( ) 1/2 Pc 0.9 Pm GPU NVIDIA Quadro K5200 NVIDIA Quadro K5200 CUDA 2304 PGI CUDA Toolkit GA CPU 5 12 GA 0 CPU GA GA 20 CPU 6. Open IoT GPU Open IoT GPU GA C/C++ OSS PGI 12 1 CPU 35 GA [1] Y. Yamato, et al., "Predictive Maintenance Platform with Sound Stream Analysis in Edges," Journal of Information Processing, Vol.25, pp , [2] T. Demizu, et al., "Data Management and Packet Transmission Method based on Receivers Attributes," IEEE WF-IoT 2018, [3] H. Noguchi, et al., "Autonomous Device Identification Architecture for Internet of Things," IEEE WF-IoT 2018, [4] Y. Yamato, et al., "Maintenance of Business Machines with Edge and Cloud Collaboration," IEEJ Transactions on Electrical and Electronic Engineering, Vol.13, No.8, [5] M. Kataoka, et al., "Tacit computing and its application for open IoT era," IEEE CCNC 2018, pp.1-5, [6] Y. Yamato, et al., "A study of coordination logic description and execution for dynamic device coordination services," The 2018 IEICE General Conference, BS-4-1, [7] "hitoe," NTT Vol.29, No.7, pp.13-18, [8] Y. Yamato, "Study of Vital Data Analysis Platform Using Wearable Sensor," IEICE Technical Report, SC , Mar [9] Y. Yamato, et al., "A Study of Optimizing Heterogeneous Resources for Open IoT," IEICE Technical Report, SC , Nov [10] Y. Yamato, et al., "Analyzing machine noise for real time maintenance," ICGIP 2016, [11] Y. Yamato, "Proposal of Vital Data Analysis Platform using Wearable Sensor," ICIAE 2017, pp , [12] " ERP,", Vol.116, No.201, pp.19-22, [13] Y. Yamato, et al., "Proposal of Lambda Architecture Adoption for Real Time Predictive Maintenance," IEEE CANDAR 2016, pp , [14] Y. Yamato, et al., "Proposal of Real Time Predictive Maintenance Platform with 3D Printer for Business Vehicles," International Journal of Information and Electronics Engineering, Vol.6, No.5, pp , [15] Y. Yamato, et al.,"security Camera Movie and ERP Data Matching System to Prevent Theft," IEEE CCNC 2017, pp , [16] Y. Yamato, "A study of posture judgement on vehicles using wearable 5
6 acceleration sensor," The 2017 IEICE General Conference, BS-3-1, [17] Y. Yamato, "Experiments of posture estimation on vehicles using wearable acceleration sensors," IEEE BigDataSecurity 2017, [18] Y. Yamato, et al., "Proposal of Shoplifting Prevention Service Using Image Analysis and ERP Check," IEEJ Transactions on Electrical and Electronic Engineering, Vol.12, pp , [19] Y. Yamato, et al., "Fast and Reliable Restoration Method of Virtual Resources on OpenStack," IEEE Transactions on Cloud Computing, 2015 [20] Y. Yamato, "Server Structure Proposal and Automatic Verification Technology on IaaS Cloud of Plural Type Servers," NETs2015, [21] Y. Yamato, "Proposal of Optimum Application Deployment Technology for Heterogeneous IaaS Cloud," WCSE 2016, pp.34-37, [22] Y. Yamato, "Use case study of HDD-SSD hybrid storage, distributed storage and HDD storage on OpenStack," IDEAS 15, pp , [23] Y. Yamato, et al.,"fast Restoration Method of Virtual Resources on OpenStack," IEEE CCNC 2015, pp , [24] Y. Yamato, et al., "Survey of Public IaaS Cloud Computing API," IEEJ Transactions on Electronics, Information and Systems, Vol.132, pp , [25] Y. Yamato, "Implementation Evaluation of Amazon SQS Compatible Functions," IEEJ Transactions on Electronics, Information and Systems, Vol.135, pp , [26] Y. Yamato, et al.,"development of Template Management Technology for Easy Deployment of Virtual Resources on OpenStack," Journal of Cloud Computing, Springer, 3:7, June [27] Y. Yamato, "Automatic verification for plural virtual machines patches," IEEE ICUFN 2015, pp , [28] Y. Yamato, "Cloud Storage Application Area of HDD-SSD Hybrid Storage, Distributed Storage and HDD Storage," IEEJ Transactions on Electrical and Electronic Engineering, Vol.11, pp , [29] Y. Yamato, et al., "Evaluation of Agile Software Development Method for Carrier Cloud Service Platform Development," IEICE Transactions on Information & Systems, Vol.E97-D, pp , [30] Y. Yamato, et al., "Software Maintenance Evaluation of Agile Software Development Method Based on OpenStack," IEICE Transactions on Information & Systems, Vol.E98-D, pp , [31] Y. Yamato, et al., "Development of Resource Management Server for Production IaaS Services Based on OpenStack," Journal of Information Processing, Vol.23, pp.58-66, [32] Y. Yamato, "Performance-Aware Server Architecture Recommendation and Automatic Performance Verification Technology on IaaS Cloud," Service Oriented Computing and Applications, Springer, [33] Y. Yamato, "Server Selection, Configuration and Reconfiguration Technology for IaaS Cloud with Multiple Server Types," Journal of Network and Systems Management, Springer, [34] ",", Vol.J95-B, pp , [35] ",", Vol.15, No.3, pp.3-8, [36] Y. Yamato, "Automatic verification technology of software patches for user virtual environments on IaaS cloud," Journal of Cloud Computing, Springer, 4:4, Feb [37] Y. Yamato, "Automatic system test technology of user virtual machine software patch on IaaS cloud," IEEJ Transactions on Electrical and Electronic Engineering, Vol.10, pp , [38] Y. Yamato, "Optimum Application Deployment Technology for Heterogeneous IaaS Cloud," Journal of Information Processing, Vol.25, pp.56-58, [39] Y. Yamato, "OpenStack Hypervisor, Container and Baremetal Servers Performance Comparison," IEICE Communication Express, Vol.4, pp , [40] Y. Yamato, et al., "A study to optimize heterogenesous resources for Open IoT," IEEE CANDAR 2017, pp , Nov [41] S. Wienke, et al., "OpenACC-first experiences with real-world applications," Euro-Par 2012 Parallel Processing, pp , [42] M. Wolfe, "Implementing the PGI accelerator model," ACM the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp.43-50, [43] Y. Tanaka, et al., "Evaluation of Optimization Method for Fortran Codes with GPU Automatic Parallelization Compiler," IPSJ SIG Technical Report, 2011(9), pp.1-6, [44] H. Sunaga, et al., "Service Delivery Platform Architecture for the Next-Generation Network," ICIN2008, Oct [45] Y. Yokohata, et al., "Service Composition Architecture for Programmability and Flexibility in Ubiquitous Communication Networks," IEEE SAINTW 06, pp , [46] ",", Vol.J91-B, pp , [47] Y. Yamato, et al., "Development of Service Control Server for Web- Telecom Coordination Service," IEEE ICWS 2008, pp , [48] "Web-,", Vol.J91-B, pp , [49] "Web Web,", Vol.49, pp , [50] Y. Yokohata, et al., "Context-Aware Content-Provision Service for Shopping Malls Based on Ubiquitous Service-Oriented Network Framework and Authentication and Access Control Agent Framework," IEEE CCNC 2006, pp , [51], ",", Vol.48, pp , [52] Y. Nakano, et al., "Method of creating web services from web applications," IEEE SOCA 2007, pp.65-71, [53] Y. Yamato, et al., "Context-Aware Service Composition and Component Change-over using Semantic Web Techniques," IEEE ICWS 2007, pp , [54] Y. Yamato, et al., "Context-aware Ubiquitous Service Composition Technology," IFIP Research and Practical Issues of Enterprise Information Systems (CONFENIS 2006), pp.51-61, [55] Y. Miyagi, et al., "Support System for Developing Web Service Composition Scenarios without Need for Web Application Skills," IEEE SAINT 09, pp , [56] ",", Vol.105, No.407, pp.35-40, [57] ",", Vol.105, No.127, pp.5-8, [58] Y. Yamato, et al., "Study and Evaluation of Context-Aware Service Composition and Change-Over Using BPEL Engine and Semantic Web Techniques," IEEE CCNC 2008, pp , [59] H. Sunaga, et al., "Ubiquitous Life Creation through Service Composition Technologies," WTC2006, May [60] Y. Yamato, et al., "Study of Service Processing Agent for Context- Aware Service Coordination," IEEE SCC 2008, pp , [61] M. Takemoto, et al. "Service-composition Method and Its Implementation in Service-provision Architecture for Ubiquitous Computing Environments," IPSJ Journal, Vol.46, pp , [62] "BPEL,", Vol.51, pp , [63] M. Takemoto, et al.,"service Elements and Service Templates for Adaptive Service Composition in a Ubiquitous Computing Environment," IEEE APCC 2003, Vol.1, pp , [64] Y. Nakano, et al.,"effective Web-Service Creation Mechanism for Ubiquitous Service Oriented Architecture," IEEE CEC/EEE 2006, pp.85, [65] Y. Nakano, et al., "Web-Service-Based Avatar Service Modeling in the Next Generation Network," IEEE APSITT 2008, pp.52-57, [66] Y. Yamato, "Method of Service Template Generation on a Service Coordination Framework," UCS2004, Nov [67] ",", Vol.48, pp , [68] "BPEL,", Vol.J91-B, pp , [69] ",,",, [70] Y. Yamato, et al., "Peer-to-Peer Content Searching Method using Evaluation of Semantic Vectors," IEEE CCNC 2006, pp , [71] Y. Yamato, et al., "Peer-to-Peer Non-document Content Searching Method Using User Evaluation of Semantic Vectors," IEICE Transactions on Communications, Vol.E89-B, pp , [72] ",", Vol.104, No.691, pp , [73] ",", Vol.103, No.575, pp.1-6, [74] ",", Vol.116, No.287, pp.49-52, [75] M. Takemoto, et al., "A Service-Coordination Framework for Application to Ubiquitous and Context-aware Computing Environments," The 66th IPSJ General Conference, pp , [76] Y. Yamato, "P2P Content Searching Method Using Evaluation of Semantic Vectors," IEEE SAINT 2005 Workshops, pp , [77] "SOAP-REST,", Vol.51, pp , [78] "BPEL,", Vol.6, pp , [79] ",", Vol.4, pp , [80] "Service Delivery Platform,", Vol.J93-B, pp , 2010 [81] "," 66, pp , [82] "Grid,", Vol.105, No.178, pp.49-54, [83] " SOAP-REST,", Vol.109, No.3, pp.83-86, [84] "R&D Web Web," NTT Vol.20, No.6, pp.40-43, [85] "Web," (DEWS 2007), [86] M. Takemoto, et al., "A Future Network Service based on PC Grid Computing and P2P File Sharing: YOHEIX (Your Own Hybrid Environment for Information Exchange)," IPSJ SIG Technical Report, 2004, No.89 (2004-DPS-119), pp.91-96, [87] ",", Vol.J91-B, pp ,
Server Selection, Configuration and Reconfiguration Technology for IaaS Cloud with Multiple Server Types
J Netw Syst Manage (2018) 26:339 360 https://doi.org/10.1007/s10922-017-9418-z Server Selection, Configuration and Reconfiguration Technology for IaaS Cloud with Multiple Server Types Yoji Yamato 1 Received:
More informationService Delivery Platform Architecture for the Next-Generation Network
Service Delivery Platform Architecture for the Next-Generation Network Hiroshi Sunaga, Yoji Yamato, Hiroyuki Ohnishi, Masashi Kaneko, Masami Iio, and Miki Hirano NTT Network Service Systems Laboratories,
More informationOpenStack hypervisor, container and Baremetal servers performance comparison
OpenStack hypervisor, container and Baremetal servers performance comparison Yoji Yamato a) Software Innovation Center, NTT Corporation, 3 9 11 Midori-cho, Musashino-shi, Tokyo 180 8585, Japan a) yamato.yoji@lab.ntt.co.jp
More informationNTT Com Press Conference March 1, 2016 #enterprisecloud
NTT Com Press Conference March 1, 2016 #enterprisecloud 1 Significant Enhancement of Enterprise Cloud - Realizing Digital Transformation - NTT Communications March 1, 2016 2 NTT Communications Initiatives
More informationCUDA. Matthew Joyner, Jeremy Williams
CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel
More informationNVIDIA DLI HANDS-ON TRAINING COURSE CATALOG
NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial
More informationExpressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks A full-day tutorial proposal for SC17
Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks A full-day tutorial proposal for SC17 Tutorial Instructors [James Reinders, Michael J. Voss, Pablo Reble, Rafael Asenjo]
More informationGPU GPU CPU. Raymond Namyst 3 Samuel Thibault 3 Olivier Aumage 3
/CPU,a),2,2 2,2 Raymond Namyst 3 Samuel Thibault 3 Olivier Aumage 3 XMP XMP-dev CPU XMP-dev/StarPU XMP-dev XMP CPU StarPU CPU /CPU XMP-dev/StarPU N /CPU CPU. Graphics Processing Unit GP General-Purpose
More informationTechnology for a better society. hetcomp.com
Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction
More informationIntroduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620
Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved
More informationNVIDIA DEEP LEARNING INSTITUTE
NVIDIA DEEP LEARNING INSTITUTE TRAINING CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial
More informationMartin Dubois, ing. Contents
Martin Dubois, ing Contents Without OpenNet vs With OpenNet Technical information Possible applications Artificial Intelligence Deep Packet Inspection Image and Video processing Network equipment development
More informationECE 8823: GPU Architectures. Objectives
ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading
More informationgpot: Intelligent Compiler for GPGPU using Combinatorial Optimization Techniques
gpot: Intelligent Compiler for GPGPU using Combinatorial Optimization Techniques Yuta TOMATSU, Tomoyuki HIROYASU, Masato YOSHIMI, Mitsunori MIKI Graduate Student of School of Ewngineering, Faculty of Department
More informationVirtualization and Softwarization Technologies for End-to-end Networking
ization and Softwarization Technologies for End-to-end Networking Naoki Oguchi Toru Katagiri Kazuki Matsui Xi Wang Motoyoshi Sekiya The emergence of 5th generation mobile networks (5G) and Internet of
More informationEvaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices
Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices Jonas Hahnfeld 1, Christian Terboven 1, James Price 2, Hans Joachim Pflug 1, Matthias S. Müller
More informationPASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year
PASS4TEST \ http://www.pass4test.com We offer free update service for one year Exam : HP0-D31 Title : Designing HP Data Center and Cloud Solutions Vendor : HP Version : DEMO Get Latest & Valid HP0-D31
More informationCuda C Programming Guide Appendix C Table C-
Cuda C Programming Guide Appendix C Table C-4 Professional CUDA C Programming (1118739329) cover image into the powerful world of parallel GPU programming with this down-to-earth, practical guide Table
More informationAutomatic Verification Technology of Virtual Machine Software Patch on IaaS Cloud
Automatic Verification Technology of Virtual Machine Software Patch on IaaS Cloud Yoji Yamato International Science Index, Computer and Information Engineering waset.org/publication/10000705 Abstract In
More informationHybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS
+ Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics
More informationThomas Lin, Naif Tarafdar, Byungchul Park, Paul Chow, and Alberto Leon-Garcia
Thomas Lin, Naif Tarafdar, Byungchul Park, Paul Chow, and Alberto Leon-Garcia The Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto, ON, Canada Motivation: IoT
More informationQuantitative study of computing time of direct/iterative solver for MoM by GPU computing
Quantitative study of computing time of direct/iterative solver for MoM by GPU computing Keisuke Konno 1a), Hajime Katsuda 2, Kei Yokokawa 1, Qiang Chen 1, Kunio Sawaya 3, and Qiaowei Yuan 4 1 Department
More informationVSC Users Day 2018 Start to GPU Ehsan Moravveji
Outline A brief intro Available GPUs at VSC GPU architecture Benchmarking tests General Purpose GPU Programming Models VSC Users Day 2018 Start to GPU Ehsan Moravveji Image courtesy of Nvidia.com Generally
More informationNTT Com Press Conference March 1, 2016 #enterprisecloud
NTT Com Press Conference March 1, 2016 #enterprisecloud 1 Hybrid IT: Living In A Multi-Cloud World Melanie Posey Research VP, IaaS/Hosting Infrastructure Services & Managed Network Services February 2016
More informationNew Value Chain through Service Platform
New Value Chain through Service Platform Oct. 24 th 2008 Ryozo Ito Senior Executive Consultant Hewlett-Packard Japan Technology for better business outcomes 2008 Hewlett-Packard Development Company, L.P.
More informationHuawei Agile Controller. Agile Controller
Huawei 1 1 Product Overview Product Features is the latest user-centric and application-based, automatic network resource control system developed by Huawei. This system is positioned as the "Smart Brain"
More informationProductive Performance on the Cray XK System Using OpenACC Compilers and Tools
Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 The New Generation of Supercomputers Hybrid
More informationNavigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015
Copyright Khronos Group 2015 - Page 1 Navigating the Vision API Jungle: Which API Should You Use and Why? Embedded Vision Summit, May 2015 Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem
More informationCUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav
CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture
More informationEnabling Red Hat Virtualization for the Hybrid Cloud
Enabling Red Hat Virtualization for the Hybrid Cloud RHV 4 integration with CloudForms and Ansible Scott Herold Director, Product Management - Virtualization Business Red Hat Forum Israel November 2016
More informationGPUs and Emerging Architectures
GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs
More informationGPGPU/CUDA/C Workshop 2012
GPGPU/CUDA/C Workshop 2012 Day-1: GPGPU/CUDA/C and WSU Presenter(s): Abu Asaduzzaman Nasrin Sultana Wichita State University July 10, 2012 GPGPU/CUDA/C Workshop 2012 Outline Introduction to the Workshop
More informationParallel Programming Libraries and implementations
Parallel Programming Libraries and implementations Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License.
More informationHPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Introduction to CUDA programming
KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Introduction to CUDA programming 1 Agenda GPU Architecture Overview Tools of the Trade Introduction to CUDA C Patterns of Parallel
More informationOpenACC Course. Office Hour #2 Q&A
OpenACC Course Office Hour #2 Q&A Q1: How many threads does each GPU core have? A: GPU cores execute arithmetic instructions. Each core can execute one single precision floating point instruction per cycle
More informationInternet Routing Operation Technology (ENCORE)
: End-to-end Network Operation Internet Routing Operation Technology () Toshimitsu Oshima, Mitsuho Tahara, Kazuo Koike, Yoshihiro Otsuka, and Souhei Majima Abstract is a multi--based system for automatically
More informationNVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU
NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU GPGPU opens the door for co-design HPC, moreover middleware-support embedded system designs to harness the power of GPUaccelerated
More informationAddressing Heterogeneity in Manycore Applications
Addressing Heterogeneity in Manycore Applications RTM Simulation Use Case stephane.bihan@caps-entreprise.com Oil&Gas HPC Workshop Rice University, Houston, March 2008 www.caps-entreprise.com Introduction
More informationFast Hardware For AI
Fast Hardware For AI Karl Freund karl@moorinsightsstrategy.com Sr. Analyst, AI and HPC Moor Insights & Strategy Follow my blogs covering Machine Learning Hardware on Forbes: http://www.forbes.com/sites/moorinsights
More informationState of OpenShift on Bare Metal
State of OpenShift on Bare Metal OpenShift Commons Gathering - Seattle Jose Palafox, Technical Program Manager for CNCF, Intel Jeremy Eder, Senior Principal Performance Engineer, Red Hat Dave Cain, Senior
More informationAccelerator programming with OpenACC
..... Accelerator programming with OpenACC Colaboratorio Nacional de Computación Avanzada Jorge Castro jcastro@cenat.ac.cr 2018. Agenda 1 Introduction 2 OpenACC life cycle 3 Hands on session Profiling
More informationRWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky
RWTH GPU-Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de March 2012 Rechen- und Kommunikationszentrum (RZ) The GPU-Cluster GPU-Cluster: 57 Nvidia Quadro 6000 (29 nodes) innovative
More informationOpenACC 2.6 Proposed Features
OpenACC 2.6 Proposed Features OpenACC.org June, 2017 1 Introduction This document summarizes features and changes being proposed for the next version of the OpenACC Application Programming Interface, tentatively
More informationAvailable online at ScienceDirect. Procedia Computer Science 98 (2016 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 515 521 The 3rd International Symposium on Emerging Information, Communication and Networks (EICN 2016) A Speculative
More informationA cache-aware performance prediction framework for GPGPU computations
A cache-aware performance prediction framework for GPGPU computations The 8th Workshop on UnConventional High Performance Computing 215 Alexander Pöppl, Alexander Herz August 24th, 215 UCHPC 215, August
More informationPiz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design
Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design Sadaf Alam & Thomas Schulthess CSCS & ETHzürich CUG 2014 * Timelines & releases are not precise Top 500
More informationTaking Back Control of Your Network With SD-LAN
IHS TECHNOLOGY SEPTEMBER 2016 Taking Back Control of Your Network With SD-LAN Matthias Machowinski, Senior Research Director, Enterprise Networks and Video TABLE OF CONTENTS Access Networks Are Under Pressure...
More informationAdrian Tate XK6 / openacc workshop Manno, Mar
Adrian Tate XK6 / openacc workshop Manno, Mar6-7 2012 1 Overview & Philosophy Two modes of usage Contents Present contents Upcoming releases Optimization of libsci_acc Autotuning Adaptation Asynchronous
More informationCMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)
CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can
More informationHPC with GPU and its applications from Inspur. Haibo Xie, Ph.D
HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance
More informationCloud Computing Introduction & Offerings from IBM
Cloud Computing Introduction & Offerings from IBM Gytis Račiukaitis IT Architect, IBM Global Business Services Agenda What is cloud computing? Benefits Risks & Issues Thinking about moving into the cloud?
More informationThe New Open Edge. IOT+Telecom+Cloud+Enterprise
The New Open Edge IOT+Telecom+Cloud+Enterprise Topics 1. LF Edge formation announcement 2. Why Edge, killer apps & defining the Edge 3. LF Edge Summary 2 LF Edge Open Source harmonized for Edge & IOT New
More informationProgramming NVIDIA GPUs with OpenACC Directives
Programming NVIDIA GPUs with OpenACC Directives Michael Wolfe michael.wolfe@pgroup.com http://www.pgroup.com/accelerate Programming NVIDIA GPUs with OpenACC Directives Michael Wolfe mwolfe@nvidia.com http://www.pgroup.com/accelerate
More informationThe GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky
The GPU-Cluster Sandra Wienke wienke@rz.rwth-aachen.de Fotos: Christian Iwainsky Rechen- und Kommunikationszentrum (RZ) The GPU-Cluster GPU-Cluster: 57 Nvidia Quadro 6000 (29 nodes) innovative computer
More informationAccelerating Reinforcement Learning in Engineering Systems
Accelerating Reinforcement Learning in Engineering Systems Tham Chen Khong with contributions from Zhou Chongyu and Le Van Duc Department of Electrical & Computer Engineering National University of Singapore
More informationModelos de Negócio na Era das Clouds. André Rodrigues, Cloud Systems Engineer
Modelos de Negócio na Era das Clouds André Rodrigues, Cloud Systems Engineer Agenda Software and Cloud Changed the World Cisco s Cloud Vision&Strategy 5 Phase Cloud Plan Before Now From idea to production:
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationOpenACC2 vs.openmp4. James Lin 1,2 and Satoshi Matsuoka 2
2014@San Jose Shanghai Jiao Tong University Tokyo Institute of Technology OpenACC2 vs.openmp4 he Strong, the Weak, and the Missing to Develop Performance Portable Applica>ons on GPU and Xeon Phi James
More informationGPU ACCELERATED DATABASE MANAGEMENT SYSTEMS
CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU
More informationGPU ARCHITECTURE Chris Schultz, June 2017
GPU ARCHITECTURE Chris Schultz, June 2017 MISC All of the opinions expressed in this presentation are my own and do not reflect any held by NVIDIA 2 OUTLINE CPU versus GPU Why are they different? CUDA
More informationOmpSs + OpenACC Multi-target Task-Based Programming Model Exploiting OpenACC GPU Kernel
www.bsc.es OmpSs + OpenACC Multi-target Task-Based Programming Model Exploiting OpenACC GPU Kernel Guray Ozen guray.ozen@bsc.es Exascale in BSC Marenostrum 4 (13.7 Petaflops ) General purpose cluster (3400
More informationAccelerating sequential computer vision algorithms using commodity parallel hardware
Accelerating sequential computer vision algorithms using commodity parallel hardware Platform Parallel Netherlands GPGPU-day, 28 June 2012 Jaap van de Loosdrecht NHL Centre of Expertise in Computer Vision
More informationPUBLIC AND HYBRID CLOUD: BREAKING DOWN BARRIERS
PUBLIC AND HYBRID CLOUD: BREAKING DOWN BARRIERS Jane R. Circle Manager, Red Hat Global Cloud Provider Program and Cloud Access Program June 28, 2016 WHAT WE'LL DISCUSS TODAY Hybrid clouds and multi-cloud
More informationInter-cloud computing: Use cases and requirements lessons learned 3.11
Inter-cloud computing: Use cases and requirements lessons learned 3.11 Oct 12, 2011 Global Inter-Cloud Technology Forum (GICTF) Institute of Information Security (IISEC) Atsuhiro Goto goto@iisec.ac.jp
More informationCloud Computing 1. CSCI 4850/5850 High-Performance Computing Spring 2018
Cloud Computing 1 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning
More informationGPGPU, 4th Meeting Mordechai Butrashvily, CEO GASS Company for Advanced Supercomputing Solutions
GPGPU, 4th Meeting Mordechai Butrashvily, CEO moti@gass-ltd.co.il GASS Company for Advanced Supercomputing Solutions Agenda 3rd meeting 4th meeting Future meetings Activities All rights reserved (c) 2008
More informationLatest Advances in MVAPICH2 MPI Library for NVIDIA GPU Clusters with InfiniBand
Latest Advances in MVAPICH2 MPI Library for NVIDIA GPU Clusters with InfiniBand Presentation at GTC 2014 by Dhabaleswar K. (DK) Panda The Ohio State University E-mail: panda@cse.ohio-state.edu http://www.cse.ohio-state.edu/~panda
More informationOpenACC. Introduction and Evolutions Sebastien Deldon, GPU Compiler engineer
OpenACC Introduction and Evolutions Sebastien Deldon, GPU Compiler engineer 3 WAYS TO ACCELERATE APPLICATIONS Applications Libraries Compiler Directives Programming Languages Easy to use Most Performance
More information[NEC Group Internal Use Only] IoT Security. - Challenges & Standardization status. Sivabalan Arumugam.
[NEC Group Internal Use Only] IoT Security - Challenges & Standardization status Sivabalan Arumugam Outline IoT Security Overview IoT Security Challenges IoT related Threats
More informationOpenACC programming for GPGPUs: Rotor wake simulation
DLR.de Chart 1 OpenACC programming for GPGPUs: Rotor wake simulation Melven Röhrig-Zöllner, Achim Basermann Simulations- und Softwaretechnik DLR.de Chart 2 Outline Hardware-Architecture (CPU+GPU) GPU computing
More informationWhat is Dell EMC Cloud for Microsoft Azure Stack?
What is Dell EMC Cloud for Microsoft Azure Stack? Karsten Bott @azurestack_guy Advisory Cloud Platform Specialist AzureStack GLOBAL SPONSORS Why Hybrid Cloud? The New Digital Customer Rising and continuously
More informationTrends and Challenges in Multicore Programming
Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores
More informationKako napraviti Cloud?
Kako napraviti Cloud? Tomislav Lukačević Converged Infrastructure Presales Consultant tomislav.lukacevic@hp.com Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein
More informationSurvey of the State of P2P File Sharing Applications
Survey of the State of P2P File Sharing Applications Keita Ooi, Satoshi Kamei, and Tatsuya Mori Abstract Recent improvements in Internet access have been accompanied by a dramatic spread of peer-to-peer
More informationHuawei Agile Controller. Agile Controller 1
Huawei Agile Controller Agile Controller 1 Agile Controller 1 Product Overview Agile Controller is the latest user- and application-based network resource auto control system offered by Huawei. Following
More informationIntroduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware
More informationHiPANQ Overview of NVIDIA GPU Architecture and Introduction to CUDA/OpenCL Programming, and Parallelization of LDPC codes.
HiPANQ Overview of NVIDIA GPU Architecture and Introduction to CUDA/OpenCL Programming, and Parallelization of LDPC codes Ian Glendinning Outline NVIDIA GPU cards CUDA & OpenCL Parallel Implementation
More informationThe Integrated Smart & Security Platform Powered the Developing of IOT
The Integrated Smart & Security Platform Powered the Developing of IOT We Are Entering A New Era- 50million connections Smart-Healthcare Smart-Wearable VR/AR Intelligent Transportation Eco-Agriculture
More informationIntegrated Security Management Framework
Integrated Security Management Framework Critical for Securing the Future of IoT Nampuraja Enose Principal Consultant- Industry 4.0 / Industrial Internet 17 Nov 2016 Internet Of Things The Vision 2 Instrumented
More informationFull Scalable Media Cloud Solution with Kubernetes Orchestration. Zhenyu Wang, Xin(Owen)Zhang
Full Scalable Media Cloud Solution with Kubernetes Orchestration Zhenyu Wang, Xin(Owen)Zhang Agenda Media in the Network and Cloud Intel Media Server Reference Software Stack Container with MSS enablement
More informationParallel Programming. Libraries and Implementations
Parallel Programming Libraries and Implementations Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationCS 470 Spring Other Architectures. Mike Lam, Professor. (with an aside on linear algebra)
CS 470 Spring 2016 Mike Lam, Professor Other Architectures (with an aside on linear algebra) Parallel Systems Shared memory (uniform global address space) Primary story: make faster computers Programming
More informationDistributed and Cloud Computing
Distributed and Cloud Computing K. Hwang, G. Fox and J. Dongarra Chapter 9: Ubiquitous Clouds and The Internet of Things (suggested for use in 5 lectures in 250 minutes) Prepared by Kai Hwang University
More informationCross-Site Virtual Network Provisioning in Cloud and Fog Computing
This paper was accepted for publication in the IEEE Cloud Computing. The copyright was transferred to IEEE. The final version of the paper will be made available on IEEE Xplore via http://dx.doi.org/10.1109/mcc.2017.28
More informationDeep Learning: Transforming Engineering and Science The MathWorks, Inc.
Deep Learning: Transforming Engineering and Science 1 2015 The MathWorks, Inc. DEEP LEARNING: TRANSFORMING ENGINEERING AND SCIENCE A THE NEW RISE ERA OF OF GPU COMPUTING 3 NVIDIA A IS NEW THE WORLD S ERA
More informationHighly Reliable and Managed MPLS Network Using Type-X Products
: Development of Highly Reliable and Highly Reliable and Managed MPLS Network Using Products Hidehiko Mitsubori, Jun Nishikido, and Hisashi Komura Abstract NTT Network Service Systems Laboratories have
More informationThe OpenVX Computer Vision and Neural Network Inference
The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos
More informationOpenACC. Part I. Ned Nedialkov. McMaster University Canada. October 2016
OpenACC. Part I Ned Nedialkov McMaster University Canada October 2016 Outline Introduction Execution model Memory model Compiling pgaccelinfo Example Speedups Profiling c 2016 Ned Nedialkov 2/23 Why accelerators
More informationThe Oracle Trust Fabric Securing the Cloud Journey
The Oracle Trust Fabric Securing the Cloud Journey Eric Olden Senior Vice President and General Manager Cloud Security and Identity 05.07.2018 Safe Harbor Statement The following is intended to outline
More informationAccelerating Vision Processing
Accelerating Vision Processing Neil Trevett Vice President Mobile Ecosystem at NVIDIA President of Khronos and Chair of the OpenCL Working Group SIGGRAPH, July 2016 Copyright Khronos Group 2016 - Page
More informationExploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR
Exploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR Presentation at Mellanox Theater () Dhabaleswar K. (DK) Panda - The Ohio State University panda@cse.ohio-state.edu Outline Communication
More informationThe Introduction of Sensor-Cloud and Its Architecture, Applications and Approaches. Mao-Lin Li 2013/11/5
The Introduction of Sensor-Cloud and Its Architecture, Applications and Approaches Mao-Lin Li 2013/11/5 What is a Sensor-Cloud? Sensors: Limited and are specific to their applications/services when linked
More informationThe Latest EMC s announcements
The Latest EMC s announcements Copyright 2014 EMC Corporation. All rights reserved. 1 TODAY S BUSINESS CHALLENGES Cut Operational Costs & Legacy More Than Ever React Faster To Find New Growth Balance Risk
More informationIntroduction of NEC s Open Source Activities. November, 2014 NEC Corporation
Introduction of NEC s Open Source Activities November, 2014 NEC Corporation Agenda NEC Cloud IaaS Overview Okinawa Open Laboratory Page 2 NEC Corporation 2014 NEC Cloud IaaS Overview Two kinds of IaaS
More informationStan Posey, NVIDIA, Santa Clara, CA, USA
Stan Posey, sposey@nvidia.com NVIDIA, Santa Clara, CA, USA NVIDIA Strategy for CWO Modeling (Since 2010) Initial focus: CUDA applied to climate models and NWP research Opportunities to refactor code with
More informationGPU-accelerated Verification of the Collatz Conjecture
GPU-accelerated Verification of the Collatz Conjecture Takumi Honda, Yasuaki Ito, and Koji Nakano Department of Information Engineering, Hiroshima University, Kagamiyama 1-4-1, Higashi Hiroshima 739-8527,
More information13th Asia-Pacific Eco-Business Forum in Kawasaki
13th Asia-Pacific Eco-Business Forum in Kawasaki Session 2 Future Eco Cities Session Challenge to Environment Monitoring using AI Technology 16 Feb 2017 Norio YABE Technical Computing Solutions Unit FUJITSU
More informationExperiences with Achieving Portability across Heterogeneous Architectures
Experiences with Achieving Portability across Heterogeneous Architectures Lukasz G. Szafaryn +, Todd Gamblin ++, Bronis R. de Supinski ++ and Kevin Skadron + + University of Virginia ++ Lawrence Livermore
More informationGPU Based Face Recognition System for Authentication
GPU Based Face Recognition System for Authentication Bhumika Agrawal, Chelsi Gupta, Meghna Mandloi, Divya Dwivedi, Jayesh Surana Information Technology, SVITS Gram Baroli, Sanwer road, Indore, MP, India
More informationUpdate on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem
Update on Khronos Open Standard APIs for Vision Processing Neil Trevett Khronos President NVIDIA Vice President Mobile Ecosystem Copyright Khronos Group 2015 - Page 1 Copyright Khronos Group 2015 - Page
More information