Evaluating Orthogonality between Application Auto tuning and Run Time Resource Management for Adaptive OpenCL Applications
|
|
- Isabella Rice
- 5 years ago
- Views:
Transcription
1 Evaluating Orthogonality between Application Auto tuning and Run Time Resource Management for Adaptive OpenCL Applications Edoardo Paone, Davide Gadioli, Gianluca Palermo, Vittorio Zaccaria, Cristina Silvano Politecnico di Milano
2 Computer Architecture Evolution um um Pentium um Core2 Duo 65nm Nehalem 45nm Time The number of transistors incorporated in a chip will approximately double every two years Gordon Moore, Intel co-founder 2
3 Moore s Law on Performance Performance um um Pentium 4 0.8um Core2 Duo 65nm Nehalem 45nm
4 Moore s Law on Performance Performance The Golden Era: - Single-processor - 1 st Power Wall um um Pentium 4 0.8um Core2 Duo 65nm Nehalem 45nm
5 Moore s Law on Performance Performance The Multicore Era: - 2 to 16 cores - On-chip shared LL$ um um Pentium 4 0.8um Core2 Duo 65nm Nehalem 45nm - Programmability challenge
6 Moore s Law on Performance Performance? The Manycore Era: - Larger # of cores - Networks on-chip um um Pentium 4 0.8um Core2 Duo 65nm Nehalem 45nm Programmability challenge + Dynamic Resource Management 3
7 Main Idea In the context of resource consolidation, analyze the orthogonal effects of: Resource Management Application Auto Tuning Approximate computing Target Platforms 4
8 Main Idea In the context of resource consolidation, analyze the orthogonal effects of: Resource Management Application Auto Tuning Approximate computing Target Platforms 4 Multicore Platform
9 Run Time Resource Management App1 App2 App3 RTRM Target Platform Amit Kumar Singh, Muhammad Shafique, Akash Kumar, and Jörg Henkel. Mapping on multi/many core systems: survey of current and emerging trends. In Proceedings of the 50th Annual Design Automation Conference (DAC)
10 RTRM Overview App1 App2 App3 Accounting Mapping RTRM Target Platform 6
11 RTRM Overview Resource accounting phase grants resources to critical workloads while optimize resource usage by best effort workloads Accounting Mapping App1 App2 App RTRM Target Platform 6
12 RTRM Overview Resource accounting phase grants resources to critical workloads while optimize resource usage by best effort workloads Mapping phase maps virtual resources on physical resources to achieve optimal platform usage to handle run time variations Accounting Mapping App1 App2 App RTRM Target Platform 6
13 Application Auto Tuning Key idea is that most of the applications are configurable thanks to a set of parameters 7
14 Application Auto Tuning Key idea is that most of the applications are configurable thanks to a set of parameters 7
15 Application Auto Tuning Key idea is that most of the applications are configurable thanks to a set of parameters Parameters: Color 7
16 Application Auto Tuning Key idea is that most of the applications are configurable thanks to a set of parameters Parameters: Color Shape 7
17 Application Auto Tuning Key idea is that most of the applications are configurable thanks to a set of parameters Parameters: Color Shape Size 7
18 Application Auto Tuning Key idea is that most of the applications are configurable thanks to a set of parameters Run-Time Knobs 7
19 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance QoR Performance 8
20 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance Autonomous Video-surveillance System 8
21 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance Video Frame Rate Video Resolution Autonomous Video-surveillance System 8
22 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance Video Frame Rate Video Resolution Autonomous Video-surveillance System 8
23 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance Video Frame Rate Video Resolution Autonomous Video-surveillance System 8
24 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance Video Frame Rate Video Resolution Autonomous Video-surveillance System 8
25 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance Video Frame Rate Video Resolution Autonomous Video-surveillance System 8
26 Why Run Time Knobs & Auto Tuning? In some applications internal knobs that can be used to trade off between application quality of results and performance Video Frame Rate Video Resolution Autonomous Video-surveillance System 8
27 9 Application Auto Tuning Framework
28 Application Auto Tuning Framework Execution Loop 9
29 Application Auto Tuning Framework Monitoring 9
30 Application Auto Tuning Framework Re-Configure 9
31 Orthogonality Concept Requests Resources Platform OS Target HW Platform 10
32 Orthogonality Concept Requests Resources Exploitation of OpenCL Device Fission to limit resource requests Platform OS Target HW Platform 10
33 Orthogonality Concept Requests Resources Run-Time Resource Manager Exploitation of OpenCL Device Fission to limit resource requests Platform OS Target HW Platform 10
34 The Multi View Case Study 2 eyes = 3 dimensions 11
35 Implementation 1 P R P L Q R Q L Q L Q P L P D P CAM LEFT D Q Q R P R CAM LEFT CAM RIGHT CAM RIGHT 1 Ke Zhang, Jiangbo Lu, and Gauthier Lafruit, Cross-Based Local Stereo Matching Using Orthogonal Integral Images, IEEE Transactions On Circuits and Systems For Video Technology, Vol. 19, No. 7, July
36 Pixel disparity 36 Left camera Right camera reference disparity 13
37 Pixel disparity 1 Left camera Right camera 2 5 Application Knobs 3 reference disparity QoR Disparity Error 13
38 Experimental Setup Target Platform AMD NUMA Architecture: 4 nodes 4 cores OpenCL 1.2 run time provided by AMD 14
39 Experimental Setup Target Platform AMD NUMA Architecture: 4 nodes 4 cores OpenCL 1.2 run time provided by AMD Workload Definition: Single application multiple instances Dynamic workload in terms of start time, amount of data to process, frame rate goal 14
40 Experimental Setup Target Platform AMD NUMA Architecture: 4 nodes 4 cores OpenCL 1.2 run time provided by AMD Workload Definition: Single application multiple instances Dynamic workload in terms of start time, amount of data to process, frame rate goal Evaluation Metrics Normalized Actual Penalty (Performance/Quality metric) User satisfaction in terms of Application Frame Rate Normalized Application Error (Quality metric) User satisfaction in terms of quality of the resulting image (1/QoR) Difference w.r.t. off line profiling (Predictability metric) 14
41 Application Auto Tuning Effects 15
42 15 Application Auto Tuning Effects
43 15 Application Auto Tuning Effects
44 Comparative Analysis Application Auto-tuning OFF ON Run-Time Resource Management OFF ON PLAIN-LINUX (No Device Fission) PLAIN-RTRM ADAPTIVE-LINUX ADAPTIVE-RTRM 16
45 Run Time Results Missed Deadlines Design-Time Vs Run-Time Profiling QoR Disparity Error #Multi-View Apps
46 Run Time Results Missed Deadlines Design-Time Vs Run-Time Profiling QoR Disparity Error #Multi-View Apps
47 Run Time Results Missed Deadlines Design-Time Vs Run-Time Profiling QoR Disparity Error #Multi-View Apps
48 Run Time Results Missed Deadlines Design-Time Vs Run-Time Profiling QoR Disparity Error #Multi-View Apps
49 Run Time Results Missed Deadlines Design-Time Vs Run-Time Profiling QoR Disparity Error #Multi-View Apps
50 Run Time Results APPS 18
51 Resource Aware AS RTM Requests Resources Resource Availability Platform OS Target HW Platform 19
52 Resource Aware AS RTM Requests Resources Resource Availability Platform OS Target HW Platform 19
53 Conclusions We considered the problem of managing multiple OpenCL applications for server consolidation on multicore platforms We implemented an approach exploiting run time management frameworks operating both at application level or at OS/resource level Analysis of results: Auto tuning is necessary to modulate performance and QoR Resource awareness is needed for predictability by means of resource isolation (RTRM) or simple monitor (RA AS RTM) 20
Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures
Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Outline Research challenges in multicore
More informationAn Evaluation of Autotuning Techniques for the Compiler Optimization Problems
An Evaluation of Autotuning Techniques for the Compiler Optimization Problems Amir Hossein Ashouri, Gianluca Palermo and Cristina Silvano Politecnico di Milano, Milan, Italy {amirhossein.ashouri,ginaluca.palermo,cristina.silvano}@polimi.it
More informationChallenges of Heterogeneous MPSoC for Image Processing
Challenges of Heterogeneous MPSoC for Image Processing DGLR 2017 Walter Stechele Institute for Integrated Systems Technische Universität München Overview Reconfigurable hardware Hardware software
More informationCS671 Parallel Programming in the Many-Core Era
CS671 Parallel Programming in the Many-Core Era Lecture 1: Introduction Zheng Zhang Rutgers University CS671 Course Information Instructor information: instructor: zheng zhang website: www.cs.rutgers.edu/~zz124/
More informationMicroprocessor Trends and Implications for the Future
Microprocessor Trends and Implications for the Future John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 522 Lecture 4 1 September 2016 Context Last two classes: from
More informationOpenCL Application Auto-Tuning and Run-Time Resource Management for Multi-Core Platforms
214 IEEE International Symposium on Parallel and Distributed Processing with Applications OpenCL Application Auto-Tuning and Run-Time Resource Management for Multi-Core Platforms Davide Gadioli, Simone
More informationAutomatic Pruning of Autotuning Parameter Space for OpenCL Applications
Automatic Pruning of Autotuning Parameter Space for OpenCL Applications Ahmet Erdem, Gianluca Palermo 6, and Cristina Silvano 6 Department of Electronics, Information and Bioengineering Politecnico di
More informationChapter 1. Introduction
Chapter 1. Introduction João Bispo 1, Pedro Pinto 1, João M.P. Cardoso 1, Jorge G. Barbosa 1, Hamid Arabnejad 1, Davide Gadioli 2, Emanuele Vitali 2, Gianluca Palermo 2, Cristina Silvano 2, Stefano Cherubin
More informationWhat is This Course About? CS 356 Unit 0. Today's Digital Environment. Why is System Knowledge Important?
0.1 What is This Course About? 0.2 CS 356 Unit 0 Class Introduction Basic Hardware Organization Introduction to Computer Systems a.k.a. Computer Organization or Architecture Filling in the "systems" details
More informationCSE 291: Mobile Application Processor Design
CSE 291: Mobile Application Processor Design Mobile Application Processors are where the action are The evolution of mobile application processors mirrors that of microprocessors mirrors that of mainframes..
More informationTrends and Challenges in Multicore Programming
Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh (thanks to Prof. Nigel Topham) General Information Instructor
More informationHW Trends and Architectures
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 1/29 HW Trends and Architectures prof. Ing. Pavel Tvrdík CSc. Ing. Jiří Kašpar Department of Computer Systems Faculty
More informationLecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )
Systems Group Department of Computer Science ETH Zürich Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Today Non-Uniform
More informationIntroduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano
Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed
More informationThe ANTAREX Approach to AutoTuning and Adaptivity for Energy efficient HPC systems
The ANTAREX Approach to AutoTuning and Adaptivity for Energy efficient HPC systems The ANTAREX Team Nesus Fifth Working Group Meeting Ljubljana, July 8 th, 2016 ANTAREX AutoTuning and Adaptivity approach
More informationToday. SMP architecture. SMP architecture. Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )
Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Systems Group Department of Computer Science ETH Zürich SMP architecture
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationMulticore Hardware and Parallelism
Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3
More informationPerformance of computer systems
Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type
More informationParallelism in Hardware
Parallelism in Hardware Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3 Moore s Law
More informationThe Art of Parallel Processing
The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a
More informationAsymmetry-aware execution placement on manycore chips
Asymmetry-aware execution placement on manycore chips Alexey Tumanov Joshua Wise, Onur Mutlu, Greg Ganger CARNEGIE MELLON UNIVERSITY Introduction: Core Scaling? Moore s Law continues: can still fit more
More informationECE 2162 Intro & Trends. Jun Yang Fall 2009
ECE 2162 Intro & Trends Jun Yang Fall 2009 Prerequisites CoE/ECE 0142: Computer Organization; or CoE/CS 1541: Introduction to Computer Architecture I will assume you have detailed knowledge of Pipelining
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationFundamentals of Computer Design
Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University
More informationMicroelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica
Microelettronica J. M. Rabaey, "Digital integrated circuits: a design perspective" Introduction Why is designing digital ICs different today than it was before? Will it change in future? The First Computer
More informationEN105 : Computer architecture. Course overview J. CRENNE 2015/2016
EN105 : Computer architecture Course overview J. CRENNE 2015/2016 Schedule Cours Cours Cours Cours Cours Cours Cours Cours Cours Cours 2 CM 1 - Warmup CM 2 - Computer architecture CM 3 - CISC2RISC CM 4
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationMultimedia in Mobile Phones. Architectures and Trends Lund
Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson
More informationA Simple Model for Estimating Power Consumption of a Multicore Server System
, pp.153-160 http://dx.doi.org/10.14257/ijmue.2014.9.2.15 A Simple Model for Estimating Power Consumption of a Multicore Server System Minjoong Kim, Yoondeok Ju, Jinseok Chae and Moonju Park School of
More informationChapter 1. Computer Abstractions and Technology. Lesson 2: Understanding Performance
Chapter 1 Computer Abstractions and Technology Lesson 2: Understanding Performance Indeed, the cost-performance ratio of the product will depend most heavily on the implementer, just as ease of use depends
More informationWisconsin Computer Architecture. Nam Sung Kim
Wisconsin Computer Architecture Mark Hill Nam Sung Kim Mikko Lipasti Karu Sankaralingam Guri Sohi David Wood Technology & Moore s Law 35nm Transistor 1947 Moore s Law 1964: Integrated Circuit 1958 Transistor
More informationNetwork Swapping. Outline Motivations HW and SW support for swapping under Linux OS
Network Swapping Emanuele Lattanzi, Andrea Acquaviva and Alessandro Bogliolo STI University of Urbino, ITALY Outline Motivations HW and SW support for swapping under Linux OS Local devices (CF, µhd) Network
More informationElettronica T moduli I e II
Elettronica T moduli I e II Docenti: Massimo Lanzoni, Igor Loi Massimo.lanzoni@unibo.it igor.loi@unibo.it A.A. 2015/2016 Scheduling MOD 1 (Prof. Loi) Weeks 39,40,41,42, 43,44» MOS transistors» Digital
More informationTaming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems Prathap Kumar Valsan, Heechul Yun, Farzad Farshchi University of Kansas 1 Why? High-Performance Multicores for Real-Time Systems
More informationECE 486/586. Computer Architecture. Lecture # 2
ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:
More informationECE 154A. Architecture. Dmitri Strukov
ECE 154A Introduction to Computer Architecture Dmitri Strukov Lecture 1 Outline Admin What this class is about? Prerequisites ii Simple computer Performance Historical trends Economics 2 Admin Office Hours:
More informationIntroduction to cache memories
Course on: Advanced Computer Architectures Introduction to cache memories Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Summary Summary Main goal Spatial and temporal
More informationGetting Ready for Approximate Computing: Trading Parallelism for Accuracy for DSS Workloads. Department of Computer Science University of Cyprus
Getting Ready for Approximate Computing: Trading Parallelism for Accuracy for DSS Workloads Pedro Moura Trancoso CASPER Research Group Department of Computer Science University of Cyprus In: Computing
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationVLSI Design Automation. Calcolatori Elettronici Ing. Informatica
VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing
More informationIMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM
IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM I5 AND I7 PROCESSORS Juan M. Cebrián 1 Lasse Natvig 1 Jan Christian Meyer 2 1 Depart. of Computer and Information
More informationUPCRC Overview. Universal Computing Research Centers launched at UC Berkeley and UIUC. Andrew A. Chien. Vice President of Research Intel Corporation
UPCRC Overview Universal Computing Research Centers launched at UC Berkeley and UIUC Andrew A. Chien Vice President of Research Intel Corporation Announcement Key Messages Microsoft and Intel are announcing
More informationQuantifying Load Imbalance on Virtualized Enterprise Servers
Quantifying Load Imbalance on Virtualized Enterprise Servers Emmanuel Arzuaga and David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston MA 1 Traditional Data Centers
More informationCS 3410: Computer System Organization and Programming
CS 3410: Computer System Organization and Programming Anne Bracy Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy,
More informationLecture 2: Performance
Lecture 2: Performance Today s topics: Technology wrap-up Performance trends and equations Reminders: YouTube videos, canvas, and class webpage: http://www.cs.utah.edu/~rajeev/cs3810/ 1 Important Trends
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Computer Organization and Design The Hardware / Software Interface David A. Patterson and John L. Hennessy Course based on the
More informationTechnologies and application performance. Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017
Technologies and application performance Marc Mendez-Bermond HPC Solutions Expert - Dell Technologies September 2017 The landscape is changing We are no longer in the general purpose era the argument of
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #21: Caches 3 2005-07-27 CS61C L22 Caches III (1) Andy Carle Review: Why We Use Caches 1000 Performance 100 10 1 1980 1981 1982 1983
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Classes of Computers Personal computers General purpose, variety of software
More informationAdvances of parallel computing. Kirill Bogachev May 2016
Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being
More informationPerformance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware
Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware 2010 VMware Inc. All rights reserved About the Speaker Hemant Gaidhani Senior Technical
More informationFundamentals of Computers Design
Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2
More informationCLASS 1 TUESDAY, JAN 3RD
Don t Panic! An Introduction to. Fixing & Preventing Computer Problems CLASS 1 TUESDAY, JAN 3 RD Computer Basics Startup Problems Hardware Maintenance Ange Rapa January 2018 Don t Panic! An Introduction
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors:!
More informationThe Computer Revolution. Classes of Computers. Chapter 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore
More informationChapter 0 Introduction
Chapter 0 Introduction Jin-Fu Li Laboratory Department of Electrical Engineering National Central University Jhongli, Taiwan Applications of ICs Consumer Electronics Automotive Electronics Green Power
More informationBy Charvi Dhoot*, Vincent J. Mooney &,
By Charvi Dhoot*, Vincent J. Mooney &, -Shubhajit Roy Chowdhury*, Lap Pui Chau # *International Institute of Information Technology, Hyderabad, India & School of Electrical and Computer Engineering, Georgia
More informationCOMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES
COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:
More informationAmir H. Ashouri University of Toronto Canada
Compiler Autotuning using Machine Learning: A State-of-the-art Review Amir H. Ashouri University of Toronto Canada 4 th July, 2018 Politecnico di Milano, Italy Background 2 Education B.Sc (2005-2009):
More informationDesign Metrics. A couple of especially important metrics: Time to market Total cost (NRE + unit cost) Performance (speed latency and throughput)
Design Metrics A couple of especially important metrics: Time to market Total cost (NRE + unit cost) Performance (speed latency and throughput) 1 Design Metrics A couple of especially important metrics:
More informationIntel Architecture for Software Developers
Intel Architecture for Software Developers 1 Agenda Introduction Processor Architecture Basics Intel Architecture Intel Core and Intel Xeon Intel Atom Intel Xeon Phi Coprocessor Use Cases for Software
More informationIntroduction. Summary. Why computer architecture? Technology trends Cost issues
Introduction 1 Summary Why computer architecture? Technology trends Cost issues 2 1 Computer architecture? Computer Architecture refers to the attributes of a system visible to a programmer (that have
More informationJin-Fu Li. Department of Electrical Engineering. Jhongli, Taiwan
EEA001 VLSI Design Jin-Fu Li Advanced Reliable Systems (ARES) Lab. Department of Electrical Engineering National Central University Jhongli, Taiwan Contents Syllabus Introduction to CMOS Circuits MOS Transistor
More informationLecture 1: Introduction
Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline
More informationComparison of Processor Architectures and Metrics from 1992 to 2011
Comparison of Processor Architectures and Metrics from 1992 to 211 Michael Colucciello Department of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 32816-2362 Abstract
More informationArchitecture at the end of Moore
Architecture at the end of Moore Stefanos Kaxiras Uppsala University IT Uppsala universitet Conclusions There s a power problem and it seems bad Nothing works really well (e.g., multicores) Heterogeous
More informationlow Energy COnsumption NETworks
low Energy COnsumption NETworks (ECONET) Smart power management for fixed network devices ETSI Workshop on Energy Efficiency Genova 21st June 2012 The Project Motivations and Focus x 10 Static energy efficiency
More informationAn Introduction to Parallel Programming
An Introduction to Parallel Programming Ing. Andrea Marongiu (a.marongiu@unibo.it) Includes slides from Multicore Programming Primer course at Massachusetts Institute of Technology (MIT) by Prof. SamanAmarasinghe
More informationComputer Organization and Components
Computer Organization and Components Course Structure IS500, fall 05 Lecture :, Concurrency,, and ILP Module : C and Assembly Programming L L L L Module : Processor Design X LAB S LAB L9 L5 Assistant Research
More informationCMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology. Moore s Law: 2X transistors / year
CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology Moore s Law: 2X transistors / year Cramming More Components onto Integrated Circuits Gordon Moore, Electronics, 1965 # on transistors
More informationProcessor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)
More informationCompiler Optimizations and Auto-tuning. Amir H. Ashouri Politecnico Di Milano -2014
Compiler Optimizations and Auto-tuning Amir H. Ashouri Politecnico Di Milano -2014 Compilation Compilation = Translation One piece of code has : Around 10 ^ 80 different translations Different platforms
More informationMulti-core Programming Evolution
Multi-core Programming Evolution Based on slides from Intel Software ollege and Multi-ore Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts, Evolution
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationEvaluating Run-time Resource Management Policies for Multi-core Embedded Platforms with
Evaluating Run-time Resource Management Policies for Multi-core Embedded Platforms with the EMME Evaluation Framework Giovanni Mariani ALaRI - University of Lugano Lugano, Switzerland E-mail: giovanni.mariani@lu.unisi.ch
More informationThe Effect of Temperature on Amdahl Law in 3D Multicore Era
The Effect of Temperature on Amdahl Law in 3D Multicore Era L Yavits, A Morad, R Ginosar Abstract This work studies the influence of temperature on performance and scalability of 3D Chip Multiprocessors
More informationUnderstanding Dual-processors, Hyper-Threading Technology, and Multicore Systems
Understanding Dual-processors, Hyper-Threading Technology, and Multicore Systems This paper will provide you with a basic understanding of the differences among several computer system architectures dual-processor
More informationChallenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008
Michael Doggett Graphics Architecture Group April 2, 2008 Graphics Processing Unit Architecture CPUs vsgpus AMD s ATI RADEON 2900 Programming Brook+, CAL, ShaderAnalyzer Architecture Challenges Accelerated
More informationCSE 141: Computer Architecture. Professor: Michael Taylor. UCSD Department of Computer Science & Engineering
CSE 141: Computer 0 Architecture Professor: Michael Taylor RF UCSD Department of Computer Science & Engineering Computer Architecture from 10,000 feet foo(int x) {.. } Class of application Physics Computer
More informationIntel s Architecture for NFV
Intel s Architecture for NFV Evolution from specialized technology to mainstream programming Net Futures 2015 Network applications Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION
More informationMany-Core Computing Era and New Challenges. Nikos Hardavellas, EECS
Many-Core Computing Era and New Challenges Nikos Hardavellas, EECS Moore s Law Is Alive And Well 90nm 90nm transistor (Intel, 2005) Swine Flu A/H1N1 (CDC) 65nm 2007 45nm 2010 32nm 2013 22nm 2016 16nm 2019
More informationComputer Architecture s Changing Definition
Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction
More informationAgenda. Review. Costs of Set-Associative Caches
Agenda CS 61C: Great Ideas in Computer Architecture (Machine Structures) Technology Trends and Data Level Instructor: Michael Greenbaum Caches - Replacement Policies, Review Administrivia Course Halfway
More informationBetter Security with Virtual Machines
Better Security with Virtual Machines VMware Security Seminar Cambridge, 2006 Agenda VMware Evolution Virtual machine Server architecture Virtual infrastructure Looking forward VMware s security vision
More informationProcessor Architecture
Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)
More information4. Shared Memory Parallel Architectures
Master rogram (Laurea Magistrale) in Computer cience and Networking High erformance Computing ystems and Enabling latforms Marco Vanneschi 4. hared Memory arallel Architectures 4.4. Multicore Architectures
More informationSystem-on-Chip Architecture for Mobile Applications. Sabyasachi Dey
System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution
More informationLecture 1: Gentle Introduction to GPUs
CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 1: Gentle Introduction to GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Who Am I? Mohamed
More informationCS Computer Architecture
CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 Structured Computer Organization A computer s native language, machine language, is difficult for human s to use to program the computer
More informationReal-Time Cache Management for Multi-Core Virtualization
Real-Time Cache Management for Multi-Core Virtualization Hyoseung Kim 1,2 Raj Rajkumar 2 1 University of Riverside, California 2 Carnegie Mellon University Benefits of Multi-Core Processors Consolidation
More informationTHREAD LEVEL PARALLELISM
THREAD LEVEL PARALLELISM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 4 is due on Dec. 11 th This lecture
More informationArquitecturas y Modelos de. Multicore
Arquitecturas y Modelos de rogramacion para Multicore 17 Septiembre 2008 Castellón Eduard Ayguadé Alex Ramírez Opening statements * Some visionaries already predicted multicores 30 years ago And they have
More informationArchitecture-Conscious Database Systems
Architecture-Conscious Database Systems 2009 VLDB Summer School Shanghai Peter Boncz (CWI) Sources Thank You! l l l l Database Architectures for New Hardware VLDB 2004 tutorial, Anastassia Ailamaki Query
More informationMEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS
MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS INSTRUCTOR: Dr. MUHAMMAD SHAABAN PRESENTED BY: MOHIT SATHAWANE AKSHAY YEMBARWAR WHAT IS MULTICORE SYSTEMS? Multi-core processor architecture means placing
More informationHPC in the Multicore Era
HPC in the Multicore Era -Challenges and opportunities - David Barkai, Ph.D. Intel HPC team High Performance Computing 14th Workshop on the Use of High Performance Computing in Meteorology ECMWF, Shinfield
More informationCS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it
Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1
More informationVLSI Design Automation. Maurizio Palesi
VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 Outline Technology trends VLSI Design flow (an overview) 3 IC Products Processors CPU, DSP, Controllers Memory chips
More information