Modeling CPU Energy Consumption for Energy Efficient Scheduling
|
|
- Nora Dean
- 5 years ago
- Views:
Transcription
1 Modeling CPU Energy Consumption for Energy Efficient Scheduling Abhishek Jaiantilal, Yifei Jiang, Shivakant Mishra University of Colorado - Boulder GCM '10 Proceedings of the 1st Workshop on Green Computing 2010 ACM
2 Outline Introduction Energy Model Overview Power Consumed and CPU Cycles Experimental Results Conclusions 2
3 Introduction (1/2) The processor is the component that consumes the most power. 3
4 Introduction (2/2) Dynamic Voltage and Frequency Scaling (DVFS) is used in CPU, referring as P-states. Per Core Power Gating (PCPG), or Dynamic Core Gating (DCG) is a hardware feature allowing the cores in a multicore CPU to shut themselves off. It is also called C-states. C0 - Active state C1 - Inactive state with the core not running on these idle cycles C3 - Inactive state with the cache saved C6 - All the PLL turned off 4
5 Energy Model Overview (1/3) Black Box approach PCPG is hardware controlled, so we use Black Box approach. Obtained the statistics of /proc/stat file A scheduling policy to limit these loops on few cores might not be the best compared with running them on all the cores. Still a low power profile. Lesser execution time. So we need to know the power consumption of a task 5
6 Energy Model Overview (2/3) Even though the processes are running at 100% load, the power consumed is different for different tasks. Because some of these tasks are float-cycle intensive and others are integer or memory cycle intensive. 6
7 Energy Model Overview (3/3) Modified Black Box approach If we know how much power a task is consuming, then we can fit a schedule that allowing for a shorter execution time and a lower energy consumption. We need the training data to choose the best task schedule depending on the tradeoff between the power consumption and the execution time. Disadvantages Need training data from all the possible tasks first Computers should have the same configuration 7
8 Power Consumed and CPU Cycles (1/7) System power consumption P(System) f(p CPU + P Memory + P Fans + P HDD + P Northbridge + P Southbridge + P Graphics + P(Other components)) f() = Efficiency of the Power supply 8
9 Power Consumed and CPU Cycles (2/7) Simplified system power consumption P(System) P CPU + P Memory + P Bias Bias = Power of Fans, Motherboard, North-bridge, South-bridge, Graphics, HDD, and Other Components. 9
10 Power Consumed and CPU Cycles (3/7) We proposed if we know the CPU cycle profile for a task, we can build a simple linear model to account the CPU load and energy consumed. P System Cycles FPU + Cycles INT + Cycles Memory + P(Bias) P(Task i ) Cycles FPU + Cycles IU + Cycles Cache N P System Power Task i i=1 + Bias 10
11 Power Consumed and CPU Cycles (4/7) We need to know the counts and the types of CPU cycles executed by a task. Dtrace for Solaris Oprofile Intel Vtune for Linux We used Vtune in an offline manner and sampled the application and store the cycle time over some period. (30 minutes~1 hour) 11
12 Power Consumed and CPU Cycles (5/7) Linear Regression Model Power Task i = F number offp cycles +I number of Int Cycles +M number of Memory Cycles F, I, and M are multiplier for watt cost of running a single FP, INT, or Memory cycle. But there is no direct way to find them. 12
13 Power Consumed and CPU Cycles (6/7) We use the statistical approach of minimizing the square error to find these unknown variables. min F,I,M Measured wattage Y Predicted wattage Y 2 Y = F Number offp cycles +I (Number of Int Cycles) +M Number of Memory Cycles F, I, M > 0, β = F I M + Bias = Xβ Once we know X, Y, then F, I, and M (stored in the β vector) can be obtained as: β = X T X + λi 1 X T Y 13
14 Power Consumed and CPU Cycles (7/7) We also used another statistical algorithm - Random Forests in our experiments. Random Forests is a popular machine learning/statistical approach that uses decision trees. It is a non-linear algorithm compared to the linear regression formulation. 14
15 Experimental Results (1/6) Regression Model Training We obtained training data from the following benchmarks first: memcpy While-float mprime Then we obtained separated test data for: SPECjvm While-Int While-Branch 15
16 Experimental Results (2/6) Results of Regression Model 16
17 Experimental Results (3/6) Energy Efficient Scheduler We proposed that we do not wake up a core from idle state until its needed. The cores that were not allocated any tasks were shut off. A core cannot execute more than a specific number of processor cycles. We used the average number of cycles executed to predict the energy consumed and then chose the best energy efficient schedule. The ideal case would be in an online fashion, based on the current load/cycle executed and evaluate the task schedule every second. 17
18 Experimental Results (4/6) 18
19 Experimental Results (5/6) 19
20 Experimental Results (6/6) 20
21 Conclusions We showed that a linear and Random Forests model can be used for predicting energy consumption. We also proposed a simple scheduler that utilizes this model to minimize power consumption but still maintain similar execution time. In the future, we propose to come up with a better mathematical model for scheduler. We also propose to use model in an online fashion and allowing the OS to limit processes that consume power greater than a fixed limit. 21
Abhishek Pandey Aman Chadha Aditya Prakash
Abhishek Pandey Aman Chadha Aditya Prakash System: Building Blocks Motivation: Problem: Determining when to scale down the frequency at runtime is an intricate task. Proposed Solution: Use Machine learning
More informationA Simple Model for Estimating Power Consumption of a Multicore Server System
, pp.153-160 http://dx.doi.org/10.14257/ijmue.2014.9.2.15 A Simple Model for Estimating Power Consumption of a Multicore Server System Minjoong Kim, Yoondeok Ju, Jinseok Chae and Moonju Park School of
More informationPower Measurement Using Performance Counters
Power Measurement Using Performance Counters October 2016 1 Introduction CPU s are based on complementary metal oxide semiconductor technology (CMOS). CMOS technology theoretically only dissipates power
More informationEfficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems
Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems Ayse K. Coskun Electrical and Computer Engineering Department Boston University http://people.bu.edu/acoskun
More informationDE0 Nano SoC - CPU Performance and Power
DE0 Nano SoC DE0 Nano SoC - CPU Performance and Power While Running Debian 19 th March 2017 - Satyen Akolkar Group 5 - AR Internet of Things By: Satyen Akolkar OVERVIEW The benchmark was performed by using
More informationTips and Tricks: Designing low power Native and WebApps. Harita Chilukuri and Abhishek Dhanotia
Tips and Tricks: Designing low power Native and WebApps Harita Chilukuri and Abhishek Dhanotia Acknowledgements William Baughman for his help with the browser analysis Ross Burton & Thomas Wood for information
More informationLOWERING POWER CONSUMPTION OF HEVC DECODING. Chi Ching Chi Techinische Universität Berlin - AES PEGPUM 2014
LOWERING POWER CONSUMPTION OF HEVC DECODING Chi Ching Chi Techinische Universität Berlin - AES PEGPUM 2014 Introduction How to achieve low power HEVC video decoding? Modern processors expose many low power
More informationManaging Hardware Power Saving Modes for High Performance Computing
Managing Hardware Power Saving Modes for High Performance Computing Second International Green Computing Conference 2011, Orlando Timo Minartz, Michael Knobloch, Thomas Ludwig, Bernd Mohr timo.minartz@informatik.uni-hamburg.de
More informationEnergy Models for DVFS Processors
Energy Models for DVFS Processors Thomas Rauber 1 Gudula Rünger 2 Michael Schwind 2 Haibin Xu 2 Simon Melzner 1 1) Universität Bayreuth 2) TU Chemnitz 9th Scheduling for Large Scale Systems Workshop July
More informationAn Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling
An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling Keigo Mizotani, Yusuke Hatori, Yusuke Kumura, Masayoshi Takasu, Hiroyuki Chishiro, and Nobuyuki Yamasaki Graduate
More informationPredicting Program Phases and Defending against Side-Channel Attacks using Hardware Performance Counters
Predicting Program Phases and Defending against Side-Channel Attacks using Hardware Performance Counters Junaid Nomani and Jakub Szefer Computer Architecture and Security Laboratory Yale University junaid.nomani@yale.edu
More informationPower Measurements using performance counters
Power Measurements using performance counters CSL862: Low-Power Computing By Suman A M (2015SIY7524) Android Power Consumption in Android Power Consumption in Smartphones are powered from batteries which
More informationA Probabilistic Graphical Model-based Approach for Minimizing Energy under Performance Constraints
A Probabilistic Graphical Model-based Approach for Minimizing Energy under Performance Constraints Nikita Mishra, Huazhe Zhang, John Lafferty and Hank Hoffmann University of Chicago Fraction of time CPU
More informationMyths in PMC-based Power Estimation. Jason Mair, Zhiyi Huang, David Eyers, and Haibo Zhang
Myths in PMC-based Power Estimation Jason Mair, Zhiyi Huang, David Eyers, and Haibo Zhang Outline PMC-based power modeling Experimental setup and configuration Myth 1: Sample rate Myth 2: Thermal effects
More informationCOL862 Programming Assignment-1
Submitted By: Rajesh Kedia (214CSZ8383) COL862 Programming Assignment-1 Objective: Understand the power and energy behavior of various benchmarks on different types of x86 based systems. We explore a laptop,
More informationWorkload Prediction for Adaptive Power Scaling Using Deep Learning. Steve Tarsa, Amit Kumar, & HT Kung Harvard, Intel Labs MRL May 29, 2014 ICICDT 14
Workload Prediction for Adaptive Power Scaling Using Deep Learning Steve Tarsa, Amit Kumar, & HT Kung Harvard, Intel Labs MRL May 29, 2014 ICICDT 14 In these slides Machine learning (ML) is applied to
More informationCOL862: Low Power Computing Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques
COL862: Low Power Computing Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques Authors: Huazhe Zhang and Henry Hoffmann, Published: ASPLOS '16 Proceedings
More informationEvaluating the Effectiveness of Model Based Power Characterization
Evaluating the Effectiveness of Model Based Power Characterization John McCullough, Yuvraj Agarwal, Jaideep Chandrashekhar (Intel), Sathya Kuppuswamy, Alex C. Snoeren, Rajesh Gupta Computer Science and
More informationLast Time. Making correct concurrent programs. Maintaining invariants Avoiding deadlocks
Last Time Making correct concurrent programs Maintaining invariants Avoiding deadlocks Today Power management Hardware capabilities Software management strategies Power and Energy Review Energy is power
More informationI/O Systems (4): Power Management. CSE 2431: Introduction to Operating Systems
I/O Systems (4): Power Management CSE 2431: Introduction to Operating Systems 1 Outline Overview Hardware Issues OS Issues Application Issues 2 Why Power Management? Desktop PCs Battery-powered Computers
More informationPOWER MANAGEMENT AND ENERGY EFFICIENCY
POWER MANAGEMENT AND ENERGY EFFICIENCY * Adopted Power Management for Embedded Systems, Minsoo Ryu 2017 Operating Systems Design Euiseong Seo (euiseong@skku.edu) Need for Power Management Power consumption
More informationFUNCTIONS OF COMPONENTS OF A PERSONAL COMPUTER
FUNCTIONS OF COMPONENTS OF A PERSONAL COMPUTER Components of a personal computer - Summary Computer Case aluminium casing to store all components. Motherboard Central Processor Unit (CPU) Power supply
More informationA Cool Scheduler for Multi-Core Systems Exploiting Program Phases
IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 5, MAY 2014 1061 A Cool Scheduler for Multi-Core Systems Exploiting Program Phases Zhiming Zhang and J. Morris Chang, Senior Member, IEEE Abstract Rapid growth
More informationReal Time Power Estimation and Thread Scheduling via Performance Counters. By Singh, Bhadauria, McKee
Real Time Power Estimation and Thread Scheduling via Performance Counters By Singh, Bhadauria, McKee Estimating Power Consumption Power Consumption is a highly important metric for developers Simple power
More informationBill Nesheim Sun Microsystems, Inc. Bob Kasten Intel Corporation
Bill Nesheim Sun Microsystems, Inc. Bob Kasten Intel Corporation 1 Executive Summary Sun and Intel strategic alliance has resulted in powerful innovations for customers The Solaris OS and the Intel Xeon
More informationPower Models Supporting Energy-Efficient Co- Design on Ultra-Low Power Embedded Systems
Power Models Supporting Energy-Efficient Co- Design on Ultra-Low Power Embedded Systems Vi Ngoc-Nha Tran 1, Brendan Barry 2, Phuong Ha 1 1 Department of Computer Science, UiT The Arctic University of Norway
More informationCOL862 - Low Power Computing
COL862 - Low Power Computing Power Measurements using performance counters and studying the low power computing techniques in IoT development board (PSoC 4 BLE Pioneer Kit) and Arduino Mega 2560 Submitted
More informationQuad-core Press Briefing First Quarter Update
Quad-core Press Briefing First Quarter Update AMD Worldwide Server/Workstation Marketing C O N F I D E N T I A L Outstanding Dual-core Performance Toady Average of scores places AMD ahead by 2% Average
More informationCrusoe Power Management:
Crusoe Power Management: Cutting x86 Operating Power Through LongRun Marc Fleischmann Director, Low Power Programs Transmeta Corporation Crusoe, LongRun and Code Morphing are trademarks of Transmeta Corp.
More informationEmbedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.
Embedded processors Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.fi Comparing processors Evaluating processors Taxonomy of processors
More informationECE 471 Embedded Systems Lecture 2
ECE 471 Embedded Systems Lecture 2 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 7 September 2018 Announcements Reminder: The class notes are posted to the website. HW#1 will
More informationAdvanced and parallel architectures. Part B. Prof. A. Massini. June 13, Exercise 1a (3 points) Exercise 1b (3 points) Exercise 2 (8 points)
Advanced and parallel architectures Prof. A. Massini June 13, 2017 Part B Exercise 1a (3 points) Exercise 1b (3 points) Exercise 2 (8 points) Student s Name Exercise 3 (4 points) Exercise 4 (3 points)
More informationSIMD. Utilization of a SIMD unit in the OS Kernel. Shogo Saito 1 and Shuichi Oikawa 2 2. SIMD. SIMD (Single SIMD SIMD SIMD SIMD
OS SIMD 1 2 SIMD (Single Instruction Multiple Data) SIMD OS (Operating System) SIMD SIMD OS Utilization of a SIMD unit in the OS Kernel Shogo Saito 1 and Shuichi Oikawa 2 Nowadays, it is very common that
More informationAMD Opteron 4200 Series Processor
What s new in the AMD Opteron 4200 Series Processor (Codenamed Valencia ) and the new Bulldozer Microarchitecture? Platform Processor Socket Chipset Opteron 4000 Opteron 4200 C32 56x0 / 5100 (codenamed
More informationExternal Docking Station for 2.5in or 3.5in SATA III 6Gbps Hard Drives - esata or USB 3.0 with UASP
External Docking Station for 2.5in or 3.5in SATA III 6Gbps Hard Drives - esata or USB 3.0 with UASP Product ID: SDOCKU33EF This USB 3.0 and esata docking station makes it easy for you to connect and swap
More informationEXPLORING PARALLEL PROCESSING OPPORTUNITIES IN AERMOD. George Delic * HiPERiSM Consulting, LLC, Durham, NC, USA
EXPLORING PARALLEL PROCESSING OPPORTUNITIES IN AERMOD George Delic * HiPERiSM Consulting, LLC, Durham, NC, USA 1. INTRODUCTION HiPERiSM Consulting, LLC, has a mission to develop (or enhance) software and
More informationThe EPU functions that are supported vary with motherboard models.
E043 December 2009 / First Edition is an energy-efficient tool that provides you with a total system power-saving solution. It detects the current computer loading and intelligently adjusts the power usage
More informationCase Study IBM PowerPC 620
Case Study IBM PowerPC 620 year shipped: 1995 allowing out-of-order execution (dynamic scheduling) and in-order commit (hardware speculation). using a reorder buffer to track when instruction can commit,
More informationEnergy-centric DVFS Controlling Method for Multi-core Platforms
Energy-centric DVFS Controlling Method for Multi-core Platforms Shin-gyu Kim, Chanho Choi, Hyeonsang Eom, Heon Y. Yeom Seoul National University, Korea MuCoCoS 2012 Salt Lake City, Utah Abstract Goal To
More informationMartin Kruliš, v
Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal
More informationStatic and Dynamic Frequency Scaling on Multicore CPUs
Static and Dynamic Frequency Scaling on Multicore CPUs Wenlei Bao 1 Changwan Hong 1 Sudheer Chunduri 2 Sriram Krishnamoorthy 3 Louis-Noël Pouchet 4 Fabrice Rastello 5 P. Sadayappan 1 1 The Ohio State University
More informationFrame Shared Memory: Line-Rate Networking on Commodity Hardware John Giacomoni
Frame Shared Memory: Line-Rate Networking on Commodity Hardware John Giacomoni John K. Bennett, Douglas C. Sicker, and Manish Vachharajani Alexander L. Wolf - Imperial College London Antonio Carzaniga
More informationKampala August, Agner Fog
Advanced microprocessor optimization Kampala August, 2007 Agner Fog www.agner.org Agenda Intel and AMD microprocessors Out Of Order execution Branch prediction Platform, 32 or 64 bits Choice of compiler
More informationPoTrA: A framework for Building Power Models For Next Generation Multicore Architectures
www.bsc.es PoTrA: A framework for Building Power Models For Next Generation Multicore Architectures Part II: modeling methods Outline Background Known pitfalls Objectives Part I: Decomposable power models:
More informationFundamentals of Quantitative Design and Analysis
Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature
More informationPower-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters Gregor von Laszewski, Lizhe Wang, Andrew J. Younge, Xi He Service Oriented Cyberinfrastructure Lab Rochester Institute of Technology,
More informationCrusoe Processor Model TM5800
Model TM5800 Crusoe TM Processor Model TM5800 Features VLIW processor and x86 Code Morphing TM software provide x86-compatible mobile platform solution Processors fabricated in latest 0.13µ process technology
More informationA priori power estimation of linear solvers on multi-core processors
A priori power estimation of linear solvers on multi-core processors Dimitar Lukarski 1, Tobias Skoglund 2 Uppsala University Department of Information Technology Division of Scientific Computing 1 Division
More informationWho Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption. Jeremy Bennett
Who Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption Jeremy Bennett Why? Ericsson T65 released 2001 Li-Ion 720 mah standby 300 h talk time 11 h includes
More informationCS3350B Computer Architecture CPU Performance and Profiling
CS3350B Computer Architecture CPU Performance and Profiling Marc Moreno Maza http://www.csd.uwo.ca/~moreno/cs3350_moreno/index.html Department of Computer Science University of Western Ontario, Canada
More informationThe AMD64 Technology for Server and Workstation. Dr. Ulrich Knechtel Enterprise Program Manager EMEA
The AMD64 Technology for Server and Workstation Dr. Ulrich Knechtel Enterprise Program Manager EMEA Agenda Direct Connect Architecture AMD Opteron TM Processor Roadmap Competition OEM support The AMD64
More informationOptimising Multicore JVMs. Khaled Alnowaiser
Optimising Multicore JVMs Khaled Alnowaiser Outline JVM structure and overhead analysis Multithreaded JVM services JVM on multicore An observational study Potential JVM optimisations Basic JVM Services
More informationPOWER7: IBM's Next Generation Server Processor
POWER7: IBM's Next Generation Server Processor Acknowledgment: This material is based upon work supported by the Defense Advanced Research Projects Agency under its Agreement No. HR0011-07-9-0002 Outline
More informationAdvanced Computer Architecture
Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes
More informationLECTURE 3:CPU SCHEDULING
LECTURE 3:CPU SCHEDULING 1 Outline Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time CPU Scheduling Operating Systems Examples Algorithm Evaluation 2 Objectives
More informationCMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading)
CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) Limits to ILP Conflicting studies of amount of ILP Benchmarks» vectorized Fortran FP vs. integer
More informationPower Management for Embedded Systems
Power Management for Embedded Systems Minsoo Ryu Hanyang University Why Power Management? Battery-operated devices Smartphones, digital cameras, and laptops use batteries Power savings and battery run
More informationAdvanced Computer Architecture (CS620)
Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).
More informationUSB 3.0 / esata Dual Hard Drive Docking Station with UASP for 2.5/3.5in SATA SSD / HDD SATA 6 Gbps
USB 3.0 / esata Dual Hard Drive Docking Station with UASP for 2.5/3.5in SATA SSD / HDD SATA 6 Gbps Product ID: SDOCK2U33EB The SDOCK2U33EB Dual 2.5/3.5" SATA hard drive docking station lets you dock and
More informationParallel Computing. Parallel Computing. Hwansoo Han
Parallel Computing Parallel Computing Hwansoo Han What is Parallel Computing? Software with multiple threads Parallel vs. concurrent Parallel computing executes multiple threads at the same time on multiple
More informationBenchmarking of Dynamic Power Management Solutions. Frank Dols CELF Embedded Linux Conference Santa Clara, California (USA) April 19, 2007
Benchmarking of Dynamic Power Management Solutions Frank Dols CELF Embedded Linux Conference Santa Clara, California (USA) April 19, 2007 Why Benchmarking?! From Here to There, 2000whatever Vendor NXP
More informationUSB 3.0 to 4-Bay SATA 6Gbps Hard Drive Docking Station w/ UASP & Dual Fans - 2.5/3.5in SSD / HDD Dock
USB 3.0 to 4-Bay SATA 6Gbps Hard Drive Docking Station w/ UASP & Dual Fans - 2.5/3.5in SSD / HDD Dock Product ID: SDOCK4U33 The SDOCK4U33 four-bay 2.5/3.5" SATA HDD / SSD docking station lets you dock
More informationOutline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need??
Outline EEL 7 Graduate Computer Architecture Chapter 3 Limits to ILP and Simultaneous Multithreading! Limits to ILP! Thread Level Parallelism! Multithreading! Simultaneous Multithreading Ann Gordon-Ross
More informationResponse Time and Throughput
Response Time and Throughput Response time How long it takes to do a task Throughput Total work done per unit time e.g., tasks/transactions/ per hour How are response time and throughput affected by Replacing
More informationOUTLINE Introduction Power Components Dynamic Power Optimization Conclusions
OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism
More informationECE 571 Advanced Microprocessor-Based Design Lecture 22
ECE 571 Advanced Microprocessor-Based Design Lecture 22 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 19 April 2018 HW#11 will be posted Announcements 1 Reading 1 Exploring DynamIQ
More informationAccurate and Stable Empirical CPU Power Modelling for Multi- and Many-Core Systems
Accurate and Stable Empirical CPU Power Modelling for Multi- and Many-Core Systems Matthew J. Walker*, Stephan Diestelhorst, Geoff V. Merrett* and Bashir M. Al-Hashimi* *University of Southampton Arm Ltd.
More informationPart 1 of 3 -Understand the hardware components of computer systems
Part 1 of 3 -Understand the hardware components of computer systems The main circuit board, the motherboard provides the base to which a number of other hardware devices are connected. Devices that connect
More informationAgenda. What is Ryzen? History. Features. Zen Architecture. SenseMI Technology. Master Software. Benchmarks
Ryzen Agenda What is Ryzen? History Features Zen Architecture SenseMI Technology Master Software Benchmarks The Ryzen Chip What is Ryzen? CPU chip family released by AMD in 2017, which uses their latest
More information1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola
1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device
More information80 Plus Gold Certi ed
P1 550B BEFX Designed for serious gamers and DIY professionals, the XFX XTR Series 650W Full Modular 80 Plus Gold power supply delivers the clean and stable power required for demanding gaming rigs and
More informationDesigning Power-Aware Collective Communication Algorithms for InfiniBand Clusters
Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters Krishna Kandalla, Emilio P. Mancini, Sayantan Sur, and Dhabaleswar. K. Panda Department of Computer Science & Engineering,
More informationAge nda. Intel PXA27x Processor Family: An Applications Processor for Phone and PDA applications
Intel PXA27x Processor Family: An Applications Processor for Phone and PDA applications N.C. Paver PhD Architect Intel Corporation Hot Chips 16 August 2004 Age nda Overview of the Intel PXA27X processor
More informationExploring the Throughput-Fairness Trade-off on Asymmetric Multicore Systems
Exploring the Throughput-Fairness Trade-off on Asymmetric Multicore Systems J.C. Sáez, A. Pousa, F. Castro, D. Chaver y M. Prieto Complutense University of Madrid, Universidad Nacional de la Plata-LIDI
More informationECE 695 Numerical Simulations Lecture 3: Practical Assessment of Code Performance. Prof. Peter Bermel January 13, 2017
ECE 695 Numerical Simulations Lecture 3: Practical Assessment of Code Performance Prof. Peter Bermel January 13, 2017 Outline Time Scaling Examples General performance strategies Computer architectures
More informationHigh performance 2D Discrete Fourier Transform on Heterogeneous Platforms. Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli
High performance 2D Discrete Fourier Transform on Heterogeneous Platforms Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli Motivation Fourier Transform widely used in Physics, Astronomy, Engineering
More informationHardware-Based Speculation
Hardware-Based Speculation Execute instructions along predicted execution paths but only commit the results if prediction was correct Instruction commit: allowing an instruction to update the register
More informationUSB 3.0 Dual Hard Drive Docking Station with UASP for 2.5/3.5in SSD / HDD SATA 6 Gbps
USB 3.0 Dual Hard Drive Docking Station with UASP for 2.5/3.5in SSD / HDD SATA 6 Gbps Product ID: SDOCK2U33 The SDOCK2U33 Dual 2.5/3.5" SATA hard drive docking station lets you dock and swap drives from
More informationCharles Lefurgy IBM Research, Austin
Super-Dense Servers: An Energy-efficient Approach to Large-scale Server Clusters Outline Problem Internet data centers use a lot of energy Opportunity Load-varying applications Servers can be power-managed
More informationKaisen Lin and Michael Conley
Kaisen Lin and Michael Conley Simultaneous Multithreading Instructions from multiple threads run simultaneously on superscalar processor More instruction fetching and register state Commercialized! DEC
More informationOutline. How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III
Outline How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III Peter Christen and Adam Czezowski CAP Research Group Department of Computer Science,
More informationChapter 5. Introduction ARM Cortex series
Chapter 5 Introduction ARM Cortex series 5.1 ARM Cortex series variants 5.2 ARM Cortex A series 5.3 ARM Cortex R series 5.4 ARM Cortex M series 5.5 Comparison of Cortex M series with 8/16 bit MCUs 51 5.1
More informationHakam Zaidan Stephen Moore
Hakam Zaidan Stephen Moore Outline Vector Architectures Properties Applications History Westinghouse Solomon ILLIAC IV CDC STAR 100 Cray 1 Other Cray Vector Machines Vector Machines Today Introduction
More informationMotion Control Computing Architectures for Ultra Precision Machines
Motion Control Computing Architectures for Ultra Precision Machines Mile Erlic Precision MicroDynamics, Inc., #3-512 Frances Avenue, Victoria, B.C., Canada, V8Z 1A1 INTRODUCTION Several computing architectures
More informationParallelizing Inline Data Reduction Operations for Primary Storage Systems
Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr
More informationMultithreaded Value Prediction
Multithreaded Value Prediction N. Tuck and D.M. Tullesn HPCA-11 2005 CMPE 382/510 Review Presentation Peter Giese 30 November 2005 Outline Motivation Multithreaded & Value Prediction Architectures Single
More informationPerformance Analysis in the Real World of Online Services
Performance Analysis in the Real World of Online Services Dileep Bhandarkar, Ph. D. Distinguished Engineer 2009 IEEE International Symposium on Performance Analysis of Systems and Software My Background:
More informationA2E: Adaptively Aggressive Energy Efficient DVFS Scheduling for Data Intensive Applications
A2E: Adaptively Aggressive Energy Efficient DVFS Scheduling for Data Intensive Applications Li Tan 1, Zizhong Chen 1, Ziliang Zong 2, Rong Ge 3, and Dong Li 4 1 University of California, Riverside 2 Texas
More informationRon Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group
Simultaneous Multi-threading Implementation in POWER5 -- IBM's Next Generation POWER Microprocessor Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group Outline Motivation Background Threading Fundamentals
More informationADVANCED ELECTRONIC SOLUTIONS AVIATION SERVICES COMMUNICATIONS AND CONNECTIVITY MISSION SYSTEMS
The most important thing we build is trust ADVANCED ELECTRONIC SOLUTIONS AVIATION SERVICES COMMUNICATIONS AND CONNECTIVITY MISSION SYSTEMS UT840 LEON Quad Core First Silicon Results Cobham Semiconductor
More informationExploring different level of parallelism Instruction-level parallelism (ILP): how many of the operations/instructions in a computer program can be performed simultaneously 1. e = a + b 2. f = c + d 3.
More informationMICROPROCESSOR TECHNOLOGY
MICROPROCESSOR TECHNOLOGY Assis. Prof. Hossam El-Din Moustafa Lecture 20 Ch.10 Intel Core Duo Processor Architecture 2-Jun-15 1 Chapter Objectives Understand the concept of dual core technology. Look inside
More informationUSB 3.0/eSATA Dual 3.5 SATA III Hard Drive External RAID Enclosure w/ UASP and Fan Black
USB 3.0/eSATA Dual 3.5 SATA III Hard Drive External RAID Enclosure w/ UASP and Fan Black Product ID: S3520BU33ER The S3520BU33ER 2-Bay RAID Enclosure offers a high-performance external storage solution,
More informationPotentials and Limitations for Energy Efficiency Auto-Tuning
Center for Information Services and High Performance Computing (ZIH) Potentials and Limitations for Energy Efficiency Auto-Tuning Parco Symposium Application Autotuning for HPC (Architectures) Robert Schöne
More informationPerformance Profiling
Performance Profiling Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University msryu@hanyang.ac.kr Outline History Understanding Profiling Understanding Performance Understanding Performance
More informationIMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM
IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM I5 AND I7 PROCESSORS Juan M. Cebrián 1 Lasse Natvig 1 Jan Christian Meyer 2 1 Depart. of Computer and Information
More informationVariations on Regression Models. Prof. Bennett Math Models of Data Science 2/02/06
Variations on Regression Models Prof. Bennett Math Models of Data Science 2/02/06 Outline Steps in modeling Review of Least Squares model Model in E & K pg 24-29 Aqualsol version of E&K Other loss functions
More informationComputer Architecture Homework Set # 1 COVER SHEET Please turn in with your own solution
CSCE 614 (Fall 2017) Computer Architecture Homework Set # 1 COVER SHEET Please turn in with your own solution Eun Jung Kim Write your answers on the sheets provided. Submit with the COVER SHEET. If you
More informationA Smart Port Card Tutorial --- Hardware
A Smart Port Card Tutorial --- Hardware John DeHart Washington University jdd@arl.wustl.edu http://www.arl.wustl.edu/~jdd 1 References: New Links from Kits References Page Intel Embedded Module: Data Sheet
More informationHierarchical PLABs, CLABs, TLABs in Hotspot
Hierarchical s, CLABs, s in Hotspot Christoph M. Kirsch ck@cs.uni-salzburg.at Hannes Payer hpayer@cs.uni-salzburg.at Harald Röck hroeck@cs.uni-salzburg.at Abstract Thread-local allocation buffers (s) are
More information