A Probabilistic Graphical Model-based Approach for Minimizing Energy under Performance Constraints

Size: px
Start display at page:

Download "A Probabilistic Graphical Model-based Approach for Minimizing Energy under Performance Constraints"

Transcription

1 A Probabilistic Graphical Model-based Approach for Minimizing Energy under Performance Constraints Nikita Mishra, Huazhe Zhang, John Lafferty and Hank Hoffmann University of Chicago

2 Fraction of time CPU utilization CPU utilization Average CPU utilization of more than 5,000 servers during 6-month period [1] [1]Barroso, Luiz André, and Urs Hölzle. "The case for energy-proportional computing." IEEE computer (2007):

3 Example of a configuration space 2.26 Hz Clock Speed Memory Controller 1 Memory Controller 2 Cores Memory controller 3

4 Adaptive systems Automatically tune configurations for different utilizations to achieve most energy efficient state 4

5 Adaptive systems Automatically tune configurations for different utilizations to achieve most energy efficient state Requires the power and performance profile for the application 4

6 Why is it a difficult problem? 5

7 Why is it a difficult problem? Configuration space can be quite large. With brute force it may take a lot of time. 5

8 Why is it a difficult problem? Configuration space can be quite large. With brute force it may take a lot of time. The behavior of each application is different for different machine. 5

9 Why is it a difficult problem? Configuration space can be quite large. With brute force it may take a lot of time. The behavior of each application is different for different machine. The application behavior could even vary with different input. E.g. (Video streaming application x264) 5

10 Cores Example: streamcluster Performance rate (in iter/s) Clock speed A contour plot of performance rate (in iter/s) for streamcluster benchmark at different configurations 6

11 Cores Example: streamcluster Performance rate (in iter/s) 8 Clock speed A contour plot of performance rate (in iter/s) for streamcluster benchmark at different configurations 6

12 Cores Example: streamcluster Performance rate (in iter/s) Multiple local solutions 8 Clock speed A contour plot of performance rate (in iter/s) for streamcluster benchmark at different configurations 6

13 Example: kmeans Optimal configuration frontier Pareto frontier of Performance rate (in Iter/s) vs system-power(in Watts) at different configurations 7

14 LEO (Learning for Energy Optimization) Historical Data Target Application 8

15 LEO (Learning for Energy Optimization) Historical Data Target Application Incorporate performance profiles of previously seen applications 8

16 Example: kmeans Performance rate (in Iter/s) vs Configuration index Estimated Pareto-optimal frontiers vs true frontier found with exhaustive search 9

17 Motivation/Overview Statistical modelling Evaluation Summary Outline 10

18 Outline Statistical modelling 10

19 Outline Statistical modelling Graphical Models Hierarchical Bayesian model Expectationmaximization algorithm 10

20 Outline Statistical modelling Graphical Models Hierarchical Bayesian model Expectationmaximization algorithm 10

21 Outline Statistical modelling Graphical Models Hierarchical Bayesian model Expectationmaximization algorithm 10

22 Outline Statistical modelling Graphical Models Hierarchical Bayesian model Expectationmaximization algorithm 10

23 Graphical Models z1 z2 zm -1 zm y1 y2 ym -1 ym yi: Vector of performance rate by the i th application for different configurations. 11

24 Graphical Models z1 z2 zm -1 zm y1 y2 ym -1 ym yi: Vector of performance rate by the i th application for different configurations. 11

25 Graphical Models z1 z2 zm -1 zm y1 y2 ym -1 ym yi: Vector of performance rate by the i th application for different configurations. 11

26 Hierarchical Bayesian Model Hidden Nodes, z1 z2 zm -1 zm All applications (Observed data) y1 y2 ym -1 ym yi: Vector of performance rate by the i th application for different configurations. 12

27 Hierarchical Bayesian Model Hidden Nodes, z1 z2 zm -1 zm All applications (Observed data) y1 y2 ym -1 ym Target Application (Partially observed data) yi: Vector of performance rate by the i th application for different configurations. 12

28 Hierarchical Bayesian Model Hidden Nodes, Couples each of the applications z1 z2 zm -1 zm All applications (Observed data) y1 y2 ym -1 ym Target Application (Partially observed data) yi: Vector of performance rate by the i th application for different configurations. 12

29 Hierarchical Bayesian Model Hidden Nodes, z1 z2 zm -1 zm Couples each of the applications Penalizes large variations in the application All applications (Observed data) y1 y2 ym -1 ym Target Application (Partially observed data) yi: Vector of performance rate by the i th application for different configurations. 12

30 Hierarchical Bayesian Model Hidden Nodes, z1 z2 zm -1 zm All applications (Observed data) y1 y2 ym -1 ym yi: Vector of performance rate by the i th application for different configurations. 12

31 Hierarchical Bayesian Model Hidden Nodes, z1 z2 zm -1 zm True value of target application All applications (Observed data) y1 y2 ym -1 ym yi: Vector of performance rate by the i th application for different configurations. 13

32 Expectation Maximization Algorithm Model Parameters Latent variables Initialize 14

33 Expectation Maximization Algorithm Model Parameters Latent variables Ɵnew= Initialize Initialize 14

34 Expectation Maximization Algorithm Model Parameters Latent variables Ɵnew= Initialize Initialize = E-step Create Expected log-likelihood function 14

35 Expectation Maximization Algorithm Model Parameters Latent variables Ɵnew= M-step Maximize Initialize Expected Initialize log-likelihood function = E-step Create Expected log-likelihood function 14

36 Expectation Maximization Algorithm Model Parameters Ɵnew Latent variables Ɵnew= M-step Maximize Initialize Expected Initialize log-likelihood function Observed data = E-step Create Expected log-likelihood function 14

37 Performance (in Iter/s) Example: kmeans (Initialization) Cores Different iterations of EM algorithm for estimating performance rate (in Iter/s) vs Cores 15

38 Performance (in Iter/s) Example: kmeans (Initialization) Observed Samples Cores Different iterations of EM algorithm for estimating performance rate (in Iter/s) vs Cores 15

39 Performance (in Iter/s) Example: kmeans (EM Iteration - 1) Cores Different iterations of EM algorithm for estimating performance rate (in Iter/s) vs Cores 15

40 Performance (in Iter/s) Example: kmeans (EM Iteration - 2) Cores Different iterations of EM algorithm for estimating performance rate (in Iter/s) vs Cores 15

41 Performance (in Iter/s) Example: kmeans (EM Iteration - 3) Cores Different iterations of EM algorithm for estimating performance rate (in Iter/s) vs Cores 15

42 Performance (in Iter/s) Example: kmeans (EM Iteration - 4) Cores Different iterations of EM algorithm for estimating performance rate (in Iter/s) vs Cores 15

43 Performance (in Iter/s) Example: kmeans (EM Iteration - 4) Cores Different iterations of EM algorithm for estimating performance rate (in Iter/s) vs Cores 15

44 LEO (Learning for Energy Optimization) Set ym = Observed Power LEO Get p = Estimated Power Feedback! Controller Select the configuration LEO Set ym = Observed Performance Get r = Estimated Performance 16

45 LEO (Learning for Energy Optimization) Set ym = Observed Power LEO Get p = Estimated Power Feedback! Controller Select the configuration LEO Set ym = Observed Performance Get r = Estimated Performance 16

46 LEO (Learning for Energy Optimization) Set ym = Observed Power LEO Get p = Estimated Power Feedback! Controller Select the configuration LEO Set ym = Observed Performance Get r = Estimated Performance 16

47 Motivation/Overview Statistical modelling Evaluation Experimental Setup Power and performance estimation Energy savings/ Phase transition Summary Outline 17

48 Outline Evaluation Experimental Setup 17

49 Outline Evaluation Experimental Setup Dual-socket Linux system with SuperMICRO X9DRL-iF motherboard and two Intel Xeon E processors 17

50 Experimental Setup Configurations (1024 configurations) 18

51 Configurations (1024 configurations) Clock speed: Experimental Setup Set using cpufrequtils package 15 DVFS settings (from 1.2 { 2.9 GHz) + TurboBoost - 16 settings 18

52 Configurations (1024 configurations) Clock speed: Set using cpufrequtils package 15 DVFS settings (from 1.2 { 2.9 GHz) + TurboBoost - 16 settings Memory controller: Experimental Setup numactl library to control the access. 2 memory controls - 2 settings 18

53 Configurations (1024 configurations) Clock speed: Set using cpufrequtils package 15 DVFS settings (from 1.2 { 2.9 GHz) + TurboBoost - 16 settings Memory controller: numactl library to control the access. 2 memory controls - 2 settings Cores: Experimental Setup Two 8 cores and hyper-threading - 32 settings 18

54 Configurations (1024 configurations) Clock speed: Set using cpufrequtils package 15 DVFS settings (from 1.2 { 2.9 GHz) + TurboBoost - 16 settings Memory controller: numactl library to control the access. 2 memory controls - 2 settings Cores: Two 8 cores and hyper-threading - 32 settings Measurements Experimental Setup 18

55 Configurations (1024 configurations) Clock speed: Set using cpufrequtils package 15 DVFS settings (from 1.2 { 2.9 GHz) + TurboBoost - 16 settings Memory controller: numactl library to control the access. 2 memory controls - 2 settings Cores: Two 8 cores and hyper-threading - 32 settings Measurements Power Experimental Setup WattsUp meter provides total system power at 1s intervals. 18

56 Configurations (1024 configurations) Clock speed: Set using cpufrequtils package 15 DVFS settings (from 1.2 { 2.9 GHz) + TurboBoost - 16 settings Memory controller: numactl library to control the access. 2 memory controls - 2 settings Cores: Two 8 cores and hyper-threading - 32 settings Measurements Power WattsUp meter provides total system power at 1s intervals. Performance Experimental Setup Applications report the heartrate, which is application specific. 18

57 Benchmarks Experimental Setup 19

58 Experimental Setup Benchmarks We use 25 benchmarks from 3 different suites, PARSEC, Minebench, Rodinia and some others. 19

59 Experimental Setup Benchmarks We use 25 benchmarks from 3 different suites, PARSEC, Minebench, Rodinia and some others. Baseline heuristics 19

60 Experimental Setup Benchmarks We use 25 benchmarks from 3 different suites, PARSEC, Minebench, Rodinia and some others. Baseline heuristics Online algorithm- Polynomial multivariate regression over configuration values on the observed dataset. 19

61 Experimental Setup Benchmarks We use 25 benchmarks from 3 different suites, PARSEC, Minebench, Rodinia and some others. Baseline heuristics Online algorithm- Polynomial multivariate regression over configuration values on the observed dataset. Offline algorithm- Average over the rest of the applications to estimate the power and performance of the given application. 19

62 Experimental Setup Benchmarks We use 25 benchmarks from 3 different suites, PARSEC, Minebench, Rodinia and some others. Baseline heuristics Online algorithm- Polynomial multivariate regression over configuration values on the observed dataset. Offline algorithm- Average over the rest of the applications to estimate the power and performance of the given application. Race-to-idle- Allocates all resources to the application and once it is finished the system goes to idle. 19

63 Motivation/Overview Statistical modelling Evaluation Experimental setup Power and performance estimation Energy savings/ Phase transition Summary Outline 20

64 Power and performance estimation Performance rate (in Iter/s) vs Configuration index System-power (in Watts) vs Configuration index 21

65 Power and performance estimation Swish Search web- server X264 Video encoder 22

66 ACCURACY Summary: Performance estimation LEO Online Offline Kmeans LEO Online Offline

67 ACCURACY Summary: Performance estimation LEO Online Offline Jacobi LEO Online Offline

68 ACCURACY Summary: Performance estimation LEO Online Offline Overall LEO Online Offline

69 ACCURACY Summary: System-power estimation LEO Online Offline Overall LEO Online Offline

70 Motivation/Overview Statistical modelling Experiments Experimental setup Power and performance estimation Energy savings/ Phase transition Summary Outline 27

71 Summary: Energy savings Comparison of average energy compared with the optimal (over different utilizations and all the benchmarks), LEO - +6% Online - +24% Offline - +29% Race-to idle - +90% 28

72 Phase - transitions Performance and power for fluidanimate along phases with different computational demands 29

73 Phase - transitions Performance and power for fluidanimate along phases with different computational demands 29

74 Multiple Applications Comparison of performance estimation(in iter/s) and system-power(in Watts) estimation for different algorithms over the set of mixture of applications Performance(in Iter/s) System-power(in Watts) Mixture 1 Mixture 2 Overall Mixture 1 Mixture 2 Overall LEO Online Offline

75 Summary

76 Sensitivity analysis of LEO vs Online As compared to LEO which quickly reaches near optimality, our baseline method (online regression) cannot perform below 15 samples because the design matrix of regression model would be rank deficient. 32

77 Related Work Offline optimization techniques (e.g.,[59, 35, 33, 10, 2]) But they are limited by reliance on a robust training phase. Online optimization techniques [44] For example, Flicker is a configurable architecture and optimization framework that uses only online models to maximize performance under a power limitation. ParallelismDial, Uses online adaptation to tailor parallelism to application workload. 33

COL862: Low Power Computing Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

COL862: Low Power Computing Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques COL862: Low Power Computing Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques Authors: Huazhe Zhang and Henry Hoffmann, Published: ASPLOS '16 Proceedings

More information

THE UNIVERSITY OF CHICAGO STATISTICAL METHODS FOR PERFORMANCE ESTIMATION FOR IMPROVING SCHEDULING AND ENERGY MINIMIZATION A DISSERTATION SUBMITTED TO

THE UNIVERSITY OF CHICAGO STATISTICAL METHODS FOR PERFORMANCE ESTIMATION FOR IMPROVING SCHEDULING AND ENERGY MINIMIZATION A DISSERTATION SUBMITTED TO THE UNIVERSITY OF CHICAGO STATISTICAL METHODS FOR PERFORMANCE ESTIMATION FOR IMPROVING SCHEDULING AND ENERGY MINIMIZATION A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCE

More information

Evaluating the Effectiveness of Model Based Power Characterization

Evaluating the Effectiveness of Model Based Power Characterization Evaluating the Effectiveness of Model Based Power Characterization John McCullough, Yuvraj Agarwal, Jaideep Chandrashekhar (Intel), Sathya Kuppuswamy, Alex C. Snoeren, Rajesh Gupta Computer Science and

More information

Energy Models for DVFS Processors

Energy Models for DVFS Processors Energy Models for DVFS Processors Thomas Rauber 1 Gudula Rünger 2 Michael Schwind 2 Haibin Xu 2 Simon Melzner 1 1) Universität Bayreuth 2) TU Chemnitz 9th Scheduling for Large Scale Systems Workshop July

More information

Power-Aware Computing with Dynamic Knobs Henry Hoffmann, Stelios Sidiroglou, Michael Carbin, Sasa Misailovic, Anant Agarwal, and Martin Rinard

Power-Aware Computing with Dynamic Knobs Henry Hoffmann, Stelios Sidiroglou, Michael Carbin, Sasa Misailovic, Anant Agarwal, and Martin Rinard Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-00-07 May 4, 00 Power-Aware Computing with Dynamic Knobs Henry Hoffmann, Stelios Sidiroglou, Michael Carbin, Sasa Misailovic,

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

COL862 Programming Assignment-1

COL862 Programming Assignment-1 Submitted By: Rajesh Kedia (214CSZ8383) COL862 Programming Assignment-1 Objective: Understand the power and energy behavior of various benchmarks on different types of x86 based systems. We explore a laptop,

More information

PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multithreaded Benchmark Suites

PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multithreaded Benchmark Suites PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multithreaded Benchmark Suites Christian Bienia (Princeton University), Sanjeev Kumar (Intel), Kai Li (Princeton University) Outline Overview What

More information

Clustering Lecture 5: Mixture Model

Clustering Lecture 5: Mixture Model Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics

More information

Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters

Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters Gregor von Laszewski, Lizhe Wang, Andrew J. Younge, Xi He Service Oriented Cyberinfrastructure Lab Rochester Institute of Technology,

More information

Derivative Delay Embedding: Online Modeling of Streaming Time Series

Derivative Delay Embedding: Online Modeling of Streaming Time Series Derivative Delay Embedding: Online Modeling of Streaming Time Series Zhifei Zhang (PhD student), Yang Song, Wei Wang, and Hairong Qi Department of Electrical Engineering & Computer Science Outline 1. Challenges

More information

Accurate Characterization of the Variability in Power Consumption in Modern Mobile Processors

Accurate Characterization of the Variability in Power Consumption in Modern Mobile Processors Accurate Characterization of the Variability in Power Consumption in Modern Mobile Processors Bharathan Balaji John McCullough, Rajesh Gupta, Yuvraj Agarwal Computer Science and Engineering, UC San Diego

More information

Cross-layer Optimization for Virtual Machine Resource Management

Cross-layer Optimization for Virtual Machine Resource Management Cross-layer Optimization for Virtual Machine Resource Management Ming Zhao, Arizona State University Lixi Wang, Amazon Yun Lv, Beihang Universituy Jing Xu, Google http://visa.lab.asu.edu Virtualized Infrastructures,

More information

Towards Energy Proportionality for Large-Scale Latency-Critical Workloads

Towards Energy Proportionality for Large-Scale Latency-Critical Workloads Towards Energy Proportionality for Large-Scale Latency-Critical Workloads David Lo *, Liqun Cheng *, Rama Govindaraju *, Luiz André Barroso *, Christos Kozyrakis Stanford University * Google Inc. 2012

More information

Modeling CPU Energy Consumption for Energy Efficient Scheduling

Modeling CPU Energy Consumption for Energy Efficient Scheduling Modeling CPU Energy Consumption for Energy Efficient Scheduling Abhishek Jaiantilal, Yifei Jiang, Shivakant Mishra University of Colorado - Boulder GCM '10 Proceedings of the 1st Workshop on Green Computing

More information

Thread Tailor Dynamically Weaving Threads Together for Efficient, Adaptive Parallel Applications

Thread Tailor Dynamically Weaving Threads Together for Efficient, Adaptive Parallel Applications Thread Tailor Dynamically Weaving Threads Together for Efficient, Adaptive Parallel Applications Janghaeng Lee, Haicheng Wu, Madhumitha Ravichandran, Nathan Clark Motivation Hardware Trends Put more cores

More information

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work Using Non-blocking Operations in HPC to Reduce Execution Times David Buettner, Julian Kunkel, Thomas Ludwig Euro PVM/MPI September 8th, 2009 Outline 1 Motivation 2 Theory of a non-blocking benchmark 3

More information

IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM

IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM I5 AND I7 PROCESSORS Juan M. Cebrián 1 Lasse Natvig 1 Jan Christian Meyer 2 1 Depart. of Computer and Information

More information

Energy-centric DVFS Controlling Method for Multi-core Platforms

Energy-centric DVFS Controlling Method for Multi-core Platforms Energy-centric DVFS Controlling Method for Multi-core Platforms Shin-gyu Kim, Chanho Choi, Hyeonsang Eom, Heon Y. Yeom Seoul National University, Korea MuCoCoS 2012 Salt Lake City, Utah Abstract Goal To

More information

Dynamic Knobs for Responsive Power-Aware Computing

Dynamic Knobs for Responsive Power-Aware Computing Dynamic Knobs for Responsive Power-Aware Computing Henry Hoffmann Stelios Sidiroglou Michael Carbin Sasa Misailovic Anant Agarwal Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

More information

Topics. CIT 470: Advanced Network and System Administration. Google DC in The Dalles. Google DC in The Dalles. Data Centers

Topics. CIT 470: Advanced Network and System Administration. Google DC in The Dalles. Google DC in The Dalles. Data Centers CIT 470: Advanced Network and System Administration Data Centers Topics Data Center: A facility for housing a large amount of computer or communications equipment. 1. Racks 2. Power 3. PUE 4. Cooling 5.

More information

Data Partitioning on Heterogeneous Multicore and Multi-GPU systems Using Functional Performance Models of Data-Parallel Applictions

Data Partitioning on Heterogeneous Multicore and Multi-GPU systems Using Functional Performance Models of Data-Parallel Applictions Data Partitioning on Heterogeneous Multicore and Multi-GPU systems Using Functional Performance Models of Data-Parallel Applictions Ziming Zhong Vladimir Rychkov Alexey Lastovetsky Heterogeneous Computing

More information

Managing Performance vs. Accuracy Trade-offs With Loop Perforation

Managing Performance vs. Accuracy Trade-offs With Loop Perforation Managing Performance vs. Accuracy Trade-offs With Loop Perforation Stelios Sidiroglou Sasa Misailovic Henry Hoffmann Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 7

ECE 571 Advanced Microprocessor-Based Design Lecture 7 ECE 571 Advanced Microprocessor-Based Design Lecture 7 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 9 February 2017 Announcements HW#4 will be posted, some readings 1 Measuring

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

Performance, Power, Die Yield. CS301 Prof Szajda

Performance, Power, Die Yield. CS301 Prof Szajda Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the

More information

Statistical Performance Comparisons of Computers

Statistical Performance Comparisons of Computers Tianshi Chen 1, Yunji Chen 1, Qi Guo 1, Olivier Temam 2, Yue Wu 1, Weiwu Hu 1 1 State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), Chinese Academy of Sciences, Beijing,

More information

Optimization of Behavioral IPs in Multi-Processor System-on- Chips

Optimization of Behavioral IPs in Multi-Processor System-on- Chips Optimization of Behavioral IPs in Multi-Processor System-on- Chips Yidi Liu and Benjamin Carrion Schafer # Department of Electronic and Information Engineering b.carrionschafer@polyu.edu.hk # Outline High-Level

More information

Parallel Programming Multicore systems

Parallel Programming Multicore systems FYS3240 PC-based instrumentation and microcontrollers Parallel Programming Multicore systems Spring 2011 Lecture #9 Bekkeng, 4.4.2011 Introduction Until recently, innovations in processor technology have

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

A Simple Model for Estimating Power Consumption of a Multicore Server System

A Simple Model for Estimating Power Consumption of a Multicore Server System , pp.153-160 http://dx.doi.org/10.14257/ijmue.2014.9.2.15 A Simple Model for Estimating Power Consumption of a Multicore Server System Minjoong Kim, Yoondeok Ju, Jinseok Chae and Moonju Park School of

More information

COL862 - Low Power Computing

COL862 - Low Power Computing COL862 - Low Power Computing Power Measurements using performance counters and studying the low power computing techniques in IoT development board (PSoC 4 BLE Pioneer Kit) and Arduino Mega 2560 Submitted

More information

Energy Proportional Datacenter Memory. Brian Neel EE6633 Fall 2012

Energy Proportional Datacenter Memory. Brian Neel EE6633 Fall 2012 Energy Proportional Datacenter Memory Brian Neel EE6633 Fall 2012 Outline Background Motivation Related work DRAM properties Designs References Background The Datacenter as a Computer Luiz André Barroso

More information

ibench: Quantifying Interference in Datacenter Applications

ibench: Quantifying Interference in Datacenter Applications ibench: Quantifying Interference in Datacenter Applications Christina Delimitrou and Christos Kozyrakis Stanford University IISWC September 23 th 2013 Executive Summary Problem: Increasing utilization

More information

Software within building physics and ground heat storage. HEAT3 version 7. A PC-program for heat transfer in three dimensions Update manual

Software within building physics and ground heat storage. HEAT3 version 7. A PC-program for heat transfer in three dimensions Update manual Software within building physics and ground heat storage HEAT3 version 7 A PC-program for heat transfer in three dimensions Update manual June 15, 2015 BLOCON www.buildingphysics.com Contents 1. WHAT S

More information

Myths in PMC-based Power Estimation. Jason Mair, Zhiyi Huang, David Eyers, and Haibo Zhang

Myths in PMC-based Power Estimation. Jason Mair, Zhiyi Huang, David Eyers, and Haibo Zhang Myths in PMC-based Power Estimation Jason Mair, Zhiyi Huang, David Eyers, and Haibo Zhang Outline PMC-based power modeling Experimental setup and configuration Myth 1: Sample rate Myth 2: Thermal effects

More information

Introduction to Trajectory Clustering. By YONGLI ZHANG

Introduction to Trajectory Clustering. By YONGLI ZHANG Introduction to Trajectory Clustering By YONGLI ZHANG Outline 1. Problem Definition 2. Clustering Methods for Trajectory data 3. Model-based Trajectory Clustering 4. Applications 5. Conclusions 1 Problem

More information

A Cross-Input Adaptive Framework for GPU Program Optimizations

A Cross-Input Adaptive Framework for GPU Program Optimizations A Cross-Input Adaptive Framework for GPU Program Optimizations Yixun Liu, Eddy Z. Zhang, Xipeng Shen Computer Science Department The College of William & Mary Outline GPU overview G-Adapt Framework Evaluation

More information

Adaptive QoS Control Beyond Embedded Systems

Adaptive QoS Control Beyond Embedded Systems Adaptive QoS Control Beyond Embedded Systems Chenyang Lu! CSE 520S! Outline! Control-theoretic Framework! Service delay control on Web servers! On-line data migration in storage servers! ControlWare: adaptive

More information

Managing Web server performance with AutoTune agents

Managing Web server performance with AutoTune agents Managing Web server performance with AutoTune agents by Y. Diao, J. L. Hellerstein, S. Parekh, J. P. Bigus Pipat Waitayaworanart Woohyung Han Outline Introduction Apache web server and performance tuning

More information

GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE)

GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE) GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE) NATALIA GIMELSHEIN ANSHUL GUPTA STEVE RENNICH SEID KORIC NVIDIA IBM NVIDIA NCSA WATSON SPARSE MATRIX PACKAGE (WSMP) Cholesky, LDL T, LU factorization

More information

DyPO: Dynamic Pareto-Optimal Configuration Selection for Heterogeneous MpSoCs

DyPO: Dynamic Pareto-Optimal Configuration Selection for Heterogeneous MpSoCs 1 DyPO: Dynamic Pareto-Optimal Configuration Selection for Heterogeneous MpSoCs UJJWAL GUPTA, Arizona State University CHETAN ARVIND PATIL, Arizona State University GANAPATI BHAT, Arizona State University

More information

Simultaneous Multithreading on Pentium 4

Simultaneous Multithreading on Pentium 4 Hyper-Threading: Simultaneous Multithreading on Pentium 4 Presented by: Thomas Repantis trep@cs.ucr.edu CS203B-Advanced Computer Architecture, Spring 2004 p.1/32 Overview Multiple threads executing on

More information

Response Time and Throughput

Response Time and Throughput Response Time and Throughput Response time How long it takes to do a task Throughput Total work done per unit time e.g., tasks/transactions/ per hour How are response time and throughput affected by Replacing

More information

A Study on Optimally Co-scheduling Jobs of Different Lengths on CMP

A Study on Optimally Co-scheduling Jobs of Different Lengths on CMP A Study on Optimally Co-scheduling Jobs of Different Lengths on CMP Kai Tian Kai Tian, Yunlian Jiang and Xipeng Shen Computer Science Department, College of William and Mary, Virginia, USA 5/18/2009 Cache

More information

Map3D V58 - Multi-Processor Version

Map3D V58 - Multi-Processor Version Map3D V58 - Multi-Processor Version Announcing the multi-processor version of Map3D. How fast would you like to go? 2x, 4x, 6x? - it's now up to you. In order to achieve these performance gains it is necessary

More information

CSC 2515 Introduction to Machine Learning Assignment 2

CSC 2515 Introduction to Machine Learning Assignment 2 CSC 2515 Introduction to Machine Learning Assignment 2 Zhongtian Qiu(1002274530) Problem 1 See attached scan files for question 1. 2. Neural Network 2.1 Examine the statistics and plots of training error

More information

Embedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.

Embedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto. Embedded processors Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.fi Comparing processors Evaluating processors Taxonomy of processors

More information

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Tony Maciejewski, Kyle Tarplee, Ryan Friese, and Howard Jay Siegel Department of Electrical and Computer Engineering Colorado

More information

GPU Sparse Graph Traversal

GPU Sparse Graph Traversal GPU Sparse Graph Traversal Duane Merrill (NVIDIA) Michael Garland (NVIDIA) Andrew Grimshaw (Univ. of Virginia) UNIVERSITY of VIRGINIA Breadth-first search (BFS) 1. Pick a source node 2. Rank every vertex

More information

Cross-Layer Memory Management for Managed Language Applications

Cross-Layer Memory Management for Managed Language Applications Cross-Layer Memory Management for Managed Language Applications Michael R. Jantz University of Tennessee mrjantz@utk.edu Forrest J. Robinson Prasad A. Kulkarni University of Kansas {fjrobinson,kulkarni}@ku.edu

More information

GaaS Workload Characterization under NUMA Architecture for Virtualized GPU

GaaS Workload Characterization under NUMA Architecture for Virtualized GPU GaaS Workload Characterization under NUMA Architecture for Virtualized GPU Huixiang Chen, Meng Wang, Yang Hu, Mingcong Song, Tao Li Presented by Huixiang Chen ISPASS 2017 April 24, 2017, Santa Rosa, California

More information

A Fine-grained Performance-based Decision Model for Virtualization Application Solution

A Fine-grained Performance-based Decision Model for Virtualization Application Solution A Fine-grained Performance-based Decision Model for Virtualization Application Solution Jianhai Chen College of Computer Science Zhejiang University Hangzhou City, Zhejiang Province, China 2011/08/29 Outline

More information

QstatLab: software for statistical process control and robust engineering

QstatLab: software for statistical process control and robust engineering QstatLab: software for statistical process control and robust engineering I.N.Vuchkov Iniversity of Chemical Technology and Metallurgy 1756 Sofia, Bulgaria qstat@dir.bg Abstract A software for quality

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Unsupervised Learning: Kmeans, GMM, EM Readings: Barber 20.1-20.3 Stefan Lee Virginia Tech Tasks Supervised Learning x Classification y Discrete x Regression

More information

STAR Watch Statewide Technology Assistance Resources Project A publication of the Western New York Law Center,Inc.

STAR Watch Statewide Technology Assistance Resources Project A publication of the Western New York Law Center,Inc. STAR Watch Statewide Technology Assistance Resources Project A publication of the Western New York Law Center,Inc. Volume 9 Issue 3 June 2005 Double the Performance: Dual-Core CPU s Make Their Debut Starting

More information

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University NIPS 2008: E. Sudderth & M. Jordan, Shared Segmentation of Natural

More information

Intel Hyper-Threading technology

Intel Hyper-Threading technology Intel Hyper-Threading technology technology brief Abstract... 2 Introduction... 2 Hyper-Threading... 2 Need for the technology... 2 What is Hyper-Threading?... 3 Inside the technology... 3 Compatibility...

More information

Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems

Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems Ayse K. Coskun Electrical and Computer Engineering Department Boston University http://people.bu.edu/acoskun

More information

Boosting Simple Model Selection Cross Validation Regularization. October 3 rd, 2007 Carlos Guestrin [Schapire, 1989]

Boosting Simple Model Selection Cross Validation Regularization. October 3 rd, 2007 Carlos Guestrin [Schapire, 1989] Boosting Simple Model Selection Cross Validation Regularization Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 3 rd, 2007 1 Boosting [Schapire, 1989] Idea: given a weak

More information

PowerTracer: Tracing requests in multi-tier services to diagnose energy inefficiency

PowerTracer: Tracing requests in multi-tier services to diagnose energy inefficiency : Tracing requests in multi-tier services to diagnose energy inefficiency Lin Yuan 1, Gang Lu 1, Jianfeng Zhan 1, Haining Wang 2, and Lei Wang 1 1 Institute of Computing Technology, Chinese Academy of

More information

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth

More information

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.)

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) 10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) Rahil Mahdian 01.04.2016 LSV Lab, Saarland University, Germany What is clustering? Clustering is the classification of objects into different groups,

More information

Multi-Threaded UPC Runtime for GPU to GPU communication over InfiniBand

Multi-Threaded UPC Runtime for GPU to GPU communication over InfiniBand Multi-Threaded UPC Runtime for GPU to GPU communication over InfiniBand Miao Luo, Hao Wang, & D. K. Panda Network- Based Compu2ng Laboratory Department of Computer Science and Engineering The Ohio State

More information

JouleGuard: Energy Guarantees for Approximate Applications

JouleGuard: Energy Guarantees for Approximate Applications JouleGuard: Energy Guarantees for Approximate Applications Henry Hoffmann University of Chicago, Department of Computer Science hankhoffmann@cs.uchicago.edu Abstract Energy consumption limits battery life

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1

More information

PLB-HeC: A Profile-based Load-Balancing Algorithm for Heterogeneous CPU-GPU Clusters

PLB-HeC: A Profile-based Load-Balancing Algorithm for Heterogeneous CPU-GPU Clusters PLB-HeC: A Profile-based Load-Balancing Algorithm for Heterogeneous CPU-GPU Clusters IEEE CLUSTER 2015 Chicago, IL, USA Luis Sant Ana 1, Daniel Cordeiro 2, Raphael Camargo 1 1 Federal University of ABC,

More information

ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION

ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION Vignesh Adhinarayanan Ph.D. (CS) Student Synergy Lab, Virginia Tech INTRODUCTION Supercomputers are constrained by power Power

More information

Facilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding

Facilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding Facilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding Yin Li, Hao Wang, Xuebin Zhang, Ning Zheng, Shafa Dahandeh,

More information

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads Ran Xu (Purdue), Subrata Mitra (Adobe Research), Jason Rahman (Facebook), Peter Bai (Purdue),

More information

A Case Study in Optimizing GNU Radio s ATSC Flowgraph

A Case Study in Optimizing GNU Radio s ATSC Flowgraph A Case Study in Optimizing GNU Radio s ATSC Flowgraph Presented by Greg Scallon and Kirby Cartwright GNU Radio Conference 2017 Thursday, September 14 th 10am ATSC FLOWGRAPH LOADING 3% 99% 76% 36% 10% 33%

More information

Latent Variable Models and Expectation Maximization

Latent Variable Models and Expectation Maximization Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15

More information

Towards Energy-Proportional Datacenter Memory with Mobile DRAM

Towards Energy-Proportional Datacenter Memory with Mobile DRAM Towards Energy-Proportional Datacenter Memory with Mobile DRAM Krishna Malladi 1 Frank Nothaft 1 Karthika Periyathambi Benjamin Lee 2 Christos Kozyrakis 1 Mark Horowitz 1 Stanford University 1 Duke University

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

Massively Parallel Approximation Algorithms for the Knapsack Problem

Massively Parallel Approximation Algorithms for the Knapsack Problem Massively Parallel Approximation Algorithms for the Knapsack Problem Zhenkuang He Rochester Institute of Technology Department of Computer Science zxh3909@g.rit.edu Committee: Chair: Prof. Alan Kaminsky

More information

Semi-supervised Clustering

Semi-supervised Clustering Semi-supervised lustering BY: $\ S - MAI AMLT - 2016/2017 (S - MAI) Semi-supervised lustering AMLT - 2016/2017 1 / 26 Outline 1 Semisupervised lustering 2 Semisupervised lustering/labeled Examples 3 Semisupervised

More information

Using Multiple Machines to Solve Models Faster with Gurobi 6.0

Using Multiple Machines to Solve Models Faster with Gurobi 6.0 Using Multiple Machines to Solve Models Faster with Gurobi 6.0 Distributed Algorithms in Gurobi 6.0 Gurobi 6.0 includes 3 distributed algorithms Distributed concurrent LP (new in 6.0) MIP Distributed MIP

More information

Meet the Increased Demands on Your Infrastructure with Dell and Intel. ServerWatchTM Executive Brief

Meet the Increased Demands on Your Infrastructure with Dell and Intel. ServerWatchTM Executive Brief Meet the Increased Demands on Your Infrastructure with Dell and Intel ServerWatchTM Executive Brief a QuinStreet Excutive Brief. 2012 Doing more with less is the mantra that sums up much of the past decade,

More information

Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package

Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package High Performance Machine Learning Workshop Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package Matheus Souza, Lucas Maciel, Pedro Penna, Henrique Freitas 24/09/2018 Agenda Introduction

More information

Mixture Models and EM

Mixture Models and EM Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering

More information

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major

More information

Parallel and Distributed Optimization with Gurobi Optimizer

Parallel and Distributed Optimization with Gurobi Optimizer Parallel and Distributed Optimization with Gurobi Optimizer Our Presenter Dr. Tobias Achterberg Developer, Gurobi Optimization 2 Parallel & Distributed Optimization 3 Terminology for this presentation

More information

Mbench: Benchmarking a Multicore Operating System Using Mixed Workloads

Mbench: Benchmarking a Multicore Operating System Using Mixed Workloads Mbench: Benchmarking a Multicore Operating System Using Mixed Workloads Gang Lu and Xinlong Lin Institute of Computing Technology, Chinese Academy of Sciences BPOE-6, Sep 4, 2015 Backgrounds Fast evolution

More information

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage Performance Study of Microsoft SQL Server 2016 Dell Engineering February 2017 Table of contents

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme VIRT1052BE Extreme Performance Series: Monster VM Database Performance Todd Muirhead, VMware David Morse, VMware #VMworld #VIRT1052BE Disclaimer This presentation may contain product features that are

More information

Dependency detection with Bayesian Networks

Dependency detection with Bayesian Networks Dependency detection with Bayesian Networks M V Vikhreva Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Leninskie Gory, Moscow, 119991 Supervisor: A G Dyakonov

More information

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Windows HPC Server 2008 R2 Windows HPC Server 2008 R2 makes supercomputing

More information

732A54/TDDE31 Big Data Analytics

732A54/TDDE31 Big Data Analytics 732A54/TDDE31 Big Data Analytics Lecture 10: Machine Learning with MapReduce Jose M. Peña IDA, Linköping University, Sweden 1/27 Contents MapReduce Framework Machine Learning with MapReduce Neural Networks

More information

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic An Oracle White Paper September 2011 Oracle Utilities Meter Data Management 2.0.1 Demonstrates Extreme Performance on Oracle Exadata/Exalogic Introduction New utilities technologies are bringing with them

More information

Entuity Network Monitoring and Analytics 10.5 Server Sizing Guide

Entuity Network Monitoring and Analytics 10.5 Server Sizing Guide Entuity Network Monitoring and Analytics 10.5 Server Sizing Guide Table of Contents 1 Introduction 3 2 Server Performance 3 2.1 Choosing a Server... 3 2.2 Supported Server Operating Systems for ENMA 10.5...

More information

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Cheng Li, Austin Rovinski,

More information

FEKO Mesh Optimization Study of the EDGES Antenna Panels with Side Lips using a Wire Port and an Infinite Ground Plane

FEKO Mesh Optimization Study of the EDGES Antenna Panels with Side Lips using a Wire Port and an Infinite Ground Plane FEKO Mesh Optimization Study of the EDGES Antenna Panels with Side Lips using a Wire Port and an Infinite Ground Plane Tom Mozdzen 12/08/2013 Summary This study evaluated adaptive mesh refinement in the

More information

Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems

Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems Thomas Ilsche, Marcus Hähnel, Robert Schöne, Mario Bielert, and Daniel Hackenberg Technische Universität Dresden Observation

More information

Outline. Motivation Parallel k-means Clustering Intel Computing Architectures Baseline Performance Performance Optimizations Future Trends

Outline. Motivation Parallel k-means Clustering Intel Computing Architectures Baseline Performance Performance Optimizations Future Trends Collaborators: Richard T. Mills, Argonne National Laboratory Sarat Sreepathi, Oak Ridge National Laboratory Forrest M. Hoffman, Oak Ridge National Laboratory Jitendra Kumar, Oak Ridge National Laboratory

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 16

ECE 571 Advanced Microprocessor-Based Design Lecture 16 ECE 571 Advanced Microprocessor-Based Design Lecture 16 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 21 March 2013 Project Reminder Topic Selection by Tuesday (March 26) Once

More information

A Cool Scheduler for Multi-Core Systems Exploiting Program Phases

A Cool Scheduler for Multi-Core Systems Exploiting Program Phases IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 5, MAY 2014 1061 A Cool Scheduler for Multi-Core Systems Exploiting Program Phases Zhiming Zhang and J. Morris Chang, Senior Member, IEEE Abstract Rapid growth

More information

POWER MANAGEMENT AND ENERGY EFFICIENCY

POWER MANAGEMENT AND ENERGY EFFICIENCY POWER MANAGEMENT AND ENERGY EFFICIENCY * Adopted Power Management for Embedded Systems, Minsoo Ryu 2017 Operating Systems Design Euiseong Seo (euiseong@skku.edu) Need for Power Management Power consumption

More information

Dynamic Power Optimization for Higher Server Density Racks A Baidu Case Study with Intel Dynamic Power Technology

Dynamic Power Optimization for Higher Server Density Racks A Baidu Case Study with Intel Dynamic Power Technology Dynamic Power Optimization for Higher Server Density Racks A Baidu Case Study with Intel Dynamic Power Technology Executive Summary Intel s Digital Enterprise Group partnered with Baidu.com conducted a

More information

A Computer Scientist Looks at the Energy Problem

A Computer Scientist Looks at the Energy Problem A Computer Scientist Looks at the Energy Problem Randy H. Katz University of California, Berkeley EECS BEARS Symposium February 12, 2009 Energy permits things to exist; information, to behave purposefully.

More information

Maximizing Six-Core AMD Opteron Processor Performance with RHEL

Maximizing Six-Core AMD Opteron Processor Performance with RHEL Maximizing Six-Core AMD Opteron Processor Performance with RHEL Bhavna Sarathy Red Hat Technical Lead, AMD Sanjay Rao Senior Software Engineer, Red Hat Sept 4, 2009 1 Agenda Six-Core AMD Opteron processor

More information