-1-

Size: px
Start display at page:

Download "-1-"

Transcription

1 -1- ARTIST Summer School in Europe 2010 Autrans (near Grenoble), France September 5-10, 2010 Thermal-Aware Design of 2D/3D Multi-Processor System-on on-chip Architectures Invited ds Speaker: David idai Atienza, Professor and Director of Embedded Systems Laboratory (ESL), EPFL

2 Evolution of Electronics to Multi-Processor System-on on-chip (MPSoC MPSoC) Roadmap continues: nm Multi-Processor System-on on-chip (MPSoC) architectures CMOS 90nmCMOS 65nm CMOS 45nm PE Local Memory hierarchy CPU i/o I/0 PE PE PE I/0 SRAM SRAM PE SRAM I/O I/O P E R I P H E R A L S SDR RAM main memor ry 2

3 Evolution of Electronics to Multi-Processor System-on on-chip (MPSoC MPSoC) Roadmap continues: nm Multi-Processor System-on on-chip (MPSoC) architectures CMOS 90nmCMOS 65nm CMOS 45nm [Cell Multi-Processor PS3] I/0 PE PE 80-tile PE 1.28TFLOPS I/0 SRAM MPSoC INTEL SRAM [ISSCC 07] PE SRAM I/O I/O P E R I P H E R A L S SDR RAM main memor ry 3

4 MPSoCs are Spreading Fast # of cor res 512 [Amarasinghe06] 256 Picochip PC102 Cisco CSR-1 Intel 128 Tflops Pentium P2 P3 Raw Raza XLR Niagara Boardcom 1480 Xbox360 PA-8800 Opteron Power4 Athlon PExtreme Power6 Yonah Itanium P4 Itanium 2 Ambric AM2045 Cavium Octeon Cell Opteron 4P Xeon MP Tanglewood ?? 4

5 Design Issues in MPSoCs MPSoCs have very complex architectures Advanced components and CAD tools very expensive Time-closure issues, system speed decreased Aggravated thermal issues Hot-spots, non-uniform thermal gradients [Sun, 1.8 GHz Sparc v9 Microproc] [Santarini, EDN, March 05] [Sun, Niagara Broadband Processor] [Coskun et al 07, Sun] High chances of thermal wear-outs and very short lifetimes! 5

6 Thermal Issues Become More Critical for 3D-MPSoCs I/O Pherip. I/0 SDRAM PEs layer SRAM PEs layer SRAM 3D Integ. / PE PE PE I/0 PE I/O I/O P E R I P H E R A L S SRAM SRAM I PE SRAM SDRAM main memory More power and more non-uniform heat spreading! 6

7 Reliability Degradation Factors in MPSoCs 7

8 Reliability Degradation Factors in MPSoCs Thermal Hot Spots 8

9 Reliability Degradation Factors in MPSoCs 9

10 Reliability Degradation Factors in MPSoCs 90 Fatigue failures increase with: Magnitude of variation Frequency of cycles 80 T (C) T Time (sec) Caused by: Power on/ off Power management (turning off cores) 10

11 Reliability Degradation Factors in MPSoCs 11

12 Reliability Degradation Factors in MPSoCs Spatial Gradients 12

13 Advocating Thermal-Aware 2D/3D MPSoC Design Integration of HW/SW modeling and management Heat Flow Models Fast Thermal Exploration HW Thermal monitoring HW Tuning knobs HW-Based Thermal Management Policies SW-Based Thermal Management Policies 13

14 Outline Part 1: Thermal Modeling and Management for 2D MPSoCs Part 2: Thermal Modeling and Management for 3D MPSoCs with Active Cooling Acknowledgements: Prof. Ayse K. Coskun (Boston University and Sun Microsyst.).), Dr. Srinivasan Murali (inocs and EPFL), Prof. Jose L. Ayala (UCM), Thomas Brunschwiler and Dr. Bruno Michel (IBM Zürich), Prof. Stephen Boyd (Stanford University) 14

15 Thermal Modeling, Analysis and Management of 2D Multi-Processor System-on-Chip Prof. David Atienza Alonso Embedded Systems Laboratory y( (ESL) Institute of EE, Faculty of Engineering ARTIST Summer School 2010, Autrans (France)

16 Outline MPSoC thermal modeling and analysis HW-based thermal management for MPSoCs SW-based thermal management for MPSoCs Conclusions 2

17 Outline MPSoC thermal modeling and analysis HW-based thermal management for MPSoCs SW-based thermal management for MPSoCs Conclusions 3

18 MPSoC Thermal Modeling Problem Continuous heat flow analysis Capture geometrical characteristics of MPSoCs Explore different packaging features and heat sink characteristics Time-variant heat sources Transistor switching depends on MPSoC run-time activity (software) Dynamic interaction with heat flow analysis Very complex computational problem! 4

19 MPSoC Thermal Modeling State-of of-the the-art MPSoC Modeling and Exploration 1. SW simulation: Transactions, cycle ycle-accurate (~100 KHz) [Synopsys Realview, Mentor Primecell, Madsen et al., Angiolini et al.] At the desired cycle-accurate level, they are too slow for thermal analysis of real-life life applications! 2. HW prototyping: Core dependent (~ MHz) [Cadence Palladium II, ARM Integrator IP, Heron Engineering] Very expensive and late in design flow, no thermal modeling, only used for functional validation of MPSoC architectures! Heat Flow Modeling: 1. Software thermal/power models models [Skadron et al., Kang et al.] Too computationally intensive and not able to interact at run-time with inputs from MPSoC components! 5

20 MPSoC Thermal Modeling State-of of-the the-art MPSoC Modeling and Exploration 1. SW simulation: Transactions, cycle ycle-accurate (~100 KHz) [Synopsys Realview, Mentor Primecell, Madsen et al., Angiolini et al.] At the desired cycle-accurate level, they are too slow for thermal analysis of real-life life applications! Combination of cycle-accurate MPSoC behavior 2. HW prototyping: Core dependent (~ MHz) and IC heat flow modeling at run-time is unheard of [Cadence Palladium II, ARM Integrator IP, Heron Engineering] Very expensive and late in design flow, no thermal modeling, only used for functional validation of MPSoC architectures! Heat Flow Modeling: 1. Software thermal/power models models [Skadron et al., Kang et al.] Too computationally intensive and not able to interact at run-time with inputs from MPSoC components! 6

21 Orthogonalizing MPSoC Thermal Modeling and Analysis I/O CPU CPU I/O sniffer SRAM sniffer sniffer CPU SRAM sniffer Energy sniffer sniffer CPU sniffer SRAM sniffer sniffer FPGA MPSoC Behavior Emulation on FPGA Temperature (T) SW thermal estimation tool (Host PC) Framework: MPSoC behavioral model on reconfigurable HW interacting with efficient thermal estimation 7

22 Chip and Package Heat Flow Modeling Model interface Input: power model of MPSoC components, geometrical properties Output: temperature of MPSoC components at run-time Thermal circuit: 1 st order RC circuit si Heat flow ~ Electrical current ; Temperature Si thermal ~ conductivity Voltage Heat spreader and IC composed of depends 160 elementary on temperature blocks Th hermal cond ductivity(w/ /mºk) Cu cu cu cu cu 130 si si si si si si si Thermal capacitance conductance matrix matrix si C si,1 G 1,2 -G 1,2 C si,2g 2,1 G 2,1. -G 21 G 21 Temperature change C cu,n (IMEC & Freescale, 90nm) Actual value Temperature (in Celsius) Ct k =-G(t k )t k +p k ;k= 1..m Temperature vector at instant k power consumption vector 8

23 SW Thermal Estimation Tool for MPSoCs. C t k = - G (t k )t k + p k ; k = 1..m Creating linear approximation while retaining variable Si thermal conductivity: Si thermal conductivity linearly approx. : G i,j (t k ) = I + q t k Numerically integrating. in discrete time domain the t k : t k+1 = A(t k )t k + Bp k ; k = 1..m A(t k ) = (I - d t C G(t k )) ; B = d t C -1 Complexity scales linearly with the number of modeled cells (simulated on P4@ 3GHz) tion w estimat me (S.) Time step chosen small enough for convergence thermal library validated against 3D finite element 200 model (IMEC & Freescale) 0 Heat flow Tim Si thermal conductivity dependent on temperature 60 sec of MPSoC heat flow analysis ) vity(w/mºk Non-linear 120 Therm mal conducti thermal estim Actual value Linear fit Proposed linear thermal estimation Temperature (in Celsius) Number of Cells 9

24 Case Study: HW 4-Core MPSoC MPSoC Philips board design: 4 processors, DVFS: 100/500 MHz Plastic packaging Software: Image watermarking, video rendering Power values for 90nm: Element Max Power Max Power (mw) (mw) 100 MHz 500 MHz Processor 2,92 x ,02 x 10 3 D-Cache 1,42 x ,10 x 10 2 I-Cache 1,42 x ,10 x 10 2 Priv Mem 0,61 x ,75 x 10 2 AMBA 0,31 x ,68 x

25 Results: Thermal Validation 4-core Philips MPSoC MPARM: Cycle-accurate SW architectural simulator Complete power/thermal models tuned to Philips/IMEC figures Simulations too slow: 2 days for 0.18 real sec (12 cells) HW thermal emulation able to validate policies at run-time Dynamic Voltage and Frequency Scaling 400 Many weeks of Scaling (DVFS) simulation?! based thresholds Emulation time 45 sec (128 cells)! Average temperature of emulated 4-core 4 Very fast validation of MPSoC 420 run-time thermal behavior and management DVFS ON: 500/100 MHz. in) Temp perature (Kelvi Package limit (~85ºC) ,0 10 1,0 20 2,0 30 3,0 40 4,0 50 5,0 60 6,0 70 7,0 80 8,0 Time (seconds) 500 MHz. 100 MHz. Simulation Simulation in MPARMin MPARMEmulation Emulation with DFS 11

26 Outline MPSoC thermal modeling and analysis HW-based thermal management for MPSoCs SW-based thermal management for MPSoCs Conclusions 12

27 Temperature Management is Power Control under Thermal Constraints Power consumption of cores determines thermal behavior Power consumption depends on frequency and voltage Setting frequencies/voltages can control power and temperature Optimization problem: frequency/voltage assignment in MPSoCs under thermal constraints Meet processing requirements Respect thermal constraint at all times Minimize power consumption 13

28 HW-Based Thermal Management State-of of-the the-art Static approach: thermal-aware aware placement to try to even out worst-case thermal profile [Sapatnekar, Wong et al.] Computationally difficult problem (NP-complete) Not able to predict all working conditions, and leakage changing dynamically, it is not useful in real systems No formalization of the thermal optimization problem Dynamic approach: HW-based dynamic thermal management Clock gating g based on time-out [Xie et al., Brooks et al.] DVFS based on thresholds [Chaparro et al, Mukherjee et al,] Heuristics for component shut down, limited history [Donald et al] Techniques to minimize power, they only achieve thermal management as a by-product 14

29 Formalization of Thermal Management Problem in MPSoCs Control theory problem Optimal frequency assignment module, 2-phase approach: Observable: bl Geometrical properties and behavior Heat 1) flow Design-time model and phase: thermal Find profile optimal estimation sets of frequencies Performance the cores counters for different working conditions Controlable: 2) Run-time Max. phase: throughput Apply one under of the thermal predefined constraints sets Tuning found knobs: in phase frequencies/voltages 1 for the required of system the system performance (DVFS) Observed system: MPSoC Run-time HW DVFS support Processor cores Control output: Cores freqs Performance counters (average frequency) Thermal sensors Observer and control system Optimal frequency assignment module Thermal state Thermal profile estimation Requirements: Max. Throughput Constraints: Max. temperature 15

30 Pro-Active HW-Based Thermal Control: Phase 1 Design-Time Predictive model of thermal behavior given a set of frequency assignments Allowed core power values and frequencies Optimization problem: Constraints: Chip floorplan Non-linear offline problem Optimal Performance constraint: on average, freq. is f avg frequency assignment Thermal equation: Si conductivity depends on temp Thermal equation Meet temp. constraints module at all time points Power Power equation: equation quadratic based dependence on frequency on freq. Packaging, heat spreader information Phase inputs Minimize esu sum of power consumption of cores Method outputs Table of cores frequencies assignments Frequency in predefined range 16

31 Making Power and Thermal Constraints Convex Power constraint adaptation Change non-affine (quadratic equality): Solve convex problem and get table of optimal frequencies p max for (f different i,k ) 2 / (f max ) working 2 = p i,k ; i conditions = 1,..,n, k in polynomial time (number of processors) To convex inequality: Thermal constraint adaptation p 2 /(f 2 max (f i,k ) max ) p i,k ;i=1 1,..,n, k Use worst case thermal conductivity in the range of allowed temperatures, and iterate (if needed) to optimum 17

32 Pro-Active HW-Based Thermal Control: Phase 2 - Run-Time Time, Putting It All Together Use table of frequencies assignments and index by actual conditions at regular run-time intervals Targeted operating frequency of cores Current temperature of cores Method inputs Run-time optimal DVFS assignment HW module 1) Index table output of phase 1 with current working conditions 2) Compare to current assignment to cores and generate required signaling to modify DVFS values Phase output Run-time DVFS changes for processors 18

33 Case Study: 8-Core Sun MPSoC MPSoC Sun Niagara architecture 8 processing cores SPARC T1 Max. frequency each core: 1 GHz 10 DVFS values, applied every 100ms Max. power per core: 4 W Execution characteristics of workloads [Sun Microsystems]: Mixes of 10 different benchmarks, from web-accessing to multimedia 60,000 iterations of basic benchmarks, tens of seconds of actual system execution Sun s Niagara MPSoC 19

34 Results: Thermal Constraints Respected DVFS: 2-phase Convex method: Total run-time of benchmarks 180 sec Proposed method achieves better throughput than standard DVFS while satisfying thermal constraints 106 sec (45% less exec. time) 20

35 Outline MPSoC thermal modeling and analysis HW-based thermal management for MPSoCs SW-based thermal management for MPSoCs Conclusions 21

36 MPSoC System-Level Architecture: HW and SW Layers MPOS To Core #1 (30% load) Task Manager LOAD FREQUENCY 100 % TASK MIGRATION To Core #2 (60% load) To Core #1 Can we control the MPSoC To Core #2thermal profile by (70% load) TASK B (35% FSE LOAD load) controlling 50 % software 50% execution? 0 % To Core #2 (60% load) TASK A TASK C To FSE Core LOAD #1 FSE LOAD (30% 40% load) 40% PROC 1 PROC 2 To Core #1 (70% load) To Core #2 (35% load) SW layers introduced to better exploit the HW of MPSoCs Applications divided in tasks: blocks of operations to be executed Multi-processor Operating System (MPOS) distributes the tasks Load balancing: equal ldi distribution tib ti of work kb between processors 22

37 Task Migration for Load vs. Thermal Balancing Plain load balancing No improvement in workload distribution possible: no migration 100 % LOAD FREQUENCY TEMPERATURE TEMPERATURE PROCESSOR 1 TEMP. MEAN TEMP. PROCESSOR 2 TEMP. TASK B FSE LOAD Hot-spot! 50 % 50% 40% TIME TIME 0 % TASK A FSE LOAD 40% TASK C FSE LOAD 40% PROC 1 PROC 2 23

38 Task Migration for Load vs. Thermal Balancing Heat&Run: Load balancing with local knowledge of temperature in MPSoC components 100 % LOADTASK MIGRATION FREQUENCY LOAD FREQUENCY TEMPERATURE TIME 100 % 50 % 50 % 0 % 0 % TASK B FSE TASK LOAD B 40% 50% 40% 50% TASK TASK A FSE LOAD 40% FSE LOAD FSE LOAD 40% TASK TASK C FSE LOAD 40% FSE LOAD 40% PROC 1 PROC 2 PROC source 1 PROC target 2 24

39 Task Migration for Load vs. Thermal Balancing Heat&Run: Load balancing with local knowledge of temperature in MPSoC components Helping with hot-spots, but no thermal balancing TEMPERATURE LOAD FREQUENCY TASK MIGRATION Existing approaches do not consider 100 % global thermal dynamics for task migration TIME 50 % 0 % TASK A FSE LOAD 40% TASK B FSE LOAD 50% 40% TASK C FSE LOAD 40% PROC source 1 PROC target 2 25

40 Task Migration for Load vs Thermal Balancing Contribution: Migration strategy for thermal balancing Global knowledge of temperature at MPOS level Adjusted to particular thermal dynamics of each platform Formalization Dynamic number of tasks, no control theory formalization possible Knapsack problem, move N largest tasks between cores: estimated increase in temperature and minimizing performance penalty TEMPERATURE TEMPERATURE UPPER TRESHOLD LOWER TRESHOLD TIME TIME Reduces hot spots and reaches thermal balancing Reduces hot-spots and reaches thermal balancing 26

41 Case Study: Freescale MPSoC Board Hardware 3 RISC processor cores 16KB caches, 32KB shared mem. AMBA bus, 2GB ext. mem Software uclinux-based MPOS Multimedia applications: audio and video Two packaging options Mobile embedded SoCs (slow temperature variations) High performance SoCs (fast temperature variations) 27

42 Results and Comparisons Good thermal balancing Average: 40.5ºC, variations of < 3ºC Small performance overhead ( 2 migrat/s) +/-3º Comparisons with other policies Load balancing inefficient (>7ºC diffs) 400MHz (1% overhead) Good performance and uniform temperature adjusting globally to thermal dynamics with MPOS Heat&Run inefficient or causes many deadline misses (40% below performance requirements) Contribution: performance requirements met for both types of packaging 28

43 Adapt2D-MIGRA: Combination of HW and SW-Based Pro-Active Thermal Management Initial: Large gradients New: Thermal balancing HW-based management: Convex-based dynamic voltage and frequency scaling (DVFS) exploration SW-based management: Proactive task scheduling and migration Support of multi-processor operating system: Solaris Multi-Core Good thermal control o in commercial ca MPSoCs in 90nm, what about 3D integration? 29

44 Outline MPSoC thermal modeling and analysis HW-based thermal management for MPSoCs SW-based thermal management for MPSoCs Conclusions 30

45 Conclusions Progress in semiconductor technologies enables new MPSoCs Thermal/reliability issues must be addressed for safe human interaction Thermal monitoring and control are key Clear benefits of thermal-aware aware design methods for MPSoCs Novel, fast and low-cost thermal modeling approach at system-level Formalization of HW-based thermal management problem as convex, and solved in polynomial time New SW-based thermal balancing method with very limited overhead Validation on commercial 2D- MPSoCs (Sun, (Sun, Freescale, Philips) Fast exploration of thermal behavior of complex MPSoCs Effective HW- and SW-based pro-active thermal management 31

46 Key References and Bibliography Thermal modeling and FPGA-based emulation HW HW-SW Emulation Framework for Temperature-Aware Design in MPSoCs, D. Atienza, et al. ACM TODAES,, Vol. 12, Nr. 3, pp. 1 26, August Thermal management for 2D MPSoCs Thermal Balancing Policy for Multiprocessor Stream Computing Platforms, F. Mulas, et al., IEEE T-CAD,, Vol. 28, Nr. 12, pp , December Processor Speed Control with Thermal Constraints, A. Mutapcic, S. Boyd, et al. IEEE TCAS-I,, Vol. 56, Nr. 9, pp , Sept Inducing Thermal-Awareness in Multi-Processor Systems-on-Chip Using Networks-on-Chip, E. Martinez, et al., Proc. ISVLSI Temperature Control of High-Performance Multi-core Platforms Using Convex Optimization, S.Murali, et al., Proc. DATE,

47 Swiss National Science Foundation QUESTIONS? Acknowledgements: European Commission UCSD / Sun Microsystems IMEC / Philips IBM Zürich Bologna / Freescale semiconductors33

48 Thermal Modeling and Management for 3D MPSoCs with Active Cooling Prof. David Atienza Alonso Embedded Systems Laboratory y( (ESL) Institute of EE, Faculty of Engineering ESL/EPFL 2010 ARTIST Summer School 2010, Autrans (France)

49 Promises Advantages of 3D vs. 2D Chips Reduce average length of on-chip global wires Increase number of devices reachable in given time budget Greatly facilitate heterogeneous integration (e.g. logic-dram stacks) ESL/EPFL 2010 [Figures: Ray Yarema, Fermilab] Samsung Wafer Stack Package (WSP) memory 2

50 Thermal-Reliability Issues in 3D Chips Latest chips increase power density Non-uniform hot-spots in 2D chips In 3D chips, heat affects several layers! (even more cool components) [Sun, 1.8 GHz Sparc v9 Microproc] Courtesy: [Sun, [IBM and Irvine Sens.] Niagara Broadband Processor] ESL/EPFL

51 Thermal-Reliability Issues in 3D Chips Latest chips increase power density Non-uniform hot-spots in 2D chips In 3D chips, heat affects several layers! (even more cool components) [Sun, 1.8 GHz Sparc v9 Microproc] Courtesy: [IBM and Irvine Sens.] Higher chances of thermal [Sun, Niagara wear-outs Broadband and Processor] very short lifetimes! ESL/EPFL

52 Run-Time Heat Spreading in 3D Chips 5-tier 3D stack: 10 heat sources and sensors Layer Inject between 4W 1.5W width (um) length (um) Layer nd Tier width (um) h (um) width ESL/EPFL th Tier Layer length (um) wid dth (um) Layer length (um) length (um) 3 rd Tier Large and non-uniform th Tier heat propagation! (up to 130º C on top tier) 5 394

53 NanoTera CMOSAIC Project: Design of 3D MPSoCs with Advanced Cooling 3D systems require novel electro-thermal co-design Academic partners: EPFL and ETHZ Industrial: IBM Zürich ESL/EPFL

54 NanoTera CMOSAIC Project: Design of 3D MPSoCs with Advanced Cooling 3D systems require novel electro-thermal co-design Academic partners: EPFL and ETHZ Industrial: IBM Zürich 3D stacked MPSoC chips: microchannels etched on back side to circulate liquid coolant ESL/EPFL 2010 adjustment of coolant flux task scheduling and execution control System Level Active Cooling Manager (3D heat flow prediction) 7

55 Outline Introduction 3D chip thermal modeling framework Validation of 3D thermal model Liquid cooling modeling Liquid id cooling model validation Close-loop 3D MPSoCs thermal management with active cooling Experiments and conclusions ESL/EPFL

56 Compact RC-Based Tier Thermal Model Gate-level thermal model q bj 6 b f 1 fj T f b 0 j q RC Network of q b_top q b_back Si/metal layer cells q b_left q b_front q q b_right q b_bottom 2D tier modeled as heat flux moving between adjacent cells I-1 I I+1 (q bi ) I (q bi ) I+1 face i Convective boundary conditions between layers in tier q b_top = h top A(T a -T top ) q b_bottom = h bottom A(T a -T bottom ) ESL/EPFL 2010 [Atienza et al., TODAES 2007] 9

57 Complete 3D Chip Thermal Modeling Multi-level execution for thermal convergence in 3D Local (2D-tier), liquid channels and global (3D) propagation N iteration ns Evaluate local temperature for each cell Feedback temperature Update with neighbour temperature difusion Tier-lev vel conve ergence Go to next tier or microchannel ESL/EPFL 2010 [Ayala et al., NanoNet 2009] 10

58 3D Chip Thermal Library Validation Extensible set of layers in 3D stack up to 9 tiers and heat spreader Pre-defined layers: Silicon, copper (10 layers), glue, overmold, interposer, bump Configurable nr. of cells and iterations per tier Initially 10ms thermal interval (1000 iterat./tier) Multi-tier test chip manufactured at EPFL: ESL/EPFL

59 3D Chip Thermal Library Validation Extensible set of layers in 3D stack up to 9 tiers and heat spreader Pre-defined layers: Silicon, copper (10 layers), glue, overmold, interposer, bump Configurable nr. of cells and iterations per tier Initially 10ms thermal interval (1000 iterat./tier) Multi-tier test chip manufactured at EPFL: Three types of tiers ESL/EPFL

60 3D Thermal Library Validation: Creating Various 3D Thermal Maps Flexibility for thermal characterization ESL/EPFL

61 3D Thermal Library Validation: Creating Various 3D Thermal Maps Flexibility for thermal characterization ESL/EPFL

62 3D Thermal Library Validation: Creating Various 3D Thermal Maps Flexibility for thermal characterization 10 heat sources and sensors per layer, accesible to be simultaneously l activated t ESL/EPFL

63 3D Thermal Library Validation: Correlation with 5-Tier 3D Stack e (mv) Sensor Voltag D Chip, EPFL, Layer 3 characterization ti Blue Curve: 3D current -heat model for D8 Pink curve: Heater current measured in D8 Dev8 D7HD8S Heater Current (ma), applied to Dev 7 ESL/EPFL 2010 Sensor Voltage ( mv) [Ayala et al., Nano-Nets 09] 3D Chip, EPFL, multi-tier tier characterization Bue/Pink Curve: D7 (tier 1) and D8 (tier 4) Red Curve: 3D current-heat model for D8 Dev6 Dev7 Div6_Iheat Heater Current (ma), applied to Dev 7 16

64 3D Thermal Library Validation: Correlation with 5-Tier 3D Stack ESL/EPFL 2010 e (mv) Sensor Voltag D Chip, EPFL, Layer 3 characterization ti Blue Curve: 3D current -heat model for D8 Pink curve: Heater current measured in D8 Dev8 D7HD8S Heater Current (ma), applied to Dev 7 Variations of less than 1.5% between 3D stack measurements and new 3D thermal model mv) Sensor Voltage ( [Ayala et al., Nano-Nets 09] 24 3D Chip, EPFL, multi-tier tier characterization Bue/Pink Curve: D7 (tier 1) and D8 (tier 4) Red Curve: 3D current-heat model for D8 Dev6 Dev7 Div6_Iheat Heater Current (ma), applied to Dev 7 17

65 Modeling Through Silicon Vias (TSVs) in 3D Stacks TSVs: Size: 5-10um x um TSVs change resistivity of interlayer material (IM) Figure: LSM-EPFL Modeling Granularities: 1. Homogeneous distribution, one R value for the IM 2. Different R value per unit (core, cache, etc.) 3. Exact locations of TSVs Higher accuracy Higher complexity Source: IBM Zürich and Y.Heights ESL/EPFL

66 TSV Modeling Accuracy in 3D Stacks ESL/EPFL 2010 Chosen to model TSV groups in localized positions of 3D MPSoCs 19

67 Liquid Flux Model for Laminar Flow Local junction temperature modeled as RC network: R tot = R cond + R conv + R heat Heat source Thermal resist. of Si Chip back-side temperature Si base Heating Flow rate and thickness area Total area density R tot = 1/(G si /t + 1/R b ) + A/(bA t ) + A/(VPc p ) Dependence of thermal resistance in liquid flux modeled as a quadratic form Variable value of coolant flux (Φ) R heat aφ + bφ 2 ; b << a T 1 P 1 Thermal resistance of q 2 wiring P 2 ESL/EPFL 2010 [Atienza et al., THERMINIC 09 and DATE 10] 20 q 1

68 3D Thermal Model with Liquid Cooling New set of layers in 3D stack 3D stack (up to 9 tiers) 1 microchannel and coolant flow per tier 5-tier stack with microchannels and manifold cooling seal manufactured at IBM/EPFL Enables different multi-tier liquid flux injection Liquid Micro-Heater PCB Micro-Channels Source: IBM & ESL, EPFL ESL/EPFL

69 Manufacturing of 5-Tier 3D Test Chip with Liquid Channels in Multiple Tiers Front-side Back-side Figure: IBM & ESL, EPFL Adding multi-tier liquid cooling in-/out-lets ESL/EPFL 2010 Multi-tier active cooling technology feasible for 3D-stacked chips 22

70 Correlation Results: Liquid Cooling and 3D Heat Transfer Temperature evolution at the junction (T j ) Tested range: to 0.15 L/min q 2 P 2 q 1 Similar accuracy results at different channels T 1 P 1 Avg Max temp Error= 0.6% ESL/EPFL 2010 [Atienza et al., THERMINIC 09] 23

71 Correlation Results: Liquid Cooling and 3D Heat Transfer Temperature evolution at the junction (T j ) Tested range: to 0.15 L/min Similar accuracy results at different channels q 2 2 P q 1 Variations of less than 1% between measurements and RC-based 3D thermal model with liquid cooling T 1 P 1 Avg Max temp Error= 0.6% ESL/EPFL 2010 [Atienza et al., THERMINIC 09] 24

72 Complete 3D Chip Thermal Modeling Flow with Liquid Cooling Inputs: Workload information Floorplan, TSV areas, package Temperature (for dynamic policies) Scheduler (Reactive, Proactive) Inputs: Workload information Activity of cores Power Manager (DPM) Inputs: Power trace for each unit Floorplan, package and die properties (Niagara-1), TSV area percentage/distribution Flow rate 3D Thermal Simulator w. Liquid Cooling based on EPFL-IBM 3D chips (Integrated within internal HotSpot tool version) Transient Temperature Response for Each Unit ESL/EPFL

73 Run-Time HW/SW Thermal Modeling Framework for 3D Chips Exploitation of both hardware and software benefits Zero-delay MPSoC architecture simulation Multi-Proc. OS + DVFS + Task Migration I/O CPU CPU Sw app 1... Sw app N SRAM SRAM SRAM I/O sniffer sniffer sniffer CPU sniffer sniffer sniffer CPU sniffer sniffer sniffer Energy of 2D components MPSoC Behavior Emulation on FPGA Detailed thermal analysis of 2D MPSoC layout ESL/EPFL 2010 [D. Atienza et al., TODAES 2007] standard Ethernet connection & dedicated HW monitor Software Thermal Model cu cu cucu cu si si si si si si si si si Temp. (T) of 2D components Host PC 26

74 Run-Time HW/SW Thermal Modeling Framework for 3D Chips Exploitation of both hardware and software benefits Zero-delay MPSoC architecture simulation Multi-Proc. OS + DVFS + Task Migration I/O CPU CPU Sw app 1... Sw app N SRAM SRAM SRAM I/O sniffer sniffer sniffer CPU sniffer sniffer sniffer CPU sniffer sniffer sniffer Energy of 3D components MPSoC Behavior Emulation on FPGA standard Ethernet connection & dedicated HW monitor Temp. of 3D components N th Tier 3D Stack Thermal Model 1 st Tier Host PC ESL/EPFL 2010 [D. Atienza, THERMINIC 2009] 27

75 Thermal Management for 3D-MPSoCs with Liquid Cooling Active-Adapt3D: Combined policy manager (Best-Paper Award at IEEE/IFIP VLSI-SoC S 2009) Predictive, floorplan-based task assignment and DVFS Close-loop variable liquid cooling control T 80 C Increment flow rate ; T < 80 C Decrement Policy can be applied reactively or proactively System Temperature Dynamics Flow Rate Tuner Thermal Sensors REACTIVE Temperature Measurements ESL/EPFL 2010 Temperature Forecast PROACTIVE ARMA Based Predictor 28

76 Adaptive Thermal-Aware Task Assignment Policy for 3D MPSoCs Cores on layers closer to the heat sink can be cooled faster in comparison to cores further away Adapt-3D assigns a thermal index ( ) to each core in order to distinguish the location of the cores Higher Core more prone to hot spots i For cores at locations 1, 2 and 3: Chip Chip 1 ESL/EPFL 2010 [Coskun and Atienza, DATE 09] 29

77 Adaptive Thermal-Aware Task Assignment Policy for 3D MPSoCs Probability of receiving workload at time t: P t P t 1 W For each core Weight: W inc dec 1 W init if T i W if T init i avg avg T T preferred preferred Cool core Hot core Empirical constants W init T preferred T avg E.g., 80 o C Measuredby sensors ESL/EPFL

78 Experiments 3D Thermal Management: 3D MPSoCs with Microchannels Target 3D systems based on 3D version Sun UltraSPARC T1 Power values and workloads from real traces measured in Sun platforms (multimedia players, web servers, databases, etc.) Cores and caches in separate layers Channels: Width 400um, Depth 250um. Four flow rate settings, default at 15ml/min. ESL/EPFL 2010 (EXP1-2) (EXP3) (EXP4) 31

79 Thermal Management for 3D Chips: Active-Adapt3D Adapt3D Comparisons Predictive task scheduling, active cooling and floorplan- aware DVFS achieves less than 5% hotspots Promising figures for thermal control in 3D-MPSoCs ESL/EPFL 2010 [Coskun and Atienza, DATE 10] 32

80 Thermal Management in 3D Chips: Active-Adapt3D Adapt3D Comparisons Variable multi-tier tier flow control useful for 3D systems with 3+ layers. Proactive thermal management achieves: 75% reduction in spatial gradients on average -- for fixed flow rate 97% reduction in spatial gradients on average -- for variable flow rate ESL/EPFL 2010 *LC: Multi-tier variable liquid cooling Cooling power savings up to 67% to worst-case flux [Coskun and Atienza, DATE 10] 33

81 Conclusions Complexity of coming 3D MPSoC chips requires novel thermal modeling approaches Application of simple RC-based methods demonstrated, validated with 3D test chip Initial model of liquid cooling channels in 3D chips Simple RC laminar flow model, works well with variable liquid fluxes (errors of less than 2%) Integrated the compact model into custom HotSpot tool New thermal management: feedback controller adjusts flow rate to allowed temperature with job assignment and DVFS Proactive control improves the hot spot reduction to 95% for systems with variable flow rates, and reduces thermal variations Dynamic flow rate adjustment is helpful in reducing the energy cost of fthe pump and overall system (67% power savings) ) ESL/EPFL

82 Key References and Bibliography 3D Thermal modeling and FPGA-based emulation 3D-ICE: Compact transient thermal model for 3D ICs with liquid cooling via enhanced heat transfer cavity geometries, A. Sridhar,etal.Proc. of ICCAD 2010, USA, November Transient Thermal Modeling of 2D/3D Systems-on-Chip with Active Cooling, David Atienza, Proc. of THERMINIC 2009, Belgium, October, Thermal management for 3D MPSoCs Fuzzy Control for Enforcing Energy Efficiency in High-Performance 3D Systems, M. Sabry, Ayse K. Coskun, David Atienza, Proc. of ICCAD 2010, USA, November Energy-Efficient Variable-Flow Liquid Cooling in 3D Stacked Architectures, Ayse K. Coskun, David Atienza, et al., Proc. of DATE 2010, Germany, March Modeling and Dynamic Management of 3D Multicore Systems with Liquid Cooling, Ayse K. Coskun, et al., Proc. of VLSI-SoC S 2009, Brazil, October (Best Paper Award) Dynamic Thermal Management in 3D Multicore Architectures, Ayse K. Coskun, et al., Proc. of DATE 2009, France, April ESL/EPFL 2010

83 Nano-Tera.ch Swiss Engineering Programme European Commission QUESTIONS? Swiss National Science Foundation ESL/EPFL 2010

Thermal Modeling and Active Cooling

Thermal Modeling and Active Cooling Thermal Modeling and Active Cooling for 3D MPSoCs Prof. David Atienza, Embedded Systems Laboratory (ESL), EE Institute, Faculty of Engineering MPSoC 09, 2-7 August 2009 (Savannah, Georgia, USA) Thermal-Reliability

More information

3D MPSoCs with Active Cooling

3D MPSoCs with Active Cooling System-Level Thermal Management of 3D MPSoCs with Active Cooling Prof. David Atienza, Embedded Systems Laboratory (ESL), Ecole Polytechnique Fédérale de Lausanne (EPFL) MPSoC 11, July 4 th 8 th 2011 (Beaune,

More information

Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems

Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems Ayse K. Coskun Electrical and Computer Engineering Department Boston University http://people.bu.edu/acoskun

More information

CS758: Multicore Programming

CS758: Multicore Programming CS758: Multicore Programming Introduction Fall 2009 1 CS758 Credits Material for these slides has been contributed by Prof. Saman Amarasinghe, MIT Prof. Mark Hill, Wisconsin Prof. David Patterson, Berkeley

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology

Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology http://cag.csail.mit.edu/ps3 6.189-chair@mit.edu A new processor design pattern emerges: The Arrival of Multicores MIT Raw 16 Cores

More information

Hardware-Software Codesign. 1. Introduction

Hardware-Software Codesign. 1. Introduction Hardware-Software Codesign 1. Introduction Lothar Thiele 1-1 Contents What is an Embedded System? Levels of Abstraction in Electronic System Design Typical Design Flow of Hardware-Software Systems 1-2

More information

Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs

Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs 1/16 Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs Kyungwook Chang, Sung-Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Introduction Challenges in 2D Device

More information

Thermal Sign-Off Analysis for Advanced 3D IC Integration

Thermal Sign-Off Analysis for Advanced 3D IC Integration Sign-Off Analysis for Advanced 3D IC Integration Dr. John Parry, CEng. Senior Industry Manager Mechanical Analysis Division May 27, 2018 Topics n Acknowledgements n Challenges n Issues with Existing Solutions

More information

Parallelization. Saman Amarasinghe. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Parallelization. Saman Amarasinghe. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Spring 2 Parallelization Saman Amarasinghe Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Outline Why Parallelism Parallel Execution Parallelizing Compilers

More information

Adaptive Power Blurring Techniques to Calculate IC Temperature Profile under Large Temperature Variations

Adaptive Power Blurring Techniques to Calculate IC Temperature Profile under Large Temperature Variations Adaptive Techniques to Calculate IC Temperature Profile under Large Temperature Variations Amirkoushyar Ziabari, Zhixi Bian, Ali Shakouri Baskin School of Engineering, University of California Santa Cruz

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Gigascale Integration Design Challenges & Opportunities Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Outline CMOS technology challenges Technology, circuit and μarchitecture solutions Integration

More information

Blue-Steel Ray Tracer

Blue-Steel Ray Tracer MIT 6.189 IAP 2007 Student Project Blue-Steel Ray Tracer Natalia Chernenko Michael D'Ambrosio Scott Fisher Russel Ryan Brian Sweatt Leevar Williams Game Developers Conference March 7 2007 1 Imperative

More information

Outline. Why Parallelism Parallel Execution Parallelizing Compilers Dependence Analysis Increasing Parallelization Opportunities

Outline. Why Parallelism Parallel Execution Parallelizing Compilers Dependence Analysis Increasing Parallelization Opportunities Parallelization Outline Why Parallelism Parallel Execution Parallelizing Compilers Dependence Analysis Increasing Parallelization Opportunities Moore s Law From Hennessy and Patterson, Computer Architecture:

More information

PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor

PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor Taeho Kgil, Shaun D Souza, Ali Saidi, Nathan Binkert, Ronald Dreslinski, Steve Reinhardt, Krisztian Flautner,

More information

FPGA Power Management and Modeling Techniques

FPGA Power Management and Modeling Techniques FPGA Power Management and Modeling Techniques WP-01044-2.0 White Paper This white paper discusses the major challenges associated with accurately predicting power consumption in FPGAs, namely, obtaining

More information

THERMAL EXPLORATION AND SIGN-OFF ANALYSIS FOR ADVANCED 3D INTEGRATION

THERMAL EXPLORATION AND SIGN-OFF ANALYSIS FOR ADVANCED 3D INTEGRATION THERMAL EXPLORATION AND SIGN-OFF ANALYSIS FOR ADVANCED 3D INTEGRATION Cristiano Santos 1, Pascal Vivet 1, Lee Wang 2, Michael White 2, Alexandre Arriordaz 3 DAC Designer Track 2017 Pascal Vivet Jun/2017

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 24

ECE 571 Advanced Microprocessor-Based Design Lecture 24 ECE 571 Advanced Microprocessor-Based Design Lecture 24 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 25 April 2013 Project/HW Reminder Project Presentations. 15-20 minutes.

More information

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Interconnect Challenges in a Many Core Compute Environment Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Agenda Microprocessor general trends Implications Tradeoffs Summary

More information

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha ECE520 VLSI Design Lecture 1: Introduction to VLSI Technology Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Wednesday 2:00-3:00PM or by appointment E-mail: pzarkesh@unm.edu Slide: 1 Course Objectives

More information

An Overview of Standard Cell Based Digital VLSI Design

An Overview of Standard Cell Based Digital VLSI Design An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,

More information

High-performance, low-cost liquid micro-channel cooler

High-performance, low-cost liquid micro-channel cooler High-performance, low-cost liquid micro-channel cooler R.L. Webb Department of Mechanical Engineering, Penn State University, University Park, PA 1680 Keywords: micro-channel cooler, liquid cooling, CPU

More information

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism

More information

Application-Platform Mapping in Multiprocessor Systems-on-Chip

Application-Platform Mapping in Multiprocessor Systems-on-Chip Application-Platform Mapping in Multiprocessor Systems-on-Chip Leandro Soares Indrusiak lsi@cs.york.ac.uk http://www-users.cs.york.ac.uk/lsi CREDES Kick-off Meeting Tallinn - June 2009 Application-Platform

More information

Introduction. Summary. Why computer architecture? Technology trends Cost issues

Introduction. Summary. Why computer architecture? Technology trends Cost issues Introduction 1 Summary Why computer architecture? Technology trends Cost issues 2 1 Computer architecture? Computer Architecture refers to the attributes of a system visible to a programmer (that have

More information

Embedded Systems. 7. System Components

Embedded Systems. 7. System Components Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic

More information

Design and Technology Trends

Design and Technology Trends Lecture 1 Design and Technology Trends R. Saleh Dept. of ECE University of British Columbia res@ece.ubc.ca 1 Recently Designed Chips Itanium chip (Intel), 2B tx, 700mm 2, 8 layer 65nm CMOS (4 processors)

More information

ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES

ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES Shashikiran H. Tadas & Chaitali Chakrabarti Department of Electrical Engineering Arizona State University Tempe, AZ, 85287. tadas@asu.edu, chaitali@asu.edu

More information

Jin-Fu Li. Department of Electrical Engineering. Jhongli, Taiwan

Jin-Fu Li. Department of Electrical Engineering. Jhongli, Taiwan EEA001 VLSI Design Jin-Fu Li Advanced Reliable Systems (ARES) Lab. Department of Electrical Engineering National Central University Jhongli, Taiwan Contents Syllabus Introduction to CMOS Circuits MOS Transistor

More information

Conservation Cores: Reducing the Energy of Mature Computations

Conservation Cores: Reducing the Energy of Mature Computations Conservation Cores: Reducing the Energy of Mature Computations Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steven Swanson, Michael Bedford

More information

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity Donghyuk Lee Carnegie Mellon University Problem: High DRAM Latency processor stalls: waiting for data main memory high latency Major bottleneck

More information

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER 3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER CODES+ISSS: Special session on memory controllers Taipei, October 10 th 2011 Denis Dutoit, Fabien Clermidy, Pascal Vivet {denis.dutoit@cea.fr}

More information

Adaptive Voltage Scaling (AVS) Alex Vainberg October 13, 2010

Adaptive Voltage Scaling (AVS) Alex Vainberg   October 13, 2010 Adaptive Voltage Scaling (AVS) Alex Vainberg Email: alex.vainberg@nsc.com October 13, 2010 Agenda AVS Introduction, Technology and Architecture Design Implementation Hardware Performance Monitors Overview

More information

A Simple Model for Estimating Power Consumption of a Multicore Server System

A Simple Model for Estimating Power Consumption of a Multicore Server System , pp.153-160 http://dx.doi.org/10.14257/ijmue.2014.9.2.15 A Simple Model for Estimating Power Consumption of a Multicore Server System Minjoong Kim, Yoondeok Ju, Jinseok Chae and Moonju Park School of

More information

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,

More information

Mapping C code on MPSoC for Nomadic Embedded Systems

Mapping C code on MPSoC for Nomadic Embedded Systems -1 - ARTIST2 Summer School 2008 in Europe Autrans (near Grenoble), France September 8-12, 8 2008 Mapping C code on MPSoC for Nomadic Embedded Systems http://www.artist-embedded.org/ Lecturer: Diederik

More information

Exploring Performance, Power, and Temperature Characteristics of 3D Systems with On-Chip DRAM

Exploring Performance, Power, and Temperature Characteristics of 3D Systems with On-Chip DRAM Exploring Performance, Power, and Temperature Characteristics of 3D Systems with On-Chip DRAM Jie Meng, Daniel Rossell, and Ayse K. Coskun Electrical and Computer Engineering Department, Boston University,

More information

Transistors and Wires

Transistors and Wires Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis Part II These slides are based on the slides provided by the publisher. The slides

More information

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem. The VLSI Interconnect Challenge Avinoam Kolodny Electrical Engineering Department Technion Israel Institute of Technology VLSI Challenges System complexity Performance Tolerance to digital noise and faults

More information

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks Aditya Agrawal, Josep Torrellas and Sachin Idgunji University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu

More information

Moore s s Law, 40 years and Counting

Moore s s Law, 40 years and Counting Moore s s Law, 40 years and Counting Future Directions of Silicon and Packaging Bill Holt General Manager Technology and Manufacturing Group Intel Corporation InterPACK 05 2005 Heat Transfer Conference

More information

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor

More information

Stacked Silicon Interconnect Technology (SSIT)

Stacked Silicon Interconnect Technology (SSIT) Stacked Silicon Interconnect Technology (SSIT) Suresh Ramalingam Xilinx Inc. MEPTEC, January 12, 2011 Agenda Background and Motivation Stacked Silicon Interconnect Technology Summary Background and Motivation

More information

CMPEN 411 VLSI Digital Circuits. Lecture 01: Introduction

CMPEN 411 VLSI Digital Circuits. Lecture 01: Introduction CMPEN 411 VLSI Digital Circuits Kyusun Choi Lecture 01: Introduction CMPEN 411 Course Website link at: http://www.cse.psu.edu/~kyusun/teach/teach.html [Adapted from Rabaey s Digital Integrated Circuits,

More information

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

EE282 Computer Architecture. Lecture 1: What is Computer Architecture? EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer

More information

Thermal-Aware 3D IC Physical Design and Architecture Exploration

Thermal-Aware 3D IC Physical Design and Architecture Exploration Thermal-Aware 3D IC Physical Design and Architecture Exploration Jason Cong & Guojie Luo UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/~cong Supported by DARPA Outline Thermal-Aware

More information

On GPU Bus Power Reduction with 3D IC Technologies

On GPU Bus Power Reduction with 3D IC Technologies On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The

More information

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation Course Goals Lab Understand key components in VLSI designs Become familiar with design tools (Cadence) Understand design flows Understand behavioral, structural, and physical specifications Be able to

More information

Power Consumption in 65 nm FPGAs

Power Consumption in 65 nm FPGAs White Paper: Virtex-5 FPGAs R WP246 (v1.2) February 1, 2007 Power Consumption in 65 nm FPGAs By: Derek Curd With the introduction of the Virtex -5 family, Xilinx is once again leading the charge to deliver

More information

IMEC CORE CMOS P. MARCHAL

IMEC CORE CMOS P. MARCHAL APPLICATIONS & 3D TECHNOLOGY IMEC CORE CMOS P. MARCHAL OUTLINE What is important to spec 3D technology How to set specs for the different applications - Mobile consumer - Memory - High performance Conclusions

More information

From the table we can see that the main contribution to. EDA Publishing/THERMINIC 2011

From the table we can see that the main contribution to. EDA Publishing/THERMINIC 2011 Single-hip loud omputer Thermal odel ohammadsadegh Sadri, Andrea Bartolini, Luca Benini University of Bologna Via Risorgimento, 2, 40136 Bologna, Italy Tel:0039(0)512093787;Fax:0039(0)512093785, Email:mohammadsadegh.sadr2,a.bartolini,luca.benini@unibo.it

More information

Power and Thermal Models. for RAMP2

Power and Thermal Models. for RAMP2 Power and Thermal Models for 2 Jose Renau Department of Computer Engineering, University of California Santa Cruz http://masc.cse.ucsc.edu Motivation Performance not the only first order design parameter

More information

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration 1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 10: Three-Dimensional (3D) Integration Instructor: Ron Dreslinski Winter 2016 University of Michigan 1 1 1 Announcements

More information

Lecture 1: Introduction

Lecture 1: Introduction Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline

More information

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives

More information

EITF35: Introduction to Structured VLSI Design

EITF35: Introduction to Structured VLSI Design EITF35: Introduction to Structured VLSI Design Part 1.1.2: Introduction (Digital VLSI Systems) Liang Liu liang.liu@eit.lth.se 1 Outline Why Digital? History & Roadmap Device Technology & Platforms System

More information

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann

More information

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University Abbas El Gamal Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program Stanford University Chip stacking Vertical interconnect density < 20/mm Wafer Stacking

More information

Imaging Solutions by Mercury Computer Systems

Imaging Solutions by Mercury Computer Systems Imaging Solutions by Mercury Computer Systems Presented By Raj Parihar Computer Architecture Reading Group, UofR Mercury Computer Systems Boston based; designs and builds embedded multi computers Loosely

More information

VLSI Design Automation. Maurizio Palesi

VLSI Design Automation. Maurizio Palesi VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 Outline Technology trends VLSI Design flow (an overview) 3 IC Products Processors CPU, DSP, Controllers Memory chips

More information

Computer Architecture!

Computer Architecture! Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors

More information

KiloCore: A 32 nm 1000-Processor Array

KiloCore: A 32 nm 1000-Processor Array KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel, Timothy Andreas, Bin Liu, Anh Tran, Emmanuel Adeagbo, Bevan Baas University of California, Davis VLSI Computation

More information

A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache

A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache Stefan Rusu Intel Corporation Santa Clara, CA Intel and the Intel logo are registered trademarks of Intel Corporation or its subsidiaries in

More information

Conservation Cores: Reducing the Energy of Mature Computations

Conservation Cores: Reducing the Energy of Mature Computations Conservation Cores: Reducing the Energy of Mature Computations Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steven Swanson, Michael Bedford

More information

ECE 486/586. Computer Architecture. Lecture # 2

ECE 486/586. Computer Architecture. Lecture # 2 ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:

More information

Power Reduction Techniques in the Memory System. Typical Memory Hierarchy

Power Reduction Techniques in the Memory System. Typical Memory Hierarchy Power Reduction Techniques in the Memory System Low Power Design for SoCs ASIC Tutorial Memories.1 Typical Memory Hierarchy On-Chip Components Control edram Datapath RegFile ITLB DTLB Instr Data Cache

More information

Advanced Computer Architecture (CS620)

Advanced Computer Architecture (CS620) Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).

More information

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL Shyam Akashe 1, Ankit Srivastava 2, Sanjay Sharma 3 1 Research Scholar, Deptt. of Electronics & Comm. Engg., Thapar Univ.,

More information

Microprocessor Thermal Analysis using the Finite Element Method

Microprocessor Thermal Analysis using the Finite Element Method Microprocessor Thermal Analysis using the Finite Element Method Bhavya Daya Massachusetts Institute of Technology Abstract The microelectronics industry is pursuing many options to sustain the performance

More information

Monolithic 3D IC Design for Deep Neural Networks

Monolithic 3D IC Design for Deep Neural Networks Monolithic 3D IC Design for Deep Neural Networks 1 with Application on Low-power Speech Recognition Kyungwook Chang 1, Deepak Kadetotad 2, Yu (Kevin) Cao 2, Jae-sun Seo 2, and Sung Kyu Lim 1 1 School of

More information

Power Solutions for Leading-Edge FPGAs. Vaughn Betz & Paul Ekas

Power Solutions for Leading-Edge FPGAs. Vaughn Betz & Paul Ekas Power Solutions for Leading-Edge FPGAs Vaughn Betz & Paul Ekas Agenda 90 nm Power Overview Stratix II : Power Optimization Without Sacrificing Performance Technical Features & Competitive Results Dynamic

More information

The Processor That Don't Cost a Thing

The Processor That Don't Cost a Thing The Processor That Don't Cost a Thing Peter Hsu, Ph.D. Peter Hsu Consulting, Inc. http://cs.wisc.edu/~peterhsu DRAM+Processor Commercial demand Heat stiffling industry's growth Heat density limits small

More information

BREAKING THE MEMORY WALL

BREAKING THE MEMORY WALL BREAKING THE MEMORY WALL CS433 Fall 2015 Dimitrios Skarlatos OUTLINE Introduction Current Trends in Computer Architecture 3D Die Stacking The memory Wall Conclusion INTRODUCTION Ideal Scaling of power

More information

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Who am I? Education Master of Technology, NTNU, 2007 PhD, NTNU, 2010. Title: «Managing Shared Resources in Chip Multiprocessor Memory

More information

Evolution of Computers & Microprocessors. Dr. Cahit Karakuş

Evolution of Computers & Microprocessors. Dr. Cahit Karakuş Evolution of Computers & Microprocessors Dr. Cahit Karakuş Evolution of Computers First generation (1939-1954) - vacuum tube IBM 650, 1954 Evolution of Computers Second generation (1954-1959) - transistor

More information

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Practical Information

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Practical Information EE24 - Spring 2000 Advanced Digital Integrated Circuits Tu-Th 2:00 3:30pm 203 McLaughlin Practical Information Instructor: Borivoje Nikolic 570 Cory Hall, 3-9297, bora@eecs.berkeley.edu Office hours: TuTh

More information

Multi-Core Microprocessor Chips: Motivation & Challenges

Multi-Core Microprocessor Chips: Motivation & Challenges Multi-Core Microprocessor Chips: Motivation & Challenges Dileep Bhandarkar, Ph. D. Architect at Large DEG Architecture & Planning Digital Enterprise Group Intel Corporation October 2005 Copyright 2005

More information

Computer Architecture. Introduction. Lynn Choi Korea University

Computer Architecture. Introduction. Lynn Choi Korea University Computer Architecture Introduction Lynn Choi Korea University Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, 공학관 411, lchoi@korea.ac.kr, TA: 윤창현 / 신동욱, 3290-3896,

More information

SYNTHESIS FOR ADVANCED NODES

SYNTHESIS FOR ADVANCED NODES SYNTHESIS FOR ADVANCED NODES Abhijeet Chakraborty Janet Olson SYNOPSYS, INC ISPD 2012 Synopsys 2012 1 ISPD 2012 Outline Logic Synthesis Evolution Technology and Market Trends The Interconnect Challenge

More information

CIT 668: System Architecture. Computer Systems Architecture

CIT 668: System Architecture. Computer Systems Architecture CIT 668: System Architecture Computer Systems Architecture 1. System Components Topics 2. Bandwidth and Latency 3. Processor 4. Memory 5. Storage 6. Network 7. Operating System 8. Performance Implications

More information

ECE 172 Digital Systems. Chapter 15 Turbo Boost Technology. Herbert G. Mayer, PSU Status 8/13/2018

ECE 172 Digital Systems. Chapter 15 Turbo Boost Technology. Herbert G. Mayer, PSU Status 8/13/2018 ECE 172 Digital Systems Chapter 15 Turbo Boost Technology Herbert G. Mayer, PSU Status 8/13/2018 1 Syllabus l Introduction l Speedup Parameters l Definitions l Turbo Boost l Turbo Boost, Actual Performance

More information

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee

More information

Package level Interconnect Options

Package level Interconnect Options Package level Interconnect Options J.Balachandran,S.Brebels,G.Carchon, W.De Raedt, B.Nauwelaers,E.Beyne imec 2005 SLIP 2005 April 2 3 Sanfrancisco,USA Challenges in Nanometer Era Integration capacity F

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University bcclee@stanford.edu Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1 Memory Scaling density,

More information

Automated Transient Thermal Analysis

Automated Transient Thermal Analysis Automated Transient Thermal Analysis with ANSYS Icepak and Simplorer Using EKM Eric Lin Lalit Chaudhari Shantanu Bhide Vamsi Krishna Yaddanapudi 1 Overview Power Map Introduction Need for Chip-co Design

More information

ECE 2162 Intro & Trends. Jun Yang Fall 2009

ECE 2162 Intro & Trends. Jun Yang Fall 2009 ECE 2162 Intro & Trends Jun Yang Fall 2009 Prerequisites CoE/ECE 0142: Computer Organization; or CoE/CS 1541: Introduction to Computer Architecture I will assume you have detailed knowledge of Pipelining

More information

NETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN

NETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN NETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN Giovanni De Micheli Luca Benini CSL - Stanford University DEIS - Bologna University Electronic systems Systems on chip are everywhere Technology

More information

The Memory Hierarchy 1

The Memory Hierarchy 1 The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow

More information

EE241 - Spring 2004 Advanced Digital Integrated Circuits

EE241 - Spring 2004 Advanced Digital Integrated Circuits EE24 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolić Lecture 2 Impact of Scaling Class Material Last lecture Class scope, organization Today s lecture Impact of scaling 2 Major Roadblocks.

More information

From 3D Toolbox to 3D Integration: Examples of Successful 3D Applicative Demonstrators N.Sillon. CEA. All rights reserved

From 3D Toolbox to 3D Integration: Examples of Successful 3D Applicative Demonstrators N.Sillon. CEA. All rights reserved From 3D Toolbox to 3D Integration: Examples of Successful 3D Applicative Demonstrators N.Sillon Agenda Introduction 2,5D: Silicon Interposer 3DIC: Wide I/O Memory-On-Logic 3D Packaging: X-Ray sensor Conclusion

More information

Three DIMENSIONAL-CHIPS

Three DIMENSIONAL-CHIPS IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 4 (Sep-Oct. 2012), PP 22-27 Three DIMENSIONAL-CHIPS 1 Kumar.Keshamoni, 2 Mr. M. Harikrishna

More information

Physical Co-Design for Micro-Fluidically Cooled 3D ICs

Physical Co-Design for Micro-Fluidically Cooled 3D ICs Physical Co-Design for Micro-Fluidically Cooled 3D ICs Zhiyuan Yang, Ankur Srivastava Department of Electrical and Computer Engineering University of Maryland, College Park, Maryland, 20742 Email: {zyyang,

More information

Fundamentals of Quantitative Design and Analysis

Fundamentals of Quantitative Design and Analysis Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature

More information

Chapter 2 Designing Crossbar Based Systems

Chapter 2 Designing Crossbar Based Systems Chapter 2 Designing Crossbar Based Systems Over the last decade, the communication architecture of SoCs has evolved from single shared bus systems to multi-bus systems. Today, state-of-the-art bus based

More information

What is this class all about?

What is this class all about? EE141-Fall 2012 Digital Integrated Circuits Instructor: Elad Alon TuTh 11-12:30pm 247 Cory 1 What is this class all about? Introduction to digital integrated circuit design engineering Will describe models

More information

MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS

MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS INSTRUCTOR: Dr. MUHAMMAD SHAABAN PRESENTED BY: MOHIT SATHAWANE AKSHAY YEMBARWAR WHAT IS MULTICORE SYSTEMS? Multi-core processor architecture means placing

More information

Spiral 2-8. Cell Layout

Spiral 2-8. Cell Layout 2-8.1 Spiral 2-8 Cell Layout 2-8.2 Learning Outcomes I understand how a digital circuit is composed of layers of materials forming transistors and wires I understand how each layer is expressed as geometric

More information

Multicore SoC is coming. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems. Source: 2007 ISSCC and IDF.

Multicore SoC is coming. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems. Source: 2007 ISSCC and IDF. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems Liang-Gee Chen Distinguished Professor General Director, SOC Center National Taiwan University DSP/IC Design Lab, GIEE, NTU 1

More information

Xilinx SSI Technology Concept to Silicon Development Overview

Xilinx SSI Technology Concept to Silicon Development Overview Xilinx SSI Technology Concept to Silicon Development Overview Shankar Lakka Aug 27 th, 2012 Agenda Economic Drivers and Technical Challenges Xilinx SSI Technology, Power, Performance SSI Development Overview

More information