Tera-scale Computing and Interconnect Challenges

Size: px
Start display at page:

Download "Tera-scale Computing and Interconnect Challenges"

Transcription

1 Tera-scale Computing and Interconnect Challenges 3D Stacking Considerations Dr. Jerry Bautista Director, Microprocessor Technology Management Co-Director, Tera-scale Computing Research 2007 Intel Corporation

2 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 2

3 Multi-core is now mainstream Multi-core Top to Bottom Add cores to deliver historic 2x performance/2 years 3

4 A Tera-scale Platform Vision Cache Cache Cache Special Purpose Engines Integrated IO devices Scalable On-die Interconnect Fabric Last Level Cache Last Level Cache Last Level Cache Integrated Memory Controllers Off Die interconnect High Bandwidth Memory IO Socket Inter- Connect 4

5 Multi-core for Energy-Efficient Performance P = CV 2 f Relative single-core frequency and Vcc 5

6 What is Tera-scale? Model-based Apps Recognition Mining Synthesis 3D & Video Mult- Media Text Models 6

7

8 More Compute Better Experience Perceptual accuracy & graphical realism scale with compute Algorithms ideal for array of general purpose CPUs Shared algorithms i.e. ray-tracing for lighting aids physics collision detection and path selection AI Today: Second Life Fluid 1X Now Tomorrow: Particle Fluid 10X Compute Effects Physics Future: Natural Looking Fluid 1000X Compute 10+ years 8

9 Research shows Applications Scale Well 9

10 21.72mm Teraflops Research Processor 12.64mm Goals: Deliver Tera-scale performance - Single precision TFLOP at desktop power - Frequency target 5GHz - Bi-section B/W order of Terabits/s - Link bandwidth in hundreds of GB/s Prototype two key technologies - On-die interconnect fabric - 3D stacked memory Develop a scalable design methodology - Tiled design approach - Mesochronous clocking - Power-aware capability 1 TeraFLOP in < 60 W envelope 100 Million Transistors 80 Tiles 275mm 2 10

11 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 11

12 Tera-scale Research Challenges 12

13 FSB and Memory Effective Bandwidth, Effective CPU core MHz Desktop Platform Bandwidth Roadmap GB/s 10 GB/s PentiumIII FSB 64 bit EDO 64 bit SDRAM Pentium4 FSB 2 ch. RDRAM 128 bit DDR 128 bit DDR2 192 bit DDR2/3 CSI Eff CPU Core Freq Mem Eff BW FSB EFF BW 2x / 21 mon 2x / 24 mon 2x / 27 mon Tera-scale Computing Requirements Staying on-trend puts mem BW target at GB/s for the time frame But Tera-scale (highly parallel) work loads are 5x that 13

14 Future CPU s Tera-Scale Performance DIMMs 100 s GB/s Bulk memory This is the problem area cost and power 10 s GB/s Bulk Storage 1 s 1 s TB/s TB/s Array of interconnected processing cores (on die power ~ 0.1 mw/gb/s) 1 s GB/s Another box, shelf or rack Tera FLOPS implies tera bytes/sec bandwidth 14

15 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 15

16 Approaches to Increase BW Ex. Software opt. Add Last Lev Cache SIP/MCP 3D Stacking Pin counts challenging package limits. Speed (Power, Complexity, Si area) Si Package Socket ~10 Gb/s Mother Board Si Package Socket Power scaling is exponential with increasing speed. 16

17 Bulk Memory I/O Pins and Power Power (mw/gbps) State of the art Research Signaling Rate GBit/sec Source: Randy Mooney, Intel Pin Count Power: Total Package Pins Power Pins IO Pins Source: Ravi Mahajan, Intel SoA: 100 GB/sec ~ 1 Tb/sec = 1,000 Gb/sec 25mw/Gb/sec = 25 Watts Could be reduced significantly, but requires major changes to physical layer. Pins: Bus-width = 1,000/5 = 200, about 400 pins (differential) Too much power, too many signal pins I/O power should be < 10% of total CPU socket power 17

18 Technologies in the Memory Hierarchy SRAM and/or edram DRAM Non-volatile Memory (NAND or PCM or?) Magnetic Storage 18

19 Memory Flow Model Model: a constant size L1 & L2, and a varying size L3, and a very large L4. Method: Capture L3 accesses per instruction & Capture L4 accesses per instruction Project L3, L4 BW requirements L1 L2 L3 L4 19

20 Bandwidth Requirements Outgrow Roadmap 10TB/s Bandwidth Requirements 1TB/s 100GB/s 10GB/s 1GB/s Kernels: Matrix Operations Equation Solvers Regression Analysis Applications: Financial Analytics Physics Simulation Media Processing BW Roadmap Trends Only 25-40GB/s of BW available by Even by 2013, only ~100GB/s of BW available based on DDRx trends Terascale Workloads Insufficient Bandwidth Will Limit Performance 20 Source: Albert Lin, Intel/Stanford Yen-Kuang Chen, Intel

21 Solving BW Limitations 100% (100% = ideal performance with no BW limitations) Performance 10% 1% 0% Terascale Workloads 21 Source: Albert Lin, Intel/Stanford Yen-Kuang Chen, Intel

22 Hiding Memory/Storage Bandwidth Limitations DIMMs Under CPU: 3D stack Near CPU: MCP or C2C Tradeoffs: capacity (density) latency power/thermals Done. cost integration path SW execution (working set sizes) 22

23 On-Socket DRAM Caches For Memory Scalability Enable Large Capacity L4s - Low Latency - High Bandwidth Technologies - Multi-chip Packages (MCP) - 3D Stacking Benefits - Significant Miss Rate Reduction - Avoids bandwidth wall - At better latency Normalized miss rate Benefits of Large Caches OLTP ERP Java-1 Java Threads / Shared Cache size (MB) MCP ~200GB/s Proc DRAM $ 3D stack >1TB/s Proc DRAM $ Iyer, R, et al, Datacenter-on-Chip Architectures: Tera-scale Opportunities and Challenges, and Polka, LA et al, Package Technology to Address the Memory Bandwidth Challenge for Tera-scale, Computing, Intel Technology Journal, Volume 11, Issue 3,

24 CPU and Nearby Memory Under CPU: 3D stack Near CPU: MCP or C2C Considerations: - SW working set sizes - Power delivery - Heat dissipation - Yield/process flow - Reliability - Stacking method Chip to chip Wafer to wafer Bottom line: No Major Technical Issues Package substrate Existing DRAM devices are I/O constrained stacking attractive 24

25 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 25

26 Work in Progress: Stacked Memory Prototype 256 KB SRAM per core 4X C4 bump density 3200 thru-silicon vias Polaris Package Thru-Silicon Via Denser than C4 pitch Freya C4 pitch 26

27 Current Die and Wafer Stacking Structure Comparison Die Stacking TSV Wafer Stacking Source: Intel Possible Application : Logic + Memory TSV Size: ~50 µm Thickness: ~100 µm Bonding Structure: ~Bump Size Bonding Pitch: ~Bump Pitch Source: Intel Possible Application : Logic + Logic TSV Size: <~5µm Thickness ~10 µm *Bonding Structure: <~5 µm *Bonding Pitch <~8 µm Source: Morrow et. al, Wafer-level 3D interconnects via Cu bonding, Proc. AMC, (2004) 27

28 300 mm Wafer Bonding (b) Source: Morrow et. al, Wafer-level 3D interconnects via Cu bonding, Proc. AMC, (2004) 28

29 Bond Interface Electrical Test Configuration Current Bond Pad Through-Si Via Wafer #1 Wafer #2 ~4096 links For this study: Pitch <~ 9 µm Source: Morrow et. al, Wafer-level 3D interconnects via Cu bonding, Proc. AMC, (2004) 29

30 Stacking Memory for Cache -Thermals Intel Core 2 Duo power and thermal map Source: Intel Must carefully consider thermal map for 3D stacking 30 Source: Venkat Natarajan, Intel

31 Impact of Powermap Alignment 0.5 W/Die dissipation in one quarter of die Die 2 Thermal floorplanning for die stacks is critical aspect of thermal design hot spot alignment creates greatest increase in temperature Hot Spots Die 1 Effect of Powermap On Heat Transfer from a Four-Die-Stack Temperature (C) Uniform Heat Load Aligned Powermap Case Non-Aligned Powermap Case Airflow DT max ~ 8.5 C Die Number 31 Source: Venkat Natarajan, Intel

32 Thermal Through Silicon Vias (TTSV) STACKED DIES SIGNAL LAYERS THERMAL THROUGH SILICON VIAS (TTSV) DIE-T0-DIE VIAS THROUGH SILICON VIAS (TSV) SUBSRATE May need dedicated thermal through silicon Vias Large in size: ~100 microns diameter; ~200 microns deep Filled with copper to provide adequate thermal paths 32 Source: Venkat Natarajan, Intel

33 1 CSI4CSI4 Vss CSI4 VFusCSI3 CSI3RXDAT_B0 CSI4 Vss CSI4 CSI4CSI4 CSI4Vss CSI4CSI4RXDAT_B7 CSI4CSI4RXDAT_B3 Vss VCC VSTB CSI4Vss CSI5CSI5 CSI5CSI5RXDAT_B3 Vss Misc CSI5 CSI5RXDAT_B0 POW Vss CSI5 Vss CSI5 2 CSI4CSI3 CSI3CSI3 CSI3RXDAT_B3 Vss CSI4 CSI4CSI4 CSI4Vss CSI4CSI4 CSI4RXDAT_B8 3 Vss CSI3 CSI3CSI3RXDAT_B4 Vss CSI3RXDAT_B1 CSI4CSI4 CSI4Vss CSI4CSI4 CSI4Vio Vss CSI4 CSI4CSI4RXDAT_B4 Vio Vss CSI4 CSI4RXDAT_B2 CSI4RXDAT_B0 Vio Vss CSI4Vio CSI5RXDAT_B6 Vss Vio CSI5 CSI5RXDAT_B1 Vio Vss Misc Misc Misc FBDFBD1 CSI4Vss CSI4CSI4 CSI4CSI5 Vss CSI5 CSI5RXDAT_B4 CSI5RCSI5RXDAT_B2 VssRXDAXDATRXDAXDATCVss Vio FBD1 4 CSI3CSI3 Vss Vio CSI3CSI3RXDAT_B2 CSI3Vss VFusCSI3 CSI3CSI4 Vss CSI4 CSI4CSI4RXDAT_B6 Vio Vss CSI4RXDAT_B1 Vio CSI4CSI4 Vss CSI5RXDAT_B9 CSI5CSI5RXDAT_B5 CSI5Vss CSI5RXDAT XDAT Vio VssRXDATRXDARXCLK 5 Vss CSI3RXDAT_B5 CSI3CSI3RXDAT_B6 Vio Vss CSI3Vio CSI3CSI3 Vss CSI4 CSI4CSI4RXDAT_B9 CSI4Vss CSI4CSI4RXDAT_B5 Vss CSI4 CSI4CSI5 CSI5RXDAT_B8 Vss CSI5CSI5TCSI5 CSI5Vss FBD1RXDATD1 XDATXDATC XDAT VssRXDA 6 Vcac CSI3 CSI3RXDAT_B7 Vss CSI3CSI3 CSI3CSI3 Vss Vio CSI3Vio CSI4Vss CSI4CSI4 CSI4CSI4 Vss CSI4 CSI4Vio CSI4Vss CSI5CSI5 CSI5CSI5TVss CSI5 XCLK FBD1RXDATD0 FBD1RXDATD5 VssRXDAXDATC XDAT 7 VcacVss CSI3CSI3 CSI3RXDAT_B8 Vio Vss CSI3CSI3CSI3CSI3Vss CSI4CSI4CSI4Vio Vss CSI4CSI4CSI4CSI4Vss CSI5CSI5CSI5RXDAT_B7 Vio Vss CSI5TCSI5CSI5CSI5VssXDAT FBD1RXDATD3 XDATRXDATVss 8 VR_OCSI3 CSI3RXDAT_B9 Vss CSI3 CSI3CSI3 CSI3Vss CSI3CSI3 Vio CSI4 Vss CSI4 CSI4CSI4 Vio Vss CSI4CSI4 CSI4CSI5 Vss CSI5 CSI5Misc Vio Vss Misc MiscRXCLKXCLKDVssXDATC RXDA 9 VR_OCSI3 Vss CSI3 CSI3CSI3 Vio Vss CSI3CSI3 Vio SID[0]Vss CSI4 CSI4CSI4 CSI4Vss CSI4CSI4 CSI4Vio Vss CSI5 CSI5CSI5 Misc Vss Misc Misc 1TXDAXDATVss FBD1RXDATD2 XDATRXDAT XDAT 10 Vss CSI3CSI3CSI3Vio Vss CSI3CSI3CSI3CSI3Vss SID[1]MiscMisc CSI4Vss CSI4CachCSI4OCP Vss OCP MiscMisc MiscVss MiscMisc Misc MiscVssFBD1RXDATD4 Vio XDATDVio VssRXDA 11 CSI3CSI3 CSI3Vss CSI3CSI3 CSI3Vio Vss CSI3 PBE SID[2] ERROR[1] Vss Misc Misc Misc Cach Vss OCP_OCP OCP OCP Vss Misc TCKTMS Misc Vss Misc MiscTXDA TXDATVss FBD1RXDATD6 FBD1RXDATD12 FBD1RXDATD_B12 12 CSI3Vss CSI3CSI3 CSI3Vio Vss CSI3 CSI3CSI3 CSI3Vss ERROR[0] Vio Misc Misc Vss Cach CachOCP_OCP Vss OCP OCP OCP_TDI Vss Misc Misc Misc Vio VssTXDA TXDATXDAT FBD1RXDATD7 Vss 13 CSI3Vio CSI3CSI3 Vss CSI3 CSI3CSI3 CSI3Vss PSMI Vio Looking at the top of the Mot TRST#Misc Misc Vss MiscTXDAXDATVio VssXDATDRXDA 14 CSI3CSI3 Vss CSI3 CSI3CSI3 LSS Vss CSI3Vio PROMForcePR# Misc Vss Misc Misc 1TXDAXDATVss1TXDA FBD1RXDATD8 XDATDVio 15 Vss PLL_PLL_PLL_PLL_Vss KBX_DLL_EOR_LOW KBX_BM[0BM[ Vss Heartbeat PM_RPM_RMisc Misc Vss TXCLK TXCLK TXDAT 1TXDAVss FBD1RXDATD V12V 12V12V 12V12V 12V12V 12V12V SLVDSKT_MEM_RSTB PLL_Misc Vss MiscRXDAT FBD0RXDATB_B0 Vio VssTXDAT FBD1RXDATD10 FBD1RXDATD_B V12V12V12V12V12V12V12V12V12VSLVDVss Vss RESET#MiscRXDAT FBD0RXDATB_B1 VssRXDAT 1TXDA TXDATVio Vss 18 12V12V 12V12V 12V12V 12V12V 12V12V SLVDSY SINT_FREQ[0] SysCTDOVio VssRXDAT FBD0RXDATB_B2 FBD0RXDATB_B3 1TXDAVss TXDATXDA 19 12V12V12V12V12V12V12V12V12V12VVio SY SCSI SysCVss MiscRXDAT XDATAVio VssTXDAT 1TXDAXDATVio 20 12V12V 12V12V 12V12V 12V12V 12V12V Vss SMBCEND FBD BypCMisc Misc Misc VssRXDAT FBD0RXDATB_B5 TXDATVssTXDA 21 12V12V 12V12V 12V12V 12V12V 12V12VFRCSSMBD SysInt MiscMisc Vss MiscVio FBD0RXDATB_B4 RXCLKVss1TXDAXDATXDAT 22 12V12V 12V12V 12V12V 12V12V 12V12V FRCSVss END Vss Misc Misc MiscRXDATVssXCLK FBD0RXDATB12 FBD0RXDATB_B12 TXCLVss 23 12V12V12V12V12V12V12V12V12V12VFRCSFRCFG[0] MOTMisc Misc VssXDATARXDAT XDAT Vio Vss XCLKTXDA 24 TRIGGER[0] TRIGGER[1] Vss TRIGGER[2] TRIGGER[3] GIODGIOCVss SLVDSLVDFRCSFRCFG[1] MOTVss MiscRXDAT RXDATXDATAVssRXDAT RXDATVio XDAT 25 Vss CSI2 CSI2CSI2 CSI2Vss ROM_CSI2 CSI2FRDI Vss FRCFG[2] MAI_Misc MiscXDATAVssRXDATVio FBD0RXDATB_B8 FBD0RXDATB_B7 VssTXDA 26 CSI2CSI2 CSI2Vss CSI2CSI2 CSI2CSI2 Vss CSI2 HV MTEST_SY NC MAI_CSD_VssRXDATVioXDATA RXDATVss FBD0RXDATB11 RXDATXDAT 27 Misc Vss CSI2CSI2 CSI2Vio Vss CSI2 CSI2CSI2 CSI2Vss FRC Misc Misc Misc Vss Cach CachMisc Misc Vss Misc Misc Misc Misc Vss CSD_MiscXDATA RXCLKVss FBD0RXDATB_B6 FBD0RXDATB10 FBD0RXDATB_B11 FBD0RXDATB_B9 Vss 28 Misc CSI2 CSI2CSI2 Vss CSI2 CSI2CSI2 Vio Vss CSI2PROCFRDOMisc Vss CSI1 CachCach Misc Vss Misc Misc Misc Misc Vss Misc DFDMisc Vio VssXCLKA FBD0RXDATA_B12 FBD0RXDATB_B10 VssTXDA FBD0TXDATB_B0 29 VR_FCSI2 Vss CSI2 Vio CSI2 CSI2Vss CSI2CSI2 CSI2FRWPVss Misc Vio CSI1 CSI1Vss CSI1CSI1 CSI1Vio Vss CSI0 CSI0Misc DFDVss MiscRXDAT XDATARXDATVss Vio0TXDA FBD0TXDATB_B2 Vio 30 Vss CSI2 CSI2CSI2 CSI2Vss CSI2Vio CSI2CSI2 Vss CSI2 Vio Misc Misc Vss CSI1Vio CSI1CSI1 Vss CSI1 CSI0CSI0 CSI0Vss CSI0CSI0TMisc FBD0TXDATA1 VssXDATA FBD0RXDATA_B10 Vio VssTXDA 31 VR_THERM_ Vio CSI2Vss CSI2CSI2 CSI2CSI2 Vss CSI2 Vio CSI2 CSI2Vss CSI1CSI1 CSI1CSI1 Vss CSI1 Vio CSI1 CSI0Vss CSI0CSI0 CSI0Vio VssTXDATVio RXDAT 32 THERMTRIP# Vss CSI2CSI2 CSI2CSI2RXDAT_B9 Vss CSI2CSI2 CSI1Vss CSI2Vio CSI1RXDAT_B6 Vss CSI1 CSI1CSI1 CSI1Vss CSI0Vio CSI0CSI0 Vss CSI0T XDATAVss0TXDA FBD0TXDATB_B3 FBD0TXDATB_B1 FBD0TXDATA4 FBD0TXDATA0 VssRXDAT FBD0RXDATA_B11 FBD0TXDATB_B4 Vss 33 VcoreCSI2 CSI2RXDAT_B8 Vss Vio CSI2CSI2Vio Vss CSI1CSI1 CSI1CSI1RXDAT_B8 Vss CSI1RXDAT_B3 CSI1CSI1CSI1Vss CSI1CSI1CSI1CSI0Vss CSI0VioFBD0TXDATA6 CSI0T VssTXDAT FBD0TXDATA3 Vio Vss TXCLXCLK 34 VcorCSI2 Vss CSI2 CSI2RXDAT_B7 CSI2CSI2Vss CSI1CSI1 CSI1CSI1 Vss CSI1 CSI1CSI1RXDAT_B4 Vio Vss CSI1CSI1CSI1CSI1Vss CSI0Vio CSI0CSI0VssXDAT 35 CSI2CSI2RXDAT_B6 CSI2Vio CSI2RXDAT_B3 Vss CSI2CSI2 CSI1CSI1 Vss CSI1 CSI1CSI1 CSI1Vss CSI1RXDAT_B1 CSI1CSI1 Vss CSI1 CSI0CSI0 CSI0RXDAT_B6 36 Vss CSI2 CSI2RXDAT_B5 Vss CSI2CSI2 CSI2RXDAT_B1 FBD0TXDATA2 FBD0TXDATA9 VssTXDA FBD0TXDATB_B9 Vss CSI0CSI0RCSI0RXDAT_B1 Vio VssXDATAVio FBD0TXDATB_B7 0TXDAVss FBD0TXDATB_B5 CSI1 Vss CSI1 CSI1CSI1 CSI1RXDAT_B9 Vss CSI1RXDAT_B5 Vio CSI1CSI1 Vss CSI1 CSI1Vio CSI0RXDAT_B9 Vss CSI0RXDAT_B5 CSI0RXDAT_B4 CSI0RVss FBD0TXDATA8 0TXCLXCLK FBD0TXDATA5 FBD0TXDATB_B6 Vss CPU_FBD0 37 CSI1CSI2 CSI2CSI2 CSI2RXDAT_B4 Vss CSI1 CSI1CSI1 Vio Vss CSI1CSI1 CSI1RXDAT_B7 Vss CSI1 CSI1Vio CSI1Vss CSI0RXDAT_B8 38 CSI1CSI1 Vss CSI1 Vss CSI2RXDAT_B2 CSI2CSI2RXDAT_B0 CSI0 CSI0CSI0 Vss CSI0RXDAT_B2 CSI0RXDAT_B0 Vio VssXDATXDA FBD0TXDATB_B8 FBDFBD0 Vio Vss CSI1CSI1 Misc Misc Vss CSI1RXDAT_B2 CSI1CSI1RXDAT_B0 CSI1Vss CSI1CSI1 CSI0CSI0RXDAT_B7 Vss CSI0CSI0RXDAT_B3 FBD0TXDATA7 Vss XDATAMisc Misc CSI0 Vss CSI0 Packaging BW Options Super Socket MCP Edge Stacked Stacked Interconnect Density (1 cm x 1 cm die) Very Low (~ 100) Low (~ 800) Medium (< 5K) ( 100 µm bump pitch) Wafer-wafer ~ 1 M Die-die > 5K Interconnect BW (GB/s) < 0.2 TB/s TB/s ~ 1 TB/s 1 TB/s Flexibility Low High High Medium-Low TSV (Through Si Via) Cost/BW Tradeoff will be key Qualified Prescott dual core MCP

34 Summary An I/O inflection point on the horizon - Increased consolidation of BW at the CPU socket - Parallel execution and new workloads To increase BW, architectural and packaging options are likely the first approach to enabling high BW - 3D stacking is an attractive solution for both a large last level cache and increasing bulk DRAM capacities Challenges: Socket power increased with DRAM integration Thermals Cost/yield No major technical issues Lots of opportunity for architectural innovation! 34

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Interconnect Challenges in a Many Core Compute Environment Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Agenda Microprocessor general trends Implications Tradeoffs Summary

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Emerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation

Emerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation Emerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation Dr. Li Li Distinguished Engineer June 28, 2016 Outline Evolution of Internet The Promise of Internet

More information

Moore s s Law, 40 years and Counting

Moore s s Law, 40 years and Counting Moore s s Law, 40 years and Counting Future Directions of Silicon and Packaging Bill Holt General Manager Technology and Manufacturing Group Intel Corporation InterPACK 05 2005 Heat Transfer Conference

More information

1. NoCs: What s the point?

1. NoCs: What s the point? 1. Nos: What s the point? What is the role of networks-on-chip in future many-core systems? What topologies are most promising for performance? What about for energy scaling? How heavily utilized are Nos

More information

BREAKING THE MEMORY WALL

BREAKING THE MEMORY WALL BREAKING THE MEMORY WALL CS433 Fall 2015 Dimitrios Skarlatos OUTLINE Introduction Current Trends in Computer Architecture 3D Die Stacking The memory Wall Conclusion INTRODUCTION Ideal Scaling of power

More information

edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next?

edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next? edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next? 1 Integrating DRAM and Logic Integrate with Logic without impacting logic Performance,

More information

Multi-Core Microprocessor Chips: Motivation & Challenges

Multi-Core Microprocessor Chips: Motivation & Challenges Multi-Core Microprocessor Chips: Motivation & Challenges Dileep Bhandarkar, Ph. D. Architect at Large DEG Architecture & Planning Digital Enterprise Group Intel Corporation October 2005 Copyright 2005

More information

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Gigascale Integration Design Challenges & Opportunities Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Outline CMOS technology challenges Technology, circuit and μarchitecture solutions Integration

More information

On GPU Bus Power Reduction with 3D IC Technologies

On GPU Bus Power Reduction with 3D IC Technologies On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The

More information

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER 3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER CODES+ISSS: Special session on memory controllers Taipei, October 10 th 2011 Denis Dutoit, Fabien Clermidy, Pascal Vivet {denis.dutoit@cea.fr}

More information

The Memory Hierarchy 1

The Memory Hierarchy 1 The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow

More information

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis I NVENTIVE Physical Design Implementation for 3D IC Methodology and Tools Dave Noice Vassilios Gerousis Outline 3D IC Physical components Modeling 3D IC Stack Configuration Physical Design With TSV Summary

More information

3D Integration & Packaging Challenges with through-silicon-vias (TSV)

3D Integration & Packaging Challenges with through-silicon-vias (TSV) NSF Workshop 2/02/2012 3D Integration & Packaging Challenges with through-silicon-vias (TSV) Dr John U. Knickerbocker IBM - T.J. Watson Research, New York, USA Substrate IBM Research Acknowledgements IBM

More information

PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor

PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor Taeho Kgil, Shaun D Souza, Ali Saidi, Nathan Binkert, Ronald Dreslinski, Steve Reinhardt, Krisztian Flautner,

More information

Philippe Thierry Sr Staff Engineer Intel Corp.

Philippe Thierry Sr Staff Engineer Intel Corp. HPC@Intel Philippe Thierry Sr Staff Engineer Intel Corp. IBM, April 8, 2009 1 Agenda CPU update: roadmap, micro-μ and performance Solid State Disk Impact What s next Q & A Tick Tock Model Perenity market

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

Advancing high performance heterogeneous integration through die stacking

Advancing high performance heterogeneous integration through die stacking Advancing high performance heterogeneous integration through die stacking Suresh Ramalingam Senior Director, Advanced Packaging European 3D TSV Summit Jan 22 23, 2013 The First Wave of 3D ICs Perfecting

More information

Memory Demand Trends and what they Mean to Packaging Technology

Memory Demand Trends and what they Mean to Packaging Technology Memory Demand Trends and what they Mean to Packaging Technology Ravi Mahajan May 31, 2016 Key Contributors: Suresh Chittor, Randy Osborne, Bob Sankman IEEE 66 th ECTC Las Vegas, NV, USA May 31 June 3,

More information

TSV Test. Marc Loranger Director of Test Technologies Nov 11 th 2009, Seoul Korea

TSV Test. Marc Loranger Director of Test Technologies Nov 11 th 2009, Seoul Korea TSV Test Marc Loranger Director of Test Technologies Nov 11 th 2009, Seoul Korea # Agenda TSV Test Issues Reliability and Burn-in High Frequency Test at Probe (HFTAP) TSV Probing Issues DFT Opportunities

More information

Workloads, Scalability and QoS Considerations in CMP Platforms

Workloads, Scalability and QoS Considerations in CMP Platforms Workloads, Scalability and QoS Considerations in CMP Platforms Presenter Don Newell Sr. Principal Engineer Intel Corporation 2007 Intel Corporation Agenda Trends and research context Evolving Workload

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Ideally, computer memory would be large and fast

More information

Thermal Management Challenges in Mobile Integrated Systems

Thermal Management Challenges in Mobile Integrated Systems Thermal Management Challenges in Mobile Integrated Systems Ilyas Mohammed March 18, 2013 SEMI-THERM Executive Briefing Thermal Management Market Visions & Strategies, San Jose CA Contents Mobile computing

More information

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks Aditya Agrawal, Josep Torrellas and Sachin Idgunji University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu

More information

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Microelettronica. J. M. Rabaey, Digital integrated circuits: a design perspective EE141 Microelettronica Microelettronica J. M. Rabaey, "Digital integrated circuits: a design perspective" Introduction Why is designing digital ICs different today than it was before? Will it change in future? The First Computer

More information

MICROPROCESSOR TECHNOLOGY

MICROPROCESSOR TECHNOLOGY MICROPROCESSOR TECHNOLOGY Assis. Prof. Hossam El-Din Moustafa Lecture 20 Ch.10 Intel Core Duo Processor Architecture 2-Jun-15 1 Chapter Objectives Understand the concept of dual core technology. Look inside

More information

Memory: Past, Present and Future Trends Paolo Faraboschi

Memory: Past, Present and Future Trends Paolo Faraboschi Memory: Past, Present and Future Trends Paolo Faraboschi Fellow, Hewlett Packard Labs Systems Research Lab Quiz ( Excerpt from Intel Developer Forum Keynote 2000 ) ANDREW GROVE: is there a role for more

More information

The Road from Peta to ExaFlop

The Road from Peta to ExaFlop The Road from Peta to ExaFlop Andreas Bechtolsheim June 23, 2009 HPC Driving the Computer Business Server Unit Mix (IDC 2008) Enterprise HPC Web 100 75 50 25 0 2003 2008 2013 HPC grew from 13% of units

More information

Five Emerging DRAM Interfaces You Should Know for Your Next Design

Five Emerging DRAM Interfaces You Should Know for Your Next Design Five Emerging DRAM Interfaces You Should Know for Your Next Design By Gopal Raghavan, Cadence Design Systems Producing DRAM chips in commodity volumes and prices to meet the demands of the mobile market

More information

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration 1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 10: Three-Dimensional (3D) Integration Instructor: Ron Dreslinski Winter 2016 University of Michigan 1 1 1 Announcements

More information

CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology. Moore s Law: 2X transistors / year

CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology. Moore s Law: 2X transistors / year CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology Moore s Law: 2X transistors / year Cramming More Components onto Integrated Circuits Gordon Moore, Electronics, 1965 # on transistors

More information

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape Edition April 2017 Semiconductor technology & processing 3D systems-on-chip A clever partitioning of circuits to improve area, cost, power and performance. In recent years, the technology of 3D integration

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Advanced Computer Architecture (CS620)

Advanced Computer Architecture (CS620) Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).

More information

Addressing the Memory Wall

Addressing the Memory Wall Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the

More information

Enabling Technology for the Cloud and AI One Size Fits All?

Enabling Technology for the Cloud and AI One Size Fits All? Enabling Technology for the Cloud and AI One Size Fits All? Tim Horel Collaborate. Differentiate. Win. DIRECTOR, FIELD APPLICATIONS The Growing Cloud Global IP Traffic Growth 40B+ devices with intelligence

More information

DDR3 Memory Buffer: Buffer at the Heart of the LRDIMM Architecture. Paul Washkewicz Vice President Marketing, Inphi

DDR3 Memory Buffer: Buffer at the Heart of the LRDIMM Architecture. Paul Washkewicz Vice President Marketing, Inphi DDR3 Memory Buffer: Buffer at the Heart of the LRDIMM Architecture Paul Washkewicz Vice President Marketing, Inphi Theme Challenges with Memory Bandwidth Scaling How LRDIMM Addresses this Challenge Under

More information

3D & Advanced Packaging

3D & Advanced Packaging Tuesday, October 03, 2017 Company Overview March 12, 2015 3D & ADVANCED PACKAGING IS NOW WITHIN REACH WHAT IS NEXT LEVEL INTEGRATION? Next Level Integration blends high density packaging with advanced

More information

3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA

3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA 3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA OUTLINE 3D Application Drivers and Roadmap 3D Stacked-IC Technology 3D System-on-Chip: Fine grain partitioning Conclusion

More information

2GB DDR3 SDRAM SODIMM with SPD

2GB DDR3 SDRAM SODIMM with SPD 2GB DDR3 SDRAM SODIMM with SPD Ordering Information Part Number Bandwidth Speed Grade Max Frequency CAS Latency Density Organization Component Composition Number of Rank 78.A2GC6.AF1 10.6GB/sec 1333Mbps

More information

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics Overcoming the Memory System Challenge in Dataflow Processing Darren Jones, Wave Computing Drew Wingard, Sonics Current Technology Limits Deep Learning Performance Deep Learning Dataflow Graph Existing

More information

Platforms Design Challenges with many cores

Platforms Design Challenges with many cores latforms Design hallenges with many cores Raj Yavatkar, Intel Fellow Director, Systems Technology Lab orporate Technology Group 1 Environmental Trends: ell 2 *Other names and brands may be claimed as the

More information

OVERCOMING THE MEMORY WALL FINAL REPORT. By Jennifer Inouye Paul Molloy Matt Wisler

OVERCOMING THE MEMORY WALL FINAL REPORT. By Jennifer Inouye Paul Molloy Matt Wisler OVERCOMING THE MEMORY WALL FINAL REPORT By Jennifer Inouye Paul Molloy Matt Wisler ECE/CS 570 OREGON STATE UNIVERSITY Winter 2012 Contents 1. Introduction... 3 2. Background... 5 3. 3D Stacked Memory...

More information

EE5780 Advanced VLSI CAD

EE5780 Advanced VLSI CAD EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html

More information

Future Memories. Jim Handy OBJECTIVE ANALYSIS

Future Memories. Jim Handy OBJECTIVE ANALYSIS Future Memories Jim Handy OBJECTIVE ANALYSIS Hitting a Brick Wall OBJECTIVE ANALYSIS www.objective-analysis.com Panelists Michael Miller VP Technology, Innovation & Systems Applications MoSys Christophe

More information

Bringing 3D Integration to Packaging Mainstream

Bringing 3D Integration to Packaging Mainstream Bringing 3D Integration to Packaging Mainstream Enabling a Microelectronic World MEPTEC Nov 2012 Choon Lee Technology HQ, Amkor Highlighted TSV in Packaging TSMC reveals plan for 3DIC design based on silicon

More information

Don t Forget the Memory. Dean Klein, VP Memory System Development Micron Technology, Inc.

Don t Forget the Memory. Dean Klein, VP Memory System Development Micron Technology, Inc. Don t Forget the Memory Dean Klein, VP Memory System Development Micron Technology, Inc. Memory is Everywhere 2 One size DOES NOT fit all 3 Question: How many different memories does your computer use?

More information

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem. The VLSI Interconnect Challenge Avinoam Kolodny Electrical Engineering Department Technion Israel Institute of Technology VLSI Challenges System complexity Performance Tolerance to digital noise and faults

More information

Future of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1

Future of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1 Future of Interconnect Fabric A ontrarian View Shekhar Borkar June 13, 2010 Intel orp. 1 Outline Evolution of interconnect fabric On die network challenges Some simple contrarian proposals Evaluation and

More information

Lecture 1: CS/ECE 3810 Introduction

Lecture 1: CS/ECE 3810 Introduction Lecture 1: CS/ECE 3810 Introduction Today s topics: Why computer organization is important Logistics Modern trends 1 Why Computer Organization 2 Image credits: uber, extremetech, anandtech Why Computer

More information

3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV. Samta Bansal and Marc Greenberg, Cadence EDPS Monterey, CA April 5-6, 2012

3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV. Samta Bansal and Marc Greenberg, Cadence EDPS Monterey, CA April 5-6, 2012 3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV Samta Bansal and Marc Greenberg, Cadence EDPS Monterey, CA April 5-6, 2012 What the fuss is all about * Source : ECN Magazine March 2011 * Source : EDN Magazine

More information

Intel: Driving the Future of IT Technologies. Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation

Intel: Driving the Future of IT Technologies. Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation Research @ Intel: Driving the Future of IT Technologies Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation kp Intel Labs Mission To fuel Intel s growth, we deliver breakthrough technologies that

More information

Introduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN

Introduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN 1 Introduction The evolution of integrated circuit (IC) fabrication techniques is a unique fact in the history of modern industry. The improvements in terms of speed, density and cost have kept constant

More information

L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers

L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers I N S T I T U T D E R E C H E R C H E T E C H N O L O G I Q U E L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers 10/04/2017 Les Rendez-vous de

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University DRAMs Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Main Memory & Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers William Stallings Computer Organization and Architecture 8 th Edition Chapter 18 Multicore Computers Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved

More information

Exploiting Dark Silicon in Server Design. Nikos Hardavellas Northwestern University, EECS

Exploiting Dark Silicon in Server Design. Nikos Hardavellas Northwestern University, EECS Exploiting Dark Silicon in Server Design Nikos Hardavellas Northwestern University, EECS Moore s Law Is Alive And Well 90nm 90nm transistor (Intel, 2005) Swine Flu A/H1N1 (CDC) 65nm 45nm 32nm 22nm 16nm

More information

Hardware and Software solutions for scaling highly threaded processors. Denis Sheahan Distinguished Engineer Sun Microsystems Inc.

Hardware and Software solutions for scaling highly threaded processors. Denis Sheahan Distinguished Engineer Sun Microsystems Inc. Hardware and Software solutions for scaling highly threaded processors Denis Sheahan Distinguished Engineer Sun Microsystems Inc. Agenda Chip Multi-threaded concepts Lessons learned from 6 years of CMT

More information

Design and Technology Trends

Design and Technology Trends Lecture 1 Design and Technology Trends R. Saleh Dept. of ECE University of British Columbia res@ece.ubc.ca 1 Recently Designed Chips Itanium chip (Intel), 2B tx, 700mm 2, 8 layer 65nm CMOS (4 processors)

More information

The communication bottleneck

The communication bottleneck 3D-MPSoCs: architectural and design technology outlook Luca Benini DEIS Università di Bologna lbenini@deis.unibo.it The communication bottleneck Architectural issues Traditional shared buses do not scale

More information

From Majorca with love

From Majorca with love From Majorca with love IEEE Photonics Society - Winter Topicals 2010 Photonics for Routing and Interconnects January 11, 2010 Organizers: H. Dorren (Technical University of Eindhoven) L. Kimerling (MIT)

More information

A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache

A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache Stefan Rusu Intel Corporation Santa Clara, CA Intel and the Intel logo are registered trademarks of Intel Corporation or its subsidiaries in

More information

3D NAND Technology Scaling helps accelerate AI growth

3D NAND Technology Scaling helps accelerate AI growth 3D NAND Technology Scaling helps accelerate AI growth Jung Yoon, Ranjana Godse IBM Supply Chain Engineering Andrew Walls IBM Flash Systems August 2018 1 Agenda 3D-NAND Scaling & AI Flash density trend

More information

Adrian Proctor Vice President, Marketing Viking Technology

Adrian Proctor Vice President, Marketing Viking Technology Storage PRESENTATION in the TITLE DIMM GOES HERE Socket Adrian Proctor Vice President, Marketing Viking Technology SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless

More information

Hybrid Memory Cube (HMC)

Hybrid Memory Cube (HMC) 23 Hybrid Memory Cube (HMC) J. Thomas Pawlowski, Fellow Chief Technologist, Architecture Development Group, Micron jpawlowski@micron.com 2011 Micron Technology, I nc. All rights reserved. Products are

More information

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers System Performance Scaling of IBM POWER6 TM Based Servers Jeff Stuecheli Hot Chips 19 August 2007 Agenda Historical background POWER6 TM chip components Interconnect topology Cache Coherence strategies

More information

Power Technology For a Smarter Future

Power Technology For a Smarter Future 2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Power Technology For a Smarter Future Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation

More information

Stacked Silicon Interconnect Technology (SSIT)

Stacked Silicon Interconnect Technology (SSIT) Stacked Silicon Interconnect Technology (SSIT) Suresh Ramalingam Xilinx Inc. MEPTEC, January 12, 2011 Agenda Background and Motivation Stacked Silicon Interconnect Technology Summary Background and Motivation

More information

SU Dual and Quad-Core Xeon UP Server

SU Dual and Quad-Core Xeon UP Server SU4-1300 Dual and Quad-Core Xeon UP Server www.eslim.co.kr Dual and Quad-Core Server Computing Leader!! ESLIM KOREA INC. 1. Overview eslim SU4-1300 The ideal entry-level server Intel Xeon processor 3000/3200

More information

Toward a Memory-centric Architecture

Toward a Memory-centric Architecture Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains

More information

The Design of the KiloCore Chip

The Design of the KiloCore Chip The Design of the KiloCore Chip Aaron Stillmaker*, Brent Bohnenstiehl, Bevan Baas DAC 2017: Design Challenges of New Processor Architectures University of California, Davis VLSI Computation Laboratory

More information

Intel Enterprise Processors Technology

Intel Enterprise Processors Technology Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology

More information

ECE 486/586. Computer Architecture. Lecture # 2

ECE 486/586. Computer Architecture. Lecture # 2 ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:

More information

Open Innovation with Power8

Open Innovation with Power8 2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Open Innovation with Power8 Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation 2013

More information

A 1.5GHz Third Generation Itanium Processor

A 1.5GHz Third Generation Itanium Processor A 1.5GHz Third Generation Itanium Processor Jason Stinson, Stefan Rusu Intel Corporation, Santa Clara, CA 1 Outline Processor highlights Process technology details Itanium processor evolution Block diagram

More information

The Future of Electrical I/O for Microprocessors. Frank O Mahony Intel Labs, Hillsboro, OR USA

The Future of Electrical I/O for Microprocessors. Frank O Mahony Intel Labs, Hillsboro, OR USA The Future of Electrical I/O for Microprocessors Frank O Mahony frank.omahony@intel.com Intel Labs, Hillsboro, OR USA 1 Outline 1TByte/s I/O: motivation and challenges Circuit Directions Channel Directions

More information

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CSAIL IAP MEETING MAY 21, 2013 Research Agenda Lack of technology progress Moore s Law still alive Power

More information

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha ECE520 VLSI Design Lecture 1: Introduction to VLSI Technology Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Wednesday 2:00-3:00PM or by appointment E-mail: pzarkesh@unm.edu Slide: 1 Course Objectives

More information

Technology Trends IT ELS. Kevin Kettler Dell CTO

Technology Trends IT ELS. Kevin Kettler Dell CTO Technology Trends IT ELS Kevin Kettler Dell CTO Core Technology Building Blocks Processor Chipset Graphics Memory I/O Subsystems Process Technology.13µ 2001 90nm 2003 65nm 2005 45nm 2007 32nm ~2009 22nm

More information

CS/EE 6810: Computer Architecture

CS/EE 6810: Computer Architecture CS/EE 6810: Computer Architecture Class format: Most lectures on YouTube *BEFORE* class Use class time for discussions, clarifications, problem-solving, assignments 1 Introduction Background: CS 3810 or

More information

Parallelism for the Masses: Opportunities and Challenges

Parallelism for the Masses: Opportunities and Challenges : Opportunities and Challenges These are my opinions, not necessarily those of my employer. Andrew A. Chien Vice President of Research Intel Corporation University of Washington/Microsoft Research Institute

More information

Large and Fast: Exploiting Memory Hierarchy

Large and Fast: Exploiting Memory Hierarchy CSE 431: Introduction to Operating Systems Large and Fast: Exploiting Memory Hierarchy Gojko Babić 10/5/018 Memory Hierarchy A computer system contains a hierarchy of storage devices with different costs,

More information

POWER7+ TM IBM IBM Corporation

POWER7+ TM IBM IBM Corporation POWER7+ TM 2012 Corporation Outline POWER Processor History Design Overview Performance Benchmarks Key Features Scale-up / Scale-out The new accelerators Advanced energy management Summary * Statements

More information

Petascale Computing Research Challenges

Petascale Computing Research Challenges Petascale Computing Research Challenges - A Manycore Perspective Stephen Pawlowski Intel Senior Fellow GM, Architecture & Planning CTO, Digital Enterprise Group Yesterday, Today and Tomorrow in HPC ENIAC

More information

Robert Jamieson. Robs Techie PP Everything in this presentation is at your own risk!

Robert Jamieson. Robs Techie PP Everything in this presentation is at your own risk! Robert Jamieson Robs Techie PP Everything in this presentation is at your own risk! PC s Today Basic Setup Hardware pointers PCI Express How will it effect you Basic Machine Setup Set the swap space Min

More information

IoT, Wearable, Networking and Automotive Markets Driving External Memory Innovation Jim Cooke, Sr. Ecosystem Enabling Manager, Embedded Business Unit

IoT, Wearable, Networking and Automotive Markets Driving External Memory Innovation Jim Cooke, Sr. Ecosystem Enabling Manager, Embedded Business Unit IoT, Wearable, Networking and Automotive Markets Driving External Memory Innovation Jim Cooke, Sr. Ecosystem Enabling Manager, Embedded Business Unit JCooke@Micron.com 2016Micron Technology, Inc. All rights

More information

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad NoC Round Table / ESA Sep. 2009 Asynchronous Three Dimensional Networks on on Chip Frédéric ric PétrotP Outline Three Dimensional Integration Clock Distribution and GALS Paradigm Contribution of the Third

More information

Intel High-Performance Computing. Technologies for Engineering

Intel High-Performance Computing. Technologies for Engineering 6. LS-DYNA Anwenderforum, Frankenthal 2007 Keynote-Vorträge II Intel High-Performance Computing Technologies for Engineering H. Cornelius Intel GmbH A - II - 29 Keynote-Vorträge II 6. LS-DYNA Anwenderforum,

More information

Parallel Computing. Parallel Computing. Hwansoo Han

Parallel Computing. Parallel Computing. Hwansoo Han Parallel Computing Parallel Computing Hwansoo Han What is Parallel Computing? Software with multiple threads Parallel vs. concurrent Parallel computing executes multiple threads at the same time on multiple

More information

HW Trends and Architectures

HW Trends and Architectures Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 1/29 HW Trends and Architectures prof. Ing. Pavel Tvrdík CSc. Ing. Jiří Kašpar Department of Computer Systems Faculty

More information

Lost in the Bermuda Triangle: Energy, Complexity, and Performance. Dennis Abts Cray Inc.

Lost in the Bermuda Triangle: Energy, Complexity, and Performance. Dennis Abts Cray Inc. Lost in the Bermuda Triangle: Energy, Complexity, and Performance Dennis Abts Cray Inc. Exploring Uncharted Waters 1. what does complexity mean to you? 2. What takes the most time to verify in your designs?

More information

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer MIMD Overview Intel Paragon XP/S Overview! MIMDs in the 1980s and 1990s! Distributed-memory multicomputers! Intel Paragon XP/S! Thinking Machines CM-5! IBM SP2! Distributed-memory multicomputers with hardware

More information

Lecture 18: DRAM Technologies

Lecture 18: DRAM Technologies Lecture 18: DRAM Technologies Last Time: Cache and Virtual Memory Review Today DRAM organization or, why is DRAM so slow??? Lecture 18 1 Main Memory = DRAM Lecture 18 2 Basic DRAM Architecture Lecture

More information

Gemini: Sanjiv Kapil. A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor. Gemini Architect Sun Microsystems, Inc.

Gemini: Sanjiv Kapil. A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor. Gemini Architect Sun Microsystems, Inc. Gemini: A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor Sanjiv Kapil Gemini Architect Sun Microsystems, Inc. Design Goals Designed for compute-dense, transaction oriented systems (webservers,

More information

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010 Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed

More information

The Mont-Blanc approach towards Exascale

The Mont-Blanc approach towards Exascale http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are

More information

Intel HPC Technologies Outlook

Intel HPC Technologies Outlook Intel HPC Technologies Outlook Andrey Semin Principal Engineer, HPC Technology Manager, EMEA October 19 th, 2015 ZKI Tagung AK Supercomputing Munich, Germany Legal Disclaimers INFORMATION IN THIS DOCUMENT

More information

EE586 VLSI Design. Partha Pande School of EECS Washington State University

EE586 VLSI Design. Partha Pande School of EECS Washington State University EE586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 1 (Introduction) Why is designing digital ICs different today than it was before? Will it change in

More information

4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD

4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD 4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD Ordering Information Part Number Bandwidth Speed Grade Max Frequency CAS Latency Density Organization Component Composition 78.B1GE3.AFF0C 12.8GB/sec 1600Mbps

More information

CMOS Photonic Processor-Memory Networks

CMOS Photonic Processor-Memory Networks CMOS Photonic Processor-Memory Networks Vladimir Stojanović Integrated Systems Group Massachusetts Institute of Technology Acknowledgments Krste Asanović, Rajeev Ram, Franz Kaertner, Judy Hoyt, Henry Smith,

More information