Tera-scale Computing and Interconnect Challenges
|
|
- Steven Cooper
- 5 years ago
- Views:
Transcription
1 Tera-scale Computing and Interconnect Challenges 3D Stacking Considerations Dr. Jerry Bautista Director, Microprocessor Technology Management Co-Director, Tera-scale Computing Research 2007 Intel Corporation
2 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 2
3 Multi-core is now mainstream Multi-core Top to Bottom Add cores to deliver historic 2x performance/2 years 3
4 A Tera-scale Platform Vision Cache Cache Cache Special Purpose Engines Integrated IO devices Scalable On-die Interconnect Fabric Last Level Cache Last Level Cache Last Level Cache Integrated Memory Controllers Off Die interconnect High Bandwidth Memory IO Socket Inter- Connect 4
5 Multi-core for Energy-Efficient Performance P = CV 2 f Relative single-core frequency and Vcc 5
6 What is Tera-scale? Model-based Apps Recognition Mining Synthesis 3D & Video Mult- Media Text Models 6
7
8 More Compute Better Experience Perceptual accuracy & graphical realism scale with compute Algorithms ideal for array of general purpose CPUs Shared algorithms i.e. ray-tracing for lighting aids physics collision detection and path selection AI Today: Second Life Fluid 1X Now Tomorrow: Particle Fluid 10X Compute Effects Physics Future: Natural Looking Fluid 1000X Compute 10+ years 8
9 Research shows Applications Scale Well 9
10 21.72mm Teraflops Research Processor 12.64mm Goals: Deliver Tera-scale performance - Single precision TFLOP at desktop power - Frequency target 5GHz - Bi-section B/W order of Terabits/s - Link bandwidth in hundreds of GB/s Prototype two key technologies - On-die interconnect fabric - 3D stacked memory Develop a scalable design methodology - Tiled design approach - Mesochronous clocking - Power-aware capability 1 TeraFLOP in < 60 W envelope 100 Million Transistors 80 Tiles 275mm 2 10
11 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 11
12 Tera-scale Research Challenges 12
13 FSB and Memory Effective Bandwidth, Effective CPU core MHz Desktop Platform Bandwidth Roadmap GB/s 10 GB/s PentiumIII FSB 64 bit EDO 64 bit SDRAM Pentium4 FSB 2 ch. RDRAM 128 bit DDR 128 bit DDR2 192 bit DDR2/3 CSI Eff CPU Core Freq Mem Eff BW FSB EFF BW 2x / 21 mon 2x / 24 mon 2x / 27 mon Tera-scale Computing Requirements Staying on-trend puts mem BW target at GB/s for the time frame But Tera-scale (highly parallel) work loads are 5x that 13
14 Future CPU s Tera-Scale Performance DIMMs 100 s GB/s Bulk memory This is the problem area cost and power 10 s GB/s Bulk Storage 1 s 1 s TB/s TB/s Array of interconnected processing cores (on die power ~ 0.1 mw/gb/s) 1 s GB/s Another box, shelf or rack Tera FLOPS implies tera bytes/sec bandwidth 14
15 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 15
16 Approaches to Increase BW Ex. Software opt. Add Last Lev Cache SIP/MCP 3D Stacking Pin counts challenging package limits. Speed (Power, Complexity, Si area) Si Package Socket ~10 Gb/s Mother Board Si Package Socket Power scaling is exponential with increasing speed. 16
17 Bulk Memory I/O Pins and Power Power (mw/gbps) State of the art Research Signaling Rate GBit/sec Source: Randy Mooney, Intel Pin Count Power: Total Package Pins Power Pins IO Pins Source: Ravi Mahajan, Intel SoA: 100 GB/sec ~ 1 Tb/sec = 1,000 Gb/sec 25mw/Gb/sec = 25 Watts Could be reduced significantly, but requires major changes to physical layer. Pins: Bus-width = 1,000/5 = 200, about 400 pins (differential) Too much power, too many signal pins I/O power should be < 10% of total CPU socket power 17
18 Technologies in the Memory Hierarchy SRAM and/or edram DRAM Non-volatile Memory (NAND or PCM or?) Magnetic Storage 18
19 Memory Flow Model Model: a constant size L1 & L2, and a varying size L3, and a very large L4. Method: Capture L3 accesses per instruction & Capture L4 accesses per instruction Project L3, L4 BW requirements L1 L2 L3 L4 19
20 Bandwidth Requirements Outgrow Roadmap 10TB/s Bandwidth Requirements 1TB/s 100GB/s 10GB/s 1GB/s Kernels: Matrix Operations Equation Solvers Regression Analysis Applications: Financial Analytics Physics Simulation Media Processing BW Roadmap Trends Only 25-40GB/s of BW available by Even by 2013, only ~100GB/s of BW available based on DDRx trends Terascale Workloads Insufficient Bandwidth Will Limit Performance 20 Source: Albert Lin, Intel/Stanford Yen-Kuang Chen, Intel
21 Solving BW Limitations 100% (100% = ideal performance with no BW limitations) Performance 10% 1% 0% Terascale Workloads 21 Source: Albert Lin, Intel/Stanford Yen-Kuang Chen, Intel
22 Hiding Memory/Storage Bandwidth Limitations DIMMs Under CPU: 3D stack Near CPU: MCP or C2C Tradeoffs: capacity (density) latency power/thermals Done. cost integration path SW execution (working set sizes) 22
23 On-Socket DRAM Caches For Memory Scalability Enable Large Capacity L4s - Low Latency - High Bandwidth Technologies - Multi-chip Packages (MCP) - 3D Stacking Benefits - Significant Miss Rate Reduction - Avoids bandwidth wall - At better latency Normalized miss rate Benefits of Large Caches OLTP ERP Java-1 Java Threads / Shared Cache size (MB) MCP ~200GB/s Proc DRAM $ 3D stack >1TB/s Proc DRAM $ Iyer, R, et al, Datacenter-on-Chip Architectures: Tera-scale Opportunities and Challenges, and Polka, LA et al, Package Technology to Address the Memory Bandwidth Challenge for Tera-scale, Computing, Intel Technology Journal, Volume 11, Issue 3,
24 CPU and Nearby Memory Under CPU: 3D stack Near CPU: MCP or C2C Considerations: - SW working set sizes - Power delivery - Heat dissipation - Yield/process flow - Reliability - Stacking method Chip to chip Wafer to wafer Bottom line: No Major Technical Issues Package substrate Existing DRAM devices are I/O constrained stacking attractive 24
25 Agenda Tera-scale computing - an I/O inflection point Implications to interconnects Options 3D die/wafer stacking considerations Summary 25
26 Work in Progress: Stacked Memory Prototype 256 KB SRAM per core 4X C4 bump density 3200 thru-silicon vias Polaris Package Thru-Silicon Via Denser than C4 pitch Freya C4 pitch 26
27 Current Die and Wafer Stacking Structure Comparison Die Stacking TSV Wafer Stacking Source: Intel Possible Application : Logic + Memory TSV Size: ~50 µm Thickness: ~100 µm Bonding Structure: ~Bump Size Bonding Pitch: ~Bump Pitch Source: Intel Possible Application : Logic + Logic TSV Size: <~5µm Thickness ~10 µm *Bonding Structure: <~5 µm *Bonding Pitch <~8 µm Source: Morrow et. al, Wafer-level 3D interconnects via Cu bonding, Proc. AMC, (2004) 27
28 300 mm Wafer Bonding (b) Source: Morrow et. al, Wafer-level 3D interconnects via Cu bonding, Proc. AMC, (2004) 28
29 Bond Interface Electrical Test Configuration Current Bond Pad Through-Si Via Wafer #1 Wafer #2 ~4096 links For this study: Pitch <~ 9 µm Source: Morrow et. al, Wafer-level 3D interconnects via Cu bonding, Proc. AMC, (2004) 29
30 Stacking Memory for Cache -Thermals Intel Core 2 Duo power and thermal map Source: Intel Must carefully consider thermal map for 3D stacking 30 Source: Venkat Natarajan, Intel
31 Impact of Powermap Alignment 0.5 W/Die dissipation in one quarter of die Die 2 Thermal floorplanning for die stacks is critical aspect of thermal design hot spot alignment creates greatest increase in temperature Hot Spots Die 1 Effect of Powermap On Heat Transfer from a Four-Die-Stack Temperature (C) Uniform Heat Load Aligned Powermap Case Non-Aligned Powermap Case Airflow DT max ~ 8.5 C Die Number 31 Source: Venkat Natarajan, Intel
32 Thermal Through Silicon Vias (TTSV) STACKED DIES SIGNAL LAYERS THERMAL THROUGH SILICON VIAS (TTSV) DIE-T0-DIE VIAS THROUGH SILICON VIAS (TSV) SUBSRATE May need dedicated thermal through silicon Vias Large in size: ~100 microns diameter; ~200 microns deep Filled with copper to provide adequate thermal paths 32 Source: Venkat Natarajan, Intel
33 1 CSI4CSI4 Vss CSI4 VFusCSI3 CSI3RXDAT_B0 CSI4 Vss CSI4 CSI4CSI4 CSI4Vss CSI4CSI4RXDAT_B7 CSI4CSI4RXDAT_B3 Vss VCC VSTB CSI4Vss CSI5CSI5 CSI5CSI5RXDAT_B3 Vss Misc CSI5 CSI5RXDAT_B0 POW Vss CSI5 Vss CSI5 2 CSI4CSI3 CSI3CSI3 CSI3RXDAT_B3 Vss CSI4 CSI4CSI4 CSI4Vss CSI4CSI4 CSI4RXDAT_B8 3 Vss CSI3 CSI3CSI3RXDAT_B4 Vss CSI3RXDAT_B1 CSI4CSI4 CSI4Vss CSI4CSI4 CSI4Vio Vss CSI4 CSI4CSI4RXDAT_B4 Vio Vss CSI4 CSI4RXDAT_B2 CSI4RXDAT_B0 Vio Vss CSI4Vio CSI5RXDAT_B6 Vss Vio CSI5 CSI5RXDAT_B1 Vio Vss Misc Misc Misc FBDFBD1 CSI4Vss CSI4CSI4 CSI4CSI5 Vss CSI5 CSI5RXDAT_B4 CSI5RCSI5RXDAT_B2 VssRXDAXDATRXDAXDATCVss Vio FBD1 4 CSI3CSI3 Vss Vio CSI3CSI3RXDAT_B2 CSI3Vss VFusCSI3 CSI3CSI4 Vss CSI4 CSI4CSI4RXDAT_B6 Vio Vss CSI4RXDAT_B1 Vio CSI4CSI4 Vss CSI5RXDAT_B9 CSI5CSI5RXDAT_B5 CSI5Vss CSI5RXDAT XDAT Vio VssRXDATRXDARXCLK 5 Vss CSI3RXDAT_B5 CSI3CSI3RXDAT_B6 Vio Vss CSI3Vio CSI3CSI3 Vss CSI4 CSI4CSI4RXDAT_B9 CSI4Vss CSI4CSI4RXDAT_B5 Vss CSI4 CSI4CSI5 CSI5RXDAT_B8 Vss CSI5CSI5TCSI5 CSI5Vss FBD1RXDATD1 XDATXDATC XDAT VssRXDA 6 Vcac CSI3 CSI3RXDAT_B7 Vss CSI3CSI3 CSI3CSI3 Vss Vio CSI3Vio CSI4Vss CSI4CSI4 CSI4CSI4 Vss CSI4 CSI4Vio CSI4Vss CSI5CSI5 CSI5CSI5TVss CSI5 XCLK FBD1RXDATD0 FBD1RXDATD5 VssRXDAXDATC XDAT 7 VcacVss CSI3CSI3 CSI3RXDAT_B8 Vio Vss CSI3CSI3CSI3CSI3Vss CSI4CSI4CSI4Vio Vss CSI4CSI4CSI4CSI4Vss CSI5CSI5CSI5RXDAT_B7 Vio Vss CSI5TCSI5CSI5CSI5VssXDAT FBD1RXDATD3 XDATRXDATVss 8 VR_OCSI3 CSI3RXDAT_B9 Vss CSI3 CSI3CSI3 CSI3Vss CSI3CSI3 Vio CSI4 Vss CSI4 CSI4CSI4 Vio Vss CSI4CSI4 CSI4CSI5 Vss CSI5 CSI5Misc Vio Vss Misc MiscRXCLKXCLKDVssXDATC RXDA 9 VR_OCSI3 Vss CSI3 CSI3CSI3 Vio Vss CSI3CSI3 Vio SID[0]Vss CSI4 CSI4CSI4 CSI4Vss CSI4CSI4 CSI4Vio Vss CSI5 CSI5CSI5 Misc Vss Misc Misc 1TXDAXDATVss FBD1RXDATD2 XDATRXDAT XDAT 10 Vss CSI3CSI3CSI3Vio Vss CSI3CSI3CSI3CSI3Vss SID[1]MiscMisc CSI4Vss CSI4CachCSI4OCP Vss OCP MiscMisc MiscVss MiscMisc Misc MiscVssFBD1RXDATD4 Vio XDATDVio VssRXDA 11 CSI3CSI3 CSI3Vss CSI3CSI3 CSI3Vio Vss CSI3 PBE SID[2] ERROR[1] Vss Misc Misc Misc Cach Vss OCP_OCP OCP OCP Vss Misc TCKTMS Misc Vss Misc MiscTXDA TXDATVss FBD1RXDATD6 FBD1RXDATD12 FBD1RXDATD_B12 12 CSI3Vss CSI3CSI3 CSI3Vio Vss CSI3 CSI3CSI3 CSI3Vss ERROR[0] Vio Misc Misc Vss Cach CachOCP_OCP Vss OCP OCP OCP_TDI Vss Misc Misc Misc Vio VssTXDA TXDATXDAT FBD1RXDATD7 Vss 13 CSI3Vio CSI3CSI3 Vss CSI3 CSI3CSI3 CSI3Vss PSMI Vio Looking at the top of the Mot TRST#Misc Misc Vss MiscTXDAXDATVio VssXDATDRXDA 14 CSI3CSI3 Vss CSI3 CSI3CSI3 LSS Vss CSI3Vio PROMForcePR# Misc Vss Misc Misc 1TXDAXDATVss1TXDA FBD1RXDATD8 XDATDVio 15 Vss PLL_PLL_PLL_PLL_Vss KBX_DLL_EOR_LOW KBX_BM[0BM[ Vss Heartbeat PM_RPM_RMisc Misc Vss TXCLK TXCLK TXDAT 1TXDAVss FBD1RXDATD V12V 12V12V 12V12V 12V12V 12V12V SLVDSKT_MEM_RSTB PLL_Misc Vss MiscRXDAT FBD0RXDATB_B0 Vio VssTXDAT FBD1RXDATD10 FBD1RXDATD_B V12V12V12V12V12V12V12V12V12VSLVDVss Vss RESET#MiscRXDAT FBD0RXDATB_B1 VssRXDAT 1TXDA TXDATVio Vss 18 12V12V 12V12V 12V12V 12V12V 12V12V SLVDSY SINT_FREQ[0] SysCTDOVio VssRXDAT FBD0RXDATB_B2 FBD0RXDATB_B3 1TXDAVss TXDATXDA 19 12V12V12V12V12V12V12V12V12V12VVio SY SCSI SysCVss MiscRXDAT XDATAVio VssTXDAT 1TXDAXDATVio 20 12V12V 12V12V 12V12V 12V12V 12V12V Vss SMBCEND FBD BypCMisc Misc Misc VssRXDAT FBD0RXDATB_B5 TXDATVssTXDA 21 12V12V 12V12V 12V12V 12V12V 12V12VFRCSSMBD SysInt MiscMisc Vss MiscVio FBD0RXDATB_B4 RXCLKVss1TXDAXDATXDAT 22 12V12V 12V12V 12V12V 12V12V 12V12V FRCSVss END Vss Misc Misc MiscRXDATVssXCLK FBD0RXDATB12 FBD0RXDATB_B12 TXCLVss 23 12V12V12V12V12V12V12V12V12V12VFRCSFRCFG[0] MOTMisc Misc VssXDATARXDAT XDAT Vio Vss XCLKTXDA 24 TRIGGER[0] TRIGGER[1] Vss TRIGGER[2] TRIGGER[3] GIODGIOCVss SLVDSLVDFRCSFRCFG[1] MOTVss MiscRXDAT RXDATXDATAVssRXDAT RXDATVio XDAT 25 Vss CSI2 CSI2CSI2 CSI2Vss ROM_CSI2 CSI2FRDI Vss FRCFG[2] MAI_Misc MiscXDATAVssRXDATVio FBD0RXDATB_B8 FBD0RXDATB_B7 VssTXDA 26 CSI2CSI2 CSI2Vss CSI2CSI2 CSI2CSI2 Vss CSI2 HV MTEST_SY NC MAI_CSD_VssRXDATVioXDATA RXDATVss FBD0RXDATB11 RXDATXDAT 27 Misc Vss CSI2CSI2 CSI2Vio Vss CSI2 CSI2CSI2 CSI2Vss FRC Misc Misc Misc Vss Cach CachMisc Misc Vss Misc Misc Misc Misc Vss CSD_MiscXDATA RXCLKVss FBD0RXDATB_B6 FBD0RXDATB10 FBD0RXDATB_B11 FBD0RXDATB_B9 Vss 28 Misc CSI2 CSI2CSI2 Vss CSI2 CSI2CSI2 Vio Vss CSI2PROCFRDOMisc Vss CSI1 CachCach Misc Vss Misc Misc Misc Misc Vss Misc DFDMisc Vio VssXCLKA FBD0RXDATA_B12 FBD0RXDATB_B10 VssTXDA FBD0TXDATB_B0 29 VR_FCSI2 Vss CSI2 Vio CSI2 CSI2Vss CSI2CSI2 CSI2FRWPVss Misc Vio CSI1 CSI1Vss CSI1CSI1 CSI1Vio Vss CSI0 CSI0Misc DFDVss MiscRXDAT XDATARXDATVss Vio0TXDA FBD0TXDATB_B2 Vio 30 Vss CSI2 CSI2CSI2 CSI2Vss CSI2Vio CSI2CSI2 Vss CSI2 Vio Misc Misc Vss CSI1Vio CSI1CSI1 Vss CSI1 CSI0CSI0 CSI0Vss CSI0CSI0TMisc FBD0TXDATA1 VssXDATA FBD0RXDATA_B10 Vio VssTXDA 31 VR_THERM_ Vio CSI2Vss CSI2CSI2 CSI2CSI2 Vss CSI2 Vio CSI2 CSI2Vss CSI1CSI1 CSI1CSI1 Vss CSI1 Vio CSI1 CSI0Vss CSI0CSI0 CSI0Vio VssTXDATVio RXDAT 32 THERMTRIP# Vss CSI2CSI2 CSI2CSI2RXDAT_B9 Vss CSI2CSI2 CSI1Vss CSI2Vio CSI1RXDAT_B6 Vss CSI1 CSI1CSI1 CSI1Vss CSI0Vio CSI0CSI0 Vss CSI0T XDATAVss0TXDA FBD0TXDATB_B3 FBD0TXDATB_B1 FBD0TXDATA4 FBD0TXDATA0 VssRXDAT FBD0RXDATA_B11 FBD0TXDATB_B4 Vss 33 VcoreCSI2 CSI2RXDAT_B8 Vss Vio CSI2CSI2Vio Vss CSI1CSI1 CSI1CSI1RXDAT_B8 Vss CSI1RXDAT_B3 CSI1CSI1CSI1Vss CSI1CSI1CSI1CSI0Vss CSI0VioFBD0TXDATA6 CSI0T VssTXDAT FBD0TXDATA3 Vio Vss TXCLXCLK 34 VcorCSI2 Vss CSI2 CSI2RXDAT_B7 CSI2CSI2Vss CSI1CSI1 CSI1CSI1 Vss CSI1 CSI1CSI1RXDAT_B4 Vio Vss CSI1CSI1CSI1CSI1Vss CSI0Vio CSI0CSI0VssXDAT 35 CSI2CSI2RXDAT_B6 CSI2Vio CSI2RXDAT_B3 Vss CSI2CSI2 CSI1CSI1 Vss CSI1 CSI1CSI1 CSI1Vss CSI1RXDAT_B1 CSI1CSI1 Vss CSI1 CSI0CSI0 CSI0RXDAT_B6 36 Vss CSI2 CSI2RXDAT_B5 Vss CSI2CSI2 CSI2RXDAT_B1 FBD0TXDATA2 FBD0TXDATA9 VssTXDA FBD0TXDATB_B9 Vss CSI0CSI0RCSI0RXDAT_B1 Vio VssXDATAVio FBD0TXDATB_B7 0TXDAVss FBD0TXDATB_B5 CSI1 Vss CSI1 CSI1CSI1 CSI1RXDAT_B9 Vss CSI1RXDAT_B5 Vio CSI1CSI1 Vss CSI1 CSI1Vio CSI0RXDAT_B9 Vss CSI0RXDAT_B5 CSI0RXDAT_B4 CSI0RVss FBD0TXDATA8 0TXCLXCLK FBD0TXDATA5 FBD0TXDATB_B6 Vss CPU_FBD0 37 CSI1CSI2 CSI2CSI2 CSI2RXDAT_B4 Vss CSI1 CSI1CSI1 Vio Vss CSI1CSI1 CSI1RXDAT_B7 Vss CSI1 CSI1Vio CSI1Vss CSI0RXDAT_B8 38 CSI1CSI1 Vss CSI1 Vss CSI2RXDAT_B2 CSI2CSI2RXDAT_B0 CSI0 CSI0CSI0 Vss CSI0RXDAT_B2 CSI0RXDAT_B0 Vio VssXDATXDA FBD0TXDATB_B8 FBDFBD0 Vio Vss CSI1CSI1 Misc Misc Vss CSI1RXDAT_B2 CSI1CSI1RXDAT_B0 CSI1Vss CSI1CSI1 CSI0CSI0RXDAT_B7 Vss CSI0CSI0RXDAT_B3 FBD0TXDATA7 Vss XDATAMisc Misc CSI0 Vss CSI0 Packaging BW Options Super Socket MCP Edge Stacked Stacked Interconnect Density (1 cm x 1 cm die) Very Low (~ 100) Low (~ 800) Medium (< 5K) ( 100 µm bump pitch) Wafer-wafer ~ 1 M Die-die > 5K Interconnect BW (GB/s) < 0.2 TB/s TB/s ~ 1 TB/s 1 TB/s Flexibility Low High High Medium-Low TSV (Through Si Via) Cost/BW Tradeoff will be key Qualified Prescott dual core MCP
34 Summary An I/O inflection point on the horizon - Increased consolidation of BW at the CPU socket - Parallel execution and new workloads To increase BW, architectural and packaging options are likely the first approach to enabling high BW - 3D stacking is an attractive solution for both a large last level cache and increasing bulk DRAM capacities Challenges: Socket power increased with DRAM integration Thermals Cost/yield No major technical issues Lots of opportunity for architectural innovation! 34
Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp
Interconnect Challenges in a Many Core Compute Environment Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Agenda Microprocessor general trends Implications Tradeoffs Summary
More informationAim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group
Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.
More informationEmerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation
Emerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation Dr. Li Li Distinguished Engineer June 28, 2016 Outline Evolution of Internet The Promise of Internet
More informationMoore s s Law, 40 years and Counting
Moore s s Law, 40 years and Counting Future Directions of Silicon and Packaging Bill Holt General Manager Technology and Manufacturing Group Intel Corporation InterPACK 05 2005 Heat Transfer Conference
More information1. NoCs: What s the point?
1. Nos: What s the point? What is the role of networks-on-chip in future many-core systems? What topologies are most promising for performance? What about for energy scaling? How heavily utilized are Nos
More informationBREAKING THE MEMORY WALL
BREAKING THE MEMORY WALL CS433 Fall 2015 Dimitrios Skarlatos OUTLINE Introduction Current Trends in Computer Architecture 3D Die Stacking The memory Wall Conclusion INTRODUCTION Ideal Scaling of power
More informationedram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next?
edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next? 1 Integrating DRAM and Logic Integrate with Logic without impacting logic Performance,
More informationMulti-Core Microprocessor Chips: Motivation & Challenges
Multi-Core Microprocessor Chips: Motivation & Challenges Dileep Bhandarkar, Ph. D. Architect at Large DEG Architecture & Planning Digital Enterprise Group Intel Corporation October 2005 Copyright 2005
More informationGigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004
Gigascale Integration Design Challenges & Opportunities Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Outline CMOS technology challenges Technology, circuit and μarchitecture solutions Integration
More informationOn GPU Bus Power Reduction with 3D IC Technologies
On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The
More information3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER
3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER CODES+ISSS: Special session on memory controllers Taipei, October 10 th 2011 Denis Dutoit, Fabien Clermidy, Pascal Vivet {denis.dutoit@cea.fr}
More informationThe Memory Hierarchy 1
The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow
More informationPhysical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis
I NVENTIVE Physical Design Implementation for 3D IC Methodology and Tools Dave Noice Vassilios Gerousis Outline 3D IC Physical components Modeling 3D IC Stack Configuration Physical Design With TSV Summary
More information3D Integration & Packaging Challenges with through-silicon-vias (TSV)
NSF Workshop 2/02/2012 3D Integration & Packaging Challenges with through-silicon-vias (TSV) Dr John U. Knickerbocker IBM - T.J. Watson Research, New York, USA Substrate IBM Research Acknowledgements IBM
More informationPicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor
PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor Taeho Kgil, Shaun D Souza, Ali Saidi, Nathan Binkert, Ronald Dreslinski, Steve Reinhardt, Krisztian Flautner,
More informationPhilippe Thierry Sr Staff Engineer Intel Corp.
HPC@Intel Philippe Thierry Sr Staff Engineer Intel Corp. IBM, April 8, 2009 1 Agenda CPU update: roadmap, micro-μ and performance Solid State Disk Impact What s next Q & A Tick Tock Model Perenity market
More informationHPC Technology Trends
HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations
More informationAdvancing high performance heterogeneous integration through die stacking
Advancing high performance heterogeneous integration through die stacking Suresh Ramalingam Senior Director, Advanced Packaging European 3D TSV Summit Jan 22 23, 2013 The First Wave of 3D ICs Perfecting
More informationMemory Demand Trends and what they Mean to Packaging Technology
Memory Demand Trends and what they Mean to Packaging Technology Ravi Mahajan May 31, 2016 Key Contributors: Suresh Chittor, Randy Osborne, Bob Sankman IEEE 66 th ECTC Las Vegas, NV, USA May 31 June 3,
More informationTSV Test. Marc Loranger Director of Test Technologies Nov 11 th 2009, Seoul Korea
TSV Test Marc Loranger Director of Test Technologies Nov 11 th 2009, Seoul Korea # Agenda TSV Test Issues Reliability and Burn-in High Frequency Test at Probe (HFTAP) TSV Probing Issues DFT Opportunities
More informationWorkloads, Scalability and QoS Considerations in CMP Platforms
Workloads, Scalability and QoS Considerations in CMP Platforms Presenter Don Newell Sr. Principal Engineer Intel Corporation 2007 Intel Corporation Agenda Trends and research context Evolving Workload
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Ideally, computer memory would be large and fast
More informationThermal Management Challenges in Mobile Integrated Systems
Thermal Management Challenges in Mobile Integrated Systems Ilyas Mohammed March 18, 2013 SEMI-THERM Executive Briefing Thermal Management Market Visions & Strategies, San Jose CA Contents Mobile computing
More informationXylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks
Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks Aditya Agrawal, Josep Torrellas and Sachin Idgunji University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu
More informationMicroelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica
Microelettronica J. M. Rabaey, "Digital integrated circuits: a design perspective" Introduction Why is designing digital ICs different today than it was before? Will it change in future? The First Computer
More informationMICROPROCESSOR TECHNOLOGY
MICROPROCESSOR TECHNOLOGY Assis. Prof. Hossam El-Din Moustafa Lecture 20 Ch.10 Intel Core Duo Processor Architecture 2-Jun-15 1 Chapter Objectives Understand the concept of dual core technology. Look inside
More informationMemory: Past, Present and Future Trends Paolo Faraboschi
Memory: Past, Present and Future Trends Paolo Faraboschi Fellow, Hewlett Packard Labs Systems Research Lab Quiz ( Excerpt from Intel Developer Forum Keynote 2000 ) ANDREW GROVE: is there a role for more
More informationThe Road from Peta to ExaFlop
The Road from Peta to ExaFlop Andreas Bechtolsheim June 23, 2009 HPC Driving the Computer Business Server Unit Mix (IDC 2008) Enterprise HPC Web 100 75 50 25 0 2003 2008 2013 HPC grew from 13% of units
More informationFive Emerging DRAM Interfaces You Should Know for Your Next Design
Five Emerging DRAM Interfaces You Should Know for Your Next Design By Gopal Raghavan, Cadence Design Systems Producing DRAM chips in commodity volumes and prices to meet the demands of the mobile market
More informationEECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration
1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 10: Three-Dimensional (3D) Integration Instructor: Ron Dreslinski Winter 2016 University of Michigan 1 1 1 Announcements
More informationCMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology. Moore s Law: 2X transistors / year
CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology Moore s Law: 2X transistors / year Cramming More Components onto Integrated Circuits Gordon Moore, Electronics, 1965 # on transistors
More information3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape
Edition April 2017 Semiconductor technology & processing 3D systems-on-chip A clever partitioning of circuits to improve area, cost, power and performance. In recent years, the technology of 3D integration
More informationMemory Systems IRAM. Principle of IRAM
Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several
More informationAdvanced Computer Architecture (CS620)
Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).
More informationAddressing the Memory Wall
Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the
More informationEnabling Technology for the Cloud and AI One Size Fits All?
Enabling Technology for the Cloud and AI One Size Fits All? Tim Horel Collaborate. Differentiate. Win. DIRECTOR, FIELD APPLICATIONS The Growing Cloud Global IP Traffic Growth 40B+ devices with intelligence
More informationDDR3 Memory Buffer: Buffer at the Heart of the LRDIMM Architecture. Paul Washkewicz Vice President Marketing, Inphi
DDR3 Memory Buffer: Buffer at the Heart of the LRDIMM Architecture Paul Washkewicz Vice President Marketing, Inphi Theme Challenges with Memory Bandwidth Scaling How LRDIMM Addresses this Challenge Under
More information3D & Advanced Packaging
Tuesday, October 03, 2017 Company Overview March 12, 2015 3D & ADVANCED PACKAGING IS NOW WITHIN REACH WHAT IS NEXT LEVEL INTEGRATION? Next Level Integration blends high density packaging with advanced
More information3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA
3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA OUTLINE 3D Application Drivers and Roadmap 3D Stacked-IC Technology 3D System-on-Chip: Fine grain partitioning Conclusion
More information2GB DDR3 SDRAM SODIMM with SPD
2GB DDR3 SDRAM SODIMM with SPD Ordering Information Part Number Bandwidth Speed Grade Max Frequency CAS Latency Density Organization Component Composition Number of Rank 78.A2GC6.AF1 10.6GB/sec 1333Mbps
More informationOvercoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics
Overcoming the Memory System Challenge in Dataflow Processing Darren Jones, Wave Computing Drew Wingard, Sonics Current Technology Limits Deep Learning Performance Deep Learning Dataflow Graph Existing
More informationPlatforms Design Challenges with many cores
latforms Design hallenges with many cores Raj Yavatkar, Intel Fellow Director, Systems Technology Lab orporate Technology Group 1 Environmental Trends: ell 2 *Other names and brands may be claimed as the
More informationOVERCOMING THE MEMORY WALL FINAL REPORT. By Jennifer Inouye Paul Molloy Matt Wisler
OVERCOMING THE MEMORY WALL FINAL REPORT By Jennifer Inouye Paul Molloy Matt Wisler ECE/CS 570 OREGON STATE UNIVERSITY Winter 2012 Contents 1. Introduction... 3 2. Background... 5 3. 3D Stacked Memory...
More informationEE5780 Advanced VLSI CAD
EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html
More informationFuture Memories. Jim Handy OBJECTIVE ANALYSIS
Future Memories Jim Handy OBJECTIVE ANALYSIS Hitting a Brick Wall OBJECTIVE ANALYSIS www.objective-analysis.com Panelists Michael Miller VP Technology, Innovation & Systems Applications MoSys Christophe
More informationBringing 3D Integration to Packaging Mainstream
Bringing 3D Integration to Packaging Mainstream Enabling a Microelectronic World MEPTEC Nov 2012 Choon Lee Technology HQ, Amkor Highlighted TSV in Packaging TSMC reveals plan for 3DIC design based on silicon
More informationDon t Forget the Memory. Dean Klein, VP Memory System Development Micron Technology, Inc.
Don t Forget the Memory Dean Klein, VP Memory System Development Micron Technology, Inc. Memory is Everywhere 2 One size DOES NOT fit all 3 Question: How many different memories does your computer use?
More informationPower dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.
The VLSI Interconnect Challenge Avinoam Kolodny Electrical Engineering Department Technion Israel Institute of Technology VLSI Challenges System complexity Performance Tolerance to digital noise and faults
More informationFuture of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1
Future of Interconnect Fabric A ontrarian View Shekhar Borkar June 13, 2010 Intel orp. 1 Outline Evolution of interconnect fabric On die network challenges Some simple contrarian proposals Evaluation and
More informationLecture 1: CS/ECE 3810 Introduction
Lecture 1: CS/ECE 3810 Introduction Today s topics: Why computer organization is important Logistics Modern trends 1 Why Computer Organization 2 Image credits: uber, extremetech, anandtech Why Computer
More information3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV. Samta Bansal and Marc Greenberg, Cadence EDPS Monterey, CA April 5-6, 2012
3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV Samta Bansal and Marc Greenberg, Cadence EDPS Monterey, CA April 5-6, 2012 What the fuss is all about * Source : ECN Magazine March 2011 * Source : EDN Magazine
More informationIntel: Driving the Future of IT Technologies. Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation
Research @ Intel: Driving the Future of IT Technologies Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation kp Intel Labs Mission To fuel Intel s growth, we deliver breakthrough technologies that
More informationIntroduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN
1 Introduction The evolution of integrated circuit (IC) fabrication techniques is a unique fact in the history of modern industry. The improvements in terms of speed, density and cost have kept constant
More informationL évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers
I N S T I T U T D E R E C H E R C H E T E C H N O L O G I Q U E L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers 10/04/2017 Les Rendez-vous de
More informationComputer Systems Laboratory Sungkyunkwan University
DRAMs Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Main Memory & Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width
More informationWilliam Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers
William Stallings Computer Organization and Architecture 8 th Edition Chapter 18 Multicore Computers Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved
More informationExploiting Dark Silicon in Server Design. Nikos Hardavellas Northwestern University, EECS
Exploiting Dark Silicon in Server Design Nikos Hardavellas Northwestern University, EECS Moore s Law Is Alive And Well 90nm 90nm transistor (Intel, 2005) Swine Flu A/H1N1 (CDC) 65nm 45nm 32nm 22nm 16nm
More informationHardware and Software solutions for scaling highly threaded processors. Denis Sheahan Distinguished Engineer Sun Microsystems Inc.
Hardware and Software solutions for scaling highly threaded processors Denis Sheahan Distinguished Engineer Sun Microsystems Inc. Agenda Chip Multi-threaded concepts Lessons learned from 6 years of CMT
More informationDesign and Technology Trends
Lecture 1 Design and Technology Trends R. Saleh Dept. of ECE University of British Columbia res@ece.ubc.ca 1 Recently Designed Chips Itanium chip (Intel), 2B tx, 700mm 2, 8 layer 65nm CMOS (4 processors)
More informationThe communication bottleneck
3D-MPSoCs: architectural and design technology outlook Luca Benini DEIS Università di Bologna lbenini@deis.unibo.it The communication bottleneck Architectural issues Traditional shared buses do not scale
More informationFrom Majorca with love
From Majorca with love IEEE Photonics Society - Winter Topicals 2010 Photonics for Routing and Interconnects January 11, 2010 Organizers: H. Dorren (Technical University of Eindhoven) L. Kimerling (MIT)
More informationA Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache
A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache Stefan Rusu Intel Corporation Santa Clara, CA Intel and the Intel logo are registered trademarks of Intel Corporation or its subsidiaries in
More information3D NAND Technology Scaling helps accelerate AI growth
3D NAND Technology Scaling helps accelerate AI growth Jung Yoon, Ranjana Godse IBM Supply Chain Engineering Andrew Walls IBM Flash Systems August 2018 1 Agenda 3D-NAND Scaling & AI Flash density trend
More informationAdrian Proctor Vice President, Marketing Viking Technology
Storage PRESENTATION in the TITLE DIMM GOES HERE Socket Adrian Proctor Vice President, Marketing Viking Technology SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless
More informationHybrid Memory Cube (HMC)
23 Hybrid Memory Cube (HMC) J. Thomas Pawlowski, Fellow Chief Technologist, Architecture Development Group, Micron jpawlowski@micron.com 2011 Micron Technology, I nc. All rights reserved. Products are
More informationAgenda. System Performance Scaling of IBM POWER6 TM Based Servers
System Performance Scaling of IBM POWER6 TM Based Servers Jeff Stuecheli Hot Chips 19 August 2007 Agenda Historical background POWER6 TM chip components Interconnect topology Cache Coherence strategies
More informationPower Technology For a Smarter Future
2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Power Technology For a Smarter Future Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation
More informationStacked Silicon Interconnect Technology (SSIT)
Stacked Silicon Interconnect Technology (SSIT) Suresh Ramalingam Xilinx Inc. MEPTEC, January 12, 2011 Agenda Background and Motivation Stacked Silicon Interconnect Technology Summary Background and Motivation
More informationSU Dual and Quad-Core Xeon UP Server
SU4-1300 Dual and Quad-Core Xeon UP Server www.eslim.co.kr Dual and Quad-Core Server Computing Leader!! ESLIM KOREA INC. 1. Overview eslim SU4-1300 The ideal entry-level server Intel Xeon processor 3000/3200
More informationToward a Memory-centric Architecture
Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains
More informationThe Design of the KiloCore Chip
The Design of the KiloCore Chip Aaron Stillmaker*, Brent Bohnenstiehl, Bevan Baas DAC 2017: Design Challenges of New Processor Architectures University of California, Davis VLSI Computation Laboratory
More informationIntel Enterprise Processors Technology
Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology
More informationECE 486/586. Computer Architecture. Lecture # 2
ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:
More informationOpen Innovation with Power8
2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Open Innovation with Power8 Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation 2013
More informationA 1.5GHz Third Generation Itanium Processor
A 1.5GHz Third Generation Itanium Processor Jason Stinson, Stefan Rusu Intel Corporation, Santa Clara, CA 1 Outline Processor highlights Process technology details Itanium processor evolution Block diagram
More informationThe Future of Electrical I/O for Microprocessors. Frank O Mahony Intel Labs, Hillsboro, OR USA
The Future of Electrical I/O for Microprocessors Frank O Mahony frank.omahony@intel.com Intel Labs, Hillsboro, OR USA 1 Outline 1TByte/s I/O: motivation and challenges Circuit Directions Channel Directions
More informationSOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS
SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CSAIL IAP MEETING MAY 21, 2013 Research Agenda Lack of technology progress Moore s Law still alive Power
More informationECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha
ECE520 VLSI Design Lecture 1: Introduction to VLSI Technology Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Wednesday 2:00-3:00PM or by appointment E-mail: pzarkesh@unm.edu Slide: 1 Course Objectives
More informationTechnology Trends IT ELS. Kevin Kettler Dell CTO
Technology Trends IT ELS Kevin Kettler Dell CTO Core Technology Building Blocks Processor Chipset Graphics Memory I/O Subsystems Process Technology.13µ 2001 90nm 2003 65nm 2005 45nm 2007 32nm ~2009 22nm
More informationCS/EE 6810: Computer Architecture
CS/EE 6810: Computer Architecture Class format: Most lectures on YouTube *BEFORE* class Use class time for discussions, clarifications, problem-solving, assignments 1 Introduction Background: CS 3810 or
More informationParallelism for the Masses: Opportunities and Challenges
: Opportunities and Challenges These are my opinions, not necessarily those of my employer. Andrew A. Chien Vice President of Research Intel Corporation University of Washington/Microsoft Research Institute
More informationLarge and Fast: Exploiting Memory Hierarchy
CSE 431: Introduction to Operating Systems Large and Fast: Exploiting Memory Hierarchy Gojko Babić 10/5/018 Memory Hierarchy A computer system contains a hierarchy of storage devices with different costs,
More informationPOWER7+ TM IBM IBM Corporation
POWER7+ TM 2012 Corporation Outline POWER Processor History Design Overview Performance Benchmarks Key Features Scale-up / Scale-out The new accelerators Advanced energy management Summary * Statements
More informationPetascale Computing Research Challenges
Petascale Computing Research Challenges - A Manycore Perspective Stephen Pawlowski Intel Senior Fellow GM, Architecture & Planning CTO, Digital Enterprise Group Yesterday, Today and Tomorrow in HPC ENIAC
More informationRobert Jamieson. Robs Techie PP Everything in this presentation is at your own risk!
Robert Jamieson Robs Techie PP Everything in this presentation is at your own risk! PC s Today Basic Setup Hardware pointers PCI Express How will it effect you Basic Machine Setup Set the swap space Min
More informationIoT, Wearable, Networking and Automotive Markets Driving External Memory Innovation Jim Cooke, Sr. Ecosystem Enabling Manager, Embedded Business Unit
IoT, Wearable, Networking and Automotive Markets Driving External Memory Innovation Jim Cooke, Sr. Ecosystem Enabling Manager, Embedded Business Unit JCooke@Micron.com 2016Micron Technology, Inc. All rights
More informationNoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad
NoC Round Table / ESA Sep. 2009 Asynchronous Three Dimensional Networks on on Chip Frédéric ric PétrotP Outline Three Dimensional Integration Clock Distribution and GALS Paradigm Contribution of the Third
More informationIntel High-Performance Computing. Technologies for Engineering
6. LS-DYNA Anwenderforum, Frankenthal 2007 Keynote-Vorträge II Intel High-Performance Computing Technologies for Engineering H. Cornelius Intel GmbH A - II - 29 Keynote-Vorträge II 6. LS-DYNA Anwenderforum,
More informationParallel Computing. Parallel Computing. Hwansoo Han
Parallel Computing Parallel Computing Hwansoo Han What is Parallel Computing? Software with multiple threads Parallel vs. concurrent Parallel computing executes multiple threads at the same time on multiple
More informationHW Trends and Architectures
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 1/29 HW Trends and Architectures prof. Ing. Pavel Tvrdík CSc. Ing. Jiří Kašpar Department of Computer Systems Faculty
More informationLost in the Bermuda Triangle: Energy, Complexity, and Performance. Dennis Abts Cray Inc.
Lost in the Bermuda Triangle: Energy, Complexity, and Performance Dennis Abts Cray Inc. Exploring Uncharted Waters 1. what does complexity mean to you? 2. What takes the most time to verify in your designs?
More informationMIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer
MIMD Overview Intel Paragon XP/S Overview! MIMDs in the 1980s and 1990s! Distributed-memory multicomputers! Intel Paragon XP/S! Thinking Machines CM-5! IBM SP2! Distributed-memory multicomputers with hardware
More informationLecture 18: DRAM Technologies
Lecture 18: DRAM Technologies Last Time: Cache and Virtual Memory Review Today DRAM organization or, why is DRAM so slow??? Lecture 18 1 Main Memory = DRAM Lecture 18 2 Basic DRAM Architecture Lecture
More informationGemini: Sanjiv Kapil. A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor. Gemini Architect Sun Microsystems, Inc.
Gemini: A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor Sanjiv Kapil Gemini Architect Sun Microsystems, Inc. Design Goals Designed for compute-dense, transaction oriented systems (webservers,
More informationMoneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010
Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed
More informationThe Mont-Blanc approach towards Exascale
http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are
More informationIntel HPC Technologies Outlook
Intel HPC Technologies Outlook Andrey Semin Principal Engineer, HPC Technology Manager, EMEA October 19 th, 2015 ZKI Tagung AK Supercomputing Munich, Germany Legal Disclaimers INFORMATION IN THIS DOCUMENT
More informationEE586 VLSI Design. Partha Pande School of EECS Washington State University
EE586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 1 (Introduction) Why is designing digital ICs different today than it was before? Will it change in
More information4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD
4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD Ordering Information Part Number Bandwidth Speed Grade Max Frequency CAS Latency Density Organization Component Composition 78.B1GE3.AFF0C 12.8GB/sec 1600Mbps
More informationCMOS Photonic Processor-Memory Networks
CMOS Photonic Processor-Memory Networks Vladimir Stojanović Integrated Systems Group Massachusetts Institute of Technology Acknowledgments Krste Asanović, Rajeev Ram, Franz Kaertner, Judy Hoyt, Henry Smith,
More information