EMERGING TECHNOLOGIES AND ARCHITECTURES FOR LOW-POWER AND HETEROGENEOUS COMPUTE NODE TARGETING EXASCALE LEVEL COMPUTING

Size: px
Start display at page:

Download "EMERGING TECHNOLOGIES AND ARCHITECTURES FOR LOW-POWER AND HETEROGENEOUS COMPUTE NODE TARGETING EXASCALE LEVEL COMPUTING"

Transcription

1 EMERGING TECHNOLOGIES AND ARCHITECTURES FOR LOW-POWER AND HETEROGENEOUS COMPUTE NODE TARGETING EXASCALE LEVEL COMPUTING Denis Dutoit, E. Guthmuller, JP Noël, Y. Thonnart, P. Vivet CEA-Leti Strategic Marketing Manager Advanced Computing

2 TECHNOLOGICAL BACKGROUND Supercomputer Cluster Blade TERA CEA Exascale = Flop/s (Nov. 2018: 0.2x10 18 Flop/s) Technologies? Architectures? European effort? Scope of the talk Compute Node 2

3 OUTLINE Part 1: Compute node architecture evolution Part 2: Architectures and associated technologies: Heterogeneous Architectures & Integration Photonic interposer In-Memory-Computing Part 3: European HPC Exascale effort: EuroHPC European Processor Initiative (EPI) Conclusion 3

4 PART 1: COMPUTE NODE ARCHITECTURE EVOLUTION 4

5 HIGH PERFORMANCE COMPUTING EVOLUTION Compute High Performance Computing Analyze Data in Data out New drivers Requirements Solutions New workloads (deep learning) Massive volume of data (big data) More computing performance (Ops per second) for simple operations (FP16, FP8, INT ). Energy efficiency (Ops per Watt). Increased Bytes per Flops. High bandwidth/low latency access to all data. Heterogeneity In-Memory- Computing Optical Networkon-Chip Starting from high performance compute only, HPC evolves towards: New workloads Massive volume of data TERA CEA 5

6 CHALLENGES FOR ADVANCED COMPUTING < 10x energy efficiency improvement every 4 years PERFORMANCE 100 EFLOPS 10 EFLOPS x10 every 4 years 1 EFLOPS 100 PFLOPS ENERGY PER OPERATION * 10 PFLOPS 2 nj/flop 1 PFLOPS 200 pj/flop 100 TFLOPS 20 pj/flop 10 TFLOPS 2 pj/flop 1 TFLOPS 0.2 pj/flop /10 every 4 years * assuming 20 MWatt supercomputer

7 TECHNOLOGY & ARCHITECTURE SOLUTIONS FOR ADVANCED COMPUTING End of Dennard s scaling End of Amdahl s law End of Moore s law PERFORMANCE 100 EFLOPS Happy scaling Many-core Heterogeneous architectures Disruptive architectures 10 EFLOPS 1 EFLOPS 100 PFLOPS ENERGY PER OPERATION * 10 PFLOPS 2 nj/flop 1 PFLOPS 200 pj/flop 100 TFLOPS 20 pj/flop 10 TFLOPS 2 pj/flop 1 TFLOPS 0.2 pj/flop * assuming 20 MWatt supercomputer

8 COMPUTE NODE ARCHITECTURE EVOLUTION End of Dennard s scaling End of Amdahl s law End of Moore s law Happy scaling Many-core Heterogeneous architectures Disruptive architectures Far Mem. NIC Data Centric Interconnect CPU Generic processing Cache Bus Cache Cache NoC + LLC Cache Cache Close Mem. Close Mem Coherent Link Close Mem Close Mem In Memory Computing Memory NIC (Network InterConnect) Memory NIC HW accelerator 8

9 ARCHITECTURES AND ASSOCIATED TECHNOLOGIES 2 - Photonics 1 Heterogeneous Integration Far Mem. NIC Data Centric Interconnect Generic processing In Memory Computing Part 2: Architectures and associated technologies Close Mem. Close Mem Coherent Link Close Mem Close Mem HW accelerator 3 - In-Memory- Computing 9

10 52 ND EDITION OF THE TOP500 LIST (NOVEMBER 11 TH, 2018) Top#1 today: Flop/s Peak It is 1/5 of Exascale level of performance Users: #1-#2: US (Department of Energy) #3-#4: China (National Supercomputing center) #5: Switzerland (National Supercomputing center) Processor design & technology: Chip Design Manuf. IBM POWER9 NVIDIA Volta GV100 Sunway SW26010 Intel Xeon E5 10

11 EUROPEAN HPC EXASCALE EFFORT How to bring back Europe into the race? Supercomputing infrastructure Processor design Part 3: European HPC Exascale effort 11

12 PART 2: ARCHITECTURES AND ASSOCIATED TECHNOLOGIES 12

13 1 - HETEROGENEOUS ARCHITECTURE & INTEGRATION: OUTLINE 1 Heterogeneous Integration 13

14 SCALABLE VECTOR EXTENSION: FUJITSU & ARM Source Fujitsu HotChips 2018 < Generic processing is going towards ultra-high memory bandwidth 14

15 GPU: NVIDIA TESLA V100 + IBM POWER9 Source IBM Source NVIDIA Computing performance from GPU High-speed link between components < Integration technologies for heterogeneous compute 15

16 ADVANCED PACKAGING TECHNOLOGIES Advanced Packaging Integration SiP Multi-Chip-Module 3D 2.5D Die stacking Interconnect density: 100µm x 100µm Interconnect density: 10µm x 10µm Interconnect density: 10µm x 10µm Source: AMD EPYC 7260, 4-chiplet chip Source: Micron High-Bandwidth-Memory System-in-Package 3D Integrated-Circuit (3D IC) 16

17 MULTI-CHIP-MODULE: INTEGRATION WITH CHIPLETS AMD Zen architecture integrates upto 4 chiplets on a substrate; for scalable solution and more than reticle size silicon area in a chip. AMD Zen 2 architecture integrates upto 9 chiplets on a substrate ( ) EPYC 7260, 4-chiplet chip ( ) EPYC Rome, 9-chiplet chip ( ) AMD Zen2 architecture; 17

18 3D DIE STACKING: HBM MEMORY High memory density and bandwidth for small footprint Starting from GPU but now is used on high performance chips (2015) Hynix HBM1 on AMD Fuji 18

19 2.5D INTERPOSER: HBM INTEGRATION FOR MEMORY BANDWIDTH NEC Aurora SX-10+ First product with 6x HBM2: 1.2 TB/s total memory bandwidth 2.45 TFLOPS ~0.5 Byte/Flops 19

20 LETI S DEMONSTRATOR ROADMAP FOR HETEROGENEOUS COMPUTING IP design HIGH PERFORMANCE COMPUTING Heterogeneous integration SERVER Optical link on Photonic Interposer Chiplet on Active Interposer Processor design EUROPEAN PROCESSOR For HPC, Servers & ehpc EUROPEAN EXASCALE & POST-EXASCALE HPC Compute Node on Photonic Interposer Neuromorphic Quantum computing LOGIC-ON-LOGIC 3D Interconnect MICRO-SERVER Multicore Architecture 23 partners officially in the EPI consortium For HPC, Servers & ehpc 120M funding 20

21 3D NETWORK ON CHIP LOGIC-ON-LOGIC 3D Interconnect 21

22 3D NOC DEMONSTRATOR FROM LETI ISSCC 2016 < Power efficient 3D interconnect, a first step towards 3D-based computing architectures Architecture Design Technology 3D Network-on-Chip: heterogeneous and homogenous multi-cores 3D Plug: High throughput Low latency DFT TSV middle (AR 1:8), µ-bumps, 50µm x 40µm pitch Face2Back stacking, Die2Die assembly Demonstrator 3D NoC 0.32 pj/b Molding Bottom die Package substrate Top die 3D cross section 22

23 CHIPLET ON ACTIVE INTERPOSER SERVER Chiplet on Active Interposer 23

24 ACTIVE INTERPOSER PARTITIONING FOR MANY-CORE «Active» Interposer : which added value? Heterogeneous 3D - Advanced tech node for computation within chiplets - Mature tech node for communication/power/dft/etc Chip-to-Chip Interconnect - Hierarchical NoC, for energy efficient communications System IOs - On Interposer, for off-chip memory accesses Power Management - Chiplet power supply, without any external passives And most of all... preserve (active) interposer cost! Target low logic density (eg < 10%) to preserve interposer yield & cost Source Vivet, ISVLSI'15 24

25 ACTIVE INTERPOSER FROM LETI (INTACT) Symposium DIC 2015 ISVLSI 2015 < 96 cores compute fabric with 6 chiplets stacked on an active interposer System Architecture Design Technology Cache Coherent Compute Fabric with: 96 cores (MIPS32), 3 levels of caches, integrated power management Performance targets 100 GOPS 10 GOPS/Watt 25 Watts total Heterogeneous 3D partitioning with: 28nm FDSOI chiplets (x6) Low power compute fabric Wide voltage range (0.6V 1.2V) Body biasing for logic boost & leakage ctrl 65nm active interposer Power unit (Switched Cap DC-DC conv.) Interconnect (Network-on-Chip) Test, clocking, thermal sensors, etc TSV Ø 10µm Height 100µm µ-bumps Ø 10 µm Pitch 20 µm 25

26 4 L3 CACHE TILES 64<->512 bits SER/DES 64<->512 bits SER/DES 4 CPU CLUSTERS L3 ctrl Network Interfaces Circuit IOs (250 signals) CHIPLET ARCHITECTURE & MAIN FEATURES Future 3D interco. Future 3D interco. Future 3D interco. (N1) Coherent L1<->L2 2D-Mesh Interconnect 32-bit scalar core with MMU 16 KB I & D L1 caches CPU 0 Bottom Die Active Silicon Interposer 32-bit scalar core with MMU 16 KB I & D L1 caches CPU 1 (N0) 5 Crossbars 256 KB shared distributed L2 cache 1 MB L3 cache Low freq/high bandwidth 32-bit scalar core with MMU 16 KB I & D L1 caches CPU 2 32-bit scalar core with MMU 16 KB I & D L1 caches CPU 3 Peripherals (ICU, UART, SPI) Local clock generators (N2) Adaptive L2->L3 2D-Mesh Interconnect Top Dies : TSARLET Chiplets 4 Clusters of 4 Cores Die-2-Die Interconnects (N1) Passive Links within Interposer Top Clock Generators PVT+timing fault sensors DFT & Memory BISTs Config. Registers (N3) L3->DRAM 2D-Mesh Interconnect South Bridge UART SPI Circuit Config. North Bridge System+Mem IOs Hierarchical 3D NoCs (N2-N3) 2D NoC within Chiplets 2D NoC within Interposer 3D NoC vertical link 3D Plug PE2 L1$ PE3 L1$ L2$ PE0 L1$ PE1 L1$ Circuit IO Interface PE2 L1$ PE3 L1$ L2$ PE0 L1$ PE1 L1$ 3D Plug Clk, Rst, Test L3$ L3$ L3$ L3$ Die 3D Plug PE2 L1$ PE3 L1$ PE2 L1$ PE3 L1$ L2$ L2$ PE0 L1$ PE1 L1$ FUSE mem PE0 L1$ PE1 L1$ 3D Plug Technology and main features FDSOI 28nm (STMicroelectronics) Technology LVT option, 10 Metal Layers 4.0 mm x 5.6 mm = 22 mm2, Area 395 Million Transistors Primary (2D) 249 signals, 237 powers, Circuit IOs 200µm pitch 3D Circuit 2618 signals implemented up to IOs Metal 10 20µm pitch Flip-Chip, 4 layers substrate, Package 19x19= mm pitch [0.5v - 1.3v] for core logic Power [-2v - +2v] for body biasing Supplies 1.8v for circuit IOs Clocking PVT Sensors Thermal Sensors FLLs for local clock generation, Timing fault sensors for worst path 8 PVT Sensors based on 7 Ring Oscillators 1 absolute for thermal reference, 4 PVT sensors for gradient Substrate Chiplet main innovations : Cache Coherent Shared Memory Fully Scalable architecture for future integration in 3D using inteporser Fault Tolerant and adaptive L3-caches FDSOI 28 nm technology, Ultra Large Voltage Range, Energy Efficiency [Guthmuller, ESSCIRC 2018] 26

27 ACTIVE INTERPOSER ARCHITECTURE Active interposer, integrating 6 chiplets, offering 96 cores DC-DC converters - Switched Capacitances - Poly + MIM + MOM - Power Density 0.31 W/mm2 - Vin 1.8V, Vout [0.6V 1.2 V] - Power efficiency up to 80% no external passives [G. Pillonnet, 3DIC 2015] Off-Chip Links - LVDS IOs for L3$ access - Delay calibration per lane - 4*Tx/Rx 32bits@300MHz DDR - Total Bandwidth 19 GByte/s 3D cross-section - TSV for IO signals and DC-DC converter power suplies µ-bumps on the interposer ( signals, power) C4 bumps for package connection CMOS 65nm 15 x 13 = 200 mm 2 15 Millions de transistors 0.6 % complexity of total complexity 3D Communication Links - L1/L2/L3 NoC interconnects - Passive links (L1 refill) - Active links (L2 & L3 refill) Interposer infrastructure - Configuration interface - Thermal Sensors - Stress Sensors - IEEE 1687 DFT (IJTAG) [J. Durupt, ETS 2016] 27

28 MULTI-CHIP-MODULE AND ACTIVE INTERPOSER HIGH PERFORMANCE COMPUTING Heterogeneous integration 28

29 ExaNoDe H2020-FETHPC-2014 Starting date: 01/10/2015 Duration: 42 months ExaNoDe designs core technologies for an integrated heterogeneous compute node. ExaNoDe will deliver an HW/SW integrated prototype comprising: Technology and design solutions for an interposer-based computing device targeting HPC applications, Integration of devices in a Multi- Chip-Module (MCM), System and middleware SW stack. COORDINATING ORGANISATION CEA - Commissariat à l Energie Atomique et aux énergies alternatives, France OTHER PARTNERS Arm Limited, UK ETH Zürich, Switzerland FORTH, Greece Fraunhofer ITWM, Germany Scapos AG, Germany University of Manchester, UK Bull SAS (Atos Group), France Virtual Open Systems, France Barcelona Supercomputing Centre, Spain Forschungszentrum Jülich, Germany Kalray SA, France CNRS - Centre National de la Recherche Scientifique, France

30 ExaNoDe HW Prototype: Integration Hierarchy Node: Xilinx MPSoC FPGA for ARM core and reconfigurable HW Chiplet for HW accelerator (CNN) Chiplet SoC Multi-Chip-Module Chiplet SoC MPSoC FPGA MPSoC FPGA Accelerator Chiplet SoC Chiplet SoC DMA Multi-Chip-Module MPSoC FPGA Coherence island MPSoC FPGA Silicon Interposer Organic Substrate Silicon Interposer Organic Substrate Daughter Board DDR DDR DDR DDR Mezzanine Board (ExaNeSt project)

31 Multi-Chip-Module Objective & Challenge: Coarse grain heterogeneous integration Warpage Architecture: Laminate substrate Two FPGA bare dies and one silicon interposer Cu/Ni lid Design: Interposer routing to FPGA and decoupling capacitors: Chiplet Interposer

32 2 - PHOTONICS 2 - Photonics 32

33 Si photonic subsystem Large scale circuit integration Architecture & circuit design KEY TECHNOLOGIES FOR CHIP-TO-CHIP PHOTONIC COMMUNICATION IN POST-EXASCALE COMPUTING Heterogeneous computing system on large-scale silicon interposer Full WDM link integration Die assembly on interposer Optical NoC topology Integrated lasers Fine-grain thermal control Circuit-switched routing Optical IO Fiber coupling TSV & microbump IOs Thermal dissipation Mechanical stress Generic E/O chiplet for communication Routing, flow-control & arbitration Tx/Rx electro-optical drivers Dense integration Autonomous thermal control Integration in computing fabric 33

34 OPTICAL LINK ON PHOTONIC INTERPOSER HIGH PERFORMANCE COMPUTING Heterogeneous integration Optical link on Photonic Interposer 34

35 LETI S OPTICAL NETWORK-IN-PACKAGE NOCS 2015 ISSCC 2018 < Optical network-in-package to interconnect microprocessors and memories System Architecture Design Technology Cache Coherent Compute Fabric with: 96 cores (MIPS32), Optical NoC. Design challenges demonstrated on silicon: Thermal control of WDM devices E/O co-design of drivers Optimized short-range optical WDM system Reference design with 6 chiplets, 96 cores, 8 transceivers available. Preliminary integration of E/O transceiver and photo-diode Photonic interposer 35

36 DEMONSTRATION OF A THERMALLY TUNED WDM ELECTRO-OPTICAL LINK CMOS+Si-Photonics 3D stack Optical fiber array 1Tbps/mm² bandwidth density Tight technology integration of E/O ring modulators within a 3D stack Integrated thermal tuning robust to compute fabric heating [Y. Thonnart, ISSCC2018] Chip-on-board integration Y. Thonnart & al. ISSCC

37 Si-Photonics Architecture Packaging LETI S SI-PHOTONICS ROADMAP FOR POST-EXASCALE COMPUTING TSV for CMOS TSV for Si-Pho WDM link E/O Micro rings ONoC Thermal tuning Target demonstrator core cache-coherent processor Generic E/O chiplets 8-node optical NoC 576 Gbit/s aggregated bandwidth 384 microring resonators ~10 ns electro-optical latency 37

38 3 - IN MEMORY COMPUTING 3 - In-Memory- Computing 38

39 BREAK THE MEMORY WALL! BUT HOW? "memory wall" 11 TB/s; BF ratio = TB/s; BF ratio = TB/s; BF ratio = 0.37 Source Fujitsu HotChips 2018 is nowaday the main limitation for high performance computing it s time to consider data centric architectures toward in-memory computing 39

40 HOW TO DEFINE IN-MEMORY COMPUTING UNIT? in-memory computing = memory with computing R =row R =row R =row k datax datay dataz Computing means selected data during a computing operation is in-situ processed (w/o external processing R =row W =row R =row R =row k data_op(x,y,z) datax data_op(x,y,z) datay dataz Computing means in-situ processed data can be directly written-back Source Leti data_op(x,y,z) 40

41 row decoder row selector HOW TO INTERACT WITH IN-MEMORY COMPUTING UNIT? in-memory computing = memory able to execute microinstructions in-situ memory unit multi-row selection IMC unit bitcell array bitcell array FSM IO (read/write) instruction decoding FSM IO (read/write) processing unit FSM+IO (compute) Source Leti R/W DATA_IN DATA_OUT SYSTEM BUS R/W DATA_IN DATA_OUT SYSTEM BUS 41

42 Compilation Interpretation Analysis EXPECTED GAIN: PRELIMINARY RESULTS FROM LETI Exploration Simulation Platform Algorithm C-code + intrinsic LLVM IR Execution Trace Performance Preliminary evaluation results: Execution time speed-up factor from 10x to 10000x Energy reduction factor from 3x to 29x 42

43 NON VOLATILE MEMORY (NVM) LANDSCAPE PCM MRAM OXRAM/CBRAM Everspin 64Mb DDR3 STT-RAM embedded RRAM 2Mb Avalanche Technologies STT- MRAM 32Mb 128kbit CBRAM 8-bit controller with embedded RRAM 43

44 NEW ARCHITECTURE PARADIGMS WITH NVM Current system 1-Smart SSDs 2-Main memory replacement 3-Cache replacement 4-Post-Exascale Architectures 25 % 75 % Non-Volatile Memories invade logic. Data movement energy reduced. 100 % In-Memory Computing Execution time Memory stalls Memory bound architecture Balanced architectures Disruptive architecture Compute CPU CPU CPU CPU CPU CPU CPU CPU SoC L1 $ L1 $ L2/L3 $ L1 $ L1 $ L2/L3 $ L1 $ L1 $ L2/L3 $ L1 $ L1 $ NVM NVM Ext. Mem. DRAM DRAM DRAM NVM DRAM I/O Disk/SSD NVM Disk/SSD Disk/SSD Disk/SSD Other nodes 44

45 IT S NOT OVER EXASCALE LEVEL COMPUTING IS ALSO: Low power design: Not possible to run all cores at (Vmax, Fmax) due to heat dissipation Low voltage operation, Power domains, Probes, actuators, control. Security: HPC compute nodes are open to external world (access to data) Root of trust, Secure boot, Isolation for secure services, Chip monitoring. Variable precision: Fixed floating point arithmetic reaches its limits: undesirable calculation effects, too heavy for new workloads, New arithmetic formats: variable precision, UNUM s e f sign exponent fraction 45

46 EUROPEAN HPC EXASCALE EFFORT 46

47 EUROHPC: A NEW LEGAL AND FUNDING ENTITY FOR DEPLOYING IN EUROPE A WORLD-CLASS SUPERCOMPUTING INFRASTRUCTURE The EuroHPC Joint Undertaking represents a public investment of EUR 1 billion between the European Union and 25 participating European countries. Source: 47

48 NEXT STEPS EuroHPC will initially operate from 2019 to 2026 EuroHPC will support activities through procurement and open Calls in 2019 and : EuroHPC foresees the initial co-investment with Member States of about EUR 1 billion: To acquire two pre-exascale machines and several petascale systems by 2020, For R&I actions covering the full HPC ecosystem : Further funds would allow a full coverage of the HPC strategy: Acquisition in of two exascale systems, at least one of them with European technology, one post-exascale system. Source: 48

49 European Processor Initiative European independence in High Performance Computing Technologies EU Exascale machine based on EU processor by 2023 Based on a solid, long-term economic model 49 16/11/2018 PN European Processor Initiative Subject to change

50 European Commission (EC) expectations & EPI value proposal EPI expected impacts (as per EC request) Get a world class processor for the Exascale machines supplied by EuroHPC in 2023 Develop a sustainable economic model Technology drive High Performance Computing needs for Exascale machines and beyond Connected mobility & Advanced Driver Assistance Systems (ADAS) computing needs beyond 2023 Servers, Cloud, Edge Low Power CPU needs 50 16/11/2018 PN European Processor Initiative Subject to change Business drive

51 Multiple expertises and excellent combo AUTOMOTIVE EXPERTS INDUSTRY EXPERTS HPC & RESEARCH 51 16/11/2018 PN European Processor Initiative Subject to change

52 CONCLUSION era Heterogeneous integration Heterogeneous Architecture & Integration > era Disruptive architectures Optical Network-on-Chip In-Memory-Computing European HPC strategy 52

53 THANK YOU QUESTIONS? D. Dutoit, E. Guthmuller, JP Noël, Y. Thonnart, P. Vivet Leti, technology research institute Commissariat à l énergie atomique et aux énergies alternatives Minatec Campus 17 rue des Martyrs Grenoble Cedex France

L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers

L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers I N S T I T U T D E R E C H E R C H E T E C H N O L O G I Q U E L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers 10/04/2017 Les Rendez-vous de

More information

THERMAL EXPLORATION AND SIGN-OFF ANALYSIS FOR ADVANCED 3D INTEGRATION

THERMAL EXPLORATION AND SIGN-OFF ANALYSIS FOR ADVANCED 3D INTEGRATION THERMAL EXPLORATION AND SIGN-OFF ANALYSIS FOR ADVANCED 3D INTEGRATION Cristiano Santos 1, Pascal Vivet 1, Lee Wang 2, Michael White 2, Alexandre Arriordaz 3 DAC Designer Track 2017 Pascal Vivet Jun/2017

More information

EMERGING NON VOLATILE MEMORY

EMERGING NON VOLATILE MEMORY EMERGING NON VOLATILE MEMORY Innovative components for neuromorphic architecture Leti, technology research institute Contact: leti.contact@cea.fr Neuromorphic architecture Brain-inspired computing has

More information

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER 3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER CODES+ISSS: Special session on memory controllers Taipei, October 10 th 2011 Denis Dutoit, Fabien Clermidy, Pascal Vivet {denis.dutoit@cea.fr}

More information

Xilinx SSI Technology Concept to Silicon Development Overview

Xilinx SSI Technology Concept to Silicon Development Overview Xilinx SSI Technology Concept to Silicon Development Overview Shankar Lakka Aug 27 th, 2012 Agenda Economic Drivers and Technical Challenges Xilinx SSI Technology, Power, Performance SSI Development Overview

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Beyond Moore. Beyond Programmable Logic.

Beyond Moore. Beyond Programmable Logic. Beyond Moore Beyond Programmable Logic Steve Trimberger Xilinx Research FPL 30 August 2012 Beyond Moore Beyond Programmable Logic Agenda What is happening in semiconductor technology? Moore s Law More

More information

Interposer Technology: Past, Now, and Future

Interposer Technology: Past, Now, and Future Interposer Technology: Past, Now, and Future Shang Y. Hou TSMC 侯上勇 3D TSV: Have We Waited Long Enough? Garrou (2014): A Little More Patience Required for 2.5/3D All things come to those who wait In 2016,

More information

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Interconnect Challenges in a Many Core Compute Environment Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Agenda Microprocessor general trends Implications Tradeoffs Summary

More information

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape Edition April 2017 Semiconductor technology & processing 3D systems-on-chip A clever partitioning of circuits to improve area, cost, power and performance. In recent years, the technology of 3D integration

More information

Emerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation

Emerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation Emerging IC Packaging Platforms for ICT Systems - MEPTEC, IMAPS and SEMI Bay Area Luncheon Presentation Dr. Li Li Distinguished Engineer June 28, 2016 Outline Evolution of Internet The Promise of Internet

More information

Stacked Silicon Interconnect Technology (SSIT)

Stacked Silicon Interconnect Technology (SSIT) Stacked Silicon Interconnect Technology (SSIT) Suresh Ramalingam Xilinx Inc. MEPTEC, January 12, 2011 Agenda Background and Motivation Stacked Silicon Interconnect Technology Summary Background and Motivation

More information

Toward a Memory-centric Architecture

Toward a Memory-centric Architecture Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains

More information

High Volume Manufacturing Supply Chain Ecosystem for 2.5D HBM2 ASIC SiPs

High Volume Manufacturing Supply Chain Ecosystem for 2.5D HBM2 ASIC SiPs Open-Silicon.com 490 N. McCarthy Blvd, #220 Milpitas, CA 95035 408-240-5700 HQ High Volume Manufacturing Supply Chain Ecosystem for 2.5D HBM2 ASIC SiPs Open-Silicon Asim Salim VP Mfg. Operations 20+ experience

More information

Building supercomputers from embedded technologies

Building supercomputers from embedded technologies http://www.montblanc-project.eu Building supercomputers from embedded technologies Alex Ramirez Barcelona Supercomputing Center Technical Coordinator This project and the research leading to these results

More information

Advancing high performance heterogeneous integration through die stacking

Advancing high performance heterogeneous integration through die stacking Advancing high performance heterogeneous integration through die stacking Suresh Ramalingam Senior Director, Advanced Packaging European 3D TSV Summit Jan 22 23, 2013 The First Wave of 3D ICs Perfecting

More information

3D INTEGRATION, A SMART WAY TO ENHANCE PERFORMANCE. Leti Devices Workshop December 3, 2017

3D INTEGRATION, A SMART WAY TO ENHANCE PERFORMANCE. Leti Devices Workshop December 3, 2017 3D INTEGRATION, A SMART WAY TO ENHANCE PERFORMANCE OVERAL GOAL OF THIS TALK Hybrid bonding 3D sequential 3D VLSI technologies (3D VIA Pitch

More information

VISUALIZING THE PACKAGING ROADMAP

VISUALIZING THE PACKAGING ROADMAP IEEE SCV EPS Chapter Meeting 3/13/2019 VISUALIZING THE PACKAGING ROADMAP IVOR BARBER CORPORATE VICE PRESIDENT, PACKAGING AMD IEEE EPS Lunchtime Presentation March 2019 1 2 2 www.cpmt.org/scv 3/27/2019

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

Advanced Heterogeneous Solutions for System Integration

Advanced Heterogeneous Solutions for System Integration Advanced Heterogeneous Solutions for System Integration Kees Joosse Director Sales, Israel TSMC High-Growth Applications Drive Product and Technology Smartphone Cloud Data Center IoT CAGR 12 17 20% 24%

More information

WLSI Extends Si Processing and Supports Moore s Law. Douglas Yu TSMC R&D,

WLSI Extends Si Processing and Supports Moore s Law. Douglas Yu TSMC R&D, WLSI Extends Si Processing and Supports Moore s Law Douglas Yu TSMC R&D, chyu@tsmc.com SiP Summit, Semicon Taiwan, Taipei, Taiwan, Sep. 9 th, 2016 Introduction Moore s Law Challenges Heterogeneous Integration

More information

Gen-Z Memory-Driven Computing

Gen-Z Memory-Driven Computing Gen-Z Memory-Driven Computing Our vision for the future of computing Patrick Demichel Distinguished Technologist Explosive growth of data More Data Need answers FAST! Value of Analyzed Data 2005 0.1ZB

More information

Exascale: challenges and opportunities in a power constrained world

Exascale: challenges and opportunities in a power constrained world Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made

More information

From 3D Toolbox to 3D Integration: Examples of Successful 3D Applicative Demonstrators N.Sillon. CEA. All rights reserved

From 3D Toolbox to 3D Integration: Examples of Successful 3D Applicative Demonstrators N.Sillon. CEA. All rights reserved From 3D Toolbox to 3D Integration: Examples of Successful 3D Applicative Demonstrators N.Sillon Agenda Introduction 2,5D: Silicon Interposer 3DIC: Wide I/O Memory-On-Logic 3D Packaging: X-Ray sensor Conclusion

More information

THE PATH TO EXASCALE COMPUTING. Bill Dally Chief Scientist and Senior Vice President of Research

THE PATH TO EXASCALE COMPUTING. Bill Dally Chief Scientist and Senior Vice President of Research THE PATH TO EXASCALE COMPUTING Bill Dally Chief Scientist and Senior Vice President of Research The Goal: Sustained ExaFLOPs on problems of interest 2 Exascale Challenges Energy efficiency Programmability

More information

All Programmable: from Silicon to System

All Programmable: from Silicon to System All Programmable: from Silicon to System Ivo Bolsens, Senior Vice President & CTO Page 1 Moore s Law: The Technology Pipeline Page 2 Industry Debates Variability Page 3 Industry Debates on Cost Page 4

More information

3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA

3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA 3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA OUTLINE 3D Application Drivers and Roadmap 3D Stacked-IC Technology 3D System-on-Chip: Fine grain partitioning Conclusion

More information

Pushing the Boundaries of Moore's Law to Transition from FPGA to All Programmable Platform Ivo Bolsens, SVP & CTO Xilinx ISPD, March 2017

Pushing the Boundaries of Moore's Law to Transition from FPGA to All Programmable Platform Ivo Bolsens, SVP & CTO Xilinx ISPD, March 2017 Pushing the Boundaries of Moore's Law to Transition from FPGA to All Programmable Platform Ivo Bolsens, SVP & CTO Xilinx ISPD, March 2017 High Growth Markets Cloud Computing Automotive IIoT 5G Wireless

More information

TSV : impact on microelectronics European 3D TSV Summit MINATEC Campus Grenoble, January 22nd, 2013

TSV : impact on microelectronics European 3D TSV Summit MINATEC Campus Grenoble, January 22nd, 2013 TSV : impact on microelectronics European 3D TSV Summit MINATEC Campus Grenoble, January 22nd, 2013 Welcome in Grenoble Grenoble : 3D by Nature Pour modifier: Insertion / En Tête/Pied de page -Titre de

More information

BREAKING THE MEMORY WALL

BREAKING THE MEMORY WALL BREAKING THE MEMORY WALL CS433 Fall 2015 Dimitrios Skarlatos OUTLINE Introduction Current Trends in Computer Architecture 3D Die Stacking The memory Wall Conclusion INTRODUCTION Ideal Scaling of power

More information

Maximizing heterogeneous system performance with ARM interconnect and CCIX

Maximizing heterogeneous system performance with ARM interconnect and CCIX Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable

More information

3D technology evolution to smart interposer and high density 3D ICs

3D technology evolution to smart interposer and high density 3D ICs 3D technology evolution to smart interposer and high density 3D ICs Patrick Leduc, Jean Charbonnier, Nicolas Sillon, Séverine Chéramy, Yann Lamy, Gilles Simon CEA-Leti, Minatec Campus Why 3D integration?

More information

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis I NVENTIVE Physical Design Implementation for 3D IC Methodology and Tools Dave Noice Vassilios Gerousis Outline 3D IC Physical components Modeling 3D IC Stack Configuration Physical Design With TSV Summary

More information

Moore s s Law, 40 years and Counting

Moore s s Law, 40 years and Counting Moore s s Law, 40 years and Counting Future Directions of Silicon and Packaging Bill Holt General Manager Technology and Manufacturing Group Intel Corporation InterPACK 05 2005 Heat Transfer Conference

More information

TechSearch International, Inc.

TechSearch International, Inc. Silicon Interposers: Ghost of the Past or a New Opportunity? Linda C. Matthew TechSearch International, Inc. www.techsearchinc.com Outline History of Silicon Carriers Thin film on silicon examples Multichip

More information

TechSearch International, Inc.

TechSearch International, Inc. Alternatives on the Road to 3D TSV E. Jan Vardaman President TechSearch International, Inc. www.techsearchinc.com Everyone Wants to Have 3D ICs 3D IC solves interconnect delay problem bandwidth bottleneck

More information

EXASCALE COMPUTING ROADMAP IMPACT ON LEGACY CODES MARCH 17 TH, MIC Workshop PAGE 1. MIC workshop Guillaume Colin de Verdière

EXASCALE COMPUTING ROADMAP IMPACT ON LEGACY CODES MARCH 17 TH, MIC Workshop PAGE 1. MIC workshop Guillaume Colin de Verdière EXASCALE COMPUTING ROADMAP IMPACT ON LEGACY CODES MIC workshop Guillaume Colin de Verdière MARCH 17 TH, 2015 MIC Workshop PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France March 17th, 2015 Overview Context

More information

Network on Chip Architecture: An Overview

Network on Chip Architecture: An Overview Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology

More information

The Road from Peta to ExaFlop

The Road from Peta to ExaFlop The Road from Peta to ExaFlop Andreas Bechtolsheim June 23, 2009 HPC Driving the Computer Business Server Unit Mix (IDC 2008) Enterprise HPC Web 100 75 50 25 0 2003 2008 2013 HPC grew from 13% of units

More information

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem. The VLSI Interconnect Challenge Avinoam Kolodny Electrical Engineering Department Technion Israel Institute of Technology VLSI Challenges System complexity Performance Tolerance to digital noise and faults

More information

ENERGY CHALLENGES OF COMPUTING FOR CPS SYSTEMS

ENERGY CHALLENGES OF COMPUTING FOR CPS SYSTEMS ENERGY CHALLENGES OF COMPUTING FOR CPS SYSTEMS Marc Duranton CEA Fellow Architecture, IC Design & Embedded Software Division Commissariat à l énergie atomique et aux énergies alternatives Platform4CPS

More information

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration 1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 10: Three-Dimensional (3D) Integration Instructor: Ron Dreslinski Winter 2016 University of Michigan 1 1 1 Announcements

More information

Chapter 0 Introduction

Chapter 0 Introduction Chapter 0 Introduction Jin-Fu Li Laboratory Department of Electrical Engineering National Central University Jhongli, Taiwan Applications of ICs Consumer Electronics Automotive Electronics Green Power

More information

OpenCAPI and its Roadmap

OpenCAPI and its Roadmap OpenCAPI and its Roadmap Myron Slota, President OpenCAPI Speaker name, Consortium Title Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration and Innovation OpenCAPI and

More information

Bringing 3D Integration to Packaging Mainstream

Bringing 3D Integration to Packaging Mainstream Bringing 3D Integration to Packaging Mainstream Enabling a Microelectronic World MEPTEC Nov 2012 Choon Lee Technology HQ, Amkor Highlighted TSV in Packaging TSMC reveals plan for 3DIC design based on silicon

More information

Vector Engine Processor of SX-Aurora TSUBASA

Vector Engine Processor of SX-Aurora TSUBASA Vector Engine Processor of SX-Aurora TSUBASA Shintaro Momose, Ph.D., NEC Deutschland GmbH 9 th October, 2018 WSSP 1 NEC Corporation 2018 Contents 1) Introduction 2) VE Processor Architecture 3) Performance

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

Non-contact Test at Advanced Process Nodes

Non-contact Test at Advanced Process Nodes Chris Sellathamby, J. Hintzke, B. Moore, S. Slupsky Scanimetrics Inc. Non-contact Test at Advanced Process Nodes June 8-11, 8 2008 San Diego, CA USA Overview Advanced CMOS nodes are a challenge for wafer

More information

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013 A Closer Look at the Epiphany IV 28nm 64 core Coprocessor Andreas Olofsson PEGPUM 2013 1 Adapteva Achieves 3 World Firsts 1. First processor company to reach 50 GFLOPS/W 3. First semiconductor company

More information

Adaptable Intelligence The Next Computing Era

Adaptable Intelligence The Next Computing Era Adaptable Intelligence The Next Computing Era Hot Chips, August 21, 2018 Victor Peng, CEO, Xilinx Pervasive Intelligence from Cloud to Edge to Endpoints >> 1 Exponential Growth and Opportunities Data Explosion

More information

The Mont-Blanc approach towards Exascale

The Mont-Blanc approach towards Exascale http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are

More information

Addressing the Memory Wall

Addressing the Memory Wall Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the

More information

Systems Architectures towards Exascale

Systems Architectures towards Exascale Systems Architectures towards Exascale D. Pleiter German-Indian Workshop on HPC Architectures and Applications Pune 29 November 2016 Outline Introduction Exascale computing Technology trends Architectures

More information

Enabling Technology for the Cloud and AI One Size Fits All?

Enabling Technology for the Cloud and AI One Size Fits All? Enabling Technology for the Cloud and AI One Size Fits All? Tim Horel Collaborate. Differentiate. Win. DIRECTOR, FIELD APPLICATIONS The Growing Cloud Global IP Traffic Growth 40B+ devices with intelligence

More information

ECE 574 Cluster Computing Lecture 23

ECE 574 Cluster Computing Lecture 23 ECE 574 Cluster Computing Lecture 23 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 1 December 2015 Announcements Project presentations next week There is a final. time. Maybe

More information

TechSearch International, Inc.

TechSearch International, Inc. On the Road to 3D ICs: Markets and Solutions E. Jan Vardaman President TechSearch International, Inc. www.techsearchinc.com High future cost of lithography Severe interconnect delay Noted in ITRS roadmap

More information

2009 International Solid-State Circuits Conference Intel Paper Highlights

2009 International Solid-State Circuits Conference Intel Paper Highlights 2009 International Solid-State Circuits Conference Intel Paper Highlights Mark Bohr Intel Senior Fellow Soumyanath Krishnamurthy Intel Fellow 1 2009 ISSCC Intel Paper Summary Under embargo until February,

More information

Five Emerging DRAM Interfaces You Should Know for Your Next Design

Five Emerging DRAM Interfaces You Should Know for Your Next Design Five Emerging DRAM Interfaces You Should Know for Your Next Design By Gopal Raghavan, Cadence Design Systems Producing DRAM chips in commodity volumes and prices to meet the demands of the mobile market

More information

Heterogeneous Integration and the Photonics Packaging Roadmap

Heterogeneous Integration and the Photonics Packaging Roadmap Heterogeneous Integration and the Photonics Packaging Roadmap Presented by W. R. Bottoms Packaging Photonics for Speed & Bandwidth The Functions Of A Package Protect the contents from damage Mechanical

More information

edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next?

edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next? edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next? 1 Integrating DRAM and Logic Integrate with Logic without impacting logic Performance,

More information

The ExaNeSt Project: Interconnects, Storage, and Packaging for Exascale Systems

The ExaNeSt Project: Interconnects, Storage, and Packaging for Exascale Systems The ExaNeSt Project: Interconnects, Storage, and Packaging for Exascale Systems M. Katevenis, Nikolaos Chrysos, e.a. Foundation for Research & Technology - Hellas (FORTH) On Behalf of the ExaNeSt Consortium

More information

Hybrid Memory Cube (HMC)

Hybrid Memory Cube (HMC) 23 Hybrid Memory Cube (HMC) J. Thomas Pawlowski, Fellow Chief Technologist, Architecture Development Group, Micron jpawlowski@micron.com 2011 Micron Technology, I nc. All rights reserved. Products are

More information

Intel SSD Data center evolution

Intel SSD Data center evolution Intel SSD Data center evolution March 2018 1 Intel Technology Innovations Fill the Memory and Storage Gap Performance and Capacity for Every Need Intel 3D NAND Technology Lower cost & higher density Intel

More information

Moore s Law: Alive and Well. Mark Bohr Intel Senior Fellow

Moore s Law: Alive and Well. Mark Bohr Intel Senior Fellow Moore s Law: Alive and Well Mark Bohr Intel Senior Fellow Intel Scaling Trend 10 10000 1 1000 Micron 0.1 100 nm 0.01 22 nm 14 nm 10 nm 10 0.001 1 1970 1980 1990 2000 2010 2020 2030 Intel Scaling Trend

More information

genzconsortium.org Gen-Z Technology: Enabling Memory Centric Architecture

genzconsortium.org Gen-Z Technology: Enabling Memory Centric Architecture Gen-Z Technology: Enabling Memory Centric Architecture Why Gen-Z? Gen-Z Consortium 2017 2 Why Gen-Z? Gen-Z Consortium 2017 3 Why Gen-Z? Businesses Need to Monetize Data Big Data AI Machine Learning Deep

More information

EuroHPC and the European HPC Strategy HPC User Forum September 4-6, 2018 Dearborn, Michigan, USA

EuroHPC and the European HPC Strategy HPC User Forum September 4-6, 2018 Dearborn, Michigan, USA EuroHPC and the European HPC Strategy HPC User Forum September 4-6, 2018 Dearborn, Michigan, USA Leonardo Flores Añover Senior Expert - HPC and Quantum technologies DG CONNECT European Commission Overall

More information

Photonics Integration in Si P Platform May 27 th Fiber to the Chip

Photonics Integration in Si P Platform May 27 th Fiber to the Chip Photonics Integration in Si P Platform May 27 th 2014 Fiber to the Chip Overview Introduction & Goal of Silicon Photonics Silicon Photonics Technology Wafer Level Optical Test Integration with Electronics

More information

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

A Building Block 3D System with Inductive-Coupling Through Chip Interfaces Hiroki Matsutani Keio University, Japan

A Building Block 3D System with Inductive-Coupling Through Chip Interfaces Hiroki Matsutani Keio University, Japan A Building Block 3D System with Inductive-Coupling Through Chip Interfaces Hiroki Matsutani Keio University, Japan 1 Outline: 3D Wireless NoC Designs This part also explores 3D NoC architecture with inductive-coupling

More information

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad NoC Round Table / ESA Sep. 2009 Asynchronous Three Dimensional Networks on on Chip Frédéric ric PétrotP Outline Three Dimensional Integration Clock Distribution and GALS Paradigm Contribution of the Third

More information

Opportunities & Challenges: 28nm & 2.5/3-D IC Design and Manufacturing

Opportunities & Challenges: 28nm & 2.5/3-D IC Design and Manufacturing Opportunities & Challenges: 28nm & 2.5/3-D IC Design and Manufacturing Vincent Tong Senior Vice President & Asia Pacific Executive Leader Copyright 2011 Xilinx Agenda Xilinx Business Drivers All in at

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

Future Memories. Jim Handy OBJECTIVE ANALYSIS

Future Memories. Jim Handy OBJECTIVE ANALYSIS Future Memories Jim Handy OBJECTIVE ANALYSIS Hitting a Brick Wall OBJECTIVE ANALYSIS www.objective-analysis.com Panelists Michael Miller VP Technology, Innovation & Systems Applications MoSys Christophe

More information

The EuroHPC strategic initiative

The EuroHPC strategic initiative Amsterdam, 12 December 2017 The EuroHPC strategic initiative Thomas Skordas Director, DG CONNECT-C, European Commission The European HPC strategy in Horizon 2020 Infrastructure Capacity of acquiring leadership-class

More information

CMOS Photonic Processor-Memory Networks

CMOS Photonic Processor-Memory Networks CMOS Photonic Processor-Memory Networks Vladimir Stojanović Integrated Systems Group Massachusetts Institute of Technology Acknowledgments Krste Asanović, Rajeev Ram, Franz Kaertner, Judy Hoyt, Henry Smith,

More information

Trends in HPC (hardware complexity and software challenges)

Trends in HPC (hardware complexity and software challenges) Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18

More information

Multi-Core Microprocessor Chips: Motivation & Challenges

Multi-Core Microprocessor Chips: Motivation & Challenges Multi-Core Microprocessor Chips: Motivation & Challenges Dileep Bhandarkar, Ph. D. Architect at Large DEG Architecture & Planning Digital Enterprise Group Intel Corporation October 2005 Copyright 2005

More information

Introduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN

Introduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN 1 Introduction The evolution of integrated circuit (IC) fabrication techniques is a unique fact in the history of modern industry. The improvements in terms of speed, density and cost have kept constant

More information

3D & Advanced Packaging

3D & Advanced Packaging Tuesday, October 03, 2017 Company Overview March 12, 2015 3D & ADVANCED PACKAGING IS NOW WITHIN REACH WHAT IS NEXT LEVEL INTEGRATION? Next Level Integration blends high density packaging with advanced

More information

Power Technology For a Smarter Future

Power Technology For a Smarter Future 2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Power Technology For a Smarter Future Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation

More information

Technology and Manufacturing

Technology and Manufacturing Technology and Manufacturing Executive Vice President Field Trip 2006 - London, May 23rd Field Trip 2006 - London, May 23rd Technology Technology Development Centers and Main Programs CMOS Logic Platform

More information

Part IV: 3D WiNoC Architectures

Part IV: 3D WiNoC Architectures Wireless NoC as Interconnection Backbone for Multicore Chips: Promises, Challenges, and Recent Developments Part IV: 3D WiNoC Architectures Hiroki Matsutani Keio University, Japan 1 Outline: 3D WiNoC Architectures

More information

EPYC VIDEO CUG 2018 MAY 2018

EPYC VIDEO CUG 2018 MAY 2018 AMD UPDATE CUG 2018 EPYC VIDEO CRAY AND AMD PAST SUCCESS IN HPC AMD IN TOP500 LIST 2002 TO 2011 2011 - AMD IN FASTEST MACHINES IN 11 COUNTRIES ZEN A FRESH APPROACH Designed from the Ground up for Optimal

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

GPUs and Emerging Architectures

GPUs and Emerging Architectures GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs

More information

Dr. Yassine Hariri CMC Microsystems

Dr. Yassine Hariri CMC Microsystems Dr. Yassine Hariri Hariri@cmc.ca CMC Microsystems 03-26-2013 Agenda MCES Workshop Agenda and Topics Canada s National Design Network and CMC Microsystems Processor Eras: Background and History Single core

More information

3D Integration & Packaging Challenges with through-silicon-vias (TSV)

3D Integration & Packaging Challenges with through-silicon-vias (TSV) NSF Workshop 2/02/2012 3D Integration & Packaging Challenges with through-silicon-vias (TSV) Dr John U. Knickerbocker IBM - T.J. Watson Research, New York, USA Substrate IBM Research Acknowledgements IBM

More information

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI Contents 1.7 - End of Chapter 1 Power wall The multicore era

More information

The Processor That Don't Cost a Thing

The Processor That Don't Cost a Thing The Processor That Don't Cost a Thing Peter Hsu, Ph.D. Peter Hsu Consulting, Inc. http://cs.wisc.edu/~peterhsu DRAM+Processor Commercial demand Heat stiffling industry's growth Heat density limits small

More information

Lecture 1: Introduction

Lecture 1: Introduction Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

From Majorca with love

From Majorca with love From Majorca with love IEEE Photonics Society - Winter Topicals 2010 Photonics for Routing and Interconnects January 11, 2010 Organizers: H. Dorren (Technical University of Eindhoven) L. Kimerling (MIT)

More information

White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation

White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation Next Generation Technical Computing Unit Fujitsu Limited Contents FUJITSU Supercomputer PRIMEHPC FX100 System Overview

More information

More Course Information

More Course Information More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well

More information

OVERCOMING THE MEMORY WALL FINAL REPORT. By Jennifer Inouye Paul Molloy Matt Wisler

OVERCOMING THE MEMORY WALL FINAL REPORT. By Jennifer Inouye Paul Molloy Matt Wisler OVERCOMING THE MEMORY WALL FINAL REPORT By Jennifer Inouye Paul Molloy Matt Wisler ECE/CS 570 OREGON STATE UNIVERSITY Winter 2012 Contents 1. Introduction... 3 2. Background... 5 3. 3D Stacked Memory...

More information

Technologies for Information and Health

Technologies for Information and Health Energy Defence and Global Security Technologies for Information and Health Atomic Energy Commission HPC in France from a global perspective Pierre LECA Head of «Simulation and Information Sciences Dpt.»

More information

Microarchitecture Overview. Performance

Microarchitecture Overview. Performance Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make

More information

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems Exploring System Coherency and Maximizing Performance of Mobile Memory Systems Shanghai: William Orme, Strategic Marketing Manager of SSG Beijing & Shenzhen: Mayank Sharma, Product Manager of SSG ARM Tech

More information

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC? Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC? Nikola Rajovic, Paul M. Carpenter, Isaac Gelado, Nikola Puzovic, Alex Ramirez, Mateo Valero SC 13, November 19 th 2013, Denver, CO, USA

More information