HPC Technology Trends

Similar documents
Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Petascale Computing Research Challenges

Aim High Intel. Exa, Zetta, Yotta Not Just Goofy Words Oklahoma Supercomputing Symposium October 3, Stephen R. Wheat, Ph.D.

Intel High-Performance Computing. Technologies for Engineering

Race to Exascale: Opportunities and Challenges. Avinash Sodani, Ph.D. Chief Architect MIC Processor Intel Corporation

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

John Hengeveld Director of Marketing, HPC Evangelist

Timothy Lanfear, NVIDIA HPC

The Road from Peta to ExaFlop

Philippe Thierry Sr Staff Engineer Intel Corp.

Risk Factors. Rev. 4/19/11

Multi-Core Microprocessor Chips: Motivation & Challenges

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

Parallel Programming

Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect

HPC. Accelerating. HPC Advisory Council Lugano, CH March 15 th, Herbert Cornelius Intel

GPU COMPUTING AND THE FUTURE OF HPC. Timothy Lanfear, NVIDIA

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

High Performance Computing The Essential Tool for a Knowledge Economy

HW Trends and Architectures

Intel Workstation Technology

Roadmap and Change. Presentation to 2004 Workshop on Extreme Supercomputing Panel: How Much and How Fast. Thomas Sterling

Intel Many Integrated Core (MIC) Architecture

HPC in the Multicore Era

New Intel 45nm Processors. Reinvented transistors and new products

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp

ECE 574 Cluster Computing Lecture 23

Steve Scott, Tesla CTO SC 11 November 15, 2011

A Peek at the Future Intel s Technology Roadmap. Jesse Treger Datacenter Strategic Planning October/November 2012

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

Supercomputers. Alex Reid & James O'Donoghue

White Paper. First the Tick, Now the Tock: Next Generation Intel Microarchitecture (Nehalem)

Moore s s Law, 40 years and Counting

Part 1 of 3 -Understand the hardware components of computer systems

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

Lecture 7: Parallel Processing

Fra superdatamaskiner til grafikkprosessorer og

Medical practice: diagnostics, treatment and surgery in supercomputing centers

Intel HPC Technologies Outlook

Extending Energy Efficiency. From Silicon To The Platform. And Beyond Raj Hazra. Director, Systems Technology Lab

Godson Processor and its Application in High Performance Computers

PART I - Fundamentals of Parallel Computing

ECE 486/586. Computer Architecture. Lecture # 2

The Mont-Blanc approach towards Exascale

Microprocessor Trends and Implications for the Future

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Parallel Computing. Parallel Computing. Hwansoo Han

Intel SSD Data center evolution

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Driving network transformation DAN RODRIGUEZ VICE PRESIDENT DATA CENTER GROUP GENERAL MANAGER COMMUNICATIONS INFRASTRUCTURE DIVISION

Fundamentals of Quantitative Design and Analysis

Motivation Goal Idea Proposition for users Study

Intel : Accelerating the Path to Exascale. Kirk Skaugen Vice President Intel Architecture Group General Manager Data Center Group

Exascale: Parallelism gone wild!

Intel: Driving the Future of IT Technologies. Kevin C. Kahn Senior Fellow, Intel Labs Intel Corporation

Computer Architecture

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

A New NSF TeraGrid Resource for Data-Intensive Science

Marching Memory マーチングメモリ. UCAS-6 6 > Stanford > Imperial > Verify 中村維男 Based on Patent Application by Tadao Nakamura and Michael J.

Low-Power Interconnection Networks

2011 Signal Processing CoDR: Technology Roadmap W. Turner SPDO. 14 th April 2011

White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation

CS/EE 6810: Computer Architecture

2009 International Solid-State Circuits Conference Intel Paper Highlights

Intel Enterprise Processors Technology

Building blocks for high performance DWH Computing

VLSI Design Automation

Outline Marquette University

Trends in the Infrastructure of Computing

Brief Background in Fiber Optics

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Introduction. Summary. Why computer architecture? Technology trends Cost issues

Interconnect Your Future

High Performance Computing

HPC Enabling R&D at Philip Morris International

Lecture 8: RISC & Parallel Computers. Parallel computers

HPC Solutions in High Density Data Centers

IBM HPC DIRECTIONS. Dr Don Grice. ECMWF Workshop November, IBM Corporation

Accelerating Innovation

Silvermont. Introducing Next Generation Low Power Microarchitecture: Dadi Perlmutter

VLSI Design Automation

IMPROVES. Initial Investment is Low Compared to SoC Performance and Cost Benefits

Toward a Memory-centric Architecture

Exascale: challenges and opportunities in a power constrained world

Fabio AFFINITO.

What goes to make a supercomputer?

Design Metrics. A couple of especially important metrics: Time to market Total cost (NRE + unit cost) Performance (speed latency and throughput)

HPC Cineca Infrastructure: State of the art and towards the exascale

Efficiency and Programmability: Enablers for ExaScale. Bill Dally Chief Scientist and SVP, Research NVIDIA Professor (Research), EE&CS, Stanford

Brand-New Vector Supercomputer

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

VLSI Design Automation. Maurizio Palesi

Solutions for Scalable HPC

LOW POWER LOW LATENCY VIDEO CODEC LOT 82

High performance computing and numerical modeling

Lecture 2: Performance

Transcription:

HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group

Risk Factors Today s s presentations contain forward-looking statements. All statements made that are not historical facts are subject to a number of risks and uncertainties, and actual results may differ materially. Please refer to our most recent Earnings Release and our most recent Form 10-Q Q or 10-K K filing available on our website for more information on the risk factors that could cause actual results to differ. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit v Intel Performance Benchmark Limitations (http://www.intel.com/performance/resources/limits.htm www.intel.com/performance/resources/limits.htm). Rev. 7/19/06

Real World Problems Driving Petascale & Beyond 1 ZFlops 100 EFlops 10 EFlops 1 EFlops 1 EFlops 100 PFlops 10 PFlops 1 PFlops 100 TFlops 10 TFlops 1 TFlops 100 GFlops 10 GFlops 1 GFlops Real World Exascale Problem Aerodynamic Example real Analysis: world challenges: 1 Petaflops Laser Full Optics: modeling of an aircraft in all 10 conditions Petaflops Molecular Green airplanes Dynamics in Biology: 20 Petaflops Aerodynamic Genetically Design: tailored medicine 1 Exaflops Computational Understand Cosmology: the origin of the universe 10 Exaflops Turbulence Synthetic in fuels Physics: everywhere 100 Exaflops Computational Accurate extreme Chemistry: weather prediction 1 Zettaflops 100 MFlops 1993 1999 2005 2011 2017 What we can model today with <100TF 2023 SUM Of Top500 #1 Source: Dr. Steve Chen, The Growing HPC Momentum in China,, June 30 th, 2006, Dresden, Germany 2029

Silicon Future 90nm 2003 65nm 2005 45nm 2007 2005 2012 32nm 2009 15nm New Intel technology generation every 2 years Intel R&D technologies drive this pace well into the next decade 25 nm 22nm 2011 16nm 2013 2013 2017 11nm 2015 8nm 2017 Roadmap Research

Yesterday, Today and Tomorrow in HPC ENIAC 20 Numbers in Main Memory CDC 6600 First successful Supercomputer 9MFlops ~2008 Beyond 1946 1965-1977 Climate Astrophysics 1997-2006 2006 Cell-base Community Simulation ASCI Red (word fastest on top500 till 2000) First Teraflop Computer, 9298 Intel Pentium II Xeon Processors Intel ENDEAVOR 464 Intel Xeon Processors 5100 series, 6.85 Teraflop MP Linpack,, #68 on top500 PetaScale Platforms Yesterday s s Supercomputing is Today s s Personal Computing

Intel Design & Process Cadence 2 YEARS Shrink/Derivative Xeon 5000 Intel Core Microarchitecture Xeon 5100 & 5300 65nm Five Microprocessors in One Platform 2 YEARS Shrink/Derivative PENRYN New Microarchitecture NEHALEM 45nm 2 YEARS Shrink/Derivative WESTMERE New Microarchitecture SANDY BRIDGE 32nm 6 All dates, product descriptions, availability and plans are forecasts and subject to change without notice.

Multi and Many Core Multi-core is the current progression of multiple cores on a processor die Many core is a discontinuity of putting many simpler cores on a die. Jim Held will be talking about Intel s s research in this area tomorrow.

Increasing I/O Signaling Rate to Fill the Gap 5,000 4,000 Frequency (Mhz) 3,000 2,000 Core GAP 1,000 0 Bus 1985 1990 1995 2000 2005 2010 Silicon Photonics Source: Intel

100 10 1 Increasing Memory Bandwidth BW (GB/sec) Under 2W 3D Memory Higher BW within Power Envelope to Keep Pace Memory BW Constrained by Power 3D Memory Stacking Power and IO Signals Go Through DRAM to CPU Thin DRAM Die Through DRAM Vias Heat-Sink 0.1 1980 1990 2000 2010 2020 CPU DRAM Package Source: Intel

Power and Cooling Cost Today 10 Kilowatt Hour 9 + Power Delivery Megawatt + Datacenter Cooling + = $14.6M Electricity Costs/Year DATACENTER ENERGY LABEL Assume: 9MW system power, 90% power delivery efficiency, cooling Co-efficiency of Performance (COP)=1.5

Managing Power and Cooling Efficiency Packages Silicon: Moore s s law, Strained silicon, Transistor leakage control techniques, Clock gating Silicon Heat Sinks Processor: Policy-based power allocation Multi-threaded threaded cores System Power Delivery: Fine grain power management, Ultra fine grain power management Transistors Facilities Systems Facilities: Air cooling and liquid cooling options Vertical integration of cooling solutions Power Management: From Transistors to Facilities

DELIVERY INEFFICIENCY 100% CUMULATIVE DATACENTER POWER DELIVERY EFFICIENCY 90% 80% HIGH- EFFICIENCY AC 70% 60% 50% BASELINE AC UNINTERRUPTIBLE POWER SUPPLY POWER DISTRIBUTION UNIT POWER SUPPLY UNIT VOLTAGE REGULATORS Source: Intel

CONVERSION OVERKILL Uninterruptible Power Supply Power Distribution Unit Power Supply Unit 480V AC 480V AC AC AC 208V AC DC DC DC 12V CONVENTIONAL DATACENTER AC POWER DISTRIBUTION

SIMPLIFIED DISTRIBUTION Uninterruptible Power Supply Power Distribution Unit Power Supply Unit 480V AC DC 360V 360V DC DC 12V HIGH VOLTAGE DC POWER DISTRIBUTION

HIGH VOLTAGE DC DATACENTER PROTOTYPE

EFFICIENCY REALIZED 2000 1600 1200 800 400 0 BASELINE AC HIGH VOLTAGE DC Source: Intel

Reliability Challenge Billions of Transistors Soft Error FIT/Chip (Logic & Mem) 1000 100 10 1 Relative FIT/bit (mem( cell): expected to be roughly constant Moore s s law: increasing the bit count exponentially: 2x every 2 years An exponential growth in FIT/chip + V 180 130 90 65 45 32 22 16 Assume: Chip size stays roughly the same with each generation. Soft Errors or Single Event Upsets (SEU) are caused by charge collection following an energetic particle strike. Diffusion - Ion Path + - + - + - + Source: Intel Drift Depletion Region Soft Error: One of Many Challenges Source: Intel

SIO Innovation with Acceleration Intel Architecture with Multi Core General purpose Scalability Economies of Scale Accelerator Accelerator Accelerator Accelerator Memory Hub Hub CPU Intel Architecture with Accelerators Accelerator Accelerator I/O I/O hub hub PCI Special Purpose Performance Geneseo PCIe extensions QuickAssist Software & Tools Add Add ins ins Gb Add Gb Add ins Ethernet* ins Ethernet* LPC Energy Efficient Performance with Multi-threaded threaded Cores & Accelerators

Accelerator Accelerator Accelerator Add ins ins Gb Add Gb ins Ethernet* ins System bottleneck in supporting Accelerators CPU Memory Hub I/O hub CPU SIO LPC PCI Interface Performance Reduce hardware overhead Lower latency for small packets Higher bandwidth for large packets Programming model and tools Ease of programming Reduce software overhead; Commands and data movement Status and synchronization Configuration and error handling Virtualization and power management Scheduling & Memory mgmt. Memory Buffer allocation policy Linear addressing Software & Platform latencies are 100X the physical IO latency for most accelerators.

Questions?