POWER7: IBM's Next Generation Server Processor

Similar documents
POWER7: IBM's Next Generation Server Processor

POWER7+ TM IBM IBM Corporation

POWER9 Announcement. Martin Bušek IBM Server Solution Sales Specialist

Power 7. Dan Christiani Kyle Wieschowski

Power Technology For a Smarter Future

Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group

Open Innovation with Power8

POWER4 Systems: Design for Reliability. Douglas Bossen, Joel Tendler, Kevin Reick IBM Server Group, Austin, TX

IBM's POWER5 Micro Processor Design and Methodology

Puey Wei Tan. Danny Lee. IBM zenterprise 196

1. PowerPC 970MP Overview

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers

PowerPC TM 970: First in a new family of 64-bit high performance PowerPC processors

POWER6 Processor and Systems

AMD Opteron 4200 Series Processor

HP solutions for mission critical SQL Server Data Management environments

Exam Questions C

A 5.2 GHz Microprocessor Chip for the IBM zenterprise

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation

Eric Schwarz. IBM Accelerators. July 11, IBM Corporation

HW Trends and Architectures

SPARC64 X: Fujitsu s New Generation 16 Core Processor for the next generation UNIX servers

Six-Core AMD Opteron Processor

Bridging High Performance and Low Power in the era of Big Data and Heterogeneous Computing

M7: Next Generation SPARC. Hotchips 26 August 12, Stephen Phillips Senior Director, SPARC Architecture Oracle

Jeff Stuecheli, PhD IBM Power Systems IBM Systems & Technology Group Development International Business Machines Corporation 1

IBM Power 755 server. High performance compute node for scalable clusters using InfiniBand architecture interconnect products.

FUJITSU SPARC ENTERPRISE SERVER FAMILY MID-RANGE AND HIGH- END MODELS ARCHITECTURE FLEXIBLE, MAINFRAME- CLASS COMPUTE POWER

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

Concurrent High Performance Processor design: From Logic to PD in Parallel

Best Practices for Setting BIOS Parameters for Performance

1 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

PowerPC 740 and 750

EPYC VIDEO CUG 2018 MAY 2018

The PowerEdge M830 blade server

White Paper. First the Tick, Now the Tock: Next Generation Intel Microarchitecture (Nehalem)

IBM POWER6 microarchitecture

Advanced Multimedia Architecture Prof. Cristina Silvano June 2011 Amir Hossein ASHOURI

Sun Fire V880 System Architecture. Sun Microsystems Product & Technology Group SE

POWER8 for DB2 and SAP

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

IBM System x3850 M2 servers feature hypervisor capability

Gemini: Sanjiv Kapil. A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor. Gemini Architect Sun Microsystems, Inc.

IBM POWER7 Systems Express Blades Quick Reference Guide November 2011

ECE 571 Advanced Microprocessor-Based Design Lecture 24

Overview of the SPARC Enterprise Servers

SPARC64 X: Fujitsu s New Generation 16 core Processor for UNIX Server

SAP HANA x IBM POWER8 to empower your business transformation. PETER LEE Distinguished Engineer Systems Hardware, IBM Greater China Group

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

AMD Opteron Processors In the Cloud

The Future of SPARC64 TM

These slides contain projections or other forward-looking statements within the meaning of Section 27A of the Securities Act of 1933, as amended, and

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

eslim SV Xeon 1U Server

eslim SV Xeon 2U Server

A 1.5GHz Third Generation Itanium Processor

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Intel Enterprise Processors Technology

IBM i テクニカル ワークショップ IBM i and the Future. Tim Rowe. IBM i Architect Application Development Systems Management.

A Preliminary evalua.on of OpenPOWER through op.mizing stencil based algorithms

This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources

OPENSPARC T1 OVERVIEW

IBM Power AC922 Server

Advanced Parallel Programming I

IBM System p5 550 and 550Q Express servers

Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017

IBM Power Advanced Compute (AC) AC922 Server

p5 520 server Robust entry system designed for the on demand world Highlights

Leveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD

Dell PowerEdge 11 th Generation Servers: R810, R910, and M910 Memory Guidance

All About the Cell Processor

Reference Architecture Microsoft Exchange 2013 on Dell PowerEdge R730xd 2500 Mailboxes

IBM EXAM QUESTIONS & ANSWERS

Roadrunner. By Diana Lleva Julissa Campos Justina Tandar

Blue Gene/Q. Hardware Overview Michael Stephan. Mitglied der Helmholtz-Gemeinschaft

A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache

Portland State University ECE 588/688. Cray-1 and Cray T3E

IBM Power Facts and Features IBM Power Systems, IBM PureFlex and Power Blades

Multi-Threading. Last Time: Dynamic Scheduling. Recall: Throughput and multiple threads. This Time: Throughput Computing

Crusoe Processor Model TM5800

Agenda. Sun s x Sun s x86 Strategy. 2. Sun s x86 Product Portfolio. 3. Virtualization < 1 >

SGI UV 300RL for Oracle Database In-Memory

An Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki

Computer system energy management. Charles Lefurgy

Cisco UCS B440 M1High-Performance Blade Server

Introduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes

IBM System zenterprise 196: Memory Subsystem Overview

POWER3: Next Generation 64-bit PowerPC Processor Design

IBM System x3455 AMD Opteron SMP 1 U server features Xcelerated Memory Technology to meet the needs of HPC environments

PEX8764, PCI Express Gen3 Switch, 64 Lanes, 16 Ports

Intel Workstation Platforms

Modern CPU Architectures

SGI UV for SAP HANA. Scale-up, Single-node Architecture Enables Real-time Operations at Extreme Scale and Lower TCO

Addressing the Memory Wall

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

IBM POWER4: a 64-bit Architecture and a new Technology to form Systems

Each Milliwatt Matters

Moorestown Platform: Based on Lincroft SoC Designed for Next Generation Smartphones

Transcription:

POWER7: IBM's Next Generation Server Processor Acknowledgment: This material is based upon work supported by the Defense Advanced Research Projects Agency under its Agreement No. HR0011-07-9-0002

Outline POWER7 Design Overview POWER Processor History POWER7 TM Motivation Design detals Summary How to Realize Multi-core Potential Cache Hierarchy Technology Advances in Memory Subsystem Advances in Off-chip Signaling Technology Coherence Innovation 2

POWER7: IBM s Next Generation, Balanced POWER Server Chip 45nm 20+ Years of POWER Processors RS64IV Sstar 130nm RS64III Pulsar.18um.25um.35um -Cobra A10-64 bit.22um.5um POWER3TM -630.35um.72um POWER2TM P2SC.25um RSC 1.0um.35um.6um 604e POWER1 -AMERICA s 1990 3-603 POWER7 -Multi-core POWER5TM -SMT.5um.5um Muskie A35 Next Gen. POWER6TM -Ultra High Frequency 180nm RS64II North Star RS64I Apache BiCMOS 65nm POWER4TM -Dual Core Major POWER Innovation -1990 RISC Architecture -1994 SMP -1995 Out of Order Execution -1996 64 Bit Enterprise Architecture -1997 Hardware Multi-Threading -2001 Dual Core Processors -2001 Large System Scaling -2001 Shared Caches -2003 On Chip Memory Control -2003 SMT -2006 Ultra High Frequency -2006 Dual Scope Coherence Mgmt -2006 Decimal Float/VSX -2006 Processor Recovery/Sparing -2009 Balanced Multi-core Processor -2009 On Chip EDRAM -601 1995 2000 2005 2010 * Dates represent approximate processor power-on dates, not system availability

POWER7 Processor Chip 567mm 2 Technology: 45nm lithography, Cu, SOI, edram 1.2B transistors Equivalent function of 2.7B edram efficiency Eight processor cores 12 execution units per core 4 Way SMT per core 32 Threads per chip 256KB L2 per core 32MB on chip edram shared L3 Dual DDR3 Memory Controllers 100GB/s Memory bandwidth per chip sustained Scalability up to 32 Sockets 360GB/s SMP bandwidth/chip 20,000 coherent operations in flight Advanced pre-fetching Data and Instruction Binary Compatibility with POWER6 4 * Statements regarding SMP servers do not imply that IBM will introduce a system with this capability.

POWER7 Design Principles: Multiple optimization Points Balanced Design Multiple optimization points Improved energy efficiency RAS improvements Improved Thread Performance Dynamic allocation of resources Shared L3 Increased Core parallelism 4 Way SMT Aggressive out of order execution Extreme Increase in Socket Throughput System Continued growth in socket bandwidth Balanced core, cache, memory improvements Scalable interconnect Reduced coherence traffic POWER6 POWER7 5 * Statements regarding SMP servers do not imply that IBM will introduce a system with this capability. Graphs for illustration purposes only (Not actual data)

POWER7 Design Principles: Flexibility and Adaptability Cores: 8, 6, and 4-core offerings with up to 32MB of L3 Cache Dynamically turn cores on and off, reallocating energy Dynamically vary individual core frequencies, reallocating energy Dynamically enable and disable up to 4 threads per core Memory Subsystem: Full 8 channel or reduced 4 channel configurations System Topologies: Standard, half-width, and double-width SMP busses supported Multiple System Packages 2/4s Blades and Racks Single Chip Organic High-End and Mid-Range Single Chip Glass Ceramic Compute Intensive Quad-chip MCM 6 1 Memory Controller 3 4B local links 2 Memory Controllers 3 8B local links 2 8B Remote links 8 Memory Controllers 3 16B local links (on MCM) * Statements regarding SMP servers do not imply that IBM will introduce a system with this capability.

POWER7: IBM s Next Generation, Balanced POWER Server Chip POWER7: Core DFU Execution Units 2 Fixed point units 2 Load store units ISU 4 Double precision floating point 1 Vector unit 1 Branch FXU Add Boxes 1 Condition register 1 Decimal floating point unit 6 Wide dispatch/8 Wide Issue Recovery Function Distributed 1,2,4 Way SMT Support IFU CRU/BRU LSU Out of Order Execution 32KB I-Cache 32KB D-Cache 256KB L2 Tightly coupled to core 7 256KB L2 VSX FPU

POWER7: Performance Estimates POWER7 Continues Tradition of Excellent Scalability Core performance increased by: Re-pipelined execution units Reduced L1 cache latency Tightly coupled L2 cache Additional execution units More flexible execution units Increased pipeline utilization with SMT4 and aggressive out of order execution Chip Performance Improved Greater then 4X: System High performance on chip interconnect Improved cache utilization Dual high speed integrated memory controllers Advanced SMP links will provide near linear scaling for larger POWER7 systems. 8 * Performance estimates relate to processor only and should not be used to estimate projected server performance.

Energy Management: Architected Idle Modes Two Design Points Chosen for Technology 4 PowerPC Architected States Nap (optimized for wake-up time) Turn off clocks to execution units Reduce frequency to core Caches and TLB remain coherent Fast wake-up Sleep (optimized for power reduction) Purge caches and TLB Turn off clocks to full core and caches Reduce voltage to V-retention Leakage current reduced substantially Voltage ramps-up on wake up No core re-initialization required Wake-Up Latency Doze Nap RV Winkle Power gate Sleep Energy Reduction 9

Adaptive Energy Management: Energy Scale TM Chip FO4 Tuned for Optimal Performance/ Watt in Technology DVFS (Dynamic Voltage and Frequency Slewing) -50% to +10% frequency slew independent per core Frequency and voltage adjusted based on: Work load and utilization. On board activity monitors Turbo-Mode Up to 10% frequency boost Leverages excess energy capacity from: Non worst case work loads Idle cores Processor and Memory Energy Usage can be independently Balanced. Real time hardware performance monitors used. On board power proxy logic estimates power Power Capping Support Allows budgeting of power to different parts of system Vmin 10

POWER7: IBM s Next Generation, Balanced POWER Server Chip POWER7: Reliability and Availability Features Dynamic Oscillator OSC0 Failover OSC1 Fabric Interface Fabric Bus Interface to other Chips and Nodes ECC protected Node hot add /repair Core Recovery Leverage speculative execution resources to enable recovery Error detected in GPRs FPRs VSR, flushed and retried Stacked latches to improve SER BUF BUF BUF Alternate Processor Recovery Partition isolation for core checkstops BUF L3 edram X8 Dimms 64 Byte ECC on Memory Corrects full chip kill on X8 dimms Spare X8 devices implemented Dual memory chip failures do not cause outage Selective memory mirror capability to recover partition from dimm failures HW assisted scrubbing SUE handling Dynamic sparing on channel interface PowerVM Hypervisor protected from full dimm failures 11 IO Hub PCI Bridge ECC protected SUE handling Line delete Spare rows and columns GX IO Bus ECC protected Hot add InfiniBand Interface Redundant paths PCI Adapter * Statements regarding SMP servers do not imply that IBM will introduce a system with this capability.

Summary Power Systems continue strong 7th Generation Power chip: Balanced Multi-Core design EDRAM technology SMT4 Greater then 4X performance in same power envelope as previous generation Scales to 32 socket, 1024 threads balanced system Building block for peta-scale PERCS project Power7 High Volume Card POWER7 Systems Running in Lab AIX, IBM i, Linux all operational 12 * Statements regarding SMP servers do not imply that IBM will introduce a system with this capability.