This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources

Similar documents
This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources

Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console

Spring 2010 Prof. Hyesoon Kim. Xbox 360 System Architecture, Anderews, Baker

Xbox 360 high-level architecture

Xbox 360 Architecture. Lennard Streat Samuel Echefu

CONSOLE ARCHITECTURE

Computer Systems Architecture I. CSE 560M Lecture 19 Prof. Patrick Crowley

Spring 2011 Prof. Hyesoon Kim

IBM "Broadway" 512Mb GDDR3 Qimonda

Power 7. Dan Christiani Kyle Wieschowski

Evolution of CPUs & Memory in Video Game Consoles. Curtis Geiger & Matthew Meehan

CSCI-GA Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett

Microarchitecture Overview. Performance

Processing Unit CS206T

Introduction to Microprocessor

Computer Architecture. Introduction. Lynn Choi Korea University

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

Outline Marquette University

Parallel Computing: Parallel Architectures Jin, Hai

Intel released new technology call P6P

CSE : Introduction to Computer Architecture

Microarchitecture Overview. Performance

Introduction to Computing and Systems Architecture

Roadrunner. By Diana Lleva Julissa Campos Justina Tandar

Multimedia in Mobile Phones. Architectures and Trends Lund

Figure 1-1. A multilevel machine.

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

Memory Systems IRAM. Principle of IRAM

Advanced Computer Architecture

Computer Architecture. Fall Dongkun Shin, SKKU

EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design

Computer Architecture

Computer Organization

Computer System architectures

Microelectronics. Moore s Law. Initially, only a few gates or memory cells could be reliably manufactured and packaged together.

Lecture 1: Introduction

Sony/Toshiba/IBM (STI) CELL Processor. Scientific Computing for Engineers: Spring 2008

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

CS427 Multicore Architecture and Parallel Computing

EN164: Design of Computing Systems Lecture 24: Processor / ILP 5

CS 152, Spring 2011 Section 10

GPU Architecture. Michael Doggett Department of Computer Science Lund university

Computer Architecture!

CPU Architecture Overview. Varun Sampath CIS 565 Spring 2012

Computer Architecture!

Workload Optimized Systems: The Wheel of Reincarnation. Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013

! Readings! ! Room-level, on-chip! vs.!

Challenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008

Computer & Microprocessor Architecture HCA103

Vector Processors and Graphics Processing Units (GPUs)

Putting it all Together: Modern Computer Architecture

All About the Cell Processor

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Lecture 21: Parallelism ILP to Multicores. Parallel Processing 101

Computer Evolution. Computer Generation. The Zero Generation (3) Charles Babbage. First Generation- Time Line

Hakam Zaidan Stephen Moore

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

MICROPROCESSOR TECHNOLOGY

HW Trends and Architectures

ECE 571 Advanced Microprocessor-Based Design Lecture 4

Advanced Processor Architecture

PowerPC 620 Case Study

UNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation.

Trends in the Infrastructure of Computing

HP PA-8000 RISC CPU. A High Performance Out-of-Order Processor

HISTORY OF MICROPROCESSORS

PowerPC TM 970: First in a new family of 64-bit high performance PowerPC processors

Computer Evolution. Budditha Hettige. Department of Computer Science

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

Computer Architecture!

History. PowerPC based micro-architectures. PowerPC ISA. Introduction

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CS 152, Spring 2011 Section 8

Anatomy of AMD s TeraScale Graphics Engine

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

IBM's POWER5 Micro Processor Design and Methodology

Keywords and Review Questions

Von Neumann architecture. The first computers used a single fixed program (like a numeric calculator).

CC312: Computer Organization

AMD Radeon HD 2900 Highlights

Chapter 2: Computer-System Structures. Hmm this looks like a Computer System?

Uniprocessor Computer Architecture Example: Cray T3E

Portland State University ECE 588/688. Graphics Processors

The University of Texas at Austin

ROEVER ENGINEERING COLLEGE DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Computer Systems Architecture

Portland State University ECE 588/688. Cray-1 and Cray T3E

Multi-Processors and GPU

Administrivia. HW0 scores, HW1 peer-review assignments out. If you re having Cython trouble with HW2, let us know.

Power Technology For a Smarter Future

Advanced Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Each Milliwatt Matters

Fundamentals of Computer Design

COSC 6385 Computer Architecture - Data Level Parallelism (III) The Intel Larrabee, Intel Xeon Phi and IBM Cell processors

Transcription:

This Unit: Putting It All Together CIS 501 Computer Architecture Unit 12: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital Circuits Gates & Transistors Anatomy of a game console Microsoft XBox 360 Focus mostly on CPU chip Briefly talk about system Graphics processing unit (GPU) I/O and other devices CIS 501 (Martin/Roth): XBox 360 1 CIS 501 (Martin/Roth): XBox 360 2 Sources Application-customized CPU design: The Microsoft Xbox 360 CPU story, Brown, IBM, Dec 2005 http://www-128.ibm.com/developerworks/power/library/pa-fpfxbox/ XBox 360 System Architecture, Andrews & Baker, IEEE Micro, March/April 2006" Microprocessor Report" IBM Speeds XBox 360 to Market, Krewell, Oct 31, 2005" Powering Next-Gen Game Consoles, Krewell, July 18, 2005 What is Computer Architecture? The role of a computer architect: Technology Logic Gates SRAM DRAM Circuit Techniques Packaging Magnetic Storage Flash Memory Design Plans Manufacturing Goals Function Performance Reliability Cost/Manufacturability Energy Efficiency Time to Market Computer PCs Servers PDAs Mobile Phones Supercomputers Game Consoles Embedded CIS 501 (Martin/Roth): XBox 360 3 CIS 501 (Martin/Roth): XBox 360 4

Microsoft XBox Game Console History XBox First game console by Microsoft, released in 2001, $299 Glorified PC 733 Mhz x86 Intel CPU, 64MB DRAM, NVIDIA GPU (graphics) Ran modified version of Windows OS ~25 million sold XBox 360 Second generation, released in 2005, $299-$399 All-new custom hardware 3.2 Ghz PowerPC IBM processor (custom design for XBox 360) ATI graphics chip (custom design for XBox 360) 22+ million sold Microsoft Turns to IBM for XBox 360 Microsoft is mostly a software company Turned to IBM & ATI for XBox 360 design Sony & Nintendo also turned to IBM (for PS3 & Wii, respectively) Design principles of XBox 360 [Andrews & Baker] Value for 5-7 years! big performance increase over last generation Support anti-aliased high-definition video (720*1280*4 @ 30+ fps)! extremely high pixel fill rate (goal: 100+ million pixels/s) Flexible to suit dynamic range of games! balance hardware, homogenous resources Programmability (easy to program)! listened to software developers (quote) CIS 501 (Martin/Roth): XBox 360 5 CIS 501 (Martin/Roth): XBox 360 6 More on Games Workload XBox 360 System from 30,000 Feet Graphics, graphics, graphics Special highly-parallel graphics processing unit (GPU) Much like on PCs today But general-purpose, too The high-level game code is generally a database management problem, with plenty of object-oriented code and pointer manipulation. Such a workload needs a large L2 and high integer performance. [Andrews & Baker] Wanted only a modest number of modest, fast cores Not one big core Not dozens of small cores (leave that to the GPU) Quote from Seymour Cray CIS 501 (Martin/Roth): XBox 360 7 CIS 501 (Martin/Roth): XBox 360 [Krewell, Microprocessor Report, Oct 21, 2005] 8

XBox 360 System XBox 360 Xenon Processor ISA: 64-bit PowerPC chip RISC ISA Like MIPS, but with condition codes Fixed-length 32-bit instructions 32 64-bit general purpose registers (GPRs) ISA++: Extended with VMX-128 operations 128 registers, 128-bits each Packed vector operations Example: four 32-bit floating point numbers One instruction: VR1 * VR2! VR3 Four single-precision operations Also supports conversion to MS DirectX data formats Similar to Altivec (and Intel s MMX, SSE, SSE2, etc.) Works great for 3D graphics kernels and compression CIS 501 (Martin/Roth): XBox 360 [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 9 CIS 501 (Martin/Roth): XBox 360 10 XBox 360 Xenon Processor XBox 360 Xenon Chip (IBM) Peak performance: ~75 gigaflops Gigaflop = 1 billion floating points operations per second Pipelined superscalar processor 3.2 Ghz operation Superscalar: two-way issue VMX-128 instructions (four single-precision operations at a time) Hardware multithreading: two threads per processor Three processor cores per chip 165 million transistors IBM s 90nm process Three cores 3.2 Ghz Two-way superscalar Two-way multithreaded Result: 3.2 * 2 * 4 * 3 = ~77 gigaflops CIS 501 (Martin/Roth): XBox 360 11 CIS 501 (Martin/Roth): XBox 360 [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 12

Xenon Processor Pipeline XBox 360 Memory Hiearchy Four-instruction fetch 128B cache blocks throughout Two-instruction dispatch Five functional units 32KB 2-way set-associative instruction cache (per core) VMX128 execution decoupled from other units 14-cycle VMX dot-product 32KB 4-way set-associative data cache (per core) Write-through, lots of store buffering Parity Branch predictor: 4K G-share predictor Unclear if 4KB or 4K 2-bit counters Per thread 1MB 8-way set-associative second-level cache (per chip) Special skip L2 prefetch instruction MESI cache coherence ECC 512MB GDDR3 DRAM, dual memory controllers Total of 22.4 GB/s of memory bandwidth Direct path to GPU (not supported in current PCs) CIS 501 (Martin/Roth): XBox 360 [Brown, IBM, Dec 2005] 13 CIS 501 (Martin/Roth): XBox 360 14 Xenon Multicore Interconnect XBox 360 System CIS 501 (Martin/Roth): XBox 360 [Brown, IBM, Dec 2005] 15 CIS 501 (Martin/Roth): XBox 360 [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 16

XBox Graphics Subsystem Graphics Parent Die (ATI) 10.8 GB/s FSB bandwidth link each way 232 million transistors 500 Mhz 48 unified shader ALUs Mini-cores for graphics 22.4 GB/s DRAM bandwidth 28.8 GB/s link bandwidth CIS 501 (Martin/Roth): XBox 360 [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 17 CIS 501 (Martin/Roth): XBox 360 [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 18 GPU daughter die (NEC) Putting It All Together 100 million transistors 10MB edram Embedded NEC Electronics Anti-aliasing Render at 4x resolution, then sample Z-buffering Track the depth of pixels 256GB/s internal bandwidth Unit 0: Introduction Unit 1: Technology Unit 2: Performance Unit 3: ISAs Unit 4: Caches Unit 5: Virtual Memory & I/O Unit 6: Pipelining & Branch Prediction Unit 7/8: Superscalar/Scheduling Unit 9: Multicore Unit 10: Multithreading Unit 11: Vectors CIS 501 (Martin/Roth): XBox 360 [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 19 CIS 501 (Martin/Roth): XBox 360 20