This Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources

Similar documents
This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console

Spring 2010 Prof. Hyesoon Kim. Xbox 360 System Architecture, Anderews, Baker

Xbox 360 high-level architecture

Xbox 360 Architecture. Lennard Streat Samuel Echefu

CONSOLE ARCHITECTURE

Computer Systems Architecture I. CSE 560M Lecture 19 Prof. Patrick Crowley

IBM "Broadway" 512Mb GDDR3 Qimonda

Evolution of CPUs & Memory in Video Game Consoles. Curtis Geiger & Matthew Meehan

CSCI-GA Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore

Spring 2011 Prof. Hyesoon Kim

Introduction to Computing and Systems Architecture

Spring 2010 Prof. Hyesoon Kim. AMD presentations from Richard Huddy and Michael Doggett

Outline Marquette University

Introduction to Microprocessor

Computer Architecture. Introduction. Lynn Choi Korea University

! Readings! ! Room-level, on-chip! vs.!

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

Multimedia in Mobile Phones. Architectures and Trends Lund

Workload Optimized Systems: The Wheel of Reincarnation. Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013

CSE : Introduction to Computer Architecture

Computer System architectures

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Sony/Toshiba/IBM (STI) CELL Processor. Scientific Computing for Engineers: Spring 2008

MICROPROCESSOR TECHNOLOGY

Power 7. Dan Christiani Kyle Wieschowski

Memory Systems IRAM. Principle of IRAM

GPU Architecture. Michael Doggett Department of Computer Science Lund university

Computer Architecture. Fall Dongkun Shin, SKKU

Figure 1-1. A multilevel machine.

Microarchitecture Overview. Performance

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

Lecture 1: Introduction

CS 152, Spring 2011 Section 10

Parallel Computing: Parallel Architectures Jin, Hai

Processing Unit CS206T

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

Microelectronics. Moore s Law. Initially, only a few gates or memory cells could be reliably manufactured and packaged together.

Architectures. Michael Doggett Department of Computer Science Lund University 2009 Tomas Akenine-Möller and Michael Doggett 1

Computer Architecture

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

Roadrunner. By Diana Lleva Julissa Campos Justina Tandar

Intel released new technology call P6P

Computer Evolution. Computer Generation. The Zero Generation (3) Charles Babbage. First Generation- Time Line

Reader's Guide Outline of the Book A Roadmap For Readers and Instructors Why Study Computer Organization and Architecture Internet and Web Resources

Anatomy of AMD s TeraScale Graphics Engine

HW Trends and Architectures

Chapter 2: Computer-System Structures. Hmm this looks like a Computer System?

Uniprocessor Computer Architecture Example: Cray T3E

Computer Systems Architecture

Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks

Computer Organization

HISTORY OF MICROPROCESSORS

Vector Processors and Graphics Processing Units (GPUs)

CS427 Multicore Architecture and Parallel Computing

Cell Broadband Engine. Spencer Dennis Nicholas Barlow

The Mont-Blanc approach towards Exascale

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

Microarchitecture Overview. Performance

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

Computer Evolution. Budditha Hettige. Department of Computer Science

History. PowerPC based micro-architectures. PowerPC ISA. Introduction

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

Computer Architecture!

Computer Architecture!

EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design

Computer Architecture!

Systems Design and Programming. Instructor: Chintan Patel

How to Write Fast Code , spring th Lecture, Mar. 31 st

Putting it all Together: Modern Computer Architecture

Lecture 21: Parallelism ILP to Multicores. Parallel Processing 101

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism

Portland State University ECE 588/688. Graphics Processors

AMD Radeon HD 2900 Highlights

Manycore Processors. Manycore Chip: A chip having many small CPUs, typically statically scheduled and 2-way superscalar or scalar.

UNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation.

What is Computer Architecture?

HP PA-8000 RISC CPU. A High Performance Out-of-Order Processor

CS 152 Computer Architecture and Engineering. Lecture 16: Graphics Processing Units (GPUs)

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 6. Parallel Processors from Client to Cloud

Challenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC

All About the Cell Processor

COSC 6385 Computer Architecture - Data Level Parallelism (III) The Intel Larrabee, Intel Xeon Phi and IBM Cell processors

Simultaneous Multithreading on Pentium 4

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

Computer & Microprocessor Architecture HCA103

UMBC. Rubini and Corbet, Linux Device Drivers, 2nd Edition, O Reilly. Systems Design and Programming

Computer Architecture

Advanced Computer Architecture

Power Technology For a Smarter Future

IT 252 Computer Organization and Architecture. Introduction. Chia-Chi Teng

GPU Basics. Introduction to GPU. S. Sundar and M. Panchatcharam. GPU Basics. S. Sundar & M. Panchatcharam. Super Computing GPU.

ECE 471 Embedded Systems Lecture 2

45-year CPU Evolution: 1 Law -2 Equations

Transcription:

This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital Circuits Gates & Transistors! Anatomy of a game console! Microsoft XBox 360! Focus mostly on CPU chip! Briefly talk about system! Graphics processing unit (GPU)! I/O and other devices CIS 371 (Martin/Roth): Recap 1 CIS 371 (Martin/Roth): Recap 2 Sources! Application-customized CPU design: The Microsoft Xbox 360 CPU story, Brown, IBM, Dec 2005! http://www-128.ibm.com/developerworks/power/library/pa-fpfxbox/! XBox 360 System Architecture, Andrews & Baker, IEEE Micro, March/April 2006"! Microprocessor Report"! IBM Speeds XBox 360 to Market, Krewell, Oct 31, 2005"! Powering Next-Gen Game Consoles, Krewell, July 18, 2005 What is Computer Architecture? The role of a computer architect: Technology Logic Gates SRAM DRAM Circuit Techniques Packaging Magnetic Storage Flash Memory Design Plans Manufacturing Goals Function Performance Reliability Cost/Manufacturability Energy Efficiency Time to Market Computer PCs Servers PDAs Mobile Phones Supercomputers Game Consoles Embedded CIS 371 (Martin/Roth): Recap 3 CIS 371 (Martin/Roth): Recap 4

Microsoft XBox Game Console History! XBox! First game console by Microsoft, released in 2001, $299! Glorified PC! 733 Mhz x86 Intel CPU, 64MB DRAM, NVIDIA GPU (graphics)! Ran modified version of Windows OS! ~25 million sold! XBox 360! Second generation, released in 2005, $299-$399! All-new custom hardware! 3.2 Ghz PowerPC IBM processor (custom design for XBox 360)! ATI graphics chip (custom design for XBox 360)! 19+ million sold Microsoft Turns to IBM for XBox 360! Microsoft is mostly a software company! Turned to IBM & ATI for XBox 360 design! Sony & Nintendo also turned to IBM (for PS3 & Wii, respectively)! Design principles of XBox 360 [Andrews & Baker]! Value for 5-7 years!! big performance increase over last generation! Support anti-aliased high-definition video (720*1280*4 @ 30+ fps)!! extremely high pixel fill rate (goal: 100+ million pixels/s)! Flexible to suit dynamic range of games!! balance hardware, homogenous resources! Programmability (easy to program)!! listened to software developers (quote) CIS 371 (Martin/Roth): Recap 5 CIS 371 (Martin/Roth): Recap 6 More on Games Workload XBox 360 System! Graphics, graphics, graphics! Special highly-parallel graphics processing unit (GPU)! Much like on PCs today! But general-purpose, too! The high-level game code is generally a database management problem, with plenty of object-oriented code and pointer manipulation. Such a workload needs a large L2 and high integer performance. [Andrews & Baker]! Wanted only a modest number of modest, fast cores! Not one big core! Not dozens of small cores (leave that to the GPU)! Quote from Seymour Cray CIS 371 (Martin/Roth): Recap 7 CIS 371 (Martin/Roth): Recap [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 8

XBox 360 Xenon Processor! ISA: 64-bit PowerPC chip! RISC ISA! Like MIPS, but with condition codes! Fixed-length 32-bit instructions! 32 64-bit general purpose registers (GPRs)! ISA++: Extended with VMX-128 operations! 128 registers, 128-bits each! Packed vector operations! Example: four 32-bit floating point numbers! One instruction: VR1 * VR2! VR3! Four single-precision operations! Also supports conversion to MS DirectX data formats! Similar to Altivec (and Intel s MMX, SSE, SSE2, etc.)! Works great for 3D graphics kernels and compression XBox 360 Xenon Processor! Peak performance: ~75 gigaflops! Gigaflop = 1 billion floating points operations per second! Pipelined Superscalar processor! 3.2 Ghz operation! Superscalar: two-way issue! VMX-128 instructions (four single-precision operations at a time)! Hardware multithreading: two threads per processor! Three processor cores per chip! Result:! 3.2 * 2 * 4 * 3 = ~77 gigaflops CIS 371 (Martin/Roth): Recap 9 CIS 371 (Martin/Roth): Recap 10 XBox 360 Xenon Chip (IBM) Xenon Processor Pipeline! 165 million transistors! IBM s 90nm process! Three cores! 3.2 Ghz! Two-way superscalar! Two-way multithreaded CIS 371 (Martin/Roth): Recap [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 11 CIS 371 (Martin/Roth): Recap [Brown, IBM, Dec 2005] 12

XBox 360 Memory Hiearchy Xenon Multicore Interconnect! 128B cache blocks throughout! 32KB 2-way set-associative instruction cache (per core)! 32KB 4-way set-associative data cache (per core)! Write-through, lots of store buffering! 1MB 8-way set-associative second-level cache (per chip)! Special skip L2 prefetch instruction! MESI cache coherence! 512MB GDDR3 DRAM, dual memory controllers! Total of 22.4 GB/s of memory bandwidth! Direct path to GPU (not supported in current PCs) CIS 371 (Martin/Roth): Recap 13 CIS 371 (Martin/Roth): Recap [Brown, IBM, Dec 2005] 14 XBox 360 System XBox Graphics Subsystem 10.8 GB/s FSB bandwidth link each way 22.4 GB/s DRAM bandwidth 28.8 GB/s link bandwidth CIS 371 (Martin/Roth): Recap [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 15 CIS 371 (Martin/Roth): Recap [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 16

Graphics Parent Die (ATI) GPU daughter die (NEC)! 232 million transistors! 500 Mhz! 48 unified shader ALUs CIS 371 (Martin/Roth): Recap [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 17! 100 million transistors! 10MB edram! Embedded! NEC Electronics! Anti-aliasing! Render at 4x resolution, then sample! Z-buffering! Track the depth of pixels! 256GB/s internal bandwidth CIS 371 (Martin/Roth): Recap [Andrews & Baker, IEEE Micro, Mar/Apr 2006] 18 Putting It All Together! Unit 0: Abstraction and design goals! Unit 1: ISAs! Unit 2: Single cycle! Unit 3: Performance! Unit 4/5: Integer/floating point ALU! Unit 6/7: Pipelining & Superscalar! Unit 8: Static & Dynamic Scheduling! Unit 9: Caches! Unit 10: Shared memory & multicore! Unit 11: Virtual memory and I/O! Unit 12/13/14: Reliability/power/vectors CIS 371 (Martin/Roth): Recap 19