A 1.5GHz Third Generation Itanium Processor
|
|
- Anastasia Fields
- 6 years ago
- Views:
Transcription
1 A 1.5GHz Third Generation Itanium Processor Jason Stinson, Stefan Rusu Intel Corporation, Santa Clara, CA 1
2 Outline Processor highlights Process technology details Itanium processor evolution Block diagram Cache circuit design details Package details Front-side bus interface Clock distribution Power dissipation RAS, DFT and DFM features Frequency shmoo Summary 2
3 130nm Itanium 2 Processor Highlights 410M transistors 374mm 2 die size 6MB on-die L3 cache 1.5GHz at 1.3V 6.4GB/s 400MT/s 4-way bus interface Plug-in compatible with existing platforms Extensive RAS, DFT and DFM features Largest microprocessor transistor count and on-die cache 3
4 130nm Process Characteristics Attribute Lgate M1 pitch M2 pitch M3 pitch M4 pitch M5 pitch M6 pitch Dielectric Memory cell Value 60nm 350nm 448nm 448nm 756nm 1120nm 1204nm FSG, K= µm 2 S. Thompson et al, IEDM 2001 (top) S. Tyagi, et al, IEDM 2000 (bottom) 4
5 Itanium Processor Evolution Attribute Architecture Process Device Count On-die L3 cache Frequency Itanium Processor Explicitly Parallel Instruction Computing 180nm 25M 0* 800MHz Itanium 2 Processor 180nm 221M 3MB 1.0GHz This work 130nm 410M 6MB 1.5GHz Supply Voltage 1.6V 1.5V 1.3V Power 130W* 130W 130W * Includes 4MB cache on cartridge 5
6 Block Diagram L3 Cache ECC ECC ECC L2 Cache Quad Port ECC Branch Prediction 11 Issue Ports Scoreboard, Predicate, NATs, Exceptions L1 Instruction Cache and Fetch/Pre-fetch Engine Instruction Queue ITLB B B B M M M M I I F F Register Stack Engine / Re-Mapping Branch & Predicate Registers Branch Units IA-32 Decode and Control 128 Integer Registers 128 FP Registers Integer and MM Units 8 bundles Quad-Port L1 Data Cache and DTLB ALAT Floating Point Units ECC ECC Bus (128b data, 6
7 Cache Summary Attribute L1I L1D L2 L3 Size 16K 16K 256K 6M Line Size 64B 64B 128B 128B Ways Replacement LRU NRU NRU NRU Latency 1-Fetch:1 INT:1 FP: NA INT: 5 FP: 6 14 Write Policy - WT (RA) WB (WA) WB (WA) Bandwidth R: 48GBs R: 24GBs W: 24GBs R: 48GBs W: 48GBs R: 48GBs W: 48GBs Cache bandwidth increased by 50% L3 cache latency increased to 14 clocks and set associativity doubled 7
8 L1 Instruction Cache Circuit Detail p0wl p1wl p1_local_bl p0_local_bl p1_global_bl p0_global_bl p1_write_data write_en# fill_cycle_1 p0_write_data coreclock fill_cycle_1 write_en# p1_local_bl write-1 read-2 read-3 p0_local_bl read-1 write-1 8
9 L1 Instruction Cache Circuit Detail p0wl p1wl p1_local_bl p0_local_bl p1_global_bl p0_global_bl p1_write_data write_en# fill_cycle_1 p0_write_data coreclock fill_cycle_1 write_en# p1_local_bl write-1 read-2 read-3 p0_local_bl read-1 write-1 9
10 L1 Instruction Cache Circuit Detail p0wl p1wl p1_local_bl p0_local_bl p1_global_bl p0_global_bl p1_write_data write_en# fill_cycle_1 p0_write_data coreclock fill_cycle_1 write_en# p1_local_bl write-1 read-2 read-3 p0_local_bl read-1 write-1 10
11 Way[11:0] BLOCK7 BLOCK6 Way[23:12] BLOCK7 BLOCK6 Decoders Sense amplifiers Timers L3 Subarray BLOCK5 BLOCK5 Core BLOCK4 BLOCK4 1588u BLOCK3 I/O mux BLOCK3 L3 Cache BLOCK2 BLOCK1 BLOCK0 BLOCK2 BLOCK1 BLOCK0 L3 cache contains 140 subarrays tiled to fit irregular shape of core 776u 11
12 Package Details Power delivery connector Server Management Components Flip-Chip BGA package with Integrated Heat Spreader Interposer Substrate 12
13 Package Decoupling Vdd (core) Vtt (FSB) PLL filter Power delivery connector 13
14 Front Side Bus P1 P3 CS P2 P4 Interface Support System Topology Termination Voltage Voltage Reference Data Bus Width Data Bus Speed Data Strobes Peak BW Address, Control Speed Glueless 4-way Multi-Processor Dual-sided board, staggered vias 1.2V, common ground with core Ground-referenced, 0.75V Vref 128-bit 400MT/s source synchronous 1 differential strobe for 16b of data 6.4GB/s 200MHz common clock 14
15 Front-Side Bus Topology Previous implementation Four linear stripes This work U-shape Core Core L3 Cache L3 Cache Data I/O Address I/O Control I/O 15
16 Itanium Processor Clock Distribution Trends Attribute Itanium Processor Itanium 2 Processor This work Process/Metal 180nm / Al 180nm / Al 130nm / Cu Primary tree Single ended Differential Differential Local distribution Grid Tree Tree Clock skew [ps] 28ps 62ps 24ps Deskew Method Active On-demand Fuse-based 16
17 Fuse-Based Clock Deskew ITC (TAP) SLCB scan chain SLCB Ck Gen Unit PD SLCB SLCBO zones FUSE Unit SLCB FUSE settings SLCB (3-bit/SLCB) SLCB = Second Level Clock Buffer 17
18 Skew (ps) Clock Zone Skew Plot No deskew Fuse-based deskew Scan-based deskew Clock Zone Worst case clock skew is 24ps in fuse mode and 7ps in scan mode 18
19 Power Same 130W power envelope as the 0.18um Itanium 2 processor 50% frequency increase 2X larger L3 cache Leakage increased 3.5X 5% 14% 7% 74% This work dynamic power I/O power core leakage/static cache leakage Aggressive management of dynamic power Reduced clock loading Reduced contention power L3 cache power management 5% 5% 90% Itanium 2 Processor dynamic power I/O power all leakage/static 19
20 L3D Power Reduction Scheme Previous Implementation This work Index [4:3] Index [4:3] == 11 Bank 1 Request Index [4:3] == 10 Bank 0 Request Index [4:3] == 01 Index [4:3] == Bank 1 Request Bank 0 Request Data in [7:0] Data out [7:0] 2 Data in [7:0] Data out [7:0]
21 Reliability Features Core Pipeline L1I (16KB) L2 (256KB) Data L3 (6MB) Data + Tag L1D (16KB) Tag Bus Unit Error Log Timer ECC protected ITLB1 (32) DTLB1 (32) IPC (128) DTLB2 (128) HW Page Walker Bus Queues Data Poisoning Data Bus Addr Bus Parity protected System bus 21
22 DFT/DFM Feature Summary Feature Itanium Processor Itanium 2 Processor This work Scan Coverage 48K 140K 140K Scanout Coverage 5.5K 24K 24K Cache DAT Mode (major arrays) Yes Yes Yes L3 Redundancy / Repair N/A Dual Quad Weak-Write Test Mode Fixed Fixed Programmable IO DFT Basic IO Loopback Limited IO Loopback Enhanced IO Loopback Dynamic Frequency Adjustment Multi-cycle shrink/stretch Single cycle shrink/stretch Multi-cycle shrink/stretch On-die process monitors No No Yes 22
23 Thermal Protection Features Temperature Thermal Alert ETM Thermal Trip I res I diode ETM Throttle Thermal Trip Time Diode Reference 23
24 Frequency Shmoo PASS FAIL 1:9 Bus Period (ns) Core Frequency 1.80GHz 1.73GHz 1.66GHz 1.61GHz 1.55GHz 1.50GHz 1.45GHz 1.40GHz Frequency increased by 50% from previous generation 24
25 Summary The 2003 version of the Itanium 2 processor (Madison) delivers 2X larger on-die cache and 50% increase in frequency Compatible with today s Itanium 2-based systems Enterprise-class RAS, DFT and DFM features Largest on-die cache and transistor count ever reported for a microprocessor 6.4GB/s multi-drop bus interface Clock de-skew technique achieves 24ps skew in fuse-mode and 7ps in scan mode across entire die 25
New 130nm Itanium 2 Processors for 2003
New 130nm Itanium s for 003 Harry Muljono, Stefan usu, Brian Cherkauer, Jason Stinson Intel Corporation, Santa Clara, CA 1 Outline highlights Itanium processor evolution Block diagram Power dissipation
More informationA Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache
A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache Stefan Rusu Intel Corporation Santa Clara, CA Intel and the Intel logo are registered trademarks of Intel Corporation or its subsidiaries in
More informationSam Naffziger. Gary Hammond. Next Generation Itanium Processor Overview. Lead Circuit Architect Microprocessor Technology Lab HP Corporation
Next Generation Itanium Processor Overview Gary Hammond Principal Architect Enterprise Platform Group Corporation August 27-30, 2001 Sam Naffziger Lead Circuit Architect Microprocessor Technology Lab HP
More informationItanium 2 Processor Microarchitecture Overview
Itanium 2 Processor Microarchitecture Overview Don Soltis, Mark Gibson Cameron McNairy, August 2002 Block Diagram F 16KB L1 I-cache Instr 2 Instr 1 Instr 0 M/A M/A M/A M/A I/A Template I/A B B 2 FMACs
More informationUpdate for New Implementations. As new implementations of the Itanium architecture
U Update for New Implementations As new implementations of the Itanium architecture are announced, we attempt to post appropriate updates on the support page for Itanium Architecture for Programmers: Understanding
More informationUNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation.
UNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation. July 14) (June 2013) (June 2015)(Jan 2016)(June 2016) H/W Support : Conditional Execution Also known
More information2GB DDR3 SDRAM 72bit SO-DIMM
2GB 72bit SO-DIMM Speed Max CAS Component Number of Part Number Bandwidth Density Organization Grade Frequency Latency Composition Rank 78.A2GCF.AF10C 10.6GB/sec 1333Mbps 666MHz CL9 2GB 256Mx72 256Mx8
More informationJim Keller. Digital Equipment Corp. Hudson MA
Jim Keller Digital Equipment Corp. Hudson MA ! Performance - SPECint95 100 50 21264 30 21164 10 1995 1996 1997 1998 1999 2000 2001 CMOS 5 0.5um CMOS 6 0.35um CMOS 7 0.25um "## Continued Performance Leadership
More informationBasics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS
Basics DRAM ORGANIZATION DRAM Word Line Bit Line Storage element (capacitor) In/Out Buffers Decoder Sense Amps... Bit Lines... Switching element Decoder... Word Lines... Memory Array Page 1 Basics BUS
More informationINTEL Architectures GOPALAKRISHNAN IYER FALL 2009 ELEC : Computer Architecture and Design
INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 GBI0001@AUBURN.EDU ELEC 6200-001: Computer Architecture and Design Silicon Technology Moore s law Moore's Law describes a long-term trend in the history
More informationSeveral Common Compiler Strategies. Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining
Several Common Compiler Strategies Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining Basic Instruction Scheduling Reschedule the order of the instructions to reduce the
More informationGemini: Sanjiv Kapil. A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor. Gemini Architect Sun Microsystems, Inc.
Gemini: A Power-efficient Chip Multi-Threaded (CMT) UltraSPARC Processor Sanjiv Kapil Gemini Architect Sun Microsystems, Inc. Design Goals Designed for compute-dense, transaction oriented systems (webservers,
More informationCS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory
CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5
More informationGigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004
Gigascale Integration Design Challenges & Opportunities Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Outline CMOS technology challenges Technology, circuit and μarchitecture solutions Integration
More informationOPENSPARC T1 OVERVIEW
Chapter Four OPENSPARC T1 OVERVIEW Denis Sheahan Distinguished Engineer Niagara Architecture Group Sun Microsystems Creative Commons 3.0United United States License Creative CommonsAttribution-Share Attribution-Share
More informationIntel Enterprise Processors Technology
Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology
More informationEECS 322 Computer Architecture Superpipline and the Cache
EECS 322 Computer Architecture Superpipline and the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow Summary:
More informationIntel released new technology call P6P
P6 and IA-64 8086 released on 1978 Pentium release on 1993 8086 has upgrade by Pipeline, Super scalar, Clock frequency, Cache and so on But 8086 has limit, Hard to improve efficiency Intel released new
More informationIMM128M72D1SOD8AG (Die Revision F) 1GByte (128M x 72 Bit)
Product Specification Rev. 1.0 2015 IMM128M72D1SOD8AG (Die Revision F) 1GByte (128M x 72 Bit) 1GB DDR Unbuffered SO-DIMM RoHS Compliant Product Product Specification 1.0 1 IMM128M72D1SOD8AG Version: Rev.
More informationReal Time Embedded Systems
Real Time Embedded Systems " Memories " rene.beuchat@epfl.ch LAP/ISIM/IC/EPFL Chargé de cours LSN/hepia Prof. HES 1998-2008 2 General classification of electronic memories Non-volatile Memories ROM PROM
More informationPOWER7: IBM's Next Generation Server Processor
Hot Chips 21 POWER7: IBM's Next Generation Server Processor Ronald Kalla Balaram Sinharoy POWER7 Chief Engineer POWER7 Chief Core Architect Acknowledgment: This material is based upon work supported by
More informationKiloCore: A 32 nm 1000-Processor Array
KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel, Timothy Andreas, Bin Liu, Anh Tran, Emmanuel Adeagbo, Bevan Baas University of California, Davis VLSI Computation
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 12: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital Circuits
More informationIMM64M72D1SCS8AG (Die Revision D) 512MByte (64M x 72 Bit)
Product Specification Rev. 1.0 2015 IMM64M72D1SCS8AG (Die Revision D) 512MByte (64M x 72 Bit) RoHS Compliant Product Product Specification 1.0 1 IMM64M72D1SCS8AG Version: Rev. 1.0, MAY 2015 1.0 - Initial
More informationThe Alpha Microprocessor: Out-of-Order Execution at 600 Mhz. R. E. Kessler COMPAQ Computer Corporation Shrewsbury, MA
The Alpha 21264 Microprocessor: Out-of-Order ution at 600 Mhz R. E. Kessler COMPAQ Computer Corporation Shrewsbury, MA 1 Some Highlights z Continued Alpha performance leadership y 600 Mhz operation in
More informationDDR SDRAM UDIMM. Draft 9/ 9/ MT18VDDT6472A 512MB 1 MT18VDDT12872A 1GB For component data sheets, refer to Micron s Web site:
DDR SDRAM UDIMM MT18VDDT6472A 512MB 1 MT18VDDT12872A 1GB For component data sheets, refer to Micron s Web site: www.micron.com 512MB, 1GB (x72, ECC, DR) 184-Pin DDR SDRAM UDIMM Features Features 184-pin,
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationUnit 11: Putting it All Together: Anatomy of the XBox 360 Game Console
Computer Architecture Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Milo Martin & Amir Roth at University of Pennsylvania! Computer Architecture
More informationDDR3 Memory Buffer: Buffer at the Heart of the LRDIMM Architecture. Paul Washkewicz Vice President Marketing, Inphi
DDR3 Memory Buffer: Buffer at the Heart of the LRDIMM Architecture Paul Washkewicz Vice President Marketing, Inphi Theme Challenges with Memory Bandwidth Scaling How LRDIMM Addresses this Challenge Under
More informationComputer Science 146. Computer Architecture
Computer Architecture Spring 2004 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture 12: Hardware Assisted Software ILP and IA64/Itanium Case Study Lecture Outline Review of Global Scheduling,
More informationPOWER7: IBM's Next Generation Server Processor
POWER7: IBM's Next Generation Server Processor Acknowledgment: This material is based upon work supported by the Defense Advanced Research Projects Agency under its Agreement No. HR0011-07-9-0002 Outline
More informationInterconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp
Interconnect Challenges in a Many Core Compute Environment Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Agenda Microprocessor general trends Implications Tradeoffs Summary
More informationCPI IPC. 1 - One At Best 1 - One At best. Multiple issue processors: VLIW (Very Long Instruction Word) Speculative Tomasulo Processor
Single-Issue Processor (AKA Scalar Processor) CPI IPC 1 - One At Best 1 - One At best 1 From Single-Issue to: AKS Scalar Processors CPI < 1? How? Multiple issue processors: VLIW (Very Long Instruction
More informationIMM64M64D1SOD16AG (Die Revision D) 512MByte (64M x 64 Bit)
Product Specification Rev. 2.0 2015 IMM64M64D1SOD16AG (Die Revision D) 512MByte (64M x 64 Bit) 512MB DDR Unbuffered SO-DIMM RoHS Compliant Product Product Specification 2.0 1 IMM64M64D1SOD16AG Version:
More informationA 65nm LEVEL-1 CACHE FOR MOBILE APPLICATIONS
A 65nm LEVEL-1 CACHE FOR MOBILE APPLICATIONS ABSTRACT We describe L1 cache designed for digital signal processor (DSP) core. The cache is 32KB with variable associativity (4 to 16 ways) and is pseudo-dual-ported.
More informationThe University of Adelaide, School of Computer Science 13 September 2018
Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More information4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD
4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD Ordering Information Part Number Bandwidth Speed Grade Max Frequency CAS Latency Density Organization Component Composition 78.B1GE3.AFF0C 12.8GB/sec 1600Mbps
More information1. NoCs: What s the point?
1. Nos: What s the point? What is the role of networks-on-chip in future many-core systems? What topologies are most promising for performance? What about for energy scaling? How heavily utilized are Nos
More information2GB DDR3 SDRAM SODIMM with SPD
2GB DDR3 SDRAM SODIMM with SPD Ordering Information Part Number Bandwidth Speed Grade Max Frequency CAS Latency Density Organization Component Composition Number of Rank 78.A2GC6.AF1 10.6GB/sec 1333Mbps
More informationIMM128M64D1DVD8AG (Die Revision F) 1GByte (128M x 64 Bit)
Product Specification Rev. 1.0 2015 IMM128M64D1DVD8AG (Die Revision F) 1GByte (128M x 64 Bit) 1GB DDR VLP Unbuffered DIMM RoHS Compliant Product Product Specification 1.0 1 IMM128M64D1DVD8AG Version: Rev.
More informationMainstream Computer System Components
Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved
More informationThis Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II
ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,
More informationHP PA-8000 RISC CPU. A High Performance Out-of-Order Processor
The A High Performance Out-of-Order Processor Hot Chips VIII IEEE Computer Society Stanford University August 19, 1996 Hewlett-Packard Company Engineering Systems Lab - Fort Collins, CO - Cupertino, CA
More informationThe Memory Hierarchy 1
The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow
More informationSPARC64 X: Fujitsu s New Generation 16 Core Processor for the next generation UNIX servers
X: Fujitsu s New Generation 16 Processor for the next generation UNIX servers August 29, 2012 Takumi Maruyama Processor Development Division Enterprise Server Business Unit Fujitsu Limited All Rights Reserved,Copyright
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (II)
COSC 6385 Computer Architecture - Memory Hierarchies (II) Edgar Gabriel Spring 2018 Types of cache misses Compulsory Misses: first access to a block cannot be in the cache (cold start misses) Capacity
More informationOrganization Row Address Column Address Bank Address Auto Precharge 256Mx4 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10
GENERAL DESCRIPTION The Gigaram GR2DR4BD-E4GBXXXVLP is a 512M bit x 72 DDDR2 SDRAM high density ECC REGISTERED DIMM. The GR2DR4BD-E4GBXXXVLP consists of eighteen CMOS 512M x 4 STACKED DDR2 SDRAMs for 4GB
More informationLecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5)
Lecture: Large Caches, Virtual Memory Topics: cache innovations (Sections 2.4, B.4, B.5) 1 Techniques to Reduce Cache Misses Victim caches Better replacement policies pseudo-lru, NRU Prefetching, cache
More informationDesign Objectives of the 0.35µm Alpha Microprocessor (A 500MHz Quad Issue RISC Microprocessor)
Design Objectives of the 0.35µm Alpha 21164 Microprocessor (A 500MHz Quad Issue RISC Microprocessor) Gregg Bouchard Digital Semiconductor Digital Equipment Corporation Hudson, MA 1 Outline 0.35µm Alpha
More informationPOWER4 Test Chip. Bradley D. McCredie Senior Technical Staff Member IBM Server Group, Austin. August 14, 1999
Bradley D. McCredie Senior Technical Staff Member Server Group, Austin August 14, 1999 Presentation Overview Design objectives Chip overview Technology Circuits Implementation Results Test Chip Objectives
More informationMemory latency: Affects cache miss penalty. Measured by:
Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory
More informationOrganization Row Address Column Address Bank Address Auto Precharge 128Mx8 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10
GENERAL DESCRIPTION The Gigaram is ECC Registered Dual-Die DIMM with 1.25inch (30.00mm) height based on DDR2 technology. DIMMs are available as ECC modules in 256Mx72 (2GByte) organization and density,
More informationMemory latency: Affects cache miss penalty. Measured by:
Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory
More informationECE 571 Advanced Microprocessor-Based Design Lecture 24
ECE 571 Advanced Microprocessor-Based Design Lecture 24 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 25 April 2013 Project/HW Reminder Project Presentations. 15-20 minutes.
More informationM2U1G64DS8HB1G and M2Y1G64DS8HB1G are unbuffered 200-Pin Double Data Rate (DDR) Synchronous DRAM Unbuffered Dual In-Line
184 pin Based on DDR400/333 512M bit Die B device Features 184 Dual In-Line Memory Module (DIMM) based on 110nm 512M bit die B device Performance: Speed Sort PC2700 PC3200 6K DIMM Latency 25 3 5T Unit
More informationDesign of Clock Distribution in High Performance Processors
Design of Clock Distribution in High Performance Processors Ian Young Intel Senior Fellow and Director, Advanced Circuits and Technology Integration (ACTI) Technology Manufacturing Group (TMG) Intel Corporation,
More informationIMME256M64D2SOD8AG (Die Revision E) 2GByte (256M x 64 Bit)
Product Specification Rev. 1.0 2015 IMME256M64D2SOD8AG (Die Revision E) 2GByte (256M x 64 Bit) 2GB DDR2 Unbuffered SO-DIMM By ECC DRAM RoHS Compliant Product Product Specification 1.0 1 IMME256M64D2SOD8AG
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (III)
COSC 6385 Computer Architecture - Memory Hierarchies (III) Edgar Gabriel Spring 2014 Memory Technology Performance metrics Latency problems handled through caches Bandwidth main concern for main memory
More informationCS 152 Computer Architecture and Engineering
CS 152 Computer Architecture and Engineering Lecture 19 Advanced Processors III 2006-11-2 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Udam Saini and Jue Sun www-inst.eecs.berkeley.edu/~cs152/ 1 Last
More informationBOBCAT: AMD S LOW-POWER X86 PROCESSOR
ARCHITECTURES FOR MULTIMEDIA SYSTEMS PROF. CRISTINA SILVANO LOW-POWER X86 20/06/2011 AMD Bobcat Small, Efficient, Low Power x86 core Excellent Performance Synthesizable with smaller number of custom arrays
More informationMainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation
Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer
More informationLecture 13: SRAM. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed.
Lecture 13: SRAM Slides courtesy of Deming Chen Slides based on the initial set from David Harris CMOS VLSI Design Outline Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports
More informationEach Milliwatt Matters
Each Milliwatt Matters Ultra High Efficiency Application Processors Govind Wathan Product Manager, CPG ARM Tech Symposia China 2015 November 2015 Ultra High Efficiency Processors Used in Diverse Markets
More informationFuture of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1
Future of Interconnect Fabric A ontrarian View Shekhar Borkar June 13, 2010 Intel orp. 1 Outline Evolution of interconnect fabric On die network challenges Some simple contrarian proposals Evaluation and
More informationLE4ASS21PEH 16GB Unbuffered 2048Mx64 DDR4 SO-DIMM 1.2V Up to PC CL
LE4ASS21PEH 16GB Unbuffered 2048Mx64 DDR4 SO-DIMM 1.2V Up to PC4-2133 CL 15-15-15 General Description This Legacy device is a JEDEC standard unbuffered SO-DIMM module, based on CMOS DDR4 SDRAM technology,
More information256MEGX72 (DDR2-SOCDIMM W/PLL)
www.centon.com MEMORY SPECIFICATIONS 256MEGX72 (DDR2-SOCDIMM W/PLL) 128MX8 BASED Lead-Free 268,435,456 words x 72Bit Double Data Rate Memory Module Centon's 2GB DDR2 ECC PLL SODIMM Memory Module is 268,435,456
More informationEEM 486: Computer Architecture. Lecture 9. Memory
EEM 486: Computer Architecture Lecture 9 Memory The Big Picture Designing a Multiple Clock Cycle Datapath Processor Control Memory Input Datapath Output The following slides belong to Prof. Onur Mutlu
More informationDDR2 SDRAM UDIMM MT8HTF12864AZ 1GB
Features DDR2 SDRAM UDIMM MT8HTF12864AZ 1GB For component data sheets, refer to Micron's Web site: www.micron.com Figure 1: 240-Pin UDIMM (MO-237 R/C D) Features 240-pin, unbuffered dual in-line memory
More informationGetting CPI under 1: Outline
CMSC 411 Computer Systems Architecture Lecture 12 Instruction Level Parallelism 5 (Improving CPI) Getting CPI under 1: Outline More ILP VLIW branch target buffer return address predictor superscalar more
More informationMain Memory. EECC551 - Shaaban. Memory latency: Affects cache miss penalty. Measured by:
Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row (~every 8 msec). Static RAM may be
More informationDDR SDRAM UDIMM MT16VDDT6464A 512MB MT16VDDT12864A 1GB MT16VDDT25664A 2GB
DDR SDRAM UDIMM MT16VDDT6464A 512MB MT16VDDT12864A 1GB MT16VDDT25664A 2GB For component data sheets, refer to Micron s Web site: www.micron.com 512MB, 1GB, 2GB (x64, DR) 184-Pin DDR SDRAM UDIMM Features
More information1. The values of t RCD and t RP for -335 modules show 18ns to align with industry specifications; actual DDR SDRAM device specifications are 15ns.
UDIMM MT4VDDT1664A 128MB MT4VDDT3264A 256MB For component data sheets, refer to Micron s Web site: www.micron.com 128MB, 256MB (x64, SR) 184-Pin UDIMM Features Features 184-pin, unbuffered dual in-line
More informationCS425 Computer Systems Architecture
CS425 Computer Systems Architecture Fall 2017 Multiple Issue: Superscalar and VLIW CS425 - Vassilis Papaefstathiou 1 Example: Dynamic Scheduling in PowerPC 604 and Pentium Pro In-order Issue, Out-of-order
More informationCPI < 1? How? What if dynamic branch prediction is wrong? Multiple issue processors: Speculative Tomasulo Processor
1 CPI < 1? How? From Single-Issue to: AKS Scalar Processors Multiple issue processors: VLIW (Very Long Instruction Word) Superscalar processors No ISA Support Needed ISA Support Needed 2 What if dynamic
More informationECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]
ECE7995 (4) Basics of Memory Hierarchy [Adapted from Mary Jane Irwin s slides (PSU)] Major Components of a Computer Processor Devices Control Memory Input Datapath Output Performance Processor-Memory Performance
More informationThe Alpha and Microprocessors:
The Alpha 21364 and 21464 icroprocessors: Continuing the Performance Lead Beyond Y2K Shubu ukherjee, Ph.D. Principal Hardware Engineer VSSAD Labs, Alpha Development Group Compaq Computer Corporation Shrewsbury,
More informationIMME256M64D2DUD8AG (Die Revision E) 2GByte (256M x 64 Bit)
Product Specification Rev. 1.0 2015 IMME256M64D2DUD8AG (Die Revision E) 2GByte (256M x 64 Bit) 2GB DDR2 Unbuffered DIMM By ECC DRAM RoHS Compliant Product Product Specification 1.0 1 IMME256M64D2DUD8AG
More informationThe Design of the KiloCore Chip
The Design of the KiloCore Chip Aaron Stillmaker*, Brent Bohnenstiehl, Bevan Baas DAC 2017: Design Challenges of New Processor Architectures University of California, Davis VLSI Computation Laboratory
More informationDDR SDRAM SODIMM MT16VDDF6464H 512MB MT16VDDF12864H 1GB
SODIMM MT16VDDF6464H 512MB MT16VDDF12864H 1GB 512MB, 1GB (x64, DR) 200-Pin DDR SODIMM Features For component data sheets, refer to Micron s Web site: www.micron.com Features 200-pin, small-outline dual
More informationUnleashing the Power of Embedded DRAM
Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers
More informationCSE 820 Graduate Computer Architecture. week 6 Instruction Level Parallelism. Review from Last Time #1
CSE 820 Graduate Computer Architecture week 6 Instruction Level Parallelism Based on slides by David Patterson Review from Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level
More information+1 (479)
Memory Courtesy of Dr. Daehyun Lim@WSU, Dr. Harris@HMC, Dr. Shmuel Wimer@BIU and Dr. Choi@PSU http://csce.uark.edu +1 (479) 575-6043 yrpeng@uark.edu Memory Arrays Memory Arrays Random Access Memory Serial
More informationAn Overview of Standard Cell Based Digital VLSI Design
An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,
More informationENEE 759H, Spring 2005 Memory Systems: Architecture and
SLIDE, Memory Systems: DRAM Device Circuits and Architecture Credit where credit is due: Slides contain original artwork ( Jacob, Wang 005) Overview Processor Processor System Controller Memory Controller
More informationAgenda. What is the Itanium Architecture? Terminology What is the Itanium Architecture? Thomas Siebold Technology Consultant Alpha Systems Division
What is the Itanium Architecture? Thomas Siebold Technology Consultant Alpha Systems Division thomas.siebold@hp.com Agenda Terminology What is the Itanium Architecture? 1 Terminology Processor Architectures
More informationIMM64M64D1DVS8AG (Die Revision D) 512MByte (64M x 64 Bit)
Product Specification Rev. 1.0 2015 IMM64M64D1DVS8AG (Die Revision D) 512MByte (64M x 64 Bit) 512MB DDR VLP Unbuffered DIMM RoHS Compliant Product Product Specification 1.0 1 IMM64M64D1DVS8AG Version:
More informationNetworks for Multi-core Chips A A Contrarian View. Shekhar Borkar Aug 27, 2007 Intel Corp.
Networks for Multi-core hips A A ontrarian View Shekhar Borkar Aug 27, 2007 Intel orp. 1 Outline Multi-core system outlook On die network challenges A simple contrarian proposal Benefits Summary 2 A Sample
More informationEE241 - Spring 2007 Advanced Digital Integrated Circuits. Announcements
EE241 - Spring 2007 Advanced Digital Integrated Circuits Lecture 22: SRAM Announcements Homework #4 due today Final exam on May 8 in class Project presentations on May 3, 1-5pm 2 1 Class Material Last
More informationUMBC. Rubini and Corbet, Linux Device Drivers, 2nd Edition, O Reilly. Systems Design and Programming
Systems Design and Programming Instructor: Professor Jim Plusquellic Text: Barry B. Brey, The Intel Microprocessors, 8086/8088, 80186/80188, 80286, 80386, 80486, Pentium and Pentium Pro Processor Architecture,
More informationCS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.
CS 33 Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. Hyper Threading Instruction Control Instruction Control Retirement Unit
More informationCS 152 Computer Architecture and Engineering
CS 152 Computer Architecture and Engineering Lecture 22 Advanced Processors III 2005-4-12 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Ted Hong and David Marquardt www-inst.eecs.berkeley.edu/~cs152/
More informationMemory Hierarchies 2009 DAT105
Memory Hierarchies Cache performance issues (5.1) Virtual memory (C.4) Cache performance improvement techniques (5.2) Hit-time improvement techniques Miss-rate improvement techniques Miss-penalty improvement
More informationCOEN-4730 Computer Architecture Lecture 12. Testing and Design for Testability (focus: processors)
1 COEN-4730 Computer Architecture Lecture 12 Testing and Design for Testability (focus: processors) Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University 1 Outline Testing
More informationDDR3 Memory for Intel-based G6 Servers
DDR3 Memory for Intel-based G6 Servers March 2009 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. New memory technology with G6; DDR-3
More informationThe Alpha Microprocessor: Out-of-Order Execution at 600 MHz. Some Highlights
The Alpha 21264 Microprocessor: Out-of-Order ution at 600 MHz R. E. Kessler Compaq Computer Corporation Shrewsbury, MA 1 Some Highlights Continued Alpha performance leadership 600 MHz operation in 0.35u
More informationThe Processor That Don't Cost a Thing
The Processor That Don't Cost a Thing Peter Hsu, Ph.D. Peter Hsu Consulting, Inc. http://cs.wisc.edu/~peterhsu DRAM+Processor Commercial demand Heat stiffling industry's growth Heat density limits small
More informationDDR SDRAM UDIMM MT8VDDT3264A 256MB MT8VDDT6464A 512MB For component data sheets, refer to Micron s Web site:
DDR SDRAM UDIMM MT8VDDT3264A 256MB MT8VDDT6464A 512MB For component data sheets, refer to Micron s Web site: www.micron.com 256MB, 512MB (x64, SR) 184-Pin DDR SDRAM UDIMM Features Features 184-pin, unbuffered
More informationMemory Hierarchy and Caches
Memory Hierarchy and Caches COE 301 / ICS 233 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline
More informationCOMPUTER ARCHITECTURES
COMPUTER ARCHITECTURES Random Access Memory Technologies Gábor Horváth BUTE Department of Networked Systems and Services ghorvath@hit.bme.hu Budapest, 2019. 02. 24. Department of Networked Systems and
More information