Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

Similar documents
How many cores are too many cores? Dr. Avi Mendelson, Intel - Mobile Processors Architecture group

EE586 VLSI Design. Partha Pande School of EECS Washington State University

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE484 VLSI Digital Circuits Fall Lecture 01: Introduction

CIT 668: System Architecture

CMPEN 411 VLSI Digital Circuits. Lecture 01: Introduction

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

Jin-Fu Li. Department of Electrical Engineering. Jhongli, Taiwan

ECE 486/586. Computer Architecture. Lecture # 2

The Effect of Temperature on Amdahl Law in 3D Multicore Era

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

EITF35: Introduction to Structured VLSI Design

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004

Concurrency & Parallelism, 10 mi

VLSI Design Automation

VLSI Design Automation

Announcements. Advanced Digital Integrated Circuits. No office hour next Monday. Lecture 2: Scaling Trends

What is this class all about?

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

Microprocessor Trends and Implications for the Future

Many-Core Computing Era and New Challenges. Nikos Hardavellas, EECS

CSE 141: Computer Architecture. Professor: Michael Taylor. UCSD Department of Computer Science & Engineering

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Multi-Core Microprocessor Chips: Motivation & Challenges

CMPEN 411. Spring Lecture 01: Introduction

The Beauty and Joy of Computing

What is this class all about?

EE141- Spring 2002 Introduction to Digital Integrated Circuits. What is this class about?

Introduction to Microprocessor

EE141- Spring 2007 Introduction to Digital Integrated Circuits

VLSI Design Automation. Maurizio Palesi

EE241 - Spring 2004 Advanced Digital Integrated Circuits

CIT 668: System Architecture. Computer Systems Architecture

Introduction. Summary. Why computer architecture? Technology trends Cost issues

EE141- Spring 2004 Introduction to Digital Integrated Circuits. What is this class about?

What is this class all about?

Computer Performance Evaluation and Benchmarking. EE 382M Dr. Lizy Kurian John

Spring 2011 Parallel Computer Architecture Lecture 4: Multi-core. Prof. Onur Mutlu Carnegie Mellon University

EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration

Moore s s Law, 40 years and Counting

CS310 Embedded Computer Systems. Maeng

COMP 633 Parallel Computing.

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

Digital Integrated Circuits A Design Perspective. Jan M. Rabaey

Lecture 20: Package, Power, and I/O

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Practical Information

CHAPTER 1 INTRODUCTION

Copyright 2012, Elsevier Inc. All rights reserved.

Module 18: "TLP on Chip: HT/SMT and CMP" Lecture 39: "Simultaneous Multithreading and Chip-multiprocessing" TLP on Chip: HT/SMT and CMP SMT

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp

Announcements. Advanced Digital Integrated Circuits. No office hour next Monday. Lecture 2: Scaling Trends

ECE 261: Full Custom VLSI Design

Elettronica T moduli I e II

Lecture #1. Teach you how to make sure your circuit works Do you want your transistor to be the one that screws up a 1 billion transistor chip?

Parallel Computing. Parallel Computing. Hwansoo Han

Transistors and Wires

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

CSC 447: Parallel Programming for Multi- Core and Cluster Systems

Fundamentals of Computer Design

Computer Systems Architecture

What Comes Next? Reconfigurable Nanoelectronics and Defect Tolerance. Technology Shifts. Size Matters. Ops/sec/$

Introduction to ICs and Transistor Fundamentals

EE5780 Advanced VLSI CAD

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

VLSI Design I; A. Milenkovic 1

Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )

Lecture 1: Introduction

Introduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN

Evolution of Computers & Microprocessors. Dr. Cahit Karakuş

Today. SMP architecture. SMP architecture. Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape

Outline Marquette University

10. Interconnects in CMOS Technology

VLSI microarchitecture. Scaling. Toni Juan

INEL-6080 VLSI Systems Design

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha

Digital Integrated Circuits

Z-RAM Ultra-Dense Memory for 90nm and Below. Hot Chips David E. Fisch, Anant Singh, Greg Popov Innovative Silicon Inc.

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad

UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163

BREAKING THE MEMORY WALL

Lecture 11: Large Cache Design

ESE535: Electronic Design Automation. Today. Question. Question. Intuition. Gate Array Evaluation Model

CS250 VLSI Systems Design Lecture 9: Memory

Part 1 of 3 -Understand the hardware components of computer systems

Supercomputing and Mass Market Desktops

Chapter 0 Introduction

Lecture 28 Multicore, Multithread" Suggested reading:" (H&P Chapter 7.4)"

Computer Organization. 8 th Edition. Chapter 2 p Computer Evolution and Performance

Chapter 1. Computer Abstractions and Technology. Lesson 2: Understanding Performance

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)

Computer Systems Architecture

FinFETs: Quo Vadis? Niraj K. Jha Dept. of Electrical Engineering Princeton University

ADVANCED FPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 3 & 4

More Course Information

Digital Design Methodology (Revisited) Design Methodology: Big Picture

Power-Aware Compile Technology. Xiaoming Li

EECS4201 Computer Architecture

Crusoe Power Management:

Transcription:

The VLSI Interconnect Challenge Avinoam Kolodny Electrical Engineering Department Technion Israel Institute of Technology VLSI Challenges System complexity Performance Tolerance to digital noise and faults More challenges The Dominant Challenge is Power dissipation! Interconnect is the crux of the problem Interconnect is the crux of the problem Old view of VLSI: Speed and power are dominated by logic gates Wires are ideal New view : Logic is fast and virtually free Speed and power are limited by wires

Outline of this talk 10000 2001 - Extrapolation Towards A Power Crisis Background of the VLSI interconnect challenge Implications for energy-efficient computing Research directions Power Density (W/cm 2 ) 1000 100 10 Rocket Nozzle Nuclear Reactor Pentium 4 Pentium 3 NMOS to CMOS 8086 transition Pentium Pro i286 Pentium 8085 Hot Plate i486 8080 i386 4004 1 1970 1980 1990 2000 2010 2020 Year Power Density (W/cm 2 ) 10000 1000 100 10 1 4004 8080 VLSI hits the power wall 8085 NMOS to CMOS 8086 transition i286 i386 i486 Pentium 1970 1980 1990 2000 2010 2020 Year Pentium Pro Pentium 3 Pentium 4 Interconnect Power: A case study - 2004 Intel s Pentium-M, low-power microprocessor, 0.13 micron CMOS Bit-Transportation energy is larger than computation energy!

Chips are Like Cities: Complexity is Shown in Connectivity Technology Scaling: Faster Transistors, Slower Wires In each generation of technology: More transistors More interconect wires Technology Scaling: Faster Transistors, Slower Wires Trying to Keep Wire Resistance in Check Leads to Larger Capacitances 1 2 3 k

If Bits Were Cars. The Nature of Design for Low-Power No critical root cause Because power is cumulative Need power-saving efforts at all levels! 3 Types of Improvement Wire layout optimization: Wire Widths and Spaces in a Wire Bundle 1 Reduce waste of energy 2 (Optimal) Tradeoff power with delay (or other metric) 3 Change the algorithm or computational task iccad

Wire layout optimization: Finding Optimal Wire Widths and Spaces under Delay Constraints Wire layout optimization: Optimal Ordering Theorem for Power Given an interconnect channel with wires of uniform width W, use Symmetric Hill ordering according to activity factors of the signals Integration, 2014. INTEGRATION, 2007 3-D Integrated Circuit Technology Substrate Through silicon vias (TSV) 3rd plane Circuit optimization: Optimal Power-Delay Tradeoff for Logic Paths Substrate 2nd plane For 2.5% delay increase, get 12x2.5=30% energy reduction! 1st plane Bulk CMOS *R. J. Gutmann et al., Three-Dimensional (3D) ICs: A Technology Platform for Integrated Systems and Opportunities for New Polymeric Adhesives, Proceedings of the Conference on Polymers and Adhesives in Microelectronics and Photonics, pp. 173-180, October 2001 31 For 20% delay increase, get 2x20=40% energy reduction

Most Power Savings Can be Made at High Abstraction Levels Pollack s Rule on Power Efficiency of Uniprocessors wer = α 2 (Area) = α 1 Area Architecture optimization: Processor System Evolution to CMP wer = α 2 (Area) = α 1 Area Chip Multi- Processors Processor Architectures: Uni-core, Symmetric multicore, Asymmetric (Heterogeneous)

Relative Performance Architecture optimization: Asymmetric Multi-Core Performance ACCMP Performance Vs. Power (25% serial code) Relative Power Classes of Replicated cores Standard modules (Processors, Accelerators, Cache banks,...) Network on Chip (NoC) Power management Different clocks Different operating voltages dark silicon If a chip is like a city, Network-on-Chip (NoC) is like a subway system Architecture optimization: Network on-chip (NoC)

Issues Addressed by NoC Interconnect-aware and NoC-aware Architectural Research Global wire design (delay, power, noise, scalability, reliability issues) System integration productivity (key to modular design) Multi-Core Processor Systems (key to power-efficient computing) Accessing On-Chip Cache Banks through a NoC Where to Store the Shared Data? A small number of lines, shared by many processors, is accessed numerous times What can be done better? Bring shared data closer to all processors Preserve vicinity of private data

Memory Bottleneck CPU Control Unit Reducing Distances by Embedding Memory in Execution Units Arithmetic/ Logic Unit Memory

Memristor Devices Sea of Memory Dense and fast Towards Memory-Intensive Machines Throughput Bandwidth Logic within the Memory Beyond von Neumann Architecture CPU Control Unit Arithmetic/ Logic Unit Constant-throughput-curves increase on-die-memory! Bandwidth Memory

Summary VLSI power is dominated by interconnect! New architectures are driven by interconnect distances/latencies/power