ECE 5745 Complex Digital ASIC Design Course Overview

Size: px
Start display at page:

Download "ECE 5745 Complex Digital ASIC Design Course Overview"

Transcription

1 ECE 5745 Complex Digital ASIC Design Course Overview Christopher Batten School of Electrical and Computer Engineering Cornell University

2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator Devices Technology ECE 5745 Course Overview 2 / 48

3 The Computer Engineering Stack Application Gap too large to bridge in one step (but there are exceptions e.g., magnetic compass) Technology In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 3 / 48

4 The Computer Engineering Stack Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 3 / 48

5 What is Computer Architecture? Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Application Requirements Suggest how to improve architecture Provide revenue to fund development Architecture provides feedback to guide application and technology research directions Technology Constraints Restrict what can be done efficiently New technologies make new arch possible In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 4 / 48

6 Key Metrics in Computer Architecture I$ Network I$ I$ I$ I Primary Metrics. Execution time (cycles/task). Energy (Joules/task). Cycle time (ns/cycle). Area (µm 2 ) P D$ P P P Network D$ D$ D$ Network I Secondary Metrics. Performance (ns/task). Average power (Watts). Peak power (Watts). Cost ($). Design complexity. Reliability. Flexibility Discuss qualitative first-order analysis from ECE 4750 on board ECE 5745 Course Overview 5 / 48

7 Unanswered Questions from ECE 4750 P I$ D$ Network Accelerated Instructions Xcel Xcel Network Network I$ P D$ D$ D$ I How can we quantitatively evaluate area, cycle time, and energy? I How do we actually implement processors, memories, and networks in a real chip? I How should we implement/analyze application-specific accelerators?. Very loosely coupled memory-mapped accelerators. More tightly coupled co-processor accelerators. Specialized instructions and functional units ECE 5745 Course Overview 6 / 48

8 ASIC: Application-Specific Integrated Circuit Network Out-of-Order Superscalar Superpipelined C D P I$ Accelerated Instructions Xcel Xcel Network I$ P Energy (Joules per Task) Simple Proc Superscalar w/ Deeper Pipelines B A Multicore E F Processor Power Constraint Specialized Accelerators D$ D$ D$ D$ Network Performance (Tasks per Second) ECE 5745 Course Overview 7 / 48

9 ASIC: Application-Specific Integrated Circuit P I$ D$ Network Accelerated Instructions Xcel Xcel Network I$ P D$ D$ D$ Energy Efficiency (Tasks per Joule) Design Performance Constraint Embedded Architectures Simple Processor Custom ASIC Less Flexible Accelerator More Flexible Accelerator Design Power Constraint Flexibility vs. Specialization High-Performance Architectures Performance (Tasks per Second) Network ECE 5745 Course Overview 8 / 48

10 Goal for ECE 5745 is to answer these questions! P I$ D$ Network Accelerated Instructions Xcel Xcel Network Network I$ P D$ D$ D$ I How can we quantitatively evaluate area, cycle time, and energy? I How do we actually implement processors, memories, and networks in a real chip? I How should we implement/analyze application-specific accelerators?. Very loosely coupled memory-mapped accelerators. More tightly coupled co-processor accelerators. Specialized instructions and functional units ECE 5745 Course Overview 9 / 48

11 Full Custom Design vs. Standard-Cell Design iece of full-custom multiplier array Full-custom layout in 1.0µm w/ 2 metal layers I Full-Custom Design (ECE 4740). Designer is free to do anything, anywhere; though team usually imposes some design discipline. Most time consuming design style; reserved for very high performance or very high volume chips (Intel microprocessors, RF power amps for cellphones) I Standard-Cell Design (ECE 5745). Fixed library of standard cells and SRAM memory generators. Register-transfer-level description is automatically mapped to this library of standard cells, then these cells are placed and routed automatically. Enables agile hardware design methodology ECE 5745 Course Overview 10 / 48

12 Standard-Cell Design Methodology Cells arranged in rows Mem 1 Mem 2 Generated memory arrays Clock Rail (not typical) Power Rails in M1 Clock Rail VDD Rail Well Contact under Power Rail Cell I/O on M2 Ripple carry adder with carry chain highlighted NAND2 GND Rail Flip-flop ECE 5745 Course Overview 11 / 48

13 Standard-Cell Design Methodology Execution Time (cycles/task) Design in HDL HDL Simulator Switching Activity Standard Cells Synthesis Gate-Level Model Place&Route Layout Area ( m2) Cycle Time (ns) Energy (J/task) Area ( m2) Cycle Time (ns) Energy (J/task) Power Analysis Energy (J/task) ECE 5745 Course Overview 12 / 48

14 Example Standard-Cell Chip Plot ECE 5745 Course Overview 13 / 48

15 What is Complex Digital ASIC Design? Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Complex digital ASIC design is the process of quantitatively exploring the area, cycle time, execution time, and energy trade-offs of various application-specific accelerators (and general-purpose proc+mem+net) using automated standard-cell CAD tools and then to transform the most promising design to layout ready for fabrication ECE 5745 Course Overview 14 / 48

16 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator Devices Technology ECE 5745 Course Overview 15 / 48

17 Course Structure Prereq Computer Architecture Part 2 Digital CMOS Circuits Part 1 ASIC Design Overview P M P M Part 3 CAD Algorithms ECE 5745 Course Overview 16 / 48

18 Part 1: ASIC Design Overview P M P M Topic 1 Hardware Description Languages Topic 4 Full-Custom Design Methodology Topic 6 Closing the Gap Topic 5 Automated Design Methodologies Topic 8 Testing and Verification Topic 3 CMOS Circuits Topic 2 CMOS Devices Topic 7 Clocking, Power Distribution, Packaging, and I/O ECE 5745 Course Overview 17 / 48

19 Part 2: Digital CMOS Circuits Topic 9 Combinational Logic Topic 11 Interconnect Topic 10 Sequential State ECE 5745 Course Overview 18 / 48

20 Part 3: CAD Algorithms Topic 12 Synthesis Algorithms Topic 13 Physical Design Automation Placement x = a'bc + a'bc' y = b'c' + ab' + ac x = a'b y = b'c' + ac RTL to Logic Synthesis Technology Independent Synthesis Technology Dependent Synthesis Global Routing Detailed Routing ECE 5745 Course Overview 19 / 48

21 Five-Week Design Project P I$ D$ Network Accelerated Instructions Xcel Xcel Network I$ P D$ D$ D$ Energy Efficiency (Tasks per Joule) Design Performance Constraint Embedded Architectures Simple Processor Custom ASIC Less Flexible Accelerator More Flexible Accelerator Design Power Constraint Flexibility vs. Specialization High-Performance Architectures Performance (Tasks per Second) Network ECE 5745 Course Overview 20 / 48

22 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator Devices Technology ECE 5745 Course Overview 21 / 48

23 Design starts have been decreasing Number of ASIC Starts (in thousands) % / year decline Todd Austin, Preparing for a Post Moore s Law World, MICRO-48 Keynote ECE 5745 Course Overview 22 / 48

24 ... and design costs are increasing Cost&to&Market&($&million) Mask&Costs S/W&Development&and&Testing H/W&Design&and&Verification $500K $88M $120M u 0.35u 0.25u 0.18u 0.13u 90nm 65nm 45nm 28nm 20nm Silicon&Technology&Node Source: International Business Strategies, T Todd Austin, Preparing for a Post Moore s Law World, MICRO-48 Keynote ECE 5745 Course Overview 23 / 48

25 ... but a renaissance in chip design is coming! I End of Moore s law = chip design in established tech nodes can flourish I If silicon is the canvas we paint on, engineers can do more than just shrink the canvas. David Brooks, Harvard Zvi Or-Bach, FPGAs as ASIC Alternatives: Past & Future, EE Times, April ECE 5745 Course Overview 24 / 48

26 Plenty of companies are making chips... ECE 5745 Course Overview 25 / 48

27 ... even some surprising ones! Google s TPU I Custom chip for accelerating Google s TensorFlow library I Tightly integrated into Google s data centers SiFive s HiFive 1 I First commercial RISC-V SoC I Basic RV32IMAC microcontroller ECE 5745 Course Overview 26 / 48

28 Cross-Layer Interaction is Critical Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Most significant impact on area, cycle time, execution time, and energy is usually at architectural and microarchitectural levels ECE 5745 Course Overview 27 / 48

29 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Course Motivation: Comp Arch Research Perspective 7 Intel 48-Core Prototype 10 AMD 4-Core Opteron 6 10 Intel P4 5? /year SPECint % 5 1 ~ Performance 10 DEC Alpha Transistors (Thousands) ~9%/year Frequency (MHz) MIPS R2K 2 Typical Power (W) 1 Number of Cores ECE 5745 Course Overview 28 / 48

30 Cross-Layer Interaction is Critical Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Need to quantitatively understand area, cycle time, and energy trade-offs to be able to create new revolutionary architectures that tackle the most challenging physical design issues ECE 5745 Course Overview 29 / 48

31 Course Motivation: Circuits Research Perspective Your Digital Circuit Here Your Analog Circuit Here Fighter Airplane ~100,000 parts Intel Sandy Bridge E 2.27 Billion transistors ECE 5745 Course Overview 30 / 48

32 Cross-Layer Interaction is Critical Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Circuit-level researchers need to appreciate the system-level context for their subsystems; cross-layer interaction can generate some of the most exciting research ideas ECE 5745 Course Overview 31 / 48

33 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator Devices Technology ECE 5745 Course Overview 32 / 48

34 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Fixed-Latency Iterative Multiplier Datapath b_mux_sel b_reg b_lsb req_msg req_msg.b req_msg.a 32b 32b a_mux_sel 32b 32b a_reg >> << result_ mux_sel result_ reg add_ mux_sel 0 32b result_en 32b resp_msg ECE 5745 Course Overview 33 / 48

35 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator Devices Technology ECE 5745 Course Overview 34 / 48

36 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scalar Processors with Multithreading Programmer's Logical View T0 T1 T2 T3 T4 Ti Memory Control Instructions Memory Instructions Execution Resources Architectural Registers ECE 5745 Course Overview 35 / 48

37 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scalar Processors with Multithreading Programmer's Logical View T0 T1 T2 T3 T4 Ti Memory Vector-Vector Add loop: load a, a_ptr load b, b_ptr add c, a, b store c, c_ptr a_ptr++ b_ptr++ c_ptr++ branch Masked Filter loop: load a, a_ptr a_ptr++ branch a = 0 op0 op1... branch ECE 5745 Course Overview 35 / 48

38 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scalar Processors with Multithreading Programmer's Logical View T0 T1 T2 T3 T4 Ti Memory Typical Core Micro- Architecture Instr Memory T0 T1 T2 T3 Multithreaded Cores Energy MIMD Data Memory Performance ECE 5745 Course Overview 35 / 48

39 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Vector-SIMD Processors Programmer's Logical View CT0 elm 0 elm 1 elm 2 elm 3 CTj elm i Memory Vector-SIMD Arithmetic Instructions Vector-SIMD Memory Instructions Architectural Vector Register with 4 Elements ECE 5745 Course Overview 36 / 48

40 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Vector-SIMD Processors Programmer's Logical View CT0 elm 0 elm 1 elm 2 elm 3 CTj elm i Memory Vector-Vector Add loop: vload A, a_ptr vload B, b_ptr vadd C, A, B vstore C, c_ptr a_ptr++ b_ptr++ c_ptr++ branch Masked Filter loop: vload A, a_ptr vset F, A = 0 vop0 under flag F vop1 under flag F... a_ptr++ branch ECE 5745 Course Overview 36 / 48

41 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Vector-SIMD Processors Programmer's Logical View CT0 elm 0 elm 1 elm 2 elm 3 CTj elm i Memory Instruction Memory VIU 1 3 Vector Lanes Energy MIMD Typical Core CP Micro- Architecture 0 2 Vector- SIMD VMU Data Memory Performance ECE 5745 Course Overview 36 / 48

42 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Quantitative Area Evaluation ECE 5745 Course Overview 37 / 48

43 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Quantitative Area Evaluation Normalized Area thread 2 threads 4 threads 8 threads 1 core w/ 4 lanes 4 cores w/ 1 lane ea control logic register file memory units floating-point functional units integer functional units control processor instruction cache data cache Single-core multi-lane design reduces area by 15% Quad-Core w/ Vertical Multithreading Vector- SIMD (8 elm/lane) Multi-core single-lane design increases area by 20% ECE 5745 Course Overview 38 / 48

44 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Quantitative Performance and Energy Evaluation Normalized Energy / Task Single-Lane Vector (increasing vlen) Multithreaded Multicore (increasing number of threads per core) Normalized Tasks / Second Performance reduction with increasing threads due to increased cycle time and thread management overhead on fine-grain loops Energy / Task ( J) thread 2 threads 4 threads 8 threads Quad-Core w/ Vertical Multithreading 1 elm/lane 2 elm/lane 4 elm/lane 8 elm/lane Vector- SIMD (4 core w/ 1 lane) control logic register file memory units int func units control proc inst cache data cache leakage ECE 5745 Course Overview 39 / 48

45 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Simple RISC Processor ASIC SP Control SP Datapath SP Regfile AHIP Controller RAM Interface VCO RAM Subbank (2KB) RAM Subbank (2KB) RAM Subbank (2KB) RAM Subbank (2KB) Course 2: STC1 chip plot and die photo. On the Overview chip plot, the Exec Unit is colored blue, and the 40 / 48 Fetch Unit is colored green and purple. ECE 5745 Figure

46 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Simple RISC Processor ASIC * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 410. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 240. * * * * * * * * * 230. * * * * * * * * * 220. * * * * * * * * * 210. * * * * * * * * * * * * * * * * * * * 190 * * * * * * * * * * 180 * * * * * * * * * * I RISC processor w/ 8 KB SRAM I TSMC 0.18 µm process I mm I 610K Transistors I 450 MHz at 1.8 V ECE 5745 Course Overview 41 / 48

47 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scale Vector-Thread Processor ASIC ECE 5745 Course Overview 42 / 48

48 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scale Vector-Thread Processor ASIC TSMC 0.18µm 7.14 Million Transistors 16.6 mm 2 Core Area Cache Tag CAMs CP Cache Control Cache Data RAM Banks X B A R V M U Vector Lane 0 Vector Lane 1 Vector Lane 2 V I U Vector Lane 3 ECE 5745 Course Overview 43 / 48

49 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scale Energy vs. Performance Results ECE 5745 Course Overview 44 / 48

50 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 BRG Test Chip 1 LVDS clk out divided LVDS clk out clk out reset debug Taped out in March 2016, returned in Fall 2016 Currently packaging and testing diff clk (+) diff clk ( ) single ended clk LVDS Recv clk div clk tree reset tree host2chip chip2host Host Interface Ctrl Reg RISC Core Sort Accel Memory Arbitration Unit SRAM Bank (2KB) SRAM Bank (2KB) SRAM Bank (2KB) SRAM Bank (2KB) Post-Silicon Evaluation Strategy The testing platform enables running small test programs on BRGTC1 to compare the performance and energy of pure-software kernels versus the HLS-generated sorting accelerator Taped-out Layout for BRGTC1 2x2mm 1.3M transistors in IBM 130nm RISC processor, 16KB SRAM HLS-generated accelerators Static Timing Analysis 246 MHz ECE 5745 Course Overview 45 / 48

51 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator Devices Technology ECE 5745 Course Overview 46 / 48

52 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Brainstorming for Sorting Accelerator I$ Sort array of integers stored in memory Proc sends Xcel base address and size Proc waits for Xcel to finish sorting I Software Baseline. Insertion sort O(n 2 ). 20% of dynamic instructions are lw/sw P Xcel Arbiter D$ Energy Efficiency (Tasks per Joule) Software Baseline I Hardware Accelerator. What is ideal speedup assuming we still use insertion sort?. What if we use a different algorithm?. Sketch hardware impl of a sorting accelerator Performance (Tasks per Second). Where will accelerator be in the energy/perf space? ECE 5745 Course Overview 47 / 48

53 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Devices Take-Away Points I Complex digital ASIC design is the process of quantitatively exploring the area, cycle time, execution time, and energy trade-offs of general-purpose and application-specific designs using automated standard-cell CAD tools and then to transform the most promising design to layout ready for fabrication I Course can provide useful experience with cross-layer interaction for students interested in pursuing research in computer architecture or circuits or students interested in pursuing a career in in industry development of ASICs Technology ECE 5745 Course Overview 48 / 48

ECE 5745 Complex Digital ASIC Design Course Overview

ECE 5745 Complex Digital ASIC Design Course Overview ECE 5745 Complex Digital ASIC Design Course Overview Christopher Batten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece5745 Application Algorithm

More information

EN105 : Computer architecture. Course overview J. CRENNE 2015/2016

EN105 : Computer architecture. Course overview J. CRENNE 2015/2016 EN105 : Computer architecture Course overview J. CRENNE 2015/2016 Schedule Cours Cours Cours Cours Cours Cours Cours Cours Cours Cours 2 CM 1 - Warmup CM 2 - Computer architecture CM 3 - CISC2RISC CM 4

More information

ECE 3400 Guest Lecture Design Principles and Methodologies in Computer Architecture

ECE 3400 Guest Lecture Design Principles and Methodologies in Computer Architecture ECE 3400 Guest Lecture Design Principles and Methodologies in Computer Architecture Christopher Batten Computer Systems Laboratory School of Electrical and Computer Engineering Cornell University What

More information

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System

More information

PACE: Power-Aware Computing Engines

PACE: Power-Aware Computing Engines PACE: Power-Aware Computing Engines Krste Asanovic Saman Amarasinghe Martin Rinard Computer Architecture Group MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/ PACE Approach Energy- Conscious

More information

The Design of the KiloCore Chip

The Design of the KiloCore Chip The Design of the KiloCore Chip Aaron Stillmaker*, Brent Bohnenstiehl, Bevan Baas DAC 2017: Design Challenges of New Processor Architectures University of California, Davis VLSI Computation Laboratory

More information

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation Course Goals Lab Understand key components in VLSI designs Become familiar with design tools (Cadence) Understand design flows Understand behavioral, structural, and physical specifications Be able to

More information

Design Objectives of the 0.35µm Alpha Microprocessor (A 500MHz Quad Issue RISC Microprocessor)

Design Objectives of the 0.35µm Alpha Microprocessor (A 500MHz Quad Issue RISC Microprocessor) Design Objectives of the 0.35µm Alpha 21164 Microprocessor (A 500MHz Quad Issue RISC Microprocessor) Gregg Bouchard Digital Semiconductor Digital Equipment Corporation Hudson, MA 1 Outline 0.35µm Alpha

More information

Lecture 1: Introduction

Lecture 1: Introduction Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline

More information

Jim Keller. Digital Equipment Corp. Hudson MA

Jim Keller. Digital Equipment Corp. Hudson MA Jim Keller Digital Equipment Corp. Hudson MA ! Performance - SPECint95 100 50 21264 30 21164 10 1995 1996 1997 1998 1999 2000 2001 CMOS 5 0.5um CMOS 6 0.35um CMOS 7 0.25um "## Continued Performance Leadership

More information

EE5780 Advanced VLSI CAD

EE5780 Advanced VLSI CAD EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

Major Information Session ECE: Computer Engineering

Major Information Session ECE: Computer Engineering Major Information Session ECE: Computer Engineering Prof. Christopher Batten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/~cbatten What is Computer Engineering?

More information

GRE Architecture Session

GRE Architecture Session GRE Architecture Session Session 2: Saturday 23, 1995 Young H. Cho e-mail: youngc@cs.berkeley.edu www: http://http.cs.berkeley/~youngc Y. H. Cho Page 1 Review n Homework n Basic Gate Arithmetics n Bubble

More information

An Overview of Standard Cell Based Digital VLSI Design

An Overview of Standard Cell Based Digital VLSI Design An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,

More information

DIGITAL DESIGN TECHNOLOGY & TECHNIQUES

DIGITAL DESIGN TECHNOLOGY & TECHNIQUES DIGITAL DESIGN TECHNOLOGY & TECHNIQUES CAD for ASIC Design 1 INTEGRATED CIRCUITS (IC) An integrated circuit (IC) consists complex electronic circuitries and their interconnections. William Shockley et

More information

Vector IRAM: A Microprocessor Architecture for Media Processing

Vector IRAM: A Microprocessor Architecture for Media Processing IRAM: A Microprocessor Architecture for Media Processing Christoforos E. Kozyrakis kozyraki@cs.berkeley.edu CS252 Graduate Computer Architecture February 10, 2000 Outline Motivation for IRAM technology

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

Fundamentals of Quantitative Design and Analysis

Fundamentals of Quantitative Design and Analysis Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature

More information

Do we need more chips (ASICs)?

Do we need more chips (ASICs)? 6.375 Complex Digital System Spring 2007 Lecturers: Arvind & Krste Asanović TAs: Myron King & Ajay Joshi Assistant: Sally Lee L01-1 Do we need more chips (ASICs)? ASIC=Application-Specific Integrated Circuit

More information

6.375 Complex Digital System Spring 2006

6.375 Complex Digital System Spring 2006 6.375 Complex Digital System Spring 2006 Lecturer: TAs: Assistant: Arvind Chris Batten & Mike Pellauer Sally Lee L01-1 Do we need more chips (ASICs)? ASIC=Application Specific IC Some exciting possibilities

More information

HP PA-8000 RISC CPU. A High Performance Out-of-Order Processor

HP PA-8000 RISC CPU. A High Performance Out-of-Order Processor The A High Performance Out-of-Order Processor Hot Chips VIII IEEE Computer Society Stanford University August 19, 1996 Hewlett-Packard Company Engineering Systems Lab - Fort Collins, CO - Cupertino, CA

More information

Computer Architecture!

Computer Architecture! Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors

More information

Spiral 2-8. Cell Layout

Spiral 2-8. Cell Layout 2-8.1 Spiral 2-8 Cell Layout 2-8.2 Learning Outcomes I understand how a digital circuit is composed of layers of materials forming transistors and wires I understand how each layer is expressed as geometric

More information

Lecture 41: Introduction to Reconfigurable Computing

Lecture 41: Introduction to Reconfigurable Computing inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41: Introduction to Reconfigurable Computing Michael Le, Sp07 Head TA April 30, 2007 Slides Courtesy of Hayden So, Sp06 CS61c Head TA Following

More information

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital

More information

Constructing Virtual Architectures on Tiled Processors. David Wentzlaff Anant Agarwal MIT

Constructing Virtual Architectures on Tiled Processors. David Wentzlaff Anant Agarwal MIT Constructing Virtual Architectures on Tiled Processors David Wentzlaff Anant Agarwal MIT 1 Emulators and JITs for Multi-Core David Wentzlaff Anant Agarwal MIT 2 Why Multi-Core? 3 Why Multi-Core? Future

More information

Design Methodologies. Full-Custom Design

Design Methodologies. Full-Custom Design Design Methodologies Design styles Full-custom design Standard-cell design Programmable logic Gate arrays and field-programmable gate arrays (FPGAs) Sea of gates System-on-a-chip (embedded cores) Design

More information

Design Methodologies and Tools. Full-Custom Design

Design Methodologies and Tools. Full-Custom Design Design Methodologies and Tools Design styles Full-custom design Standard-cell design Programmable logic Gate arrays and field-programmable gate arrays (FPGAs) Sea of gates System-on-a-chip (embedded cores)

More information

More Course Information

More Course Information More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well

More information

Lecture 21: Parallelism ILP to Multicores. Parallel Processing 101

Lecture 21: Parallelism ILP to Multicores. Parallel Processing 101 18 447 Lecture 21: Parallelism ILP to Multicores S 10 L21 1 James C. Hoe Dept of ECE, CMU April 7, 2010 Announcements: Handouts: Lab 4 due this week Optional reading assignments below. The Microarchitecture

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras CAD for VLSI Debdeep Mukhopadhyay IIT Madras Tentative Syllabus Overall perspective of VLSI Design MOS switch and CMOS, MOS based logic design, the CMOS logic styles, Pass Transistors Introduction to Verilog

More information

Putting it all Together: Modern Computer Architecture

Putting it all Together: Modern Computer Architecture Putting it all Together: Modern Computer Architecture Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. May 10, 2018 L23-1 Administrivia Quiz 3 tonight on room 50-340 (Walker Gym) Quiz

More information

Computer Architecture Today (I)

Computer Architecture Today (I) Fundamental Concepts and ISA Computer Architecture Today (I) Today is a very exciting time to study computer architecture Industry is in a large paradigm shift (to multi-core and beyond) many different

More information

CS 152, Spring 2011 Section 8

CS 152, Spring 2011 Section 8 CS 152, Spring 2011 Section 8 Christopher Celio University of California, Berkeley Agenda Grades Upcoming Quiz 3 What it covers OOO processors VLIW Branch Prediction Intel Core 2 Duo (Penryn) Vs. NVidia

More information

Trends in the Infrastructure of Computing

Trends in the Infrastructure of Computing Trends in the Infrastructure of Computing CSCE 9: Computing in the Modern World Dr. Jason D. Bakos My Questions How do computer processors work? Why do computer processors get faster over time? How much

More information

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model What is Computer Architecture? Structure: static arrangement of the parts Organization: dynamic interaction of the parts and their control Implementation: design of specific building blocks Performance:

More information

Keywords and Review Questions

Keywords and Review Questions Keywords and Review Questions lec1: Keywords: ISA, Moore s Law Q1. Who are the people credited for inventing transistor? Q2. In which year IC was invented and who was the inventor? Q3. What is ISA? Explain

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

Advanced Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Advanced Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Advanced Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Modern Microprocessors More than just GHz CPU Clock Speed SPECint2000

More information

Microarchitecture Overview. Performance

Microarchitecture Overview. Performance Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make

More information

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141 ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition

More information

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host

More information

KiloCore: A 32 nm 1000-Processor Array

KiloCore: A 32 nm 1000-Processor Array KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel, Timothy Andreas, Bin Liu, Anh Tran, Emmanuel Adeagbo, Bevan Baas University of California, Davis VLSI Computation

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

Fundamentals of Computer Design

Fundamentals of Computer Design CS359: Computer Architecture Fundamentals of Computer Design Yanyan Shen Department of Computer Science and Engineering 1 Defining Computer Architecture Agenda Introduction Classes of Computers 1.3 Defining

More information

Computer Architecture s Changing Definition

Computer Architecture s Changing Definition Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction

More information

CS310 Embedded Computer Systems. Maeng

CS310 Embedded Computer Systems. Maeng 1 INTRODUCTION (PART II) Maeng Three key embedded system technologies 2 Technology A manner of accomplishing a task, especially using technical processes, methods, or knowledge Three key technologies for

More information

ECE 261: Full Custom VLSI Design

ECE 261: Full Custom VLSI Design ECE 261: Full Custom VLSI Design Prof. James Morizio Dept. Electrical and Computer Engineering Hudson Hall Ph: 201-7759 E-mail: jmorizio@ee.duke.edu URL: http://www.ee.duke.edu/~jmorizio Course URL: http://www.ee.duke.edu/~jmorizio/ece261/261.html

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources

This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources This Unit: Putting It All Together CIS 501 Computer Architecture Unit 12: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital Circuits

More information

Computer Architecture

Computer Architecture Computer Architecture Mehran Rezaei m.rezaei@eng.ui.ac.ir Welcome Office Hours: TBA Office: Eng-Building, Last Floor, Room 344 Tel: 0313 793 4533 Course Web Site: eng.ui.ac.ir/~m.rezaei/architecture/index.html

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011 FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level

More information

Pipelining to Superscalar

Pipelining to Superscalar Pipelining to Superscalar ECE/CS 752 Fall 207 Prof. Mikko H. Lipasti University of Wisconsin-Madison Pipelining to Superscalar Forecast Limits of pipelining The case for superscalar Instruction-level parallel

More information

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

EE282 Computer Architecture. Lecture 1: What is Computer Architecture? EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer

More information

EITF35: Introduction to Structured VLSI Design

EITF35: Introduction to Structured VLSI Design EITF35: Introduction to Structured VLSI Design Part 1.1.2: Introduction (Digital VLSI Systems) Liang Liu liang.liu@eit.lth.se 1 Outline Why Digital? History & Roadmap Device Technology & Platforms System

More information

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses Today Comments about assignment 3-43 Comments about assignment 3 ASICs and Programmable logic Others courses octor Per should show up in the end of the lecture Mealy machines can not be coded in a single

More information

45-year CPU Evolution: 1 Law -2 Equations

45-year CPU Evolution: 1 Law -2 Equations 4004 8086 PowerPC 601 Pentium 4 Prescott 1971 1978 1992 45-year CPU Evolution: 1 Law -2 Equations Daniel Etiemble LRI Université Paris Sud 2004 Xeon X7560 Power9 Nvidia Pascal 2010 2017 2016 Are there

More information

ARM Processors for Embedded Applications

ARM Processors for Embedded Applications ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or

More information

Advanced Processor Architecture

Advanced Processor Architecture Advanced Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong

More information

Portland State University ECE 588/688. Cray-1 and Cray T3E

Portland State University ECE 588/688. Cray-1 and Cray T3E Portland State University ECE 588/688 Cray-1 and Cray T3E Copyright by Alaa Alameldeen 2014 Cray-1 A successful Vector processor from the 1970s Vector instructions are examples of SIMD Contains vector

More information

Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console

Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console Computer Architecture Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Milo Martin & Amir Roth at University of Pennsylvania! Computer Architecture

More information

Advanced d Processor Architecture. Computer Systems Laboratory Sungkyunkwan University

Advanced d Processor Architecture. Computer Systems Laboratory Sungkyunkwan University Advanced d Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Modern Microprocessors More than just GHz CPU Clock Speed SPECint2000

More information

Computer Architecture. Introduction. Lynn Choi Korea University

Computer Architecture. Introduction. Lynn Choi Korea University Computer Architecture Introduction Lynn Choi Korea University Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, 공학관 411, lchoi@korea.ac.kr, TA: 윤창현 / 신동욱, 3290-3896,

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

L2: Design Representations

L2: Design Representations CS250 VLSI Systems Design L2: Design Representations John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) Engineering Challenge Application Gap usually too large to bridge in one step,

More information

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 19 Summary

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 19 Summary ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 19 Summary Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html

More information

A 1-GHz Configurable Processor Core MeP-h1

A 1-GHz Configurable Processor Core MeP-h1 A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface

More information

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital

More information

CMPEN 411 VLSI Digital Circuits. Lecture 01: Introduction

CMPEN 411 VLSI Digital Circuits. Lecture 01: Introduction CMPEN 411 VLSI Digital Circuits Kyusun Choi Lecture 01: Introduction CMPEN 411 Course Website link at: http://www.cse.psu.edu/~kyusun/teach/teach.html [Adapted from Rabaey s Digital Integrated Circuits,

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

Computer Organization + DIGITAL DESIGN

Computer Organization + DIGITAL DESIGN Computer Organization + DIGITAL DESIGN SUKHENDU DAS www.cse.iitm.ac.in/~sdas in/~sdas sdas@iitm.ac.in Computer Level Hierarchy Program Execution Translation: The entire high level program is translated

More information

Power Reduction Techniques in the Memory System. Typical Memory Hierarchy

Power Reduction Techniques in the Memory System. Typical Memory Hierarchy Power Reduction Techniques in the Memory System Low Power Design for SoCs ASIC Tutorial Memories.1 Typical Memory Hierarchy On-Chip Components Control edram Datapath RegFile ITLB DTLB Instr Data Cache

More information

Computer Architecture

Computer Architecture Computer Architecture Lecture 2: Fundamental Concepts and ISA Dr. Ahmed Sallam Based on original slides by Prof. Onur Mutlu What Do I Expect From You? Chance favors the prepared mind. (Louis Pasteur) كل

More information

CS425 Computer Systems Architecture

CS425 Computer Systems Architecture CS425 Computer Systems Architecture Fall 2017 Multiple Issue: Superscalar and VLIW CS425 - Vassilis Papaefstathiou 1 Example: Dynamic Scheduling in PowerPC 604 and Pentium Pro In-order Issue, Out-of-order

More information

The Alpha Microprocessor: Out-of-Order Execution at 600 Mhz. R. E. Kessler COMPAQ Computer Corporation Shrewsbury, MA

The Alpha Microprocessor: Out-of-Order Execution at 600 Mhz. R. E. Kessler COMPAQ Computer Corporation Shrewsbury, MA The Alpha 21264 Microprocessor: Out-of-Order ution at 600 Mhz R. E. Kessler COMPAQ Computer Corporation Shrewsbury, MA 1 Some Highlights z Continued Alpha performance leadership y 600 Mhz operation in

More information

The Xilinx XC6200 chip, the software tools and the board development tools

The Xilinx XC6200 chip, the software tools and the board development tools The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions

More information

EC 513 Computer Architecture

EC 513 Computer Architecture EC 513 Computer Architecture Cache Organization Prof. Michel A. Kinsy The course has 4 modules Module 1 Instruction Set Architecture (ISA) Simple Pipelining and Hazards Module 2 Superscalar Architectures

More information

Performance of computer systems

Performance of computer systems Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type

More information

Computer Systems Architecture Spring 2016

Computer Systems Architecture Spring 2016 Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,

More information

EE241 - Spring 2004 Advanced Digital Integrated Circuits

EE241 - Spring 2004 Advanced Digital Integrated Circuits EE24 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolić Lecture 2 Impact of Scaling Class Material Last lecture Class scope, organization Today s lecture Impact of scaling 2 Major Roadblocks.

More information

ECE 5745 Complex Digital ASIC Design, Spring 2017 Lab 2: Sorting Accelerator

ECE 5745 Complex Digital ASIC Design, Spring 2017 Lab 2: Sorting Accelerator School of Electrical and Computer Engineering Cornell University revision: 2017-03-16-23-56 In this lab, you will explore a medium-grain hardware accelerator for sorting an array of integer values of unknown

More information

INTRODUCTION TO FPGA ARCHITECTURE

INTRODUCTION TO FPGA ARCHITECTURE 3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)

More information

Microarchitecture Overview. Performance

Microarchitecture Overview. Performance Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 18, 2005 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make

More information

ADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING

ADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING ADVANCES IN PROCESSOR DESIGN AND THE EFFECTS OF MOORES LAW AND AMDAHLS LAW IN RELATION TO THROUGHPUT MEMORY CAPACITY AND PARALLEL PROCESSING Evan Baytan Department of Electrical Engineering and Computer

More information

ECE 4750 Computer Architecture, Fall 2014 T01 Single-Cycle Processors

ECE 4750 Computer Architecture, Fall 2014 T01 Single-Cycle Processors ECE 4750 Computer Architecture, Fall 2014 T01 Single-Cycle Processors School of Electrical and Computer Engineering Cornell University revision: 2014-09-03-17-21 1 Instruction Set Architecture 2 1.1. IBM

More information

ECE 4514 Digital Design II. Spring Lecture 22: Design Economics: FPGAs, ASICs, Full Custom

ECE 4514 Digital Design II. Spring Lecture 22: Design Economics: FPGAs, ASICs, Full Custom ECE 4514 Digital Design II Lecture 22: Design Economics: FPGAs, ASICs, Full Custom A Tools/Methods Lecture Overview Wows and Woes of scaling The case of the Microprocessor How efficiently does a microprocessor

More information

Multicore SoC is coming. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems. Source: 2007 ISSCC and IDF.

Multicore SoC is coming. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems. Source: 2007 ISSCC and IDF. Scalable and Reconfigurable Stream Processor for Mobile Multimedia Systems Liang-Gee Chen Distinguished Professor General Director, SOC Center National Taiwan University DSP/IC Design Lab, GIEE, NTU 1

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 19: Multiprocessing Shuai Wang Department of Computer Science and Technology Nanjing University [Slides adapted from CSE 502 Stony Brook University] Getting More

More information

Chapter 5: ASICs Vs. PLDs

Chapter 5: ASICs Vs. PLDs Chapter 5: ASICs Vs. PLDs 5.1 Introduction A general definition of the term Application Specific Integrated Circuit (ASIC) is virtually every type of chip that is designed to perform a dedicated task.

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

EECS4201 Computer Architecture

EECS4201 Computer Architecture Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be

More information

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University EE 466/586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 18 Implementation Methods The Design Productivity Challenge Logic Transistors per Chip (K) 10,000,000.10m

More information

Computer Architecture

Computer Architecture Informatics 3 Computer Architecture Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh (thanks to Prof. Nigel Topham) General Information Instructor

More information

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha ECE520 VLSI Design Lecture 1: Introduction to VLSI Technology Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Wednesday 2:00-3:00PM or by appointment E-mail: pzarkesh@unm.edu Slide: 1 Course Objectives

More information

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight

More information

Issue Logic for a 600-MHz Out-of-Order Execution Microprocessor

Issue Logic for a 600-MHz Out-of-Order Execution Microprocessor IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 33, NO. 5, MAY 1998 707 Issue Logic for a 600-MHz Out-of-Order Execution Microprocessor James A. Farrell and Timothy C. Fischer Abstract The logic and circuits

More information

CIT 668: System Architecture

CIT 668: System Architecture CIT 668: System Architecture Computer Systems Architecture I 1. System Components 2. Processor 3. Memory 4. Storage 5. Network 6. Operating System Topics Images courtesy of Majd F. Sakr or from Wikipedia

More information