CMOS VLSI Design. MIPS Processor Example. Outline

Similar documents
Introduction to CMOS VLSI Design Lecture 2: MIPS Processor Example

Lecture 2: MIPS Processor Example

Digital Integrated Circuits Lecture 2: MIPS Processor Example

Lecture 4: MIPS Processor Example

Quick Verilog. Levels of Design Modeling

Example Sketch a stick diagram for a CMOS gate computing Y = (A + B + C) D (see Figure 1.18) and estimate the cell width and height.

Lab 8: Multicycle Processor (Part 1) 0.0

Processor: Multi- Cycle Datapath & Control

Introduction to CMOS VLSI Design (E158) Lab 4: Controller Design

RISC Processor Design

ECE 261: CMOS VLSI Design Methodologies

CSE 2021 COMPUTER ORGANIZATION

Points available Your marks Total 100

LECTURE 6. Multi-Cycle Datapath and Control

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

CSE 2021 COMPUTER ORGANIZATION

ECE369. Chapter 5 ECE369

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W10-M

CPE 335. Basic MIPS Architecture Part II

Introduction to CMOS VLSI Design. Lecture 2: MIPS Processor Example. David Harris. Harvey Mudd College Spring Outline

RISC Architecture: Multi-Cycle Implementation

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

Systems Architecture I

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

RISC Architecture: Multi-Cycle Implementation

ENE 334 Microprocessors

Topic #6. Processor Design

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

EECS 322 Computer Architecture Improving Memory Access: the Cache

CC 311- Computer Architecture. The Processor - Control

Chapter 5: The Processor: Datapath and Control

Control Unit for Multiple Cycle Implementation

RISC Design: Multi-Cycle Implementation

Lets Build a Processor

Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007

CSE 2021: Computer Organization Fall 2010 Solution to Assignment # 3: Multicycle Implementation

CMOS VLSI Design Lab 2: Datapath Design and Verification

Announcements. This week: no lab, no quiz, just midterm

Major CPU Design Steps

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

CMOS VLSI Design Lab 3: Controller Design and Verification

Using a Hardware Description Language to Design and Simulate a Processor 5.8

MIPS Architecture. Fibonacci (C) Fibonacci (Assembly) Another Example: MIPS. Example: subset of MIPS processor architecture

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

Multicycle Approach. Designing MIPS Processor

Lecture 5: The Processor

Microprogramming. Microprogramming

CMOS VLSI Design Lab 3: Controller Design and Verification

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

The Processor: Datapath & Control

Mapping Control to Hardware

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Alternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch

Processor Implementation in VHDL. University of Ulster at Jordanstown University of Applied Sciences, Augsburg

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

Chapter 5 Solutions: For More Practice

Week 6: Processor Components

Multiple Cycle Data Path

ALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR

The Microprocessor as a Microcosm:

Processor (I) - datapath & control. Hwansoo Han

EECE 417 Computer Systems Architecture

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

Lecture 10 Multi-Cycle Implementation

Chapter 4 The Processor (Part 2)

Design of Digital Circuits Lecture 13: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring April 2017

Single vs. Multi-cycle Implementation

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

ECE 313 Computer Organization EXAM 2 November 9, 2001

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

Initial Representation Finite State Diagram. Logic Representation Logic Equations

Systems Architecture

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Final Project: MIPS-like Microprocessor

ECE 313 Computer Organization Name SOLUTION EXAM 2 November 3, Floating Point 20 Points

COMP2611: Computer Organization. The Pipelined Processor

Digital Design and Computer Architecture Harris and Harris

Spiral 2-8. Cell Layout

Digital Design and Computer Architecture

University of Jordan Computer Engineering Department CPE439: Computer Design Lab

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

EE457. Note: Parts of the solutions are extracted from the solutions manual accompanying the text book.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

LECTURE 5. Single-Cycle Datapath and Control

Very short answer questions. You must use 10 or fewer words. "True" and "False" are considered very short answers.

Beyond Pipelining. CP-226: Computer Architecture. Lecture 23 (19 April 2013) CADSL

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Computer Organization & Design The Hardware/Software Interface Chapter 5 The processor : Datapath and control

Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)

FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1

ECE410 Design Project Spring 2013 Design and Characterization of a CMOS 8-bit pipelined Microprocessor Data Path

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Computer Science 141 Computing Hardware

Transcription:

COS VLSI Design IPS Processor Example Outline Design Partitioning IPS Processor Example Architecture icroarchitecture Logic Design Circuit Design Physical Design Fabrication, Packaging, Testing Slide 2

Review Sketch a stick diagram for a 4-input NOR gate Slide 3 Review Sketch a stick diagram for a 4-input NOR gate VDD A B C D Y GND Slide 4 2

Coping with Complexity How to design System-on-Chip? any millions (soon billions!) of transistors Tens to hundreds of engineers Structured Design Design Partitioning Slide 5 Structured Design Hierarchy: Divide and Conquer Recursively system into modules Regularity Reuse modules wherever possible Ex: Standard cell library odularity: well-formed interfaces Allows modules to be treated as black boxes Locality Physical and temporal Slide 6 3

Design Partitioning Architecture: User s perspective, what does it do? set, registers IPS, x86, Alpha, PIC, AR, icroarchitecture Single cycle, multi-cycle, pipelined, superscalar? Logic: how are functional blocks constructed Ripple carry, carry lookahead, carry select adders Circuit: how are transistors used Complementary COS, pass transistors, domino Physical: chip layout Datapaths, memories, random logic Slide 7 Gajski Y-Chart Slide 8 4

IPS Architecture Example: subset of IPS processor architecture Drawn from Patterson & Hennessy s textbook IPS is a 32-bit architecture with 32 registers Consider 8-bit subset using 8-bit datapath Only implement 8 registers ($ - $7) $ hardwired to 8-bit program counter Illustrate the key concepts in VLSI design Slide 9 IPS icroarchitecture ulticycle μarchitecture from Patterson & Hennessy PCEn PCWriteCond PCWrite IorD Outputs emread emwrite Control emtoreg PCSource ALUOp ALUSrcB ALUSrcA RegWrite PC ux Address Write data emory emdata [3:26] [25 : 2] [2 : 6] [5 : ] register [7: ] IRWrite[3:] [5: ] Op [5: ] u x u x RegDst [5: ] 6 8 Shift left 2 Read register Read register 2 Write register Write data Registers Read data Read data 2 A B u 2 x 3 ux Zero ALU ALU result Jump address ALUOut 2 u x emory data register ALU control ALUControl [5 : ] Slide 5

ulticycle Controller fetch emread ALUSrcA = IorD = IRWrite3 ALUSrcB = ALUOp = PCWrite PCSource = emread ALUSrcA = IorD = IRWrite2 ALUSrcB = ALUOp = PCWrite PCSource = 2 emread ALUSrcA = IorD = IRWrite ALUSrcB = ALUOp = PCWrite PCSource = 3 emread ALUSrcA = IorD = IRWrite ALUSrcB = ALUOp = PCWrite PCSource = decode/ register fetch 4 ALUSrcA = ALUSrcB = ALUOp = Reset emory address computation 5 ALUSrcA = ALUSrcB = ALUOp = (Op = 'L B ') or (Op = 'S B ') Execution 9 ALUSrcA = ALUSrcB = ALUOp= (Op = R-type) Branch Jump completion completion 2 ALUSrcA = ALUSrcB = PCWrite ALUOp = PCSource = PCWriteCond PCSource = (Op = 'BEQ') (Op = 'J') 6 (Op = 'LB') (Op = 'S B') emory access 8 emory access R-type completion emread IorD = emwrite IorD = RegDst = RegWrite emtoreg = 7 Write-back step RegDst= RegWrite emtoreg = Slide Logic Design Start at top level Hierarchically decompose IPS into units Top-level interface crystal oscillator 2-phase clock generator ph ph2 reset memread memwrite IPS processor adr writedata memdata 8 8 8 external memory Slide 2 6

PC PCEn ux Address Write data emory emdata register emory data register PCWriteCond PCWrite IorD emread emwrite emtoreg IRWrite[3:] Outputs Control Op [5: ] u x PCSource ALUOp ALUSrcB ALUSrcA RegWrite RegDst [5: ] 6 8 Shift left 2 [3:26] [25 : 2] [2 : 6] [5 : ] ux [5: ] [7 : ] Read register Read Read register 2 data Registers Write Read register data 2 Write data [5: ] A B 2 3 u x ALU control Zero ALU ALU result ALUControl Jump address ALUOut u x 2 Block Diagram memwrite memread u x controller aluop[:] alucontrol ph op[5:] zero alusrca alusrcb[:] pcen pcsource[:] memtoreg regdst iord regwrite irwrite[3:] funct[5:] alucontrol[2:] ph2 reset adr[7:] datapath writedata[7:] memdata[7:] Slide 3 Hierarchical Design mips controller alucontrol datapath standard cell library bitslice zipper alu inv4x flop ramslice fulladder or2 and2 mux4 nor2 inv nand2 mux2 tri Slide 4 7

HDLs Hardware Description Languages Widely used in logic design Verilog and VHDL Describe hardware using code Document logic functions Simulate logic before building Synthesize code into gates and layout Requires a library of standard cells Slide 5 Verilog Example module fulladder(input a, b, c, output s, cout); a b a b c cout c carry sum sum s(a, b, c, s); s fulladder carry c(a, b, c, cout); cout s endmodule module carry(input a, b, c, output cout) assign cout = (a&b) (a&c) (b&c); endmodule Slide 6 8

Circuit Design How should logic be implemented? NANDs and NORs vs. ANDs and ORs? Fan-in and fan-out? How wide should transistors be? These choices affect speed, area, power Logic synthesis makes these choices for you Good enough for many applications Hand-crafted circuits are still better Slide 7 Example: Carry Logic assign cout = (a&b) (a&c) (b&c); Transistors? Gate Delays? Slide 8 9

Example: Carry Logic assign cout = (a&b) (a&c) (b&c); a b a c b c g g2 g3 x y z g4 cout Transistors? Gate Delays? Slide 9 Gate-level Netlist module carry(input a, b, c, output cout) wire x, y, z; and g(x, a, b); and g2(y, a, c); and g3(z, b, c); or g4(cout, x, y, z); endmodule a b a c b c g g2 g3 x y z g4 cout Slide 2

SPICE Netlist.SUBCKT CARRY A B C COUT VDD GND N I A GND GND NOS W=U L=.8U AD=.3P AS=.5P N2 I B GND GND NOS W=U L=.8U AD=.3P AS=.5P N3 CN C I GND NOS W=U L=.8U AD=.5P AS=.5P N4 I2 B GND GND NOS W=U L=.8U AD=.5P AS=.5P N5 CN A I2 GND NOS W=U L=.8U AD=.5P AS=.5P P I3 A VDD VDD POS W=2U L=.8U AD=.6P AS= P P2 I3 B VDD VDD POS W=2U L=.8U AD=.6P AS=P P3 CN C I3 VDD POS W=2U L=.8U AD=P AS=P P4 I4 B VDD VDD POS W=2U L=.8U AD=.3P AS=P P5 CN A I4 VDD POS W=2U L=.8U AD=P AS=.3P N6 COUT CN GND GND NOS W=2U L=.8U AD=P AS=P P6 COUT CN VDD VDD POS W=4U L=.8U AD=2P AS=2P CI I GND 2FF CI3 I3 GND 3FF CA A GND 4FF CB B GND 4FF CC C GND 2FF CCN CN GND 4FF CCOUT COUT GND 2FF.ENDS Slide 2 Physical Design Floorplan Standard cells Place & route Datapaths Slice planning Area estimation Slide 22

IPS Floorplan I/O pads mips (4.6 λ2) 5 λ I/O pads 35 λ 69 λ control 5 λ x 4 λ (.6 λ2) wiring channel: 3 tracks = 24 λ zipper 27 λ x 25 λ datapath 27 λ x 5 λ (2.8 λ2) alucontrol 2 λ x λ (2 kλ2) I/O pads bitslice 27 λ x λ 27 λ 35 λ I/O pads 5 λ Slide 23 IPS Layout Slide 24 2

Standard Cells Uniform cell height Uniform well height V DD and GND rails 2 Access to I/Os Well / substrate taps Exploits regularity Slide 25 Synthesized Controller Synthesize HDL into gate-level netlist Place & Route using standard cell library Slide 26 3

Pitch atching Synthesized controller area is mostly wires Design is smaller if wires run through/over cells Smaller = faster, lower power as well! Design snap-together cells for datapaths and arrays Plan wires into cells A A A A B Connect by abutment A A A A B Exploits locality A A A A B Takes lots of effort A A A A B C C D Slide 27 IPS Datapath 8-bit datapath built from 8 bitslices (regularity) Zipper at top drives control signals to datapath Slide 28 4

IPS ALU Arithmetic / Logic Unit is part of bitslice Slide 29 Design Verification Fabrication is slow & expensive Specification OSIS.6μm: $, 3 months State of art: $, month Architecture Debugging chips is very hard Design Limited visibility into operation Logic Prove design is right before building! Design Logic simulation Circuit Design Ckt. simulation / formal verification Layout vs. schematic comparison Physical Design Design & electrical rule checks Verification is > 5% of effort on most chips! = = = = Function Function Function Function Timing Power Slide 3 5

Fabrication & Packaging Tapeout final layout Fabrication 6, 8, 2 wafers Optimized for throughput, not latency ( weeks!) Processed wafers are sliced into dice (chips) and packaged. Packaging Bond gold wires from die I/O pads to package Slide 3 Testing Test that chip operates Design errors anufacturing errors A single dust particle or wafer defect kills a die Yields from 9% to < % Depends on die size, maturity of process Test each part before shipping to customer Slide 32 6