BeiHang Short Course, Part 2: Description and Synthesis
|
|
- Spencer Bryant
- 5 years ago
- Views:
Transcription
1 BeiHang Short Course, Part 2: Operation Centric Hardware Description and Synthesis J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s1 James C. Hoe Department of ECE Carnegie Mellon University Collaborator: Arvind (MIT) James C. Hoe and Arvind, Operation Centric Hardware Description and Synthesis, IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, Volume 23, Number 9, pp , September Euclid s Algorithm: Greatest Common Divisor Rules Gcd(a, b) if a>b, b!=0 Gcd(a%b, b) Gcd(a, b) if a>b, b!=0 Gcd(a b, b) Gcd(a, b) if a<b Gcd(b, a) Gcd(a, 0) a Execution: Gcd(2,4) flip Gcd(4,2) mod iter flip Gcd(2,2) Gcd(2,0) mod iter Gcd(0,2) (mod) (mod iter) (flip) (done) J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s2 1
2 FSM#1: what is NS for b a < flip a b ce flip J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s3 b next = (a<b)? a : b FSM#2: what is NS for a a_sub_b b mod flip flip or mod ce a b < flip b a_sub_b b =0 mod J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s4 a next = (flip or mod)? (flip? b : a b) : a 2
3 Mapping to Hardware a_sub_b b mod flip a flip or mod ce a b ce a_sub_b < flip mod =0 flip Is it clear that the two FSMs together implements GCD? J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s5 Cooperating FSM is State centric J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s6 3
4 Operation Centric Decomposition a a b b when a < b a = b b = a J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s7 when a >= b && b!=0 a = a b b = b Otherwise do nothing A Very Complicated Real Life Example: Out of Order Speculative Processor Fetch hunit BTB PC Decode Unit RF ROB status IntU BPU MemU J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s8 4
5 A Trivial Made Up Example: Decoupled Fetch/Execute +1 FIFO PC Imem RegFile ALU Fetch Execute J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s9 Just Two Instructions Program visible state program counter: PC register file: RF[ ] Addrd, r1, r2 RF[rd] RF[r1] + RF[r2] PC PC + 1 Bzra, rc if RF[rc]==0 then PC RF[ra] else PC PC + 1 J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s10 5
6 Interactions between Fetch and Execute +1 FIFO PC Imem RegFile ALU PC next J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s11 INST PC next =if (INST is a taken branch instruction) branch target else PC+1 Outline Motivations Operation centric hardware abstraction Synthesis of an operation centric description Wrap Up J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s12 6
7 Operation Centric Abstraction reg array (ROM) FIFO array PC IMEM BF RF STATE = Proc( pc, imem, bf, rf ) J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s13 Processor Model: Fetch Rule Fetch Rule Proc( pc, imem, bf, rf ) Proc( pc+1, imem, bf.enq(inst), rf) let inst=imem[pc] +1 PC IMEM BF RF ALU J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s14 7
8 TRSpec Rewrite Rules Takes notation from Term Rewriting Systems (TRS) <left hand side pattern> when <predicate expression> ==> <right hand side rewrite expression> let <variables bindings> J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s15 Atomic Execution Semantics Given a set of rules and an initial term s While ( some rules are applicable to s ) { choose an applicable rule (non deterministic) } apply the rule atomically to s Atomic Update Step J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s16 Note: after a rule fires, applicability of rules is re evaluated from scratch on the new state 8
9 Processor Model: Execute Rules Add Rule Proc( pc, imem, bf, rf ) when bf.first( )=Add(rd, r1, r2) Proc( pc, imem, bf.deq( ), rf[ rd:=(rf[r1]+rf[r2]) ] ) +1 PC IMEM BF RF ALU J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s17 Processor Model: branch if zero Bz Not Taken Proc( pc, imem, bf, rf ) if rf[rc] 0 when bf.first( )=Bz(ra, rc) Proc( pc, imem, bf.deq( ), rf ) Bz Taken Proc( pc, imem, bf, rf ) if rf[rc]==0 when bf.first( )=Bz(ra, rc) Proc( rf[ra], imem, bf.clear( ), rf ) J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s18 Is this (good) hardware description? 9
10 Operation Centric Abstraction Explicit declaration of storage (same as RTL) Describes system behavior as a collection of guarded actions (a.k.a. rules); instead a collection of distributed state machine NS logic a rule is guarded by a predicate condition; if condition true then always correct to apply action rule application is atomic, i.e., if multiple rules enabled, pick only one to proceed an execution corresponds to a sequence of rule applications J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s19 Excerpt from Superscalar Model: Dataflow Order Dispatch Rule Dispatch Instruction : Non Branch IntU( Queue(.. { entry }[i]..), ) if ( op is a valid type && ALU is available ) where RsEntry(Valid, id, op, arg1, arg2) = entry Arg(Valid, value1, ) = arg1 Arg(Valid, value2, ) = arg2 ==> IntU( Queue(.. { RsEntry(Invalid,,,,,, ) }[i].. ),......, Result(Valid, id, val),.... ) where val=execute(op, value1, value2) J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s20 10
11 Outline Motivations Operation centric hardware abstraction Synthesis of an operation centric description Wrap Up J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s21 Operations to Synchronous C FSM Mapping and Scheduling J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s22 11
12 Rule: A Functional Interpretation A rule may be decomposed into two parts (s) and (s) such that rule = s. if (s) then (s) else s J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s23 Rule: As a State Transition Logic Proc( pc, imem, bf, rf ) if rf[rc]==0 when bf.first( )=Bz(ra, rc) Proc( rf[ra], imem, bf.clear( ), rf ) enable current state PC RF IM PC RF IM next state values BF BF J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s24 12
13 Putting Them All Together enables from different rules that update PC 0 1 n OR latch enable next state values from different rules that update PC PC 1,PC n PC sel next state value PC J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s25 Putting Them All Together enables from different rules that update PC 0 1 n OR latch enable next state values from different rules that update PC PC 1,PC n PC sel next state value PC J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s26 13
14 Putting Them All Together enables from different rules that update PC 0 1 n OR latch enable next state values from different rules that update PC PC 1,PC n PC sel next state value PC J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s27 Single Rule per Cycle Scheduler Scheduler: 2 Priority Encoder n n 1. i i n n 3. one rule at a time i.e., at most one i is true J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s28 14
15 Correctness Implementation is deterministic but the spec is not implementation s state transitions must correspond to some legal execution of TRSpec implementation must maintain liveness Weak fairness can be achieved if a transition stays applicable, it will be selected within bounded number of steps J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s29 Good HW should fire Fetch and Execute rules together +1 FIFO PC Imem RegFile ALU Fetch Rule Execute Rules (except Bz Taken) J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s30 15
16 Executing Rules Concurrently Applying Fetch and Add together on the same state whenboth areenabledenabled does not produce conflicting updates gives the same results as if one after the other in particular, applying doesn t invalidate the other Concurrent Execution statically determine which transitions can be safely executed concurrently (formalizing the above) generate a scheduler and update logic that allows as many concurrent transitions as possible J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s31 Conflict Free Rules R a and R b are conflict free if s. a (s) b (s) 1. a ( b (s)) b ( a (s)) 2. a ( b (s)) == b ( a (s)) 3. a ( b (s)) == a (s) b (s) updates do not overlap or conflict J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s32 You can fire any number of conflict free rules in a clock cycle as long as they are all pairwise conflict free!! 16
17 Multiple Rule per Cycle Scheduler Scheduler: 2 n n 1. i i J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s n n 3. multiple rules such that i j R i and R j are conflict free Conflict Free Scheduler Partition rules into maximum number of nonoverlapping sets such that rules in different sets are conflict free ( Best case: All sets are of size 1!!) Schedule each set independently eg e.g., one rule per cycle per set The state update logic is unchanged J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s34 17
18 CF Scheduling Example T 1 T 2 T 6 T 3 T 5 T 4 Conflict Graph CF Graph J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s35 CF Scheduling Example T 1 T 2 T 1 T 2 T 6 T 3 T 6 T 3 T 5 T 4 T 5 T 4 CF Graph Conflict Graph J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s36 18
19 Multiple Rule per Cycle Scheduler 1 2 Scheduler 1 2 Scheduler 2 n Scheduler n 1. i i J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s n n 3. multiple rules such that i j R i and R j are conflict free Performance Gain Multiple rules per cycle But is this always optimal? CF scheduler does not increase critical path partitioned schedulers are smaller and faster than a singlemonolithic scheduler distributed scheduler lowers wiring delay for s and s J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s38 19
20 CF Schedule is too strict R a and R b are sequentially composable (SC) if s. a (s) b (s) 1. a ( b (s)) b ( a (s)) 2. a ( b (s)) == b ( a (s)) 3. a ( b (s)) == a (s) b (s) J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s39 Applying a pair of SC rules concurrently to the same state produce the same outcome as only one ordering, but that is all that is required SC Scheduling For each CF scheduling group in a given clock cycle, a b c. the transitive closure of R a,r b,r c. on SC is ordered For the sake of implementation, we further require the orderings to be consistent in all clock cycles J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s40 20
21 TRSpec and TRAC (aka my PhD Thesis) TRSpec Design RTL Target Tech. Std Cell TRAC Synopsys Gate Array FPGA RTL sim J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s41 U.S. Patent #6,597,664 and #6,901,055 TRSpec vs. Verilog 5 stage pipelined, 32 bit MIPS R2000 Integer Core CBA tc6a Area Clock (cells) TRSpec ns 96.6MHz6MHz LSI 10K Area Clock (gates) ns 41.9MHz Hand-coded Verilog RTL ns 96MHz ns 42.1MHz J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s42 21
22 Recap the Last Hour Operation centric design abstracts away synchronous clock as the marker of progress designer thinks in terms of a sequence of atomic updates many correct mapping to synchronous FSM D let compiler pick a good one What if precise timing is a part of the design specification? need a way to mix abstractions smoothly J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s43 Bluespec: doing it for real A real commercial implementation operation centric guarded atomic actions full high level language with proper modular design support mix seamlessly with RTL like timing control when necessary Free academic license available Visit J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s44 22
23 Computer Architecture Lab (CALCM) Carnegie Mellon University J. C. Hoe, CMU/ECE/CALCM, 2014, BHSC L2 s45 23
Bluespec-4: Rule Scheduling and Synthesis. Synthesis: From State & Rules into Synchronous FSMs
Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology Based on material prepared by Bluespec Inc, January 2005 March 2, 2005
More informationElastic Pipelines and Basics of Multi-rule Systems. Elastic pipeline Use FIFOs instead of pipeline registers
Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology L05-1 Elastic pipeline Use FIFOs instead of pipeline registers
More informationLecture 06: Rule Scheduling
Bluespec SystemVerilog Training Lecture 06: Rule Scheduling Copyright Bluespec, Inc., 2005-2008 Lecture 06: Rule Scheduling A rule has no internal sequencing Untimed rule semantics are one-rule-at-a-time
More informationEE 3170 Microcontroller Applications
EE 3170 Microcontroller Applications Lecture 4 : Processors, Computers, and Controllers - 1.2 (reading assignment), 1.3-1.5 Based on slides for ECE3170 by Profs. Kieckhafer, Davis, Tan, and Cischke Outline
More informationMODELING LANGUAGES AND ABSTRACT MODELS. Giovanni De Micheli Stanford University. Chapter 3 in book, please read it.
MODELING LANGUAGES AND ABSTRACT MODELS Giovanni De Micheli Stanford University Chapter 3 in book, please read it. Outline Hardware modeling issues: Representations and models. Issues in hardware languages.
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More information2
Operation-Centric Hardware Description and Synthesis by James C. Hoe B.S., University of California at Berkeley (1992) M.S., Massachusetts Institute of Technology (1994) Submitted to the Department of
More informationSynthesizable Verilog
Synthesizable Verilog Courtesy of Dr. Edwards@Columbia, and Dr. Franzon@NCSU http://csce.uark.edu +1 (479) 575-6043 yrpeng@uark.edu Design Methodology Structure and Function (Behavior) of a Design HDL
More information1.1 Operation-Centric Hardware Descriptions
Synthesis of Operation-Centric Hardware Descriptions James C. Hoe Dept. of Electrical and Computer Engineering Carnegie Mellon University jhoe@ece.cmu.edu Arvind Laboratory for Computer Science Massachusetts
More informationThe University of Texas at Austin
EE382 (20): Computer Architecture - Parallelism and Locality Lecture 4 Parallelism in Hardware Mattan Erez The University of Texas at Austin EE38(20) (c) Mattan Erez 1 Outline 2 Principles of parallel
More informationVHDL vs. BSV: A case study on a Java-optimized processor
VHDL vs. BSV: A case study on a Java-optimized processor April 18, 2007 Outline Introduction Goal Design parameters Goal Design parameters What are we trying to do? Compare BlueSpec SystemVerilog (BSV)
More informationOutline. In-Order vs. Out-of-Order. Project Goal 5/14/2007. Design and implement an out-of-ordering superscalar. Introduction
Outline Group IV Wei-Yin Chen Myong Hyon Cho Introduction In-Order vs. Out-of-Order Register Renaming Re-Ordering Od Buffer Superscalar Architecture Architectural Design Bluespec Implementation Results
More informationUNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Computer Architecture ECE 568 Part 10 Compiler Techniques / VLIW Israel Koren ECE568/Koren Part.10.1 FP Loop Example Add a scalar
More informationSynthesis of Language Constructs. 5/10/04 & 5/13/04 Hardware Description Languages and Synthesis
Synthesis of Language Constructs 1 Nets Nets declared to be input or output ports are retained Internal nets may be eliminated due to logic optimization User may force a net to exist trireg, tri0, tri1
More informationControl & Execution. Finite State Machines for Control. MIPS Execution. Comp 411. L14 Control & Execution 1
Control & Execution Finite State Machines for Control MIPS Execution L14 Control & Execution 1 Synchronous Systems data Latch Combinational logic Latch Clock leading edge trailing edge On the leading edge
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design
More informationBluespec SystemVerilog TM Training. Lecture 05: Rules. Copyright Bluespec, Inc., Lecture 05: Rules
Bluespec SystemVerilog Training Copyright Bluespec, Inc., 2005-2008 Rules: conditions, actions Rule Untimed Semantics Non-determinism Functional correctness: atomicity, invariants Examples Performance
More informationRegister Machines. Connecting evaluators to low level machine code
Register Machines Connecting evaluators to low level machine code 1 Plan Design a central processing unit (CPU) from: wires logic (networks of AND gates, OR gates, etc) registers control sequencer Our
More informationComplex Pipelines and Branch Prediction
Complex Pipelines and Branch Prediction Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L22-1 Processor Performance Time Program Instructions Program Cycles Instruction CPI Time Cycle
More informationLecture 5: Instruction Set Architecture : Road Map
S 09 L5-1 18-447 Lecture 5: Instruction Set Architecture James C. Hoe Dept of ECE, CMU February 2, 2009 Announcements: HW 1 due Midterm in 2 weeks Make sure you find lab partners for Lab2, no exceptions
More informationEECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis
EECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis Jan 31, 2012 John Wawrzynek Spring 2012 EECS150 - Lec05-verilog_synth Page 1 Outline Quick review of essentials of state elements Finite State
More informationLecture 7: Structural RTL Design. Housekeeping
18 643 Lecture 7: Structural RTL Design James C. Hoe Department of ECE Carnegie Mellon University 18 643 F17 L07 S1, James C. Hoe, CMU/ECE/CALCM, 2017 Housekeeping Your goal today: think about what you
More informationECE 4514 Digital Design II. Spring Lecture 15: FSM-based Control
ECE 4514 Digital Design II Lecture 15: FSM-based Control A Design Lecture Overview Finite State Machines Verilog Mapping: one, two, three always blocks State Encoding User-defined or tool-defined State
More informationRecommended Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto
Recommed Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto DISCLAIMER: The information contained in this document does NOT contain
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationHardware Modeling. VHDL Architectures. Vienna University of Technology Department of Computer Engineering ECS Group
Hardware Modeling VHDL Architectures Vienna University of Technology Department of Computer Engineering ECS Group Contents Structural Modeling Instantiation of Components Behavioral Modeling Processes
More informationRegister Transfer Methodology II
Register Transfer Methodology II Chapter 12 1 Outline 1. Design example: One shot pulse generator 2. Design Example: GCD 3. Design Example: UART 4. Design Example: SRAM Interface Controller 5. Square root
More informationOutline. Register Transfer Methodology II. 1. One shot pulse generator. Refined block diagram of FSMD
Outline Register Transfer Methodology II 1. Design example: One shot pulse generator 2. Design Example: GCD 3. Design Example: UART 4. Design Example: SRAM Interface Controller 5. Square root approximation
More informationModeling Processors. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology
Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology L08-1 Instruction set typedef enum {R0;R1;R2; ;R31} RName; typedef union tagged { struct
More informationGetting CPI under 1: Outline
CMSC 411 Computer Systems Architecture Lecture 12 Instruction Level Parallelism 5 (Improving CPI) Getting CPI under 1: Outline More ILP VLIW branch target buffer return address predictor superscalar more
More informationSequential Circuit Design: Principle
Sequential Circuit Design: Principle Chapter 8 1 Outline 1. Overview on sequential circuits 2. Synchronous circuits 3. Danger of synthesizing asynchronous circuit 4. Inference of basic memory elements
More informationA Tutorial Introduction 1
Preface From the Old to the New Acknowledgments xv xvii xxi 1 Verilog A Tutorial Introduction 1 Getting Started A Structural Description Simulating the binarytoeseg Driver Creating Ports For the Module
More informationPerformance Specifications. Simple processor pipeline
Performance Specifications Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology L11-1 Simple processor pipeline RF Bz? Bz? IF Dec Exe Mem Wb imem bf bd dmem Functional
More informationBluespec-5: Modeling Processors. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology
Bluespec-5: Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology Based on material prepared by Bluespec Inc, January 2005 L12-1 Some New Types
More informationFPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1
FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1 Anurag Dwivedi Digital Design : Bottom Up Approach Basic Block - Gates Digital Design : Bottom Up Approach Gates -> Flip Flops Digital
More informationTwo HDLs used today VHDL. Why VHDL? Introduction to Structured VLSI Design
Two HDLs used today Introduction to Structured VLSI Design VHDL I VHDL and Verilog Syntax and ``appearance'' of the two languages are very different Capabilities and scopes are quite similar Both are industrial
More informationVerilog for Synthesis Ing. Pullini Antonio
Verilog for Synthesis Ing. Pullini Antonio antonio.pullini@epfl.ch Outline Introduction to Verilog HDL Describing combinational logic Inference of basic combinational blocks Describing sequential circuits
More informationCSE 820 Graduate Computer Architecture. week 6 Instruction Level Parallelism. Review from Last Time #1
CSE 820 Graduate Computer Architecture week 6 Instruction Level Parallelism Based on slides by David Patterson Review from Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level
More informationEE382A Lecture 7: Dynamic Scheduling. Department of Electrical Engineering Stanford University
EE382A Lecture 7: Dynamic Scheduling Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee382a Lecture 7-1 Announcements Project proposal due on Wed 10/14 2-3 pages submitted
More informationBeiHang Short Course, Part 5: Pandora Smart IP Generators
BeiHang Short Course, Part 5: Pandora Smart IP Generators James C. Hoe Department of ECE Carnegie Mellon University Collaborator: Michael Papamichael J. C. Hoe, CMU/ECE/CALCM, 0, BHSC L5 s CONNECT NoC
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationINSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad ELECTRONICS AND COMMUNICATIONS ENGINEERING
INSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad - 00 0 ELECTRONICS AND COMMUNICATIONS ENGINEERING QUESTION BANK Course Name : DIGITAL DESIGN USING VERILOG HDL Course Code : A00 Class : II - B.
More informationBluespec-5: Modeling Processors. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology
Bluespec-5: Modeling Processors (revised after the lecture) Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology Based on material prepared by Bluespec Inc, January
More informationIBM Power Multithreaded Parallelism: Languages and Compilers. Fall Nirav Dave
6.827 Multithreaded Parallelism: Languages and Compilers Fall 2006 Lecturer: TA: Assistant: Arvind Nirav Dave Sally Lee L01-1 IBM Power 5 130nm SOI CMOS with Cu 389mm 2 2GHz 276 million transistors Dual
More informationCS252 Spring 2017 Graduate Computer Architecture. Lecture 8: Advanced Out-of-Order Superscalar Designs Part II
CS252 Spring 2017 Graduate Computer Architecture Lecture 8: Advanced Out-of-Order Superscalar Designs Part II Lisa Wu, Krste Asanovic http://inst.eecs.berkeley.edu/~cs252/sp17 WU UCB CS252 SP17 Last Time
More information5008: Computer Architecture
5008: Computer Architecture Chapter 2 Instruction-Level Parallelism and Its Exploitation CA Lecture05 - ILP (cwliu@twins.ee.nctu.edu.tw) 05-1 Review from Last Lecture Instruction Level Parallelism Leverage
More informationCOSC 122 Computer Fluency. Computer Organization. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 122 Computer Fluency Computer Organization Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Key Points 1) The standard computer (von Neumann) architecture consists
More informationModule Interfaces and Concurrency
Module Interfaces and Concurrency Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. L12-1 Design Alternatives: Latency and Throughput Combinational (C) Pipeline (P) Folded: Reuse a
More informationHigh-level language synthesis overview. Junsong Liao
High-level language synthesis overview Junsong Liao Introduction High Level Language Synthesis Methodologies Performance Evaluation Conclusion and future work Reference Introduction Synthesis is the process
More informationDynamic Control Hazard Avoidance
Dynamic Control Hazard Avoidance Consider Effects of Increasing the ILP Control dependencies rapidly become the limiting factor they tend to not get optimized by the compiler more instructions/sec ==>
More informationSynthesis of Combinational and Sequential Circuits with Verilog
Synthesis of Combinational and Sequential Circuits with Verilog What is Verilog? Hardware description language: Are used to describe digital system in text form Used for modeling, simulation, design Two
More informationVHDL. VHDL History. Why VHDL? Introduction to Structured VLSI Design. Very High Speed Integrated Circuit (VHSIC) Hardware Description Language
VHDL Introduction to Structured VLSI Design VHDL I Very High Speed Integrated Circuit (VHSIC) Hardware Description Language Joachim Rodrigues A Technology Independent, Standard Hardware description Language
More informationModeling Processors. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology
Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology L07-1 Instruction set typedef enum {R0;R1;R2; ;R31} RName; typedef union tagged { struct
More informationECE 4514 Digital Design II. Spring Lecture 13: Logic Synthesis
ECE 4514 Digital Design II A Tools/Methods Lecture Second half of Digital Design II 9 10-Mar-08 L13 (T) Logic Synthesis PJ2 13-Mar-08 L14 (D) FPGA Technology 10 18-Mar-08 No Class (Instructor on Conference)
More informationComputer Architecture 计算机体系结构. Lecture 4. Instruction-Level Parallelism II 第四讲 指令级并行 II. Chao Li, PhD. 李超博士
Computer Architecture 计算机体系结构 Lecture 4. Instruction-Level Parallelism II 第四讲 指令级并行 II Chao Li, PhD. 李超博士 SJTU-SE346, Spring 2018 Review Hazards (data/name/control) RAW, WAR, WAW hazards Different types
More informationProgramming in the Brave New World of Systems-on-a-chip
Programming in the Brave New World of Systems-on-a-chip Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology The 25th International Workshop on Languages and Compilers
More informationEECS Components and Design Techniques for Digital Systems. Lec 20 RTL Design Optimization 11/6/2007
EECS 5 - Components and Design Techniques for Digital Systems Lec 2 RTL Design Optimization /6/27 Shauki Elassaad Electrical Engineering and Computer Sciences University of California, Berkeley Slides
More informationNOW Handout Page 1. Review from Last Time #1. CSE 820 Graduate Computer Architecture. Lec 8 Instruction Level Parallelism. Outline
CSE 820 Graduate Computer Architecture Lec 8 Instruction Level Parallelism Based on slides by David Patterson Review Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level Parallelism
More informationL2: Design Representations
CS250 VLSI Systems Design L2: Design Representations John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) Engineering Challenge Application Gap usually too large to bridge in one step,
More informationThe Verilog Hardware Description Language
Donald Thomas Philip Moorby The Verilog Hardware Description Language Fifth Edition 4y Spri nnger Preface From the Old to the New Acknowledgments xv xvii xxi 1 Verilog A Tutorial Introduction Getting Started
More informationSynthesis vs. Compilation Descriptions mapped to hardware Verilog design patterns for best synthesis. Spring 2007 Lec #8 -- HW Synthesis 1
Verilog Synthesis Synthesis vs. Compilation Descriptions mapped to hardware Verilog design patterns for best synthesis Spring 2007 Lec #8 -- HW Synthesis 1 Logic Synthesis Verilog and VHDL started out
More informationVHDL: RTL Synthesis Basics. 1 of 59
VHDL: RTL Synthesis Basics 1 of 59 Goals To learn the basics of RTL synthesis. To be able to synthesize a digital system, given its VHDL model. To be able to relate VHDL code to its synthesized output.
More informationDigital VLSI Design with Verilog
John Williams Digital VLSI Design with Verilog A Textbook from Silicon Valley Technical Institute Foreword by Don Thomas Sprin ger Contents Introduction xix 1 Course Description xix 2 Using this Book xx
More informationECE 551: Digital System *
ECE 551: Digital System * Design & Synthesis Lecture Set 5 5.1: Verilog Behavioral Model for Finite State Machines (FSMs) 5.2: Verilog Simulation I/O and 2001 Standard (In Separate File) 3/4/2003 1 Explicit
More informationSequential Circuit Design: Principle
Sequential Circuit Design: Principle Chapter 8 1 Outline 1. Overview on sequential circuits 2. Synchronous circuits 3. Danger of synthesizing async circuit 4. Inference of basic memory elements 5. Simple
More informationCHAPTER - 2 : DESIGN OF ARITHMETIC CIRCUITS
Contents i SYLLABUS osmania university UNIT - I CHAPTER - 1 : BASIC VERILOG HDL Introduction to HDLs, Overview of Digital Design With Verilog HDL, Basic Concepts, Data Types, System Tasks and Compiler
More informationLecture 12 VHDL Synthesis
CPE 487: Digital System Design Spring 2018 Lecture 12 VHDL Synthesis Bryan Ackland Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken, NJ 07030 1 What is Synthesis?
More informationCSE140L: Components and Design Techniques for Digital Systems Lab
CSE140L: Components and Design Techniques for Digital Systems Lab Tajana Simunic Rosing Source: Vahid, Katz, Culler 1 Announcements & Outline Lab 4 due; demo signup times listed on the cse140l site Check
More informationAdvanced processor designs
Advanced processor designs We ve only scratched the surface of CPU design. Today we ll briefly introduce some of the big ideas and big words behind modern processors by looking at two example CPUs. The
More informationSuperscalar Processing (5) Superscalar Processors Ch 14. New dependency for superscalar case? (8) Output Dependency?
Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion
More informationSuperscalar Processors Ch 14
Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion
More informationCS 152, Spring 2011 Section 8
CS 152, Spring 2011 Section 8 Christopher Celio University of California, Berkeley Agenda Grades Upcoming Quiz 3 What it covers OOO processors VLIW Branch Prediction Intel Core 2 Duo (Penryn) Vs. NVidia
More informationInf2C - Computer Systems Lecture Processor Design Single Cycle
Inf2C - Computer Systems Lecture 10-11 Processor Design Single Cycle Boris Grot School of Informatics University of Edinburgh Previous lectures Combinational circuits Combinations of gates (INV, AND, OR,
More informationLogic Synthesis. EECS150 - Digital Design Lecture 6 - Synthesis
Logic Synthesis Verilog and VHDL started out as simulation languages, but quickly people wrote programs to automatically convert Verilog code into low-level circuit descriptions (netlists). EECS150 - Digital
More informationLecture 9. VHDL, part IV. Hierarchical and parameterized design. Section 1 HIERARCHICAL DESIGN
Lecture 9 VHDL, part IV Hierarchical and parameterized design Section 1 HIERARCHICAL DESIGN 2 1 Dealing with Large Digital System Design 1. Apply hierarchy to the design At the highest level use larger
More informationIntroduction to Verilog
Introduction to Verilog Synthesis and HDLs Verilog: The Module Continuous (Dataflow) Assignment Gate Level Description Procedural Assignment with always Verilog Registers Mix-and-Match Assignments The
More informationMLR Institute of Technology
MLR Institute of Technology Laxma Reddy Avenue, Dundigal, Quthbullapur (M), Hyderabad 500 043 Course Name Course Code Class Branch ELECTRONICS AND COMMUNICATIONS ENGINEERING QUESTION BANK : DIGITAL DESIGN
More informationThis course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers
Course Introduction Purpose: This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers Objectives: Learn about error detection and address errors
More informationReadings. H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152
Readings H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152 Recent Research Paper The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays, Hrishikesh et
More informationLecture 13 - VLIW Machines and Statically Scheduled ILP
CS 152 Computer Architecture and Engineering Lecture 13 - VLIW Machines and Statically Scheduled ILP John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw
More informationLatches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter
IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more
More informationUCT Algorithm Circle: Number Theory
UCT Algorithm Circle: 7 April 2011 Outline Primes and Prime Factorisation 1 Primes and Prime Factorisation 2 3 4 Some revision (hopefully) What is a prime number? An integer greater than 1 whose only factors
More informationCSE140L: Components and Design
CSE140L: Components and Design Techniques for Digital Systems Lab Tajana Simunic Rosing Source: Vahid, Katz, Culler 1 Grade distribution: 70% Labs 35% Lab 4 30% Lab 3 20% Lab 2 15% Lab 1 30% Final exam
More informationComputer Architecture ELEC3441
Computer Architecture ELEC3441 RISC vs CISC Iron Law CPUTime = # of instruction program # of cycle instruction cycle Lecture 5 Pipelining Dr. Hayden Kwok-Hay So Department of Electrical and Electronic
More informationLevels in Processor Design
Levels in Processor Design Circuit design Keywords: transistors, wires etc.results in gates, flip-flops etc. Logical design Putting gates (AND, NAND, ) and flip-flops together to build basic blocks such
More informationQuick Introduction to SystemVerilog: Sequental Logic
! Quick Introduction to SystemVerilog: Sequental Logic Lecture L3 8-545 Advanced Digital Design ECE Department Many elements Don Thomas, 24, used with permission with credit to G. Larson Today Quick synopsis
More informationC 1. Last time. CSE 490/590 Computer Architecture. Complex Pipelining I. Complex Pipelining: Motivation. Floating-Point Unit (FPU) Floating-Point ISA
CSE 490/590 Computer Architecture Complex Pipelining I Steve Ko Computer Sciences and Engineering University at Buffalo Last time Virtual address caches Virtually-indexed, physically-tagged cache design
More informationComputer Architecture: Out-of-Order Execution II. Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Out-of-Order Execution II Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-447 Spring 2013, Computer Architecture, Lecture 15 Video
More informationComputer Architecture (TT 2012)
Computer Architecture (TT 2012) The Register Transfer Level Daniel Kroening Oxford University, Computer Science Department Version 1.0, 2011 Outline Reminders Gates Implementations of Gates Latches, Flip-flops
More informationProcessor Design Pipelined Processor. Hung-Wei Tseng
Processor Design Pipelined Processor Hung-Wei Tseng Pipelining 7 Pipelining Break up the logic with isters into pipeline stages Each stage can act on different instruction/data States/Control signals of
More informationMain Points of the Computer Organization and System Software Module
Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a
More informationAdvanced issues in pipelining
Advanced issues in pipelining 1 Outline Handling exceptions Supporting multi-cycle operations Pipeline evolution Examples of real pipelines 2 Handling exceptions 3 Exceptions In pipelined execution, one
More informationECE 4514 Digital Design II. Spring Lecture 2: Hierarchical Design
ECE 4514 Digital Design II Spring 2007 Abstraction in Hardware Design Remember from last lecture that HDLs offer a textual description of a netlist. Through abstraction in the HDL, we can capture more
More informationMicroprogrammed Control Approach
Microprogrammed Control Approach Considering the FSM for our MIPS subset has 10 states, the complete MIPS instruction set, which contains more than 100 instructions, and considering that these instructions
More informationRegister Transfer Level in Verilog: Part I
Source: M. Morris Mano and Michael D. Ciletti, Digital Design, 4rd Edition, 2007, Prentice Hall. Register Transfer Level in Verilog: Part I Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National
More informationComputer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling)
18-447 Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 2/13/2015 Agenda for Today & Next Few Lectures
More informationCompiler Optimizations. Lecture 7 Overview of Superscalar Techniques. Memory Allocation by Compilers. Compiler Structure. Register allocation
Lecture 7 Overview of Superscalar Techniques CprE 581 Computer Systems Architecture, Fall 2013 Reading: Textbook, Ch. 3 Complexity-Effective Superscalar Processors, PhD Thesis by Subbarao Palacharla, Ch.1
More informationRTL Coding General Concepts
RTL Coding General Concepts Typical Digital System 2 Components of a Digital System Printed circuit board (PCB) Embedded d software microprocessor microcontroller digital signal processor (DSP) ASIC Programmable
More informationControl and Datapath 8
Control and Datapath 8 Engineering attempts to develop design methods that break a problem up into separate steps to simplify the design and increase the likelihood of a correct solution. Digital system
More informationA Process Model suitable for defining and programming MpSoCs
A Process Model suitable for defining and programming MpSoCs MpSoC-Workshop at Rheinfels, 29-30.6.2010 F. Mayer-Lindenberg, TU Hamburg-Harburg 1. Motivation 2. The Process Model 3. Mapping to MpSoC 4.
More information