Simple Asynchronous Microprocessor

Size: px
Start display at page:

Download "Simple Asynchronous Microprocessor"

Transcription

1 Caltech CS/EE 181b Winter 2014 Prof. Alain Martin Simple Asynchronous Microprocessor By Kangping Hu March

2 1. Architecture Overview 1 SAM (Simple Asynchronous Microprocessor) is a 32-bit RISC Architecture. A top-level flow diagram is shown in figure 1. A list of major building blocks and their functions are described below: PC (32-bit program counter): tracks which instruction to execute next. IMEM (Instruction memory): contains the instructions. Decoder: decomposes the instruction. RegFile: 8 32-bit general-purpose registers. YMode: performs certain operations to retrieve y-operand. Dispatch: forks x-operand and y-operand, and send them to execution units. Branch: performs comparison with x-operand. ALU: performs arithmetic operations. DMEM: reads from or writes to data memory. Shift: performs left or right shift operation. Opz Merge: controlled merge of z-operand.

3 2 Figure 1 top-level flow diagram of SAM The register number rx, ry, and rz index which general-purpose register to read from or write to. x- operand (Opx) is directly generated from general-purpose register as gpr[rx], and y-operand(opy) is generated by a certain operation between gpr[ry] and the immediate (Imm) determined by the YMode control signal. They then act as inputs for the execution part (Branch, ALU, DMEM, and Shift). The operation inside each execution block is determined by the function (Fxn). z-operand (Opz) is one of the outputs selected by the unit control signal (Unit). It is then written back to one of the generalpurpose registers indexed by rz. The Branch unit generates branch flag for PC, indicating if the next instruction address will be PC+4, or Opy. It also generates a halt flag, indicating the end of the execution.

4 2. Description of Each block PC Input: Output: Opy, Branch Flag, Halt Flag PC The PC process is composed of a 32-bit register and an adder. At the beginning of each cycle, it will output PC to the outside environment and the adder. It will then wait for the branch flag, and decide updating the new pc from either the branched pc (Opy), or the adder output. It will also receive a halt flag, which once set, will suspend all the remaining processes. But since SAM is a non-terminating microprocessor, it will never actually stop. The halt signal here is merely for indicating the end of the testing program. Figure 2-1 PC Process

5 2.2. IMEM Input: PC. Output: Instruction. 4 The Instruction Memory reads in a binary test program, which calculates the sum from 1 to 100. Based on the input PC, the instruction in the corresponding address will be sent out. The testing program is shown below:.=0x8 jmp Start ; comment.=0x100 Start: li r1=100 li r2=0u ; upper immediate jmp r3=detour ; comment Label: ; comment add r2=r1,r2 sw r2,(100) lw r2=(r1+0x3ff) lw r2=(100) sub r1=r1,1 bne r1,label hlt jmp zero ; shouldnt get executed nop.=0x200 ; test comment Detour: jmp r3

6 2.3 Decoder Input: Output: Instruction. rx, ry, rz, YMode, Imm, Fxn, Unit. 5 Figure 2-3 Decoder The decoder takes the instruction, decomposes it, and sends the smaller pieces to corresponding processes: Unit = instruction [31 30], to Opz Merge and Branch Fxn = instruction [29 27], to ALU, Branch, Shift and DMEM Ymode = instruction [26 25], to YMode rz = instruction [24 22], to RegFile rx = instruction [21 19], to RegFile ry = instruction [18 16], to RegFile imm = instruction [15 0], to YMode

7 2.4 RegFiles Input: Data In (Opz) rx, ry, rz Output: gpr[rx], gpr[ry] 6 The RegFiles block is probably the most complicated one in this design. Its diagram is shown in figure 2-4. Inside the block, there is a State process guarding the sequential input of rx, ry, rz, Data In, and sequential output of gpr[rx], and gpr[ry]. The State process has three states: State 0: Addr_Sel selects rx, Data In is skipped, Decoder set one enable signal to high, Controlled Merge receives data from the selected register, Controlled Split output Data in gpr[rx] State 1: Addr_Sel selects ry, Data In is skipped, Decoder set one enable signal to high, Controlled Merge receives data from the selected register, Controlled Split output Data in gpr[ry] State 2: Addr_Sel selects rz, Controlled Fork receives the data and send the data to all registers, Decoder set one enable signal to high, One register stores the data Controlled Merge skips receiving data The Decoder receives the address, and set one of the enable signals to high. Depending on the state and the enable signal, one of the registers will either read out data or write in data based on the input state. The Controlled merge also takes in the enable signals so that data from only one register is received to prevent deadlock. The State Update Process loops the state variable in 0->1->2->0 sequence.

8 7 Figure 2-4 RegFiles Diagram 2.5 Ymode Input: Output: Imm, gpr[ry], YMode Opy The YMode block shown in figure 2.5 is used to generate y-operand from gpr[ry] and Imm. There is a controlled merge that takes in all the possible Opy values, and outputs one of them based on Ymode value. For sign extended operation, I used 16 buffers that constantly outputting 1, while for the unsigned extended, I used 16 buffers that constantly outputting 0.

9 8 Figure 2.5 YMode diagram 2.6 Dispatch Input: Output: Opx, Opy. ALU_Opx, ALU_Opy Branch_Opx DMEM_Opx, DMEM_Opy Shift_Opy The function of this process is to fork Opx, and Opy, and send them to ALU, Branch, DMEM and Shift processes. Originally, I was planning to do a controlled fork and controlled merge at both sides of the execution blocks. But I was encountered with a lot of deadlock issues, as all of the processes conditioned by Fxn inside the execution blocked (DMEM, Branch) will also need to be conditioned by Unit. So I decide just let all the executions run, and collect the wanted opz at the very end based on the Unit control signal. In this way, the design is simplified, but the price I paid is that the longest path is always executed.

10 2.7 Execution Blocks ALU Input: Output: Opx, Opy, Fxn Opz The ALU Block is shown in fissure 2.7.1, I used controlled fork and controlled merge at both end for flow control. The x-operand and y-operand will only be sent to one process. For each small ALU Blocks, I started with AND, OR, and Inverse in the prs level, and use those three to construct NOR, and XOR. The ADD unit uses ripple carry. For the SUB, I used 2 s complement with the ADD unit. Figure ALU Blocks

11 2.7.2 DMEM Input: Opx, Opy, Fxn Output: Data Out (Opz) 10 The DMEM Block is shown in Figure below. It consists of an ex_dmem process and the Data Memory. Opy is the addressing for DMEM, and Opx is the input data. Depending on Fxn, ex_dmem will send out different Commands through Cmd to DMEM. Fxn = 0 -> Fxn = 4 -> Cmd = True; Data Out = dmem[opy]; Cmd = False; Dmem[Opy] = Opx; Data Out = Opy; Figure DMEM Block

12 2.7.3 Shift Input: Output: Opy, Fxn Opz 11 Shift is a relative simple block as shown in figure Since all the shift operations will take about the same time to finish, I didn t put guard in the input fork, and let all four shift processes run in parallel. I then collect all the outputs, and select one based on Fxn. Figure Shift Block For each shift process, I just offset the input and output, add some buffers that constantly outputting zero or one, and connect the unwanted bits to sinks that will just return acknowledge one the signal arrived. The diagram figure below shows the Shift Right By 1 process. Figure Shift Right By 1

13 2.7.4 Branch Input: Opx, PC Fxn, Unit 12 Output: Opz Branch Flag, Halt Flag Branch Block consists of two major sub-blocks ad shown in figure One is the branch portion that will output the branch and halt flag conditioned on Unit. If the Unit is not for branch operation, a default branch and halt flag will be sent out. The other sub-block is to increment PC and output it as Opz. Figure Branch Diagram Figure BEQ In the branch portion, I used both controlled fork and controlled merge for flow control. The comparators are composed of basic AND, OR, and Inverse Gates. Take BEQ (Branch IF Equal to Zero) as an example in figure , I just OR all input bits sequentially, and invert the final output as the branch flag. BNE can be implemented the same way except removing the inverse. For BLT, I just need to check the first bit, and BGE is the inverse of that. For BGT, I need to check the first bit, and OR the rest of the bits, and BLE is the inverse of it. This sequential comparison will inevitably increase the latency. A better way is to do the comparison in a tree topology.

14 3. Critical Path and Test Result 13 Show below is the result of the test program. It is adding up the sum from 1 to 100, and the result is saved in data memory location [100]. Some threads are permanently suspended, indicating the end of the simulation. /dmem> dmem.chp[16:2] dmem[100] = 5047 /dmem> dmem.chp[16:2] dmem[100] = 5047 /dmem> dmem.chp[16:2] dmem[100] = 5049 /dmem> dmem.chp[16:2] dmem[100] = 5049 /dmem> dmem.chp[16:2] dmem[100] = 5049 /dmem> dmem.chp[16:2] dmem[100] = 5049 /dmem> dmem.chp[16:2] dmem[100] = 5050 /dmem> dmem.chp[16:2] dmem[100] = 5050 /dmem> dmem.chp[16:2] dmem[100] = error Error: deadlock 32 threads are permanently suspended: (susp-perm) /pc/pc/pc2_bit2[4] at pc2.chp[86:27] I examined the number of total transitions between the wires at pc output going up and down. The result is shown below. Assuming each transition takes 100 time unit, the longest cycle takes = 116 cycles (pretty long). After some further testing, the critical path is caused by the ADD and SUB units. (watch) /pc/inc/add[5]/xor2/or1/en:o down at time (watch) /pc/inc/add[5]/xor2/or1/en:o up at time /dmem> dmem.chp[16:2] dmem[100] = 199 (watch) /pc/inc/add[5]/xor2/or1/en:o down at time (watch) /pc/inc/add[5]/xor2/or1/en:o up at time (watch) /pc/inc/add[5]/xor2/or1/en:o down at time (watch) /pc/inc/add[5]/xor2/or1/en:o up at time The SAM files are stored in the Home/181b/SAM29 directory. The file names beginning with capital letters are process that not fully decomposed, and the ones with lower case are the bottom level chp files. The top modul is SAM11.chp. There is a tool.chp file and a gates.chp that contain some common chp and prs used by several processes, and a type.chp file that defines all the variable types.

15 4. The Design Process 14 I started this design with a high level sequential description of the SAM Architecture based on its reference. It was pretty straightforward except that I was not familiar with the syntax. Then came the decompositions using the meta-process. It was fine at the beginning. But as I broke the SAM into smaller and smaller pieces, I didn t isolate the processes very well, and I encountered with a lot of deadlock issues. Also it became harder and harder to modify the architecture, adding and deleting ports, as the pieces became smaller. So I ended up with adding quite a lot redundant state signals to keep the sequential operations, and redundant buffers. At some point, I had to simplify the design. And I decided just letting some of the processes run in parallel, avoid some nested guards, and collect the result in a controlled merge. Finally to the prs level, the deadlocks appeared again, either because I didn t set the initial conditions for the gates correctly, or because the enables are not connected properly, or because of instability issues. Eventually I managed to decompose most of the chp processes to prs level. Thankfully the simulation still works. I have been struggling with chpsim throughout this semester, fighting with deadlocks and syntax errors. I really wish it could provide more information about where the deadlock happened! Anyway, it turned out to be a pretty powerful simulator, and can catch all kinds of errors I have made. At the very beginning, I was planning to finish all the decompositions ahead of time, and maybe start the layout. But it seems that I have underestimated the complexity of the microprocessor. As I started the vertical decomposition, I realized that I needed to copy that many control signals, and the completion tress are everywhere. The price to get rid of clock is not that cheap.

SAM Architecture Reference (Revised)

SAM Architecture Reference (Revised) SAM Architecture Reference (Revised) Mika Nyström Department of Computer Science California Institute of Technology Pasadena, California, U.S.A. 2000 April 19, 2000 Revised April 2, 2002 1. Introduction

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations Chapter 4 The Processor Part I Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations

More information

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University The Processor: Datapath and Control Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction CPU performance factors Instruction count Determined

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations

More information

EE 3170 Microcontroller Applications

EE 3170 Microcontroller Applications Lecture Overview EE 3170 Microcontroller Applications Lecture 7 : Instruction Subset & Machine Language: Conditions & Branches in Motorola 68HC11 - Miller 2.2 & 2.3 & 2.4 Based on slides for ECE3170 by

More information

COMP MIPS instructions 2 Feb. 8, f = g + h i;

COMP MIPS instructions 2 Feb. 8, f = g + h i; Register names (save, temporary, zero) From what I have said up to now, you will have the impression that you are free to use any of the 32 registers ($0,..., $31) in any instruction. This is not so, however.

More information

Chapter 4. The Processor Designing the datapath

Chapter 4. The Processor Designing the datapath Chapter 4 The Processor Designing the datapath Introduction CPU performance determined by Instruction Count Clock Cycles per Instruction (CPI) and Cycle time Determined by Instruction Set Architecure (ISA)

More information

The MIPS Processor Datapath

The MIPS Processor Datapath The MIPS Processor Datapath Module Outline MIPS datapath implementation Register File, Instruction memory, Data memory Instruction interpretation and execution. Combinational control Assignment: Datapath

More information

Processor (I) - datapath & control. Hwansoo Han

Processor (I) - datapath & control. Hwansoo Han Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two

More information

Simple Microprocessor Design By Dr Hashim Ali

Simple Microprocessor Design By Dr Hashim Ali Simple Microprocessor Design By Dr Hashim Ali This application note gives an introduction to microprocessor architecture. The goal of the project is to build a 4-bit processor at logic level and then simulate

More information

ECE 2300 Digital Logic & Computer Organization. More Single Cycle Microprocessor

ECE 2300 Digital Logic & Computer Organization. More Single Cycle Microprocessor ECE 23 Digital Logic & Computer Organization Spring 28 More Single Cycle Microprocessor Lecture 6: HW6 due tomorrow Announcements Prelim 2: Tues April 7, 7:3pm, Phillips Hall Coverage: Lectures 8~6 Inform

More information

CAD4 The ALU Fall 2009 Assignment. Description

CAD4 The ALU Fall 2009 Assignment. Description CAD4 The ALU Fall 2009 Assignment To design a 16-bit ALU which will be used in the datapath of the microprocessor. This ALU must support two s complement arithmetic and the instructions in the baseline

More information

BUILDING BLOCKS OF A BASIC MICROPROCESSOR. Part 1 PowerPoint Format of Lecture 3 of Book

BUILDING BLOCKS OF A BASIC MICROPROCESSOR. Part 1 PowerPoint Format of Lecture 3 of Book BUILDING BLOCKS OF A BASIC MICROPROCESSOR Part PowerPoint Format of Lecture 3 of Book Decoder Tri-state device Full adder, full subtractor Arithmetic Logic Unit (ALU) Memories Example showing how to write

More information

Topic Notes: MIPS Instruction Set Architecture

Topic Notes: MIPS Instruction Set Architecture Computer Science 220 Assembly Language & Comp. Architecture Siena College Fall 2011 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture.

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Computer Architecture, IFE CS and T&CS, 4 th sem. Single-Cycle Architecture

Computer Architecture, IFE CS and T&CS, 4 th sem. Single-Cycle Architecture Single-Cycle Architecture Data flow Data flow is synchronized with clock (edge) in sequential systems Architecture Elements - assumptions Program (Instruction) memory: All instructions & buses are 32-bit

More information

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University The Processor (1) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Chapter 4. The Processor. Computer Architecture and IC Design Lab Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS

More information

Data paths for MIPS instructions

Data paths for MIPS instructions You are familiar with how MIPS programs step from one instruction to the next, and how branches can occur conditionally or unconditionally. We next examine the machine level representation of how MIPS

More information

Week 7: Assignment Solutions

Week 7: Assignment Solutions Week 7: Assignment Solutions 1. In 6-bit 2 s complement representation, when we subtract the decimal number +6 from +3, the result (in binary) will be: a. 111101 b. 000011 c. 100011 d. 111110 Correct answer

More information

ECE 341 Midterm Exam

ECE 341 Midterm Exam ECE 341 Midterm Exam Time allowed: 75 minutes Total Points: 75 Points Scored: Name: Problem No. 1 (8 points) For each of the following statements, indicate whether the statement is TRUE or FALSE: (a) A

More information

For Example: P: LOAD 5 R0. The command given here is used to load a data 5 to the register R0.

For Example: P: LOAD 5 R0. The command given here is used to load a data 5 to the register R0. Register Transfer Language Computers are the electronic devices which have several sets of digital hardware which are inter connected to exchange data. Digital hardware comprises of VLSI Chips which are

More information

The RiSC-16 Instruction-Set Architecture

The RiSC-16 Instruction-Set Architecture The RiSC-16 Instruction-Set Architecture ENEE 646: Digital Computer Design, Fall 2002 Prof. Bruce Jacob This paper describes a sequential implementation of the 16-bit Ridiculously Simple Computer (RiSC-16),

More information

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set COMPSCI 313 S2 2018 Computer Organization 7 MIPS Instruction Set Agenda & Reading MIPS instruction set MIPS I-format instructions MIPS R-format instructions 2 7.1 MIPS Instruction Set MIPS Instruction

More information

THE MICROPROCESSOR Von Neumann s Architecture Model

THE MICROPROCESSOR Von Neumann s Architecture Model THE ICROPROCESSOR Von Neumann s Architecture odel Input/Output unit Provides instructions and data emory unit Stores both instructions and data Arithmetic and logic unit Processes everything Control unit

More information

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic CS 61C: Great Ideas in Computer Architecture Datapath Instructors: John Wawrzynek & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/fa15 1 Components of a Computer Processor Control Enable? Read/Write

More information

CSEN 601: Computer System Architecture Summer 2014

CSEN 601: Computer System Architecture Summer 2014 CSEN 601: Computer System Architecture Summer 2014 Practice Assignment 5 Solutions Exercise 5-1: (Midterm Spring 2013) a. What are the values of the control signals (except ALUOp) for each of the following

More information

The von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store.

The von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store. IT 3123 Hardware and Software Concepts February 11 and Memory II Copyright 2005 by Bob Brown The von Neumann Architecture 00 01 02 03 PC IR Control Unit Command Memory ALU 96 97 98 99 Notice: This session

More information

Reconfigurable Computing Systems ( L) Fall 2012 Tiny Register Machine (TRM)

Reconfigurable Computing Systems ( L) Fall 2012 Tiny Register Machine (TRM) Reconfigurable Computing Systems (252-2210-00L) Fall 2012 Tiny Register Machine (TRM) L. Liu Department of Computer Science, ETH Zürich Fall semester, 2012 1 Introduction Jumping up a few levels of abstraction.

More information

RiSC-16 Sequential Implementation

RiSC-16 Sequential Implementation RiSC-16 Sequential Implementation ENEE 446: Digital Computer Design, Fall 2000 Prof. Bruce Jacob This paper describes a sequential implementation of the 16-bit Ridiculously Simple Computer (RiSC-16), a

More information

EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution

EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution Important guidelines: Always state your assumptions and clearly explain your answers. Please upload your solution document

More information

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming Computer Science 324 Computer Architecture Mount Holyoke College Fall 2007 Topic Notes: Data Paths and Microprogramming We have spent time looking at the MIPS instruction set architecture and building

More information

CSSE232 Computer Architecture I. Datapath

CSSE232 Computer Architecture I. Datapath CSSE232 Computer Architecture I Datapath Class Status Reading Sec;ons 4.1-3 Project Project group milestone assigned Indicate who you want to work with Indicate who you don t want to work with Due next

More information

2010 Summer Answers [OS I]

2010 Summer Answers [OS I] CS2503 A-Z Accumulator o Register where CPU stores intermediate arithmetic results. o Speeds up process by not having to store these results in main memory. Addition o Carried out by the ALU. o ADD AX,

More information

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Introduction to the MIPS Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Introduction to the MIPS The Microprocessor without Interlocked Pipeline Stages

More information

CS3350B Computer Architecture Winter 2015

CS3350B Computer Architecture Winter 2015 CS3350B Computer Architecture Winter 2015 Lecture 5.5: Single-Cycle CPU Datapath Design Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design, Patterson

More information

Basic Processor Design

Basic Processor Design Basic Processor Design Design Instruction Set Design Datapath Design Control Unit This lecture deals with Instruction Set Design. 1001 Instruction Set Terminology Mnemonic (Instruction Name) SUBI Syntax

More information

Mapping Control to Hardware

Mapping Control to Hardware C A P P E N D I X A custom format such as this is slave to the architecture of the hardware and the instruction set it serves. The format must strike a proper compromise between ROM size, ROM-output decoding,

More information

Computer Architecture

Computer Architecture CS3350B Computer Architecture Winter 2015 Lecture 4.2: MIPS ISA -- Instruction Representation Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design,

More information

CHAPTER SIX BASIC COMPUTER ORGANIZATION AND DESIGN

CHAPTER SIX BASIC COMPUTER ORGANIZATION AND DESIGN CHAPTER SIX BASIC COMPUTER ORGANIZATION AND DESIGN 6.1. Instruction Codes The organization of a digital computer defined by: 1. The set of registers it contains and their function. 2. The set of instructions

More information

PSIM: Processor SIMulator (version 4.2)

PSIM: Processor SIMulator (version 4.2) PSIM: Processor SIMulator (version 4.2) by Charles E. Stroud, Professor Dept. of Electrical & Computer Engineering Auburn University July 23, 2003 ABSTRACT A simulator for a basic stored program computer

More information

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization CS/COE0447: Computer Organization and Assembly Language Chapter 3 Sangyeun Cho Dept. of Computer Science Five classic components I am like a control tower I am like a pack of file folders I am like a conveyor

More information

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization Five classic components CS/COE0447: Computer Organization and Assembly Language I am like a control tower I am like a pack of file folders Chapter 3 I am like a conveyor belt + service stations I exchange

More information

Tailoring the 32-Bit ALU to MIPS

Tailoring the 32-Bit ALU to MIPS Tailoring the 32-Bit ALU to MIPS MIPS ALU extensions Overflow detection: Carry into MSB XOR Carry out of MSB Branch instructions Shift instructions Slt instruction Immediate instructions ALU performance

More information

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer 1 5. Addressing Modes MIPS Addressing Modes 2 Addressing takes care of where to find data instruction We have seen, so far three addressing modes of MIPS (to find data): 1. Immediate addressing: provides

More information

LECTURE 4. Logic Design

LECTURE 4. Logic Design LECTURE 4 Logic Design LOGIC DESIGN The language of the machine is binary that is, sequences of 1 s and 0 s. But why? At the hardware level, computers are streams of signals. These signals only have two

More information

Inf2C - Computer Systems Lecture Processor Design Single Cycle

Inf2C - Computer Systems Lecture Processor Design Single Cycle Inf2C - Computer Systems Lecture 10-11 Processor Design Single Cycle Boris Grot School of Informatics University of Edinburgh Previous lectures Combinational circuits Combinations of gates (INV, AND, OR,

More information

Parallel logic circuits

Parallel logic circuits Computer Mathematics Week 9 Parallel logic circuits College of Information cience and Engineering Ritsumeikan University last week the mathematics of logic circuits the foundation of all digital design

More information

Chapter 4 The Processor

Chapter 4 The Processor Chapter 4 The Processor 4.1 Introduction 4.2 Logic Design Conventions 4.3 The Single-Cycle Design 4.4 The Pipelined Design (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University

More information

Review of the Machine Cycle

Review of the Machine Cycle MIPS Branch and Jump Instructions Cptr280 Dr Curtis Nelson Review of the Machine Cycle When a program is executing, its instructions are located in main memory. The address of an instruction is the address

More information

ECE 250 / CPS 250 Computer Architecture. Processor Design Datapath and Control

ECE 250 / CPS 250 Computer Architecture. Processor Design Datapath and Control ECE 250 / CPS 250 Computer Architecture Processor Design Datapath and Control Benjamin Lee Slides based on those from Andrew Hilton (Duke), Alvy Lebeck (Duke) Benjamin Lee (Duke), and Amir Roth (Penn)

More information

COMPUTER ARCHITECTURE AND ORGANIZATION Register Transfer and Micro-operations 1. Introduction A digital system is an interconnection of digital

COMPUTER ARCHITECTURE AND ORGANIZATION Register Transfer and Micro-operations 1. Introduction A digital system is an interconnection of digital Register Transfer and Micro-operations 1. Introduction A digital system is an interconnection of digital hardware modules that accomplish a specific information-processing task. Digital systems vary in

More information

Systems Architecture

Systems Architecture Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software

More information

CS 2506 Computer Organization II

CS 2506 Computer Organization II Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other computing devices may

More information

Microprogrammed Control Approach

Microprogrammed Control Approach Microprogrammed Control Approach Considering the FSM for our MIPS subset has 10 states, the complete MIPS instruction set, which contains more than 100 instructions, and considering that these instructions

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization CISC 662 Graduate Computer Architecture Lecture 4 - ISA MIPS ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

CENG 3420 Lecture 06: Datapath

CENG 3420 Lecture 06: Datapath CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference

More information

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

ASSEMBLY LANGUAGE MACHINE ORGANIZATION ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction

More information

ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010

ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010 ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010 This homework is to be done individually. Total 9 Questions, 100 points 1. (8

More information

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B DIGITAL SYSTEMS II Fall 1999

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B DIGITAL SYSTEMS II Fall 1999 UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering EEC180B DIGITAL SYSTEMS II Fall 1999 Lab 7-10: Micro-processor Design: Minimal Instruction Set Processor (MISP) Objective:

More information

UNIT-III REGISTER TRANSFER LANGUAGE AND DESIGN OF CONTROL UNIT

UNIT-III REGISTER TRANSFER LANGUAGE AND DESIGN OF CONTROL UNIT UNIT-III 1 KNREDDY UNIT-III REGISTER TRANSFER LANGUAGE AND DESIGN OF CONTROL UNIT Register Transfer: Register Transfer Language Register Transfer Bus and Memory Transfers Arithmetic Micro operations Logic

More information

Structure of Computer Systems

Structure of Computer Systems 288 between this new matrix and the initial collision matrix M A, because the original forbidden latencies for functional unit A still have to be considered in later initiations. Figure 5.37. State diagram

More information

Introduction to Computers - Chapter 4

Introduction to Computers - Chapter 4 Introduction to Computers - Chapter 4 Since the invention of the transistor and the first digital computer of the 1940s, computers have been increasing in complexity and performance; however, their overall

More information

CS222: Processor Design

CS222: Processor Design CS222: Processor Design Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati Processor Design building blocks Outline A simple implementation: Single Cycle Data pathandcontrol

More information

LECTURE 5. Single-Cycle Datapath and Control

LECTURE 5. Single-Cycle Datapath and Control LECTURE 5 Single-Cycle Datapath and Control PROCESSORS In lecture 1, we reminded ourselves that the datapath and control are the two components that come together to be collectively known as the processor.

More information

Chapter 7 Central Processor Unit (S08CPUV2)

Chapter 7 Central Processor Unit (S08CPUV2) Chapter 7 Central Processor Unit (S08CPUV2) 7.1 Introduction This section provides summary information about the registers, addressing modes, and instruction set of the CPU of the HCS08 Family. For a more

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA CISC 662 Graduate Computer Architecture Lecture 4 - ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23) Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20

More information

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control EE 37 Unit Single-Cycle CPU path and Control CPU Organization Scope We will build a CPU to implement our subset of the MIPS ISA Memory Reference Instructions: Load Word (LW) Store Word (SW) Arithmetic

More information

Chapter 6 Programming the LC-3

Chapter 6 Programming the LC-3 Chapter 6 Programming the LC-3 Based on slides McGraw-Hill Additional material 4/5 Lewis/Martin Aside: Booting the Computer How does it all begin? We have LC-3 hardware and a program, but what next? Initial

More information

ISA and RISCV. CASS 2018 Lavanya Ramapantulu

ISA and RISCV. CASS 2018 Lavanya Ramapantulu ISA and RISCV CASS 2018 Lavanya Ramapantulu Program Program =?? Algorithm + Data Structures Niklaus Wirth Program (Abstraction) of processor/hardware that executes 3-Jul-18 CASS18 - ISA and RISCV 2 Program

More information

Jan Rabaey Homework # 7 Solutions EECS141

Jan Rabaey Homework # 7 Solutions EECS141 UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences Last modified on March 30, 2004 by Gang Zhou (zgang@eecs.berkeley.edu) Jan Rabaey Homework # 7

More information

Control Flow. September 2, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 September 2, / 21

Control Flow. September 2, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 September 2, / 21 Control Flow Geoffrey Brown Bryce Himebaugh Indiana University September 2, 2016 Geoffrey Brown, Bryce Himebaugh 2015 September 2, 2016 1 / 21 Outline Condition Codes C Relational Operations C Logical

More information

Basic Processing Unit: Some Fundamental Concepts, Execution of a. Complete Instruction, Multiple Bus Organization, Hard-wired Control,

Basic Processing Unit: Some Fundamental Concepts, Execution of a. Complete Instruction, Multiple Bus Organization, Hard-wired Control, UNIT - 7 Basic Processing Unit: Some Fundamental Concepts, Execution of a Complete Instruction, Multiple Bus Organization, Hard-wired Control, Microprogrammed Control Page 178 UNIT - 7 BASIC PROCESSING

More information

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 4 The Processor: A Based on P&H Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined

More information

CS3350B Computer Architecture Quiz 3 March 15, 2018

CS3350B Computer Architecture Quiz 3 March 15, 2018 CS3350B Computer Architecture Quiz 3 March 15, 2018 Student ID number: Student Last Name: Question 1.1 1.2 1.3 2.1 2.2 2.3 Total Marks The quiz consists of two exercises. The expected duration is 30 minutes.

More information

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141 EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design

More information

Code No: R Set No. 1

Code No: R Set No. 1 Code No: R059210504 Set No. 1 II B.Tech I Semester Regular Examinations, November 2007 DIGITAL LOGIC DESIGN ( Common to Computer Science & Engineering, Information Technology and Computer Science & Systems

More information

The Assembly Language of the Boz 5

The Assembly Language of the Boz 5 The Assembly Language of the Boz 5 The Boz 5 uses bits 31 27 of the IR as a five bit opcode. Of the possible 32 opcodes, only 26 are implemented. Op-Code Mnemonic Description 00000 HLT Halt the Computer

More information

Supplement for MIPS (Section 4.14 of the textbook)

Supplement for MIPS (Section 4.14 of the textbook) Supplement for MIPS (Section 44 of the textbook) Section 44 does a good job emphasizing that MARIE is a toy architecture that lacks key feature of real-world computer architectures Most noticable, MARIE

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

CPU Organization (Design)

CPU Organization (Design) ISA Requirements CPU Organization (Design) Datapath Design: Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions (e.g., Registers, ALU, Shifters, Logic

More information

END-TERM EXAMINATION

END-TERM EXAMINATION (Please Write your Exam Roll No. immediately) END-TERM EXAMINATION DECEMBER 2006 Exam. Roll No... Exam Series code: 100919DEC06200963 Paper Code: MCA-103 Subject: Digital Electronics Time: 3 Hours Maximum

More information

REGISTER TRANSFER LANGUAGE

REGISTER TRANSFER LANGUAGE REGISTER TRANSFER LANGUAGE The operations executed on the data stored in the registers are called micro operations. Classifications of micro operations Register transfer micro operations Arithmetic micro

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Datapath for a Simplified Processor James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Introduction

More information

Computer Organization MIPS ISA

Computer Organization MIPS ISA CPE 335 Computer Organization MIPS ISA Dr. Iyad Jafar Adapted from Dr. Gheith Abandah Slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE 232 MIPS ISA 1 (vonneumann) Processor Organization

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #19 Designing a Single-Cycle CPU 27-7-26 Scott Beamer Instructor AI Focuses on Poker CS61C L19 CPU Design : Designing a Single-Cycle CPU

More information

MIPS Memory Access Instructions

MIPS Memory Access Instructions MIPS Memory Access Instructions MIPS has two basic data transfer instructions for accessing memory lw $t0, 4($s3) #load word from memory sw $t0, 8($s3) #store word to memory The data is loaded into (lw)

More information

CPU Performance Pipelined CPU

CPU Performance Pipelined CPU CPU Performance Pipelined CPU Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Chapters 1.4 and 4.5 In a major matter, no details are small French Proverb 2 Big Picture:

More information

Computer Architecture

Computer Architecture Computer Architecture Lecture 1: Digital logic circuits The digital computer is a digital system that performs various computational tasks. Digital computers use the binary number system, which has two

More information

CAD for VLSI 2 Pro ject - Superscalar Processor Implementation

CAD for VLSI 2 Pro ject - Superscalar Processor Implementation CAD for VLSI 2 Pro ject - Superscalar Processor Implementation 1 Superscalar Processor Ob jective: The main objective is to implement a superscalar pipelined processor using Verilog HDL. This project may

More information

CSE140: Components and Design Techniques for Digital Systems

CSE140: Components and Design Techniques for Digital Systems CSE4: Components and Design Techniques for Digital Systems Tajana Simunic Rosing Announcements and Outline Check webct grades, make sure everything is there and is correct Pick up graded d homework at

More information

Review: Abstract Implementation View

Review: Abstract Implementation View Review: Abstract Implementation View Split memory (Harvard) model - single cycle operation Simplified to contain only the instructions: memory-reference instructions: lw, sw arithmetic-logical instructions:

More information

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions. MIPS Pipe Line 2 Introduction Pipelining To complete an instruction a computer needs to perform a number of actions. These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously

More information

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

ECE232: Hardware Organization and Design. Computer Organization - Previously covered ECE232: Hardware Organization and Design Part 6: MIPS Instructions II http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Computer Organization

More information

Register Transfer and Micro-operations

Register Transfer and Micro-operations Register Transfer Language Register Transfer Bus Memory Transfer Micro-operations Some Application of Logic Micro Operations Register Transfer and Micro-operations Learning Objectives After reading this

More information