Digital System Design

Similar documents
Figure 7.7. A digital system like the one in Figure 7.2.

Behavioral Modeling in Verilog

Chapter 4 The Datapath

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

Elementary Educational Computer

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

Lecture 1: Introduction and Strassen s Algorithm

Module Instantiation. Finite State Machines. Two Types of FSMs. Finite State Machines. Given submodule mux32two: Instantiation of mux32two

Appendix D. Controller Implementation

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

L6: FSMs and Synchronization

Lecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

Computer Architecture. Microcomputer Architecture and Interfacing Colorado School of Mines Professor William Hoff

. Written in factored form it is easy to see that the roots are 2, 2, i,

UNIVERSITY OF MORATUWA

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

Ones Assignment Method for Solving Traveling Salesman Problem

Python Programming: An Introduction to Computer Science

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits

1.2 Binomial Coefficients and Subsets

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Chapter 3 Classification of FFT Processor Algorithms

1. SWITCHING FUNDAMENTALS

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecture 3. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

Chapter 3. Floating Point Arithmetic

BOOLEAN MATHEMATICS: GENERAL THEORY

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

Computer Science Foundation Exam. August 12, Computer Science. Section 1A. No Calculators! KEY. Solutions and Grading Criteria.

Lecture 5. Counting Sort / Radix Sort

Abstract. Chapter 4 Computation. Overview 8/13/18. Bjarne Stroustrup Note:

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

Computers and Scientific Thinking

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

The isoperimetric problem on the hypercube

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

Data diverse software fault tolerance techniques

How do we evaluate algorithms?

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Homework 1 Solutions MA 522 Fall 2017

Chapter 5: Processor Design Advanced Topics. Microprogramming: Basic Idea

Module 8-7: Pascal s Triangle and the Binomial Theorem

CSE 2320 Notes 8: Sorting. (Last updated 10/3/18 7:16 PM) Idea: Take an unsorted (sub)array and partition into two subarrays such that.

Lecture 11: PI/T parallel I/O, part I

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

CMPT 125 Assignment 2 Solutions

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

6.854J / J Advanced Algorithms Fall 2008

Alpha Individual Solutions MAΘ National Convention 2013

Lecture 18. Optimization in n dimensions

Pattern Recognition Systems Lab 1 Least Mean Squares

The Magma Database file formats

Weston Anniversary Fund

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU)

Image Segmentation EEE 508

Custom single-purpose processors: Hardware. 4.1 Introduction. 4.2 Combinational logic design 4-1

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Counting Regions in the Plane and More 1

Data Structures and Algorithms. Analysis of Algorithms

IMP: Superposer Integrated Morphometrics Package Superposition Tool

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Analysis of Algorithms

Introduction CHAPTER Computers

Examples and Applications of Binary Search

The number n of subintervals times the length h of subintervals gives length of interval (b-a).

EE123 Digital Signal Processing

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Lower Bounds for Sorting

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

Recursive Procedures. How can you model the relationship between consecutive terms of a sequence?

the beginning of the program in order for it to work correctly. Similarly, a Confirm

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

Τεχνολογία Λογισμικού

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Evaluation scheme for Tracking in AMI

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

why study sorting? Sorting is a classic subject in computer science. There are three reasons for studying sorting algorithms.

Algorithm. Counting Sort Analysis of Algorithms

Data Structures Week #9. Sorting

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

One advantage that SONAR has over any other music-sequencing product I ve worked

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

MOTIF XF Extension Owner s Manual

Transcription:

July, 22 9:55 vra235_ch Sheet umber Page umber 65 black chapter Digital System Desig a b c d e f g h 8 7 6 5 4 3 2. Bd3 g6+, Ke8 d8 65

July, 22 9:55 vra235_ch Sheet umber 2 Page umber 66 black 66 CHAPTER Digital System Desig I the previous chapters we showed how to desig may types of simple circuits, such as multiplexers, decoders, flip-flops, registers, ad couters, which ca be used as buildig blocks. I this chapter we provide examples of more complex circuits that ca be costructed usig the buildig blocks as subcircuits. Such larger circuits form a digital system. For practical reasos our examples of digital systems will ot be large, but the desig techiques preseted are applicable to systems of ay size. After presetig several examples, we will discuss some practical issues, such as how to esure reliable clockig of flip-flops i idividual ad multiple chips, how to deal with iput sigals that are ot sychroized to the clock sigal, ad the like. A digital system cosists of two mai parts, called the datapath circuit ad the cotrol circuit. The datapath circuit is used to store ad maipulate data ad to trasfer data from oe part of the system to aother. Datapath circuits comprise buildig blocks such as registers, shift registers, couters, multiplexers, decoders, adders, ad so o. The cotrol circuit cotrols the operatio of the datapath circuit. I Chapter 8 we referred to the cotrol circuits as fiite state machies.. Buildig Block Circuits We will give several examples of digital systems ad show how to desig their datapath ad cotrol circuits. The examples use a umber of the buildig block circuits that were preseted i earlier chapters. Some buildig blocks used i this chapter are described below... Flip-Flops ad Registers with Eable Iputs I may applicatios that use D flip-flops, it is useful to be able to prevet the data stored i the flip-flop from chagig whe a active clock edge occurs. We showed i Figure 7.6 how this capability ca be provided by addig a multiplexer to the flip-flop. Figure. depicts the circuit. Whe E =, the flip-flop output caot chage, because the multiplexer coects Q to D. But if E =, the the multiplexer coects the R iput to D. Istead of usig the multiplexer show i the figure, aother way to implemet the eable feature is to use a two-iput AND gate that drives the flip-flop s clock iput. Oe iput to the AND gate is the clock sigal, ad the other iput is E. The settig E = prevets the clock sigal from reachig the flip-flop s clock iput. This method seems simpler tha the multiplexer approach, but we will show i sectio.3 that it ca cause problems i practical operatio. E R D Q Q Clock Q Figure. A flip-flop with a eable iput.

July, 22 9:55 vra235_ch Sheet umber 3 Page umber 67 black. Buildig Block Circuits 67 We will prefer the multiplexer-based approach over gatig the clock with a AND gate i this chapter. Verilog code for a D flip-flop with a asychroous reset iput ad a eable iput is give i Figure.2. We ca exted the eable capability to registers with bits by usig 2-to- multiplexers cotrolled by E. The multiplexer for each flip-flop, i, selects either the exteral data bit, R i, or the flip-flop s output, Q i. Verilog code for a -bit register with a asychroous reset iput ad a eable iput is give i Figure.3...2 Shift Registers with Eable Iputs It is useful to be able to ihibit the shiftig operatio i a shift register by usig a eable iput, E. We showed i Figure 7.9 that shift registers ca be costructed with a parallel-load capability, which is implemeted usig a multiplexer. Figure.4 shows how module rege (R, Clock, Reset, E, Q); iput R, Clock, Reset, E; output Q; reg Q; always @(posedge Clock or egedge Reset) if (Reset == ) Q <=; else if (E) Q <=R; edmodule Figure.2 Code for a Dflip-flop with eable. module rege (R, Clock, Reset, E, Q); parameter =8; iput [ :] R; iput Clock, Reset, E; output [ :] Q; reg [ :] Q; always @(posedge Clock or egedge Reset) if (Reset == ) Q <=; else if (E) Q <=R; edmodule Figure.3 A -bit register with a eable iput.

July, 22 9:55 vra235_ch Sheet umber 4 Page umber 68 black w Clock E R L R R - D Q D Q D Q Q Q Q Q Q Figure.4 A shift register with parallel-load ad eable cotrol iputs. Q - 68

July, 22 9:55 vra235_ch Sheet umber 5 Page umber 69 black. Buildig Block Circuits 69 the eable feature ca be added by usig a additioal multiplexer. If the parallel-load cotrol iput, L, is, the flip-flops are loaded i parallel. But if L =, the additioal multiplexer selects ew data to be loaded ito the flip-flops oly if the eable E is. Verilog code that represets the circuit i Figure.4 is give i Figure.5. Whe L =, the register is loaded i parallel from the R iput. Whe L = ad E =, the data i the shift register is shifted i a right-to-left directio...3 Static Radom Access Memory (SRAM) We have itroduced several types of circuits that ca be used to store data. Assume that we eed to store a large umber, m, of data items, each of which cosists of bits. Oe possibility is to use a -bit register for each data item. We would eed to desig circuitry to cotrol access to each register, both for loadig (writig) data ito it ad for readig data out. Whe m is large, it is awkward to use idividual registers to store the data. A better approach is to make use of a static radom access memory (SRAM) block. A SRAM block is a two-dimesioal array of SRAM cells, where each cell ca store oe bit of iformatio. module shiftle (R, L, E, w, Clock, Q); parameter =4; iput [ :] R; iput L, E, w, Clock; output [ :] Q; reg [ :] Q; iteger k; always @(posedge Clock) begi if (L) Q <=R; else if (E) begi Q[] <=w; for (k = ; k < ; k = k+) Q[k] <= Q[k ]; ed ed edmodule Figure.5 A right-to-left shift register with a eable iput.

July, 22 9:55 vra235_ch Sheet umber 6 Page umber 6 black 6 CHAPTER Digital System Desig If we eed to store m items with bits each, we ca use a array of m SRAM cells. The dimesios of the SRAM array are called its aspect ratio. A SRAM cell is similar to the storage cell that was show i Figure 7.3. Sice a SRAM block may cotai a large umber of SRAM cells, each cell must take as little space o a itegrated circuit chip as possible. For this reaso, the storage cell should use as few trasistors as possible. Oe popular storage cell used i practice is depicted i Figure.6. It operates as follows. To store data ito the cell, the Sel iput is set to, ad the data value to be stored is placed o the Data iput. The SRAM cell may iclude a separate iput for the complemet of the data, idicated by the trasistor show i blue i the figure. For simplicity we assume that this trasistor is ot icluded i the cell. After waitig log eough for the data to propagate through the feedback path formed by the two NOT gates, Sel is chaged to. The stored data the remais i the feedback loop idefiitely. A possible problem is that whe Sel =, the value of Data may ot be the same as the value beig drive by the small NOT gate i the feedback path. Hece the trasistor cotrolled by Sel may attempt to drive the stored data to oe logic value while the output of the small NOT gate has the opposite logic value. To resolve this problem, the NOT gate i the feedback path is built usig small (weak) trasistors, so that its output ca be overridde with ew data. To read data stored i the cell, we simply set Sel to. I this case the Data ode would ot be drive to ay value by exteral circuitry, so that the SRAM cell ca place the stored data o this ode. The Data sigal is passed through a buffer, ot show i the figure, ad provided as a output of the SRAM block. A SRAM block cotais a array of SRAM cells. Figure.7 shows a array with two rows of two cells each. I each colum of the array, the Data odes of the cells are coected together. Each row, i, has a separate select iput, Sel i, that is used to read or write the cotets of the cells i that row. Larger arrays are formed by coectig more cells to Sel i i each row ad by addig more rows. The SRAM block must also cotai circuitry that cotrols access to each row i the array. Figure.8 depicts a 2 m array of the type i Figure.7, which has a decoder that drives the Sel iputs i each row of the array. The iputs to the decoder are called Address iputs. This term derives from the otio that the Sel Data Data Figure.6 A SRAM cell.

July, 22 9:55 vra235_ch Sheet umber 7 Page umber 6 black. Buildig Block Circuits 6 Data Data Sel Sel Figure.7 A 2 2 array of SRAM cells. locatio of a row i the array ca be thought of as the address of the row. The decoder has m Address iputs ad produces 2 m select outputs. If the Write cotrol iput is, the the data bits o the iputs d,..., d are stored i the cells of the row selected by the Address iputs. If the Read cotrol iput is, the the data stored i the row selected by the Address iputs appears o the outputs q,..., q. I may practical applicatios the data iputs ad data outputs are coected together. Thus the Write ad Read iputs must ever have the value at the same time. The desig of memory blocks has bee the subject of itesive research ad developmet. We have described oly the basic operatio of oe type of memory block. The reader ca refer to books o computer orgaizatio for more iformatio [, 2]...4 SRAMBlocks i PLDs Some PLDs cotai SRAM blocks that ca be used as part of circuits implemeted i the chips. Oe popular chip has a umber of SRAM blocks, each of which cotais 248 SRAM cells. The SRAM blocks ca be cofigured to provide differet aspect ratios, depedig o the eeds of the desig beig implemeted. Aspect ratios from 256 8 to 248 ca be realized usig a sigle SRAM block, ad multiple blocks ca be combied to form larger memory arrays. To iclude SRAM blocks i a circuit, desigers use prebuilt modules that are provided i a library as part of the CAD tools.

July, 22 9:55 vra235_ch Sheet umber 8 Page umber 62 black 62 CHAPTER Digital System Desig Data iputs d d 2 d Write Sel Sel Address a a a m m-to-2 m decoder Sel 2 Sel 2 m Read Data outputs q q 2 q Figure.8 A 2 m SRAM block..2 Desig Examples We itroduced algorithmic state machie (ASM) charts i sectio 8. ad showed how they ca be used to describe fiite state machies. ASM charts ca also be used to describe digital systems that iclude both datapath ad cotrol circuits. We will illustrate how the ASM charts ca be used as a aid i desigig digital systems by givig several examples..2. A Bit-Coutig Circuit Suppose that we wish to cout the umber of bits i a register, A, that have the value. Figure.9 shows pseudo-code for a step-by-step procedure, or algorithm, that ca be

July, 22 9:55 vra235_ch Sheet umber 9 Page umber 63 black.2 Desig Examples 63 B = ; while A do if a = the B = B + ; ed if ; Right-shift A ; ed while ; Figure.9 Pseudo-code for the bit couter. used to perform the required task. It assumes that A is stored i a register that ca shift its cotets i the left-to-right directio. The aswer produced by the algorithm is stored i the variable amed B. The algorithm termiates whe A does ot cotai ay more s, that is whe A =. I each iteratio of the while loop, if the least-sigificat bit (LSB) of A is, the B is icremeted by ; otherwise, B is ot chaged. A is shifted oe bit to the right at the ed of each loop iteratio. Figure. gives a ASM chart that represets the algorithm i Figure.9. The state box for the startig state, S, specifies that B is iitialized to. We assume that a iput sigal, s, exists, which is used to idicate whe the data to be processed has bee loaded ito A, so that the machie ca start. The decisio box labeled s stipulates that the machie remais i state S as log as s =. The coditioal output box with Load A writte iside it idicates that A is loaded from exteral data iputs if s = i state S. Whe s becomes, the machie chages to state S2. The decisio box below the state box for S2 checks whether A =. If so, the bit-coutig operatio is complete; hece the machie should chage to state S3. If ot, the FSM remais i state S2. The decisio box at the bottom of the chart checks the value of a. If a =, B is icremeted, which is idicated i the chart as B B +. If a =, the B is ot chaged. I state S3, B cotais the result, which is the umber of bits i A that were. A output sigal, Doe,is set to to idicate that the algorithm is fiished; the FSM stays i S3 util s goes back to..2.2 ASMChart Implied Timig Iformatio I sectio 8. we said that ASM charts are similar to traditioal flowcharts, except that the ASM chart implies timig iformatio. We ca use the bit-coutig example to illustrate this cocept. Cosider the ASM block for state S2, which is shaded i blue i Figure.. I a traditioal flowchart, whe state S2 is etered, the value of A would first be shifted to the right. The we would examie the value of A ad if A s LSB is, we would immediately addtob. But, sice the ASM chart represets a sequetial circuit, chages i A ad B, which represet the outputs of flip-flops, take place after the active clock edge. The same clock sigal that cotrols chages i the state of the machie also cotrols chages i A ad B. Hece i state S2, the decisio box that tests whether A =, as well as the box that checks the value of a, check the bits i A before they are shifted. If A =, the the FSM will chage to state S3 o the ext clock edge (this clock edge also shifts A, which has o effect because A is already i this case.) O the other had, if A =, the the

July, 22 9:55 vra235_ch Sheet umber Page umber 64 black 64 CHAPTER Digital System Desig Reset S Load A B s s S2 S3 Shift right A Doe B B+ A =? a Figure. ASM chart for the pseudo-code i Figure.9. FSM does ot chage to S3, but remais i S2. At the same time, A is still shifted, ad B is icremeted if a has the value. These timig issues are illustrated i Figure.4, which represets a simulatio result for a circuit that implemets the ASM chart. We show how the circuit is desiged i the followig discussio. Datapath Circuit By examiig theasm chart for the bit-coutig circuit, we ca ifer the type of circuit elemets eeded to implemet its datapath. We eed a shift register that shifts left-to-right to implemet A. It must have the parallel-load capability because of the coditioal output box i state S that loads data ito the register. A eable iput is also required because shiftig should occur oly i state S2. A couter is eeded for B, ad it eeds a parallel-load capability to iitialize the cout to i state S. It is ot wise to rely o the couter s reset iput to clear B to i state S. I practice, the reset sigal is used i a digital system for oly two purposes: to iitialize the circuit whe power is first applied, or to recover from

July, 22 9:55 vra235_ch Sheet umber Page umber 65 black.2 Desig Examples 65 Data (log 2 ) + w LB EB L LA EA L E Shift E Couter Clock A (log 2 ) + z a B Figure. Datapath for the ASM chart i Figure.. a error. The machie chages from state S3 tos as a result of s = ; hece we should ot assume that the reset sigal is used to clear the couter. The datapath circuit is depicted i Figure.. The serial iput to the shift register, w, is coected to, because it is ot eeded. The load ad eable iputs o the shift register are drive by the sigals LA ad EA. The parallel iput to the shift register is amed Data, ad its parallel output is A. A-iput NOR gate is used to test whether A =. The output of this gate, z, is whe A =. Note that the figure idicates the -iput NOR gate by showig a sigle iput coectio to the gate, with the label attached to it. The couter has log 2 () bits, with parallel iputs coected to ad parallel outputs amed B. It also has a parallel load iput LB ad eable iput EB cotrol sigals. Cotrol Circuit For coveiece we ca draw a secod ASM chart that represets oly the FSM eeded for the cotrol circuit, as show i Figure.2. The FSM has the iputs s, a, ad z ad geerates the outputs EA, LB, EB, ad Doe. I state S, LB is asserted, so that is loaded i parallel ito the couter. Note that for the cotrol sigals, like LB, istead of writig LB =, we simply write LB to idicate that the sigal is asserted. We assume that exteral circuitry drives LA to whe valid data is preset at the parallel iputs of the shift register, so that the shift register cotets are iitialized before s chages to. I state S2, EA is asserted to cause a shift operatio, ad the cout eable for B is asserted oly if a =.

July, 22 9:55 vra235_ch Sheet umber 2 Page umber 66 black 66 CHAPTER Digital System Desig Reset S LB s s S2 S3 EA Doe EB z a Figure.2 ASM chart for the bit couter datapath circuit. Verilog Code The bit-coutig circuit ca be described i Verilog code as show i Figure.3. We have chose to defie A as a eight-bit vector ad B as a 4-bit vector sigal. The ASM chart i Figure.2 ca be directly traslated ito code that describes the required cotrol circuit. The sigal y is used to represet the preset state of the FSM, ad Y represets the ext state. The FSM is described with three always blocks: the block labeled State_table specifies the state trasitios, the block labeled State_ flipflops represets the state flip-flops, ad the block labeled FSM_outputs specifies the geerated outputs i each state. A default value is specified at the begiig of the FSM_outputs block for each output sigal, ad the idividual output values are specified i the case statemet. The fourth always block defies the up-couter that implemets B. The shift register for A is istatiated at the ed of the code, ad the z sigal is defied usig the reductio NOR operator. We implemeted the code i Figure.3 i a chip ad performed a timig simulatio. Figure.4 gives the results of the simulatio for A =. After the circuit is reset, the iput sigal LA is set to, ad the desired data, (3B) 6, is placed o the

July, 22 9:55 vra235_ch Sheet umber 3 Page umber 67 black.2 Desig Examples 67 module bitcout (Clock, Reset, LA, s, Data, B, Doe); iput Clock, Reset, LA, s; iput [7:] Data; output [3:] B; output Doe; wire [7:] A; wire z; reg [:] Y, y; reg [3:] B; reg Doe, EA, EB, LB; // cotrol circuit parameter S = 2 b, S2 = 2 b, S3 = 2 b; always @(s or y or z) begi: State table case (y) S: if (!s) Y = S; else Y = S2; S2: if (z == ) Y = S2; else Y = S3; S3: if (s) Y = S3; else Y = S; default: Y=2 bxx; edcase ed always @(posedge Clock or egedge Reset) begi: State flipflops if (Reset == ) y <= S; else y <=Y; ed...cotiued i Part b. Figure.3 Verilog code for the bit-coutig circuit (Part a). Data iput. Whe s chages to, the ext active clock edge causes the FSM to chage to state S2. I this state, each active clock edge icremets B if a is, ad shifts A. Whe A =, the ext clock edge causes the FSM to chage to state S3, where Doe is set to ad B has the correct result, B = 5. To check more thoroughly that the circuit is desiged correctly, we should try differet values of iput data.

July, 22 9:55 vra235_ch Sheet umber 4 Page umber 68 black 68 CHAPTER Digital System Desig always @(y or A[]) begi: FSM outputs // defaults EA = ; LB = ; EB = ; Doe = ; case (y) S: LB = ; S2: begi EA = ; if (A[]) EB = ; else EB = ; ed S3: Doe = ; edcase ed // datapath circuit // couter B always @(egedge Reset or posedge Clock) if (!Reset) B <=; else if (LB) B <=; else if (EB) B <=B+; shiftre ShiftA (Data, LA, EA,, Clock, A); assig z= A; edmodule Figure.3 Verilog code for the bit-coutig circuit (Part b)..2.3 Shift-ad-Add Multiplier We preseted a circuit that multiplies two usiged -bit biary umbers i Figure 5.36. The circuit uses a two-dimesioal array of idetical subcircuits, each of which cotais a full-adder ad a AND gate. For large values of, this approach may ot be appropriate because of the large umber of gates eeded. Aother approach is to use a shift register i combiatio with a adder to implemet the traditioal method of multiplicatio that is doe by had. Figure.5a illustrates the maual process of multiplyig two biary umbers. The product is formed by a series of additio operatios. For each bit i i the multiplier that is, we add to the product the value of the multiplicad shifted to the left i times. This algorithm ca be described i pseudo-code as show i Figure.5b, where A is the multiplicad, B is the multiplier, ad P is the product.

July, 22 9:55 vra235_ch Sheet umber 5 Page umber 69 black.2 Desig Examples 69 Figure.4 Simulatio results for the bit-coutig circuit. Decimal 3 3 3 43 Biary Multiplicad Multiplier Product (a) Maual method P =; for i =to do if b i =the P = P + A ; ed if ; Left-shift A ; ed for ; (b) Pseudo-code Figure.5 A algorithm for multiplicatio.

July, 22 9:55 vra235_ch Sheet umber 6 Page umber 62 black 62 CHAPTER Digital System Desig A ASM chart that represets the algorithm i Figure.5b is give i Figure.6. We assume that a iput s is used to cotrol whe the machie begis the multiplicatio process. As log as s is, the machie stays i state S ad the data for A ad B ca be loaded from exteral iputs. I state S2 we test the value of the LSB of B, ad if it is, we add A to P. Otherwise, P is ot chaged. The machie moves to state S3 whe B cotais, because P has the fial product i this case. For each clock cycle i which the machie is i state S2, we shift the value of A to the left, as specified i the pseudo-code i Figure.5b. We shift the cotets of B to the right so that i each clock cycle b ca be used to decide whether or ot A should be added to P. Datapath Circuit We ca ow defie the datapath circuit. To implemet A we eed a right-to-left shift register that has 2 bits. A 2-bit register is eeded for P, ad it must have a eable iput because the assigmet P P + A i state S2 is iside a coditioal output box. A 2-bit adder is eeded to produce P + A. Note that P is loaded with i state S, ad P is loaded Reset S Load A Load B P s s S2 S3 Shift left A, Shift right B Doe P P+ A B =? b Figure.6 ASM chart for the multiplier.

July, 22 9:55 vra235_ch Sheet umber 7 Page umber 62 black.2 Desig Examples 62 from the output of the adder i state S2. We caot assume that the reset iput is used to clear P, because the machie chages from state S3 back to S based o the s iput, ot the reset iput. Hece a 2-to- multiplexer is eeded for each iput to P, to select either or the appropriate sum bit from the adder. A -bit left-to-right shift register is eeded for B, ad a -iput NOR gate ca be used to test whether B =. Figure.7 shows the datapath circuit ad labels the cotrol sigals for the shift registers. The iput data for the shift register that holds A is amed DataA. Sice the shift register has 2 bits, the most-sigificat data iputs are coected to. A sigle LA DataA LB DataB EA L E Shift-left register EB L E Shift-right register Clock A 2 B + Sum 2 2 z b Psel DataP 2 EP E Register 2 P Figure.7 Datapath circuit for the multiplier.

July, 22 9:55 vra235_ch Sheet umber 8 Page umber 622 black 622 CHAPTER Digital System Desig multiplexer symbol is show coected to the register that holds P. This symbol represets 2 2-to- multiplexers that are each cotrolled by the Psel sigal. Cotrol Circuit A ASM chart that represets oly the cotrol sigals eeded for the multiplier is give i Figure.8. I state S, Psel is set to ad EP is asserted, so that register P is cleared. Whe s =, parallel data ca be loaded ito shift registers A ad B by a exteral circuit that cotrols their parallel load iputs LA ad LB. Whe s =, the machie chages to state S2, where Psel is set to ad shiftig of A ad B is eabled. If b =, the eable for P is asserted. The machie chages to state S3 whe z =, ad the remais i S3 ad sets Doe to the value as log as s =. Reset S Psel =, EP s s S2 S3 Psel =, EA, EB Doe EP z b Figure.8 ASM chart for the multiplier cotrol circuit.

July, 22 9:55 vra235_ch Sheet umber 9 Page umber 623 black.2 Desig Examples 623 Verilog Code Verilog code for the multiplier is give i Figure.9. The umber of bits i A ad B is set by the parameter. For registers that are 2 bits wide, the umber of bits is set to +. By chagig the value of the parameters, the code ca be used for umbers of ay size. The always blocks labeled State_table ad State_ flipflops defie the state trasitios ad state flip-flops, respectively. The cotrol circuit outputs are specified i the always block labeled FSM_outputs. The parallel data iput o the shift register A is 2 bits wide, but DataA is oly bits wide. Hece the cocateate operatio {{{ b}}, DataA} is used to preped zeros oto DataA for loadig ito the shift register. The multiplexer eeded for register P is defied usig a for loop that defies 2 2-to- multiplexers. Figure.2 gives a simulatio result for the circuit geerated from the code. After the circuit is reset, LA ad LB are set to, ad the umbers to be multiplied are placed o the DataA ad DataB iputs. After s is set to, the FSM (y) chages to state S2, where it remais util B =. For each clock cycle i state S2, A is shifted to the left, ad B is shifted to the right. I three of the clock cycles i state S2, the cotets of A are added to P, correspodig to the three bits i B that have the value. Whe B =, the FSM chages to state S3 ad P cotais the correct product, which is (64) 6 (9) 6 = (9C4) 6. The decimal equivalet of this result is 25 = 25. The umber of clock cycles that the circuit requires to geerate the fial product is determied by the left-most digit i B that is. It is possible to reduce the umber of clock cycles eeded by usig more complex shift registers for A ad B. If the two right-most bits i B are both, the both A ad B could be shifted by two bit positios i oe clock cycle. Similarly, if the three lowest digits i B are, the a three bit-positio shift ca be doe, ad so o. A shift register that ca shift by multiple bit positios at oce ca be built usig a barrel shifter. We leave it as a exercise for the reader to modify the multiplier to make use of a barrel shifter..2.4 Divider The precedig example implemets the traditioal method of performig multiplicatio by had. I this example we will desig a circuit that implemets the traditioal log-had divisio. Figure.2a gives a example of log-had divisio. The first step is to try to divide the divisor 9 ito the first digit of the divided, which does ot work. Next, we try to divide 9 ito 4, ad determie that is the first digit i the quotiet. We perform the subtractio 4 9 = 5, brig dow the last digit from the divided to form 5, ad the determie that the ext digit i the quotiet is 5. The remaider is 5 45 = 5, ad the quotiet is 5. Usig biary umbers, as illustrated i Figure.2b, ivolves the same process, with the simplificatio that each digit of the quotiet ca be oly or. Give two usiged -bit umbers A ad B, we wish to desig a circuit that produces two -bit outputs Q ad R, where Q is the quotiet A/B ad R is the remaider. The procedure illustrated i Figure.2b ca be implemeted by shiftig the digits i A to the left, oe digit at a time, ito a shift register R. After each shift operatio, we compare R with B. If R B, a is placed i the appropriate bit positio i the quotiet ad B is subtracted from R. Otherwise, a bit is placed i the quotiet. This algorithm is described

July, 22 9:55 vra235_ch Sheet umber 2 Page umber 624 black 624 CHAPTER Digital System Desig module multiply (Clock, Reset, LA, LB, s, DataA, DataB, P, Doe); parameter =8; iput Clock, Reset, LA, LB, s; iput [ :] DataA, DataB; output [+ :] P; output Doe; wire z; reg [+ :] A, DataP; wire [+ :] Sum; reg [:] y, Y; reg [ :] B; reg Doe, EA, EB, EP, Psel; iteger k; // cotrol circuit parameter S = 2 b, S2 = 2 b, S3 = 2 b; always @(s or y or z) begi: State table case (y) S: if (s == ) Y = S; else Y = S2; S2: if (z == ) Y = S2; else Y = S3; S3: if (s == ) Y = S3; else Y = S; default: Y=2 bxx; edcase ed always @(posedge Clock or egedge Reset) begi: State flipflops if (Reset == ) y <= S; else y <=Y; ed...cotiued i Part b. Figure.9 Verilog code for the multiplier circuit (Part a).

July, 22 9:55 vra235_ch Sheet umber 2 Page umber 625 black.2 Desig Examples 625 always @(s or y or B[]) begi: FSM outputs // defaults EA = ; EB = ; EP = ; Doe = ; Psel = ; case (y) S: EP = ; S2: begi EA = ; EB = ; Psel = ; if (B[]) EP = ; else EP = ; ed S3: Doe = ; edcase ed //datapath circuit shiftre ShiftB (DataB, LB, EB,, Clock, B); defparam ShiftB. = 8; shiftle ShiftA ({{{ b}}, DataA}, LA, EA,, Clock, A); defparam ShiftA. = 6; assig z = (B == ); assig Sum=A+P; // defie the 2 2-to- multiplexers always @(Psel or Sum) for (k = ; k < +; k = k+) DataP[k] = Psel? Sum[k] : ; rege RegP (DataP, Clock, Reset, EP, P); defparam RegP. = 6; edmodule Figure.9 Verilog code for the multiplier circuit (Part b). usig pseudo-code i Figure.2c. The otatio R A is used to represet a 2-bit shift register formed usig R as the left-most bits ad A as the right-most bits. The pseudo-code for the multiplier i Figure.5b examies oe digit, b i, i each loop iteratio. I the ASM chart i Figure.6, we shift B to the right so that b always cotais the digit eeded. Similarly, i the log-divisio pseudo-code, each loop iteratio results i settig a digit q i to either or. A straightforward way to accomplish this is to shift or ito the least-sigificat bit of Q i each loop iteratio. A ASM chart that

July, 22 9:55 vra235_ch Sheet umber 22 Page umber 626 black 626 CHAPTER Digital System Desig Figure.2 Simulatio results for the multiplier circuit. represets the divider circuit is show i Figure.22. The sigal C represets a couter that is iitialized to i the startig state S. I state S2, both R ad A are shifted to the left, ad the i state S3, B is subtracted from R if R B. The machie chages to state S4 whe C =. Datapath Circuit We eed -bit shift registers that shift right to left for A, R, ad Q. A -bit register is eeded for B, ad a subtractor is eeded to produce R B. We ca use a adder module i which the carry-i is set to ad B is complemeted. The carry-out, c out, of this module has the value if the coditio R B is true. Hece the carry-out ca be coected to the serial iput of the shift register that holds Q, so that it is shifted ito Q i state S3. Sice R is loaded with i state S ad from the outputs of the adder i state S3, a multiplexer is eeded for the parallel data iputs o R. The datapath circuit is depicted i Figure.23. Note that the dow-couter eeded to implemet C ad the NOR gate that outputs a whe C = are ot show i the figure. Cotrol Circuit A ASM chart that shows oly the cotrol sigals eeded for the divider is give i Figure.24. I state S3 the value of c out determies whether or ot the sum output of the adder is loaded ito R. The shift eable o Q is asserted i state S3. We do ot have to specify whether or is loaded ito Q, because c out is coected to Q s serial iput i the datapath circuit. We leave it as a exercise for the reader to write Verilog code that represets the ASM chart i Figure.24 ad the datapath circuit i Figure.23.

July, 22 9:55 vra235_ch Sheet umber 23 Page umber 627 black.2 Desig Examples 627 5 9 4 9 5 45 5 B Q A R (a) A example usig decimal umbers (b) Usig biary umbers R =; for i =to do Left-shift R A ; if R B the q i =; R = R B ; else q i =; ed if ; ed for ; (c) Pseudo-code Figure.2 A algorithm for divisio. Ehacemets to the Divider Circuit Usig the ASM chart i Figure.24 causes the circuit to loop through states S2 ad S3 for 2 clock cycles. If these states ca be merged ito a sigle state, the the umber of clock cycles eeded ca be reduced to. I state S3, if c out =, we load the sum output (result of the subtractio) from the adder ito R, ad (assumig z = ) chage to state S2. I state S2 we the shift R (ad A) to the left. To combie S2 ad S3 ito a ew state, called S2, we eed to be able to place the sum ito the left-most bits of R while at the same time shiftig the MSB of A ito the LSB of R. This step ca be accomplished by usig a separate flip-flop for the LSB of R. Let the output of this flip-flop be called rr. It is iitialized to whe s = i state S. Otherwise, the flip-flop is loaded from the MSB of A. I state S2, if c out =, R is shifted left ad rr is shifted ito R. But if c out =, R is loaded i parallel from the sum outputs of the adder. Figure.25 illustrates how the divisio example from Figure.2b ca be performed usig clock cycles. The table i the figure shows the values of R, rr, A, ad Q i each step

July, 22 9:55 vra235_ch Sheet umber 24 Page umber 628 black 628 CHAPTER Digital System Desig Reset S Load A Load B R, C s s S2 Shift left R A S4 S3 Doe C C R B? Shift ito Q Shift ito Q R R B C =? Figure.22 ASM chart for the divider. of the divisio. I the datapath circuit i Figure.23, we use a separate shift register for Q. This register is ot actually eeded, because the digits i the quotiet ca be shifted ito the least-sigificat bit of the register used for A. I Figure.25 the digits of Q that are shifted ito A are show i blue. The first row i the table represets loadig of iitial data ito registers A (ad B) ad clearig R ad rr to. I the secod row of the table, labeled clock cycle, the diagoal blue arrow shows that the left-most bit of A () is shifted ito rr. The umber i R rr is ow, which is smaller tha B (). I clock cycle, rr is

July, 22 9:55 vra235_ch Sheet umber 25 Page umber 629 black.2 Desig Examples 629 Rsel LA DataA EB DataB LR ER L E Left-shift register w EA L E Left-shift register E Register a A B EQ E Left-shift register w c out + c i Clock Q R Figure.23 Datapath circuit for the divider. shifted ito R, ad the MSB of A is shifted ito rr. Also, as show i blue, a is shifted ito the LSB of Q (A). The umber i R rr is ow, which is still smaller tha B. Hece, i clock cycle 2 the same actios are performed as for clock cycle. These actios are also performed i clock cycles 3 ad 4, at which poit R rr =. Sice this is larger tha B, i clock cycle 5 the result of the subtractio = is loaded ito R. The MSB of A () is still shifted ito rr,adaisshifted ito Q. I clock cycles 6, 7, ad 8, the umber i R rr is larger tha B; hece i each of these cycles the result of the subtractio R rr B is loaded ito R, adaisloaded ito Q. After clock cycle 8 the correct result, Q = ad R =, is obtaied. The bit rr is ot a part of the fial result. A ASM chart that shows the values of the required cotrol sigals for the ehaced divider is depicted i Figure.26. The sigal ER is used i cojuctio with the flip-flop that has the output rr. Whe ER =, the value is loaded ito the flip-flop. Whe ER is set to, the MSB of shift register A is loaded ito the flip-flop. I state S, if s =, the LR is asserted to iitialize R to. Registers A ad B ca be loaded with data from exteral iputs. Whe s chages to, the machie makes a trasitio to state S2 ad at the same time shifts R R A to the left. I state S2, if c out =, the R is loaded i parallel from

July, 22 9:55 vra235_ch Sheet umber 26 Page umber 63 black 63 CHAPTER Digital System Desig Reset S Rsel =, LR, LC s S2 ER, EA S3 EQ, Rsel =, EC s c out S4 Doe LR z Figure.24 ASM chart for the divider cotrol circuit. the sum outputs of the adder. At the same time, R A is shifted left (rr is ot shifted ito R i this case). If c out =, the R R A is shifted left. The ASM chart shows how the parallel-load ad eable iputs o the registers have to be cotrolled to achieve the desired operatio. The datapath circuit for the ehaced divider is illustrated i Figure.27. As discussed for Figure.25, the digits of the quotiet Q are shifted ito register A. Note that oe of the -bit data iputs o the adder module is composed of the least-sigificat bits i register R cocateated with bit rr o the right. Verilog Code Figure.28 shows Verilog code that represets the ehaced divider. The parameter sets the umber of bits i the operads. The State_table, State_ flipflops, ad FSM_outputs

July, 22 9:55 vra235_ch Sheet umber 27 Page umber 63 black.2 Desig Examples 63 B A Clock cycle R rr A/Q Load A, B Shift left Shift left, Q 2 Shift left, Q 3 Shift left, Q 4 Shift left, Q 5 Subtract, Q 6 Subtract, Q 7 Subtract, Q 8 Subtract, Q Figure.25 A example of divisio usig = 8 clock cycles. always blocks describe the cotrol circuit, as i the previous examples. The shift registers ad couters i the datapath circuit are istatiated at the bottom of the code. The sigal rr i Figure.25 is represeted i the code by the sigal R. This sigal is implemeted as the output of the muxdff compoet; the code for this subcircuit is show i Figure 7.52. Note that the adder that produces the Sum sigal has oe iput defied as the cocateatio of R with R. The multiplexer eeded for the iput to R is represeted by the DataR sigal. This multiplexer is defied i the last statemet of the code. A simulatio result for the circuit produced from the code is give i Figure.29. The data A = A6 ad B = 8 is loaded, ad the s is set to. The circuit chages to state S2 ad cocurretly shifts R, R, ad A to the left. The output of the shift register that holds A is labeled Q i the simulatio results because this shift register cotais the quotiet whe the divisio operatio is complete. O the first three active clock edges i state S2, the umber represeted by R R is less tha the umber i B (8); hece R R A is shifted left o each clock edge, ad is shifted ito Q. I the fourth cosecutive clock cycle for which the FSM has bee i state S2, the cotets of R are = (5), ad R is ; hece R R = = (). O the ext active clock edge, the output of the adder, which is 8 = 2, is loaded ito R, ad is shifted ito Q. After clock cycles i state S2, the circuit chages to state S3, ad the correct result, Q = 4 = (2) ad R = 6, is obtaied..2.5 Arithmetic Mea Assume that k-bit umbers are stored i a set of registers R,...,R k. We wish to desig a circuit that computes the mea M of the umbers i the registers. The pseudo-code for a suitable algorithm is show i Figure.3a. Each iteratio of the loop adds the cotets of oe of the registers, deoted R i,toasum variable. After the sum is computed, M is obtaied as Sum/k. We assume that iteger divisio is used, so a remaider R, ot show i the code, is produced as well.

July, 22 9:55 vra235_ch Sheet umber 28 Page umber 632 black 632 CHAPTER Digital System Desig Reset S LR Rsel =, LC, ER s EA, ER S2 ER, ER, EA, Rsel = s c out S3 Doe LR EC z Figure.26 ASM chart for the ehaced divider cotrol circuit. A ASM chart is give i Figure.3b. While the start iput, s, is, the registers ca be loaded from exteral iputs. Whe s becomes, the machie chages to state S2, where it remais while C =, ad computes the summatio (C is a couter that represets i i Figure.3a). Whe C =, the machie chages to state S3 ad computes M = Sum/k. From the previous example, we kow that the divisio operatio requires multiple clock cycles, but we have chose ot to idicate this i the ASM chart. After computig the divisio operatio, state S4 is etered ad Doe is set to. Datapath Circuit The datapath circuit for this task is more complex tha i our previous examples. It is depicted i Figure.3. We eed a register with a eable iput to hold Sum. For simpli-

July, 22 9:55 vra235_ch Sheet umber 29 Page umber 633 black.2 Desig Examples 633 LA DataA EB DataB Clock EA L E Left-shift register w E Register B c out + c i Rsel ER LR ER L E Left-shift register w rr Q Q D q r 2 r R Q Figure.27 Datapath circuit for the ehaced divider. city, assume that the sum ca be represeted i bits without overflowig. A multiplexer is required o the data iputs o the Sum register, to select i state S ad the sum outputs of a adder i state S2. The Sum register provides oe of the data iputs to the adder. The other iput has to be selected from the data outputs of oe of the k registers. Oe way to select amog the registers is to coect them to the data iputs of a k-to- multiplexer that is coected to the adder. The select lies o the multiplexer ca be cotrolled by the

July, 22 9:55 vra235_ch Sheet umber 3 Page umber 634 black 634 CHAPTER Digital System Desig module divider (Clock, Reset, s, LA, EB, DataA, DataB, R, Q, Doe); parameter =8,log=3; iput Clock, Reset, s, LA, EB; iput [ :] DataA, DataB; output [ :] R, Q; output Doe; wire Cout, z; wire [ :] DataR; wire [:] Sum; reg [:] y, Y; reg [ :] A, B; reg [log :] Cout; reg Doe, EA, Rsel, LR, ER, ER, LC, EC, R; iteger k; // cotrol circuit parameter S = 2 b, S2 = 2 b, S3 = 2 b; always @(s or y or z) begi: State table case (y) S: if (s == ) Y = S; else Y = S2; S2: if (z == ) Y = S2; else Y = S3; S3: if (s == ) Y = S3; else Y = S; default: Y=2 bxx; edcase ed always @(posedge Clock or egedge Reset) begi: State flipflops if (Reset == ) y <= S; else y <=Y; ed...cotiued i Part b. Figure.28 Verilog code for the divider circuit (Part a).

July, 22 9:55 vra235_ch Sheet umber 3 Page umber 635 black.2 Desig Examples 635 always @(y or s or Cout or z) begi: FSM outputs // defaults LR=;ER=;ER=;LC=;EC=;EA=; Rsel = ; Doe = ; case (y) S: begi LC=;ER=; if (s == ) begi LR = ; ER = ; ed else begi LR = ; EA = ; ER = ; ed ed S2: begi Rsel = ; ER = ; ER = ; EA = ; if (Cout) LR = ; else LR = ; if (z == ) EC = ; else EC = ; ed S3: Doe = ; edcase ed...cotiued i Part c. Figure.28 Verilog code for the divider circuit (Part b). couter C. To compute the divisio operatio, we ca use the divider circuit desiged i sectio.2.4. The circuit i Figure.3 is based o k = 4, but the same circuit structure ca be used for larger values of k. Note that the eable iputs o the registers R through R 3 are coected to the outputs of a 2-to-4 decoder that has the two-bit iput RAdd, which stads for register address. The decoder eable iput is drive by the ER sigal. All registers are loaded from the same iput lies, Data. Sice k = 4, we could perform the divisio operatio simply by shiftig Sum two bits to the right, which ca be doe i oe clock cycle with a shift register that shifts by two digits. To obtai a more geeral circuit that works for ay value of k, we use the divider circuit desiged i sectio.2.4.

July, 22 9:55 vra235_ch Sheet umber 32 Page umber 636 black 636 CHAPTER Digital System Desig // datapath circuit rege RegB (DataB, Clock, Reset, EB, B); defparam RegB. = ; shiftle ShiftR (DataR, LR, ER, R, Clock, R); defparam ShiftR. = ; muxdff FF R (, A[ ], ER, Clock, R); shiftle ShiftA (DataA, LA, EA, Cout, Clock, A); defparam ShiftA. = ; assig Q=A; dowcout Couter (Clock, EC, LC, Cout); defparam Couter. = log; assig z = (Cout == ); assig Sum = {R, R} + ( B + ); assig Cout = Sum[]; // defie the 2-to- multiplexers assig DataR = Rsel? Sum : ; edmodule Figure.28 Verilog code for the divider circuit (Part c). Cotrol Circuit Figure.32 gives a ASM chart for the FSM eeded to cotrol the circuit i Figure.3. While i state S, data ca be loaded ito registers R,...,R k. But o cotrol sigals have to be asserted for this purpose, because the registers are loaded uder cotrol of the ER ad RAdd iputs, as discussed above. Whe s =, the FSM chages to state S2, where it asserts the eable ES o the Sum register ad allows C to decremet. Whe the couter reaches (z = ), the machie eters state S3, where it asserts the LA ad EB sigals to load the Sum ad k ito the A ad B iputs of the divider circuit, respectively. The FSM the eters state S4 ad asserts the Div sigal to start the divisio operatio. Whe it is fiished, the divider circuit sets zz =, ad the FSM moves to state S5. The mea M appears o the Q ad R outputs of the divider circuit. The Div sigal must still be asserted i state S5 to prevet the divider circuit from reiitializig its registers. Note that i the ASM chart i Figure.3b, oly oe state is show for computig M = Sum/k, but i Figure.32, states S3 ad S4 are used for this purpose. It is possible to combie states S3 ad S4, which we will leave as a exercise for the reader (problem.6). Alterative Datapath Circuits I Figure.3 registers R,...,R k are coected to the adder usig a multiplexer. Aother way to achieve the desired coectio is to add tri-state buffers to the outputs of the k registers ad to coect all tri-state buffers for a give bit positio to the correspodig

July, 22 9:55 vra235_ch Sheet umber 33 Page umber 637 black.2 Desig Examples 637 Figure.29 Simulatio results for the divider circuit. iput of the adder. The dow-couter C ca be used to eable each tri-state buffer at the proper time (whe the FSM is i state S2), by coectig a 2-to-4 decoder to the outputs of the couter ad usig oe output of the decoder to eable each tri-state buffer. We will show a example of usig tri-states buffers i this maer i Figure.42. For large values of k, it is preferable to use a SRAM block with k rows ad colums, istead of usig k registers. Predefied modules that represet SRAM blocks are usually provided by CAD tools. If the circuit beig desiged is to be implemeted i a custom chip, the the CAD tools esure that the desired SRAM block is icluded o the chip. Some PLDs iclude SRAM blocks that ca be cofigured to implemet various umbers of rows ad colums. The CAD system that accompaies the book provides the lpm_ram_dq module, which is a part of the LPM stadard library. Figure.33 gives a schematic diagram for the arithmetic mea circuit, usig the parameters k = 6 ad = 8. This schematic was created usig the CAD tools that accompay the book. Four of the graphical symbols i the schematic represet subcircuits described usig Verilog code, amely dowct, rege, divider, ad meactl. The code for the divider subcircuit is show i Figure.28. The meactl subcircuit represets the FSM i Figure.32. The Verilog code for this FSM is ot show. The schematic also icludes a multiplexer coected to the Sum register, a adder, ad a NOR gate that detects whe the couter C reaches. The outputs of the couter provide the address iputs to the SRAM block, called MReg. The SRAM block has 6 rows ad eight colums. I Figure.3 a decoder cotrols the loadig of data ito each of the k registers. To read the data from the registers, the couter C is used. To keep the schematic i Figure.33 simple, we have icluded the

July, 22 9:55 vra235_ch Sheet umber 34 Page umber 638 black 638 CHAPTER Digital System Desig Sum =; for i = k dow to do Sum = Sum +R i ed for ; M = Sum k ; (a) Pseudo-code Reset S Load registers Sum, C k s S2 Sum Sum + R i C C C =? s S3 S4 M Sum k Doe (b) ASM chart Figure.3 A algorithm for fidig the mea of k umbers.

July, 22 9:55 vra235_ch Sheet umber 35 Page umber 639 black.2 Desig Examples 639 RAdd ER w y w E 2-to-4 y y 2 y 3 Data E Register E Register E Register E Register Clock Ssel ES E Register EC LC E L Dow-couter z + k EB Sum LA k B EB A LA Div s Divider R Q Doe M zz Figure.3 Datapath circuit for the mea operatio.

July, 22 9:55 vra235_ch Sheet umber 36 Page umber 64 black 64 CHAPTER Digital System Desig Reset S LC, Ssel =, ES s S2 EC Ssel =, ES z S3 LA, EB s S4 Div S5 Div, Doe zz Figure.32 ASM chart for the mea operatio cotrol circuit. couter to read data from the SRAM block, but we have igored the issue of writig data ito the SRAM block. It is possible to modify the meactl code to allow the couter C to address the SRAM block for loadig the iitial data, but we will ot pursue this issue here. For simulatio purposes we ca use a feature of the CAD system that allows iitial data to be stored i the SRAM block. We chose to store i R (row of the SRAM block); ir,...; ad 5 i R 5. The results of a timig simulatio for the circuit implemeted i a FPGA chip are show i Figure.34. Oly a part of the simulatio, from the poit

July, 22 9:55 vra235_ch Sheet umber 37 Page umber 64 black.2 Desig Examples 64 Figure.33 Schematic of the mea circuit with a SRAM block. where C = 5, is show i the figure. At this poit the meactl FSM is i state S2, ad the Sum is beig accumulated. Whe C reaches, Sum has the correct value, which is + + 2 +...+ 5 = 2 = (78) 6. The FSM chages to state S3 for oe clock cycle ad the remais i state S4 util the divisio operatio is complete. The correct result, Q = 7 ad R = 8, is obtaied whe the FSM chages to state S5..2.6 Sort Operatio Give a list of k usiged -bit umbers stored i a set of registers R,...,R k,we wish to desig a circuit that ca sort the list (cotets of the registers) i ascedig order. Pseudo-code for a simple sortig algorithm is show i Figure.35. It is based o fidig the smallest umber i the sublist R i,...,r k ad movig that umber ito R i, for i =, 2,...,k 2. Each iteratio of the outer loop places the umber i R i ito A. Each iteratio of the ier loop compares this umber to the cotets of aother register R j.if

July, 22 9:55 vra235_ch Sheet umber 38 Page umber 642 black 642 CHAPTER Digital System Desig Figure.34 Simulatio results for the mea circuit usig SRAM. for i = tok 2do A = R i ; for j = i + tok do B = R j ; if B < A the R i = B ; R j = A ; A = R i ; ed if ; ed for ; ed for ; Figure.35 Pseudo-code for the sort operatio. the umber i R j is smaller tha A, the cotets of R i ad R j are swapped ad A is chaged to hold the ew cotets of R i. A ASM chart that represets the sortig algorithm is show i Figure.36. I the iitial state S, while s = the registers are loaded from exteral data iputs ad a couter C i that represets i i the outer loop is cleared. Whe the machie chages to state S2, A is loaded with the cotets of R i. Also, C j, which represets j i the ier loop, is iitialized to the value of i. State S3 is used to iitialize j to the value i +, ad state S4 loads the value of R j ito B. I state S5, A ad B are compared, ad if B < A, the machie moves to state S6. States S6 ad S7 swap the values of R i ad R j. State S8 loads A from R i. Although this step is ecessary oly for the case where B < A, the flow of cotrol is simpler if this operatio is performed i both cases. If C j is ot equal to k, the machie chages from S8toS4, thus remaiig i the ier loop. If C j = k ad C i is ot equal to k 2, the the machie stays i the outer loop by chagig to state S2.

July, 22 9:55 vra235_ch Sheet umber 39 Page umber 643 black.2 Desig Examples 643 Reset Load registers S C i s S2 A R i, C j C i C i C i + S3 C j C j + S4 B R j S5 C j C j + B< A? S6 R j A S7 S8 R i B A R i C j = k? S9 Doe s C i = k 2? Figure.36 ASM chart for the sort operatio.

July, 22 9:55 vra235_ch Sheet umber 4 Page umber 644 black 644 CHAPTER Digital System Desig Datapath Circuit There are may ways to implemet a datapath circuit that meets the requiremets of the ASM chart i Figure.36. Oe possibility is illustrated i Figures.37 ad.38. Figure.37 shows how the registers R,...,R k ca be coected to registers A ad B usig 4-to- multiplexers. We assume the value k = 4 for simplicity. Registers A ad B are coected to a comparator subcircuit ad, through multiplexers, back to the iputs of the registers R,...,R k. The registers ca be loaded with iitial (usorted) data usig the DataI lies. The data is writte (loaded) ito each register by assertig the WrIit cotrol sigal ad placig the address of the register o the RAdd iput. The tri-state buffer drive by the Rd cotrol sigal is used to output the cotets of the registers o the DataOut output. ABmux DataI WrIit RData Ri E Ri E Ri 2 E Ri 3 E R R R 2 R 3 2 3 Imux ABData Ai E Bi E Rd Clock DataOut Bout A < B BltA Figure.37 A part of the datapath circuit for the sort operatio.