Control and Datapath 8 Engineering attempts to develop design methods that break a problem up into separate steps to simplify the design and increase the likelihood of a correct solution. Digital system design is no different and it is frequently necessary to divide, in order to conquer a design problem. A general divide and conquer design approach for digital systems can be achieved by breaking up the design into a datapath of interconnected registers (to hold results) and functional units (to perform logical and arithmetic operations), and a control unit that will control the sequence of operations performed by the datapath to perform the desired calculation. DESIGN DATAPATH Consists of interconnected storage and functional units to hold results and perform logical and arithmetic operations. CONTROL UNIT Controls execution of operations using a finite state machine whose states control the data-path to generate the needed data flow. Figure 8.1 Control and Datapath
108 Digital Engineering with VHDL X times 3 A simple design problem is to build a digital system to calculate Z = 3 * X. This can can be achieved via repeated addition, i.e. Z = X + X + X. There is rarely just one solution to a digital design, instead there are design decisions and constraints that drive the solution to a final implementation. In this case the constraint is a lack of a multiplier and the availability of a single adder. If a multiplier was available then the solution is trivial. This example is contrived to demonstrate design concepts and would not be considered a practical circuit. A datapath to calculate 3 * X must be able to accumulate the partial product of X. In this implementation we assume that the registers must be explicitly cleared and that will be achieved with a multiplexer that will allow loading register P with with either zero or the output of the adder. If register P had a reset it could be used to clear the register. The multiplexer demonstrates its general use as a datapath element for directing information through the circuit.
Control and Datapath 109 DIN "0" SEL 0 1 MUX LDX REGISTER X REGISTER P LDP ADDER CONTROL SIGNAL DATAPATH ZOUT Figure 8.2 3 * X Datapath The sequence of operations on this datapath to produce 3 * DIN is first to clear register P and load input DIN into register X. Next register X (which is equal to input DIN) is summed with register P by adding register X to the accumulated sum in P. P accumulates the partial product. This last operation is performed three times to result in P = 3 * X. The following table describes these steps not in English but in terms of the transfers that occur between registers.
110 Digital Engineering with VHDL CLEAR P LOAD X with DIN P <= X + P P <= X + P P <= X + P HALT Figure 8.3 3 * X Operations From the table form it can been seen that the P register can be cleared in parallel (simultaneously) with the loading of the X register. While for this exercise this simple speed up won t be used, it would save one clock cycle in 5, resulting in a 20% improvement in circuit performance. In a Register Transfer Language (RTL) like notion we can write the steps to perform 3*X as the following. A new state T5 has been add to halt the computation. This is not done by stopping (gating) the clock, but instead by simply not loading the P register with new data. T0: P <= 0; T1: X <= DIN; T2: P <= X + P T3: P <= X + P T4: P <= X + P T5: P <= P; X <= X; Figure 8.4 3*XRTL At this point the control signals and their values for controlling the registers (LDX, LDP) and multiplexer (SEL) must be determined in order to accomplish the sequence of actions in Figure 8.4. In addition the next control state is explicitly listed. T0: SEL=0; LDX=0; LDP=1; NEXT T=T1 T1: SEL=-; LDX=1; LDP=0; NEXT T=T2 T2: SEL=1; LDX=0; LDP=1; NEXT T=T3 T3: SEL=1; LDX=0; LDP=1; NEXT T=T4 T4: SEL=1; LDX=0; LDP=1; NEXT T=T5 T5: SEL=-; LDX=0; LDP=0; NEXT T=T5 Figure 8.5 Control signal values It is assumed that the registers have a LD (load) signal and the multiplexer has a SEL (select) signal. On control steps T1 and T5 SEL is a don t care ( - ) as its output is not being used since register P is not loaded during these control steps.
Control and Datapath 111 Note that state T5 halts the computation by disabling the load of registers X and P. DIN "0" CLOCK RESET CONTROL FSM SEL 0 1 MUX LDX REGISTER X REGISTER P LDP ADDER Figure 8.6 ZOUT 3 * X Control and Datapath Constructing the datapath is easy assuming we have a register, adder, and multiplexer primitive available. Notice that the word length of the operands have not been specified - 8, 16, or 32 bits. No need to. It only impacts the maximum value of the integer that can be multiplied, not the general design of the datapath or control logic. But do note that the multiplication of an N bit binary number will result in a 2N bit result. The implementation will assume 4 bit operands (0 to 15) with a 8 bit result (0 to 225). As such the adder and register P must have a data width of 8 bits. Register X can be 4 bits wide, but the adder will require two 8 bit inputs. The next step is the less trivial design of the control logic to generate the control signals. Using a schematic capture program one is faced with a manual synthesis of an synchronous finite state machine (FSM) including state diagram, state table, and the generation of the next state equations. An alternative approach might be to make use of primitives such as a counter to construct the control circuit instead of a manual synthesis and gate level design.
112 Digital Engineering with VHDL Noting that six states T0, T1, T2, T3, T4, T5 are needed, a counter can be used to generate the 6 distinct states. The counter is the finite state machine in this design approach. These states can be then decoded with combinational logic to generate the control signals. CLOCK RESET COUNTER 0 to 5 Combinational Logic SEL LDX LDP Figure 8.7 3 * X Counter based FSM This approach will work but the boolean equations for the decoder still need to be manually generated, using K-maps or another method for minimizing a truth table. For large designs, with many control signals, a large number of equations might have to be solved. If the design then changes the equations have to be solved again. And possibly again. And then again with each new design iteration. An alternative is to microcode the design by using a ROM to hold the control bits and the counter s output is used to access the memory locations containing the bits for each control state. Remember a ROM of N address lines can implement any boolean function of N variables. The simplification here is that control bits can be entered directly into the ROM. While this example is a trivial example of microcoding, when generalized there are many advantages. First is if the ROM holds the next address, we can implement branches in our controller, treating the content of the ROM as microcode instructions. Secondly, the microcode can be readily changed and it corresponds directly to the control signals.
Control and Datapath 113 ROM ADDRESS ROM CLOCK RESET COUNTER 0 to 5 0: 0 1 0 1: 0 0 1 2: 1 1 0 3: 1 1 0 4: 1 1 0 5: 0 0 0 SEL LDP LDX Figure 8.8 3 * X Micro-code Implementation With a bit of perseverance one could construct this circuit in a TTL schematic editor, but instead let s implement it in VHDL. This approach to design is not uncommon. The datapath is designed using predefined modules, possibly using a graphical interface. The control logic on the other hand, being much less structured, and generally unique to the current design, is synthesized from an HDL based description.
114 Digital Engineering with VHDL 74LS163A +5 ROM RESETBAR CLOCK /SR CP Vcc TC P0 P1 P2 P3 Q0 Q1 Q2 Q3 A0 A1 A2 A3 D0 D1 D2 CEP CET GND /PE Disable count when counter reaches 5 Figure 8.9 TTL Control FSM Implementation The VHDL entity can be described first. It won t change as the inputs and outputs of the control unit are the same for any implementation that is chosen. There are many choices for the architectures. An obvious solution is to mimic the TTL circuit implementing a counter and decode logic.
Control and Datapath 115 library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; entity control is port ( LDX, LDP, SEL: out std_logic; RESET, CLOCK: in std_logic); end control; architecture synthesis of control is signal cnt: std_logic_vector(3 downto 0); --The counter wait until clk= 1 ; if reset = 0 then cnt <= "0000"; if cnt /= 5 then cnt <= cnt + 1; cnt <= cnt; end ; --Decode logic ldx <= 1 when cnt = 1 1 ; ldp <= 0 when cnt = 1 OR cnt = 5 1 ; sel <= 1 when cnt = 2 OR cnt = 3 OR cnt = 4 0 ; end synthesis; Figure 8.10 VHDL Control FSM
116 Digital Engineering with VHDL CLOCK RESET COUNTER 0 to 5 wait until clk= 1 ; if reset = 0 then cnt <= "0000"; if cnt /= 5 then cnt <= cnt + 1; cnt <= cnt; end ; Combinational Logic ldx <= 1 when cnt = 1 1 ; ldp <= 0 when cnt = 1 OR cnt = 5 1 ; sel <= 1 when cnt = 2 OR cnt = 3 OR cnt = 4 0 ; SEL LDX LDP Figure 8.11 VHDL Control Blocks Alternatively the entire design, including both datapath and control, can be implemented in VHDL. The entity changes with the addition of DIN as an input. library ieee; use ieee.std_logic_1164.all use work.std_arith.all; entity PROD is port ( DIN: in std_logic_vector(7 downto 0); ZOUT: out std_logic_vector(7 downto 0); RESET: in std_logic; CLK: in std_logic); end PROD; Figure 8.12 3 * X Completely in VHDL Entity
Control and Datapath 117 architecture behavioral of PROD is SUM <= X + X + X; end behavioral; Figure 8.13 Tr ivial 3 * X VHDL Architecture This first architecture is sort of a cheat as a combinational adder has been specified. Instead the design can be implemented as shown in the original block diagram. With consisting of a datapath and a counter.
118 Digital Engineering with VHDL architecture diagram of PROD is signal X, P: std_logic_vector(7 downto 0); signal CNT: std_logic_vector(2 downto 0); wait until clk= 1 ; if reset = 0 then CNT <= 0; --Behavioral Control and Dataflow if CNT = 0 then P <= 0; elsif CNT = 1 then X <= DIN; elsif CNT = 2 then P <= P + X; elsif CNT = 3 then P <= P + X; elsif CNT = 4 then P <= P + X; elsif CNT = 5 then P <= P; --Counter if CNT /= 5 then CNT <= CNT + 1; CNT <= CNT; end ; ZOUT <= P; end diagram; Figure 8.14 Direct RTL VHDL Implementation That is not a bad solution, it directly describes the RTL description. A minor improvement would be the use of a case statement to improve the synthesis of the
Control and Datapath 119 individual control states. The if statements can generate extra logic to implement the implied priority of each branch of the if. But this design does not really capture the original block diagram of the original design. But otherwise nothing wrong with this behavioral dataflow implementation assuming it synthesizes, and the synthesis tool comes up with a good design. One shared adder, versus one adder for every "+" operation. Instead we will write a for each module in the data path so that there is a one to one correspondence of block diagram and VHDL. One for each register, the multiplexer, and adder, and the control. Such a design is sometimes referred to as a dataflow design.
120 Digital Engineering with VHDL architecture dataflow of PROD is signal SEL, LDX, LDP: std_logic; signal XREG, PREG, TMP: std_logic_vector(7 downto 0); signal CNT: std_logic_vector(2 downto 0); XREG: wait until clk= 1 ; if ldx=1 then XREG <= DIN; end XREG; PREG: wait until clk= 1 ; if ldp=1 then PREG <= TMP1; end PREG; MUX: if sel = 1 then TMP1 <= TMP2; TMP1 <= 0; end MUX; ADDER: TMP2 <= XREG + PREG; end ADDER; COUNTER: wait until clk = 1 ; if RESET= 1 then CNT <= 0; if CNT /= 5 then CNT <= CNT + 1; CNT <= CNT; end COUNTER; ZOUT <= PREG; CONTROL: case cnt is when "000" => LDP <= 0 ; LDX <= 0 ; SEL = 0 ; when "001" => LDP <= 0 ; LDX <= 1 ; SEL = 0 ; when "010" => LDP <= 1 ; LDX <= 0 ; SEL = 1 ; when "011" => LDP <= 1 ; LDX <= 0 ; SEL = 1 ; when "100" => LDP <= 1 ; LDX <= 0 ; SEL = 1 ; when others => LDP <= 0 ; LDX <= 0 ; SEL = 0 ; end case; end CONTROL: ; end dataflow; Figure 8.15 VHDL Dataflow3*XImplementation
Control and Datapath 121 This design is a direct implementation of the original block diagram. With a little work both the function and the structure of the block diagram can be extracted from the VHDL. The block diagram describes the design in terms of primitives including registers, an adder, a multiplexer, counter, and decoder. It is a structural interconnection of primitives. The VHDL can be mapped onto the block diagram as shown below.. CLK RESET DIN wait until clk= 1 ; if reset= 1 then cnt <= "0000"; if cnt /= 5 then cnt <= cnt + 1; cnt <= cnt; end ; case cnt is when "000" => ldx <= 1 ;ldp <= 0 ; sel <= 0 ; when "001" => ldx <= 0 ;ldp <= 1 ; sel <= 0 when "010" => ldx <= 1 ;ldp <= 0 ; sel <= 1 ; when "011" => ldx <= 1 ;ldp <= 0 ; sel <= 1 ; when "100" => ldx <= 1 ;ldp <= 0 ; sel <= 1 ; when others => ldx <= 0 ;ldp <= 0 ; sel <= 0 ; end case; end ; LDX LDP SEL wait until clk= 1 ; if ldx = 1 then xreg <= din; end ; XREG tmp2 <= xreg + preg; end ; if sel = 1 then tmp1 <= tmp2; tmp1 <= "0000"; end ; wait until clk= 1 ; if ldp = 1 then preg <= tmp1; end ; PREG TMP1 Figure 8.16 TMP2 3 * X Str uctural VHDL VHDL also supports true structural descriptions which simplifies design reuse by allowing the interconnection of entity/architecture pairs within another entity/architecture. These new entity/architecture pairs can be created by writing entities which incorporate the es from the previous dataflow design into their respective architectures.
122 Digital Engineering with VHDL entity reg is port( DIN: in std_logic_vector(7 downto 0; DOUT: out std_logic_vector(7 downto 0; LD: in std_logic; CLK: in std_logic); end reg; entity adder is port( A,B: in std_logic_vector(7 downto 0; SUM: out std_logic_vector(7 downto 0)); end adder; entity mux is port( D0,D1: in std_logic_vector(7 downto 0; DOUT: out std_logic_vector(7 downto 0)); end mux; entity con is port(ldx, LDP, SEL: out std_logic; RESET,CLK: in std_logic); end control; Figure 8.17 Str uctural Entity Declarations entity reg is port( DIN: in std_logic_vector(7 downto 0; DOUT: out std_logic_vector(7 downto 0; LD: in std_logic; CLK: in std_logic); end reg; architecture synthesis of reg is --REGISTER wait until clk= 1 ; if ld=1 then DOUT <= DIN; end ; end synthesis; Figure 8.18 Register Entity/Architecture Pair With the entities specified, the design s structure can be specified in terms of the other entities. The advantage of this approach is that existing VHDL entities
Control and Datapath 123 can be used from a library of components exactly as is done with a schematic capture program.
124 Digital Engineering with VHDL architecture structural of PROD is component reg port( DIN: in std_logic_vector(7 downto 0; DOUT: out std_logic_vector(7 downto 0; LD: in std_logic; CLK: in std_logic); end component; component adder port( A,B: in std_logic_vector(7 downto 0; SUM: out std_logic_vector(7 downto 0)); end component; component mux port( D0,D1: in std_logic_vector(7 downto 0; DOUT: out std_logic_vector(7 downto 0)); end component; component con port(ldx, LDP, SEL: out std_logic; RESET,CLK: in std_logic); end component; signal SEL, LDX, LDP: std_logic; signal XREG, PREG: std_logic_vector(7 downto 0); signal TMP: std_logic_vector(7 downto 0); signal CNT: std_logic_vector(2 downto 0); signal zero: std_logic_vector(7 downto 0); zero <= "00000000"; U0: reg port map (DIN, XREG, LDX, CLK); U1: reg port map (TMP2, PREG, LDP, CLK); U2: mux port map (ZERO, TMP, SEL); U3: add port map (XREG, PREG, TMP); U4: con port map (LDX, LDP, SEL, RESET, CLK); ZOUT <= PREG; end structural; Figure 8.19 3*X Structural VHDL
Control and Datapath 125 X Times Y The 3*X circuit can be readily generalized into a X*Y circuit by replacing the fixed counter with counter that is loaded with Y and then is decremented to 0. X will then be summed Y times. library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity xtimesy is port ( x,y: in std_logic_vector(3 downto 0); p: out std_logic_vector(7 downto 0); reset, clock: in std_logic); end xtimesy; architecture synthesis of xtimesy is signal cnt: std_logic_vector(3 downto 0); signal prod,xreg: std_logic_vector(7 downto 0); wait until clock= 1 ; if reset= 0 then cnt <= y; xreg <= "0000" & x; prod <= "00000000"; if cnt /= "0000" then cnt <= cnt - 1; prod <= prod + xreg; end ; p <= prod; end synthesis; Figure 8.20 X*Y Behavioral VHDL
126 Digital Engineering with VHDL Circuit Timing While circuit timing is discussed where in this text it is worth reviewing as it is frequently the most confusing aspect of the control and datapath design and most impedes the understanding how these designs work. Understanding the timing will allow you to generalize these simple design examples to problems of your own design. The assumption in all the designs described in this text is that the storage elements are edge-triggered flip-flops all clocked by the same clock on the same clock edge. This is true for both control unit flip-flops and for datapath registers. Edgetriggered flip-flops transfer data on their inputs to their outputs on the the rising edge of the clock (for a positive edge-triggered flip-flop). Because the outputs of the flip-flops change slightly after being clocked, control bits generated by the control unit to not take effect until the next clock edge. Hence control bits driving the datapath have nearly the entire clock period for their effect to take effect and for data and signals to propagate and setup at the inputs of clocked registers. CONTROL These control bits setup the datapath. DATA for clocking on the next clock edge CLK Figure 8.21 Control and Datapath Timing What is more difficult to understand is how VHDL describes this behavior (assuming synthesis to edge-triggered flip-flops). One way to understand this is first to remember the general design of a FSM with its next state logic and memory elements. An important point to remember that in VHDL a signal assignment statement such as X <= 1 ; do not occur in zero time. So just like the physical flip-
Control and Datapath 127 flop the changes on the signal X does not occur until after the clock edge, and hence the change is not seen by other parts of the circuit (or other VHDL es) until the next clock edge. A signal assignment statement can take an after clause to specify this non-zero delay X <= 1 after 5 ns;. In synthesis these are ignored as the delay is determined not by the VHDL, but instead by the target flipflop in the underlying technology. wait until clk= 1 ; DIN 10 23 65 12 case state is when fetch1 => state <= fetch2; reg1 <= din; when fetch2 => state <= comp; reg2 <= din; when compare => if reg1 < reg2 then state <= write1; we <= 1 ; state <= inc; WE REG2 00 23 registers arecompared REG1 00 10 Based on the result of the comparison STATE fetch1 fetch2 comp CLK during this period flip-flops updated here write1 Figure 8.22 Combinational and Sequential Timing