CSE 260 Digital Computers: Organization and Logical Design Problem Set 10 Solutions Jon Turner thru 6.20 1. The diagram below shows a memory array containing 32 words of 2 bits each. Label each memory cell in the third row from the top, identifying the bit stored in that word (for example, bit 1 of word 23). Label each memory cell in the fifth column similarly. Circle the memory cells that contain word 17. The labels below identify the word and the bit in the word. The cells containing word 17 are circled. d_in addr r/w d 1 d 0 a 1 a 0 0 0,1 1 4,1 a 4 -a 2 row decoder 2 3 4 5 8,0 9,0 10,0 11,0 8,1 12,1 16,1 20,1 9,1 10,1 11,1 6 24,1 7 28,1 0 1 2 3 0 1 2 3 a 1,a 0 d 1 d 0 d_out - 1 -
2. Consider a 512 KByte SRAM. Assuming that the device reads and writes data in the form of 16 bit words, how many words can the device store? How many address bits are needed to address these words? Assuming that the central memory array has the same number of rows as it has columns, how many rows are there? How many of the address bits are used by the row decoder? How many by the column decoder? In what row and what column of the memory would you find bit 12 of the word with address 3a7d9? Assume that the row decoder uses the high order address bits and that the column decoder uses the low order address bits. The memory will store 256 K words, or more precisely 2 18 =262,144, so 18 address bits are needed. The memory stores a total of 4 Mbits or 2 22 bits, so it will have 2 11 =2048 rows and 2048 columns. This means that 11 of the 18 address bits will be used by the row decoder to select one of the 2048 rows and that the remaining 7 bits will be used by the column decoder. The memory stores 16 bit words, so each row contains 128 words (128 16=2048). Since the row decoder uses the top 11 bits of the address, the word with address 3a7d9=11 1010 0111 1101 1001 will appear in row 111 0100 1111=74f. Bit 12 would be in column 110 0101 1001=659. 3. Consider a circuit with a clock rate of 150 MHz, using an asynchronous memory with an access time of 50 ns and a cycle time of 65 ns. Assuming that the memory enable, address and read/write signals change on rising clock edges, how many clock cycles are needed to complete a read operation? If the memory has a word size of 32 bits, how many bytes of data can the circuit read from the memory in 1 ms? How many clock cycles are needed for a write, assuming that the read/write signal must be held low for at least 60 ns? How many bytes can be written to the memory in 1 ms? Each clock tick is 6.67 ns, so if the enable and address lines are asserted on the same rising clock edge, then the data will be valid 50 ns later, so we can latch the data on the eighth rising clock edge after the enable is asserted. If the circuit starts a new read as it s latching the data from the previous read, it can read a word every eight clock ticks, so that s four bytes every 53.33 ns. This means it can do 18,750 reads in 1 ms for a total of 75,000 bytes. For the write, we must assert the address and enable signals before the read/write signal goes low and since every signal transition must occur on a rising clock edge, this means that at least one full clock period is needed at the beginning of the write. The same is true at the end of the write. Since the cycle time is 65 ns and the period when the read/write signal must be low is 60 ns, one clock period at each end is sufficient to satisfy the timing requirements. So altogether, we need at least 2+(60/6.67)=11 clock ticks for the write. Since 11 clock ticks is 73.33 ns, we can write at most 4(1,000,000/73.33)=54,545 bytes in 1 ms. - 2 -
4. Consider a circuit that has a clock rate of 40 MHz and uses an 4 bit wide external SRAM with a read access time of 30 ns. Design a circuit, in the form of a schematic, that reads data from the memory. Your circuit should have two inputs, a read_request and a 6 bit address of the word to be read from the memory. It must generate the memory control signals and must store the word received from memory in an on-chip register, once the data is valid. It should also assert a control signal called ready to indicate when the requested word is present in the on-chip register. Include a timing diagram for your circuit. It should show clearly how many clock ticks pass between the original request and the time the data is stored in the register. You should assume that the flip flops used by your circuit have a setup time of 2 ns, a hold time of 1 ns and a propagation delay that can range from 2 to 6 ns. Assume that the clock skew is limited to 1 ns. You may assume that simple gates have a propagation delay that ranges from.5 ns to 2 ns. The schematic appears below. The set of flip flops at right forms a six bit address register. This is used to latch the address when a read request is received. It s inputs are the address bits supplied by the client and its outputs are the address bits going to the external memory. - 3 -
The four flip flops at the lower left form a four bit data register. This is used to latch the data received from the memory after the memory has had time to supply the data on the data bus. The inputs to this register connect to the data bus and the outputs connect back to the client. The top two flip flops and the associated gates form a state machine that controls the circuit. The state machine has an idle state (00) where it waits for requests. When a request is received it passes through two additional states (01 and 10) before returning to the idle state. These additional states allow time for the memory to operate, as indicated in the timing diagram shown below. To check that the timing requirements are met, note the following: The m_en signal goes high at most 6+2=8 ns after the rising clock edge. The address signals to the memory are stable at most 6 ns after the rising clock edge. Since the memory has an access time of 30 ns, the data from the memory will be valid 30 ns after the m_en and address signals are stable, so at most 38 ns after the rising clock edge that starts the read. The data register has a set of multiplexors in front of the flip flops. Assuming that the mulitplexors are implemented in the usual way, they will impose a delay of between 1 and 4 ns. So the flip flop inputs will be stable 42 ns after the rising clock edge that starts the read. Since we re allowing two clock periods for the memory read, we have until 50 ns from the first clock edge. Of course, we also need to allow 2 ns for the flip flop setup time and 1 ns for clock skew, but this still leaves us with 5 ns to spare. clk readrequest state 00 01 10 00 m_en m_rw load_areg Adr_bus load_dreg data_out - 4 -
5. A 64 Mbit DRAM array has to be refreshed every 128 ms. If the number of rows in the array is equal to the number of columns, what is the time between successive row refresh operations, assuming the refresh operations are distributed over the full 128 ms refresh interval? A 64 Mbit square memory array has 8192 rows and columns. Since all 8192 rows must be refreshed every 128 ms, the time between successive refresh operations is about 16 µs. If this memory can complete one operation every 100 ns (a read or a write), what fraction of the memory access bandwidth is not available because of refresh activity. Each row refresh operation does a read and a write, using 200 ns out of every 16,000 ns. So we lose about 1.25% of the memory bandwidth, due to refresh activity. 6. In the simple processor, the controller determines when each component in the system is permitted to use the bus. In other types of systems, there may be several independent subsystems that share a common bus. In such situations a bus arbiter is used to determine which subsystem gets to use the bus. In this problem you are to design a bus arbiter that can support three bus users. For each user, there is a request input and a grant output. The arbiter is a sequential circuit, which keeps track of the state of the bus. If the bus is free and one or more of the request lines is high, the arbiter selects one of the users and raises the corresponding grant signal. When the user is done with the bus, it is required to drop its grant signal for at least one clock tick. Design your arbiter so that if more than one user needs to use the bus multiple times, they take turns. Start by producing a state diagram for the arbiter, then design a VHDL module that implements the state diagram. Include an asynchronous reset input. To provide equal access to the bus, the arbiter should give preference to users that haven t used it recently. The circuit below does this by maintaining three separate idle states. In idle0, user 0 is given top priority for access to the bus, followed by user 1 and user 2. In idle1, user 1 is given top priority, then users2 and 0. In idle2, user 2 is given top priority, then users 0 and 1. Whenever a user releases the bus, the arbiter goes to the idle state that assigns the lowest priority to that user. The inputs are the request signals (r 0,r 1,r 2 ) and the outputs are the grants (g 0,g 1,g 2 ). 000 idle0/000 1xx 01x 001 0xx busy0/100 1xx 000 idle1/000 100 x1x x01 x0x busy1/010 x1x 000 1x0 010 idle2/000 xx1 busy2/001 xx1 xx0-5 -
library IEEE; use IEEE.std_logic_1164.all; entity arbiter is port ( clk, reset: in STD_LOGIC; req: in STD_LOGIC_VECTOR (2 downto 0); grant: out STD_LOGIC_VECTOR (2 downto 0) ); end arbiter; architecture arbiter_arch of arbiter is type state_type is (idle0, idle1, idle2, busy0, busy1, busy2); signal state: state_type; begin process(clk,reset) begin if reset = '1' then state <= idle0; elsif clk'event and clk = '1' then if state = idle0 and req(0) = '1' then state <= busy0; elsif state = idle0 and req(0) = '0' and req(1) = '1' then state <= busy1; elsif state = idle0 and req(0) = '0' and req(1) = '0' and req(2) = '1' then state <= busy2; elsif state = idle1 and req(1) = '1' then state <= busy1; elsif state = idle1 and req(1) = '0' and req(2) = '1' then state <= busy2; elsif state = idle1 and req(1) = '0' and req(2) = '0' and req(0) = '1' then state <= busy0; elsif state = idle2 and req(2) = '1' then state <= busy2; elsif state = idle2 and req(2) = '0' and req(0) = '1' then state <= busy0; elsif state = idle2 and req(2) = '0' and req(0) = '0' and req(1) = '1' then state <= busy1; elsif state = busy0 and req(0) = '0' then state <= idle1; elsif state = busy1 and req(1) = '0' then state <= idle2; elsif state = busy2 and req(2) = '0' then state <= idle0; end if; end if; end process; grant <= end arbiter_arch; "001" when state = busy0 else "010" when state = busy1 else "100" when state = busy2 else "000"; - 6 -
7. Write a program for the simple processor from section 6 of the notes that checks to see if a given ASCII character string is a palindrome. The inputs to your program are stored at locations 30 and 31 (hex). The value at location 30 is a pointer to the first character in the character string. The value in location 31 is the number of characters in the string. Your program should write 1 in location 32 if the string is a palindrome and 0, if it is not. Try loading your program in the memory for the simple processor and running a simulation that executes your program. Does your program work correctly? The VHDL shown below loads the code into memory, along with some test input. ram(0) <= x"2030"; -- lo = start ram(1) <= x"402e"; ram(2) <= x"1fff"; -- hi = start + length - 1 ram(3) <= x"a030"; ram(4) <= x"a031"; ram(5) <= x"402f"; ram(6) <= x"202f"; -- loop: if lo > hi then ram(7) <= x"3000"; ram(8) <= x"a02e"; ram(9) <= x"8019"; -- exit loop ram(10) <= x"302e"; -- if *lo!= *hi then ram(11) <= x"0001"; ram(12) <= x"4032"; -- (store -(*lo) temporarily) ram(13) <= x"302f"; ram(14) <= x"a032"; ram(15) <= x"7013"; ram(16) <= x"1000"; -- result = 0 ram(17) <= x"4032"; ram(18) <= x"0000"; -- quit ram(19) <= x"1001"; -- lo = lo + 1 ram(20) <= x"a02e"; ram(21) <= x"402e"; ram(21) <= x"1fff"; -- hi = hi - 1 ram(22) <= x"a02f"; ram(23) <= x"402f"; ram(24) <= x"6006"; -- goto loop ram(25) <= x"1001"; -- end: result = 1 ram(26) <= x"4032"; -- ram(27) <= x"0000"; -- quit ram(46) <= x"0000"; -- ram(47) <= x"0000"; -- ram(48) <= x"0033"; -- ram(49) <= x"0005"; -- ram(50) <= x"0000"; -- ram(51) <= x"0061"; -- ram(52) <= x"0062"; -- ram(53) <= x"0063"; -- ram(54) <= x"0062"; -- ram(55) <= x"0061"; -- lo hi start length result 'a' 'b' 'c' 'b' 'a' - 7 -