ENGR 2031 Digital Design Laboratory Lab 7 Background What we will cover Overview of the Simple Computer (scomp) Architecture Register Flow Diagrams VHDL Implementation of scomp Lab 7 scomp Architecture Figure 1, scomp Architecture for Labs 7 and 8 The scomp architecture for lab 7 is similar, but not the same, as that discussed in Chapter 9, Rapid Prototyping of Digital Systems, SOPC edition, Hamblen, Hall, and Furman. The version for lab has access to more memory and allows for external peripherals. The differences between the textbook version and the version discussed here and used in labs 7 and 8 are summarized below: The textbook allocates bits in the instruction word differently (8 bit opcode and operand vs 6 bit opcode and 10 bit operand). The textbook chooses the opcodes differently. The VHDL code for the scomp differs in many areas, for example signal names are different and computer memory is larger and implemented differently The textbook does not have a specific I/O subsystem and I/O buses. The textbook does not consider subroutine calls and returns. The simple computer (scomp) processor consists of the arithmetic logic unit (ALU), control unit (CU), and internal registers. The scomp processor will have access to a bank of external memory and I/O peripherals. The scomp can access up to 1024 words of internal Cyclone memory. This can be used for either program or data storage. The accumulator (AC) is the only internal register for data storage local to scomp. The accumulator can be loaded with data from memory or I/O or receive the result from the ALU. The program counter (PC) holds the address of the next instruction. The instruction register (IR) holds the most recently fetched instruction.
Instruction Format The scomp has a 16-bit instructions and a 16 bit data width (data and address widths vary among computer architectures). The instruction format is shown in Figure 2. Figure 2, scomp Instruction Format The 6-bit opcode specifies the operation (up to 64 different operations) and the 10-bit operand specifies the data to be operated on (address of data, immediate data, or not used). A 10-bit operand is necessary to hold the 10 bit memory address needed to access the 1024 words of scomp memory. Note that the textbook describes an 8-bit opcode and 8-bit operand. ALU The arithmetic logic unit (ALU) is a combinational circuit that performs arithmetic and logical operations. The data for the operation comes from the accumulator (AC) and the contents of a specified memory location via the memory data register (MDR). The result of the ALU always goes to the accumulator (AC). The ALUs operation is controlled via the control unit. Memory and I/O All data comes from or goes to memory or I/O devices. I/O registers, busses, IO_ADDR, and IO_Data will be covered in the next lecture. The memory (RAM) and memory buses are internal to the cyclone chip. The memory is implemented as a single bank of random access memory (RAM) that can hold instructions and data. The memory address bus is driven by either the program counter (PC) for an instruction fetch or the memory address register (MAR) for an operand (data) fetch. For a memory read operation: The address in the MAR or PC drives the address bus Memory places the requested data on the data bus The data must be latched in the memory data register (MDR) at the correct time by the control unit For a memory write operation: The address in the MAR drives the address bus The data to be written comes from the accumulator (AC) The data is written to the location specified on the address bus Fetch, Decode, and Execute Cycle There are three stages (after reset) in the operation of the simple computer (see Figure 3) and this cycle repeats forever. Each stage consists of one or more states. The fetch and decode stages are the same for all operations however the execute stage will have different states for each operation. The decode stage will determine which execute states are used for the operations. See Figure 3. During the fetch stage, the next instruction is obtained. During the decode stage, the instruction is decoded to determine what operations will need to be done. During the execute stage, the operations specified by the instruction are performed. The reset takes the scomp to the fetch stage.
Figure 3, Fetch, Decode, Execute Cycle scomp Instructions The following instructions (Table 1) for the scomp have already been implemented, will be implemented as part of the labs 7 and 8, or can be implemented later as additional exercises. One could also modify the scomp to support instructions of their own design. The choice of opcodes is arbitrary. Mnemonics are often used in place of the opcodes fro assembly language programming. Instruction Mnemonic Operation Opcode (Hex) NOP Do nothing (no operation) 0x00 LOAD Address AC <= MEM(Address) 0x01 STORE Address MEM(Address ) <= AC 0x02 ADD Address AC <= AC + MEM(Address) 0x03 SUB Address AC <= AC - MEM(Address) 0x04 JUMP Address PC <= Address 0x05 JNEG Address If AC < 0, PC <= Address 0x06 JPOS Address If AC > 0, PC <= Address 0x07 JZERO Address If AC = 0, PC< = Address 0x08 AND Address AC <= AC AND MEM(Address) 0x09 OR Address AC <= AC OR MEM(Address) 0x0A XOR Address AC <= AC XOR MEM(Address) 0x0B SHIFT Bits If Bits > 0, AC <= AC Shifted Bits left 0x0C If Bits < 0, AC <= AC Shifted Bits right Bits is in sign magnitude format ADDI Immediate AC = AC + Immediate (sign extended) 0x0D ILOAD Address AC <= MEM(MEM(Address )) 0x0E ISTORE Address MEM(MEM(Address )) <= AC 0x0F CALL Address Push PC onto stack, PC <= Address 0x10 RETURN Pop PC off of stack 0x11 IN I/O Address AC <= IO_IN 0x12 OUT I/O Address IO_DATA <= AC 0x13 Table 1, scomp Instructions
Sample Assembly Language Program, A = B + C For now, ignore where the data is located Start: LOAD B ADD C STORE A Here: JUMP Here Figure 4a, scomp Assembly Language Code (A = B + C) 000 : 0411; -- Start: LOAD B 001 : 0C12; -- ADD C 002 : 0810; -- STORE A 003 : 1403; -- Here: JUMP Here 010 : 0000; -- A 011 : 0004; -- B 012 : 0003; -- C Figure 4b, scomp Machine Language Code (A = B + C) The program first brings B into the accumulator, then adds C to the accumulator contents, and stores the result in A. The typical code listing of a machine language program shows the memory address (1 st column of Figure 4b), memory contents (column after the : in Figure 4b), and comments (stuff after the --). The comments shown in Figure 4b are the assembly language mnemonics. Instructions should start at address 0 in memory as that will be what the PC is initialized to. Data can be placed anywhere in memory that does not contain instructions. In figure 4b the data for A, B, and C are placed in memory locations 0x10, 0x11, and 0x12 respectively. In line 000 of Figure 4b, the 6-bit opcode for the LOAD instruction is 0x01 (000001) and B refers to the 10-bit memory address 0x011 (0000010001). Thus the LOAD B instruction is 000001 0000010001 or 0x0411. Labels like B will be useful when we do assembly language programming. Note that in the hex shorthand, extra 0s are appended to the front to extend the bits to a multiple of 4 bits. When generating the machine code by hand you must use the exact number of bits for the opcode (6) and operand (10) to generate the instruction. Register Flow for A = B + C The reset operation (Figure 5a) initializes the computer: Figure 5a, scomp after Reset Figure 5b, scomp in Fetch
During reset (Note that the memory starts to show the contents of the specified address) MW <= 0 PC <= 0x000 AC <= 0x0000 State < = fetch Figure 5b shows the fetch stage. During the fetch stage MW <= 0 IR <= MDR MAR <= MDR(9 downto 0) PC <= PC + 1 State <= decode Since the IR is a register (clocked), the contents of the data bus from the memory read are not latched in until the next state (decode). The MAR and PC are also registers and new values are not latched in until start of next state. Figure 5c shows the decode stage. During the decode stage CASE IR(15 downto 10) IS WHEN "000000" => -- No Op WHEN "000001" => -- LOAD STATE <= EX_LOAD;... Notice that the IR, MAR, and PC of Figure 5b now have the values assigned in the previous state. The control unit knows which execute state to go to based on the opcode field of the IR. During the decode state, memory has time to fetch the contents of the address loaded into the MAR during the previous state. This data is needed for instructions such as LOAD and ADD. If data is not needed by the instruction the data is not used. Figure 5c, scomp in Decode Figure 5d, scomp in Execute (LOAD) Figure 5d shows the execute stage for a LOAD operations. The LOAD operation requires only one state for the execute stage. During the execute stage WHEN EX_LOAD => -- Latch data from MDR to AC AC <= MDR;
The execute stage varies depending upon the instruction. Some instructions such as STORE require more than one state for the execute stage. After being reset, the computer will go through the fetch, decode, execute cycle as long as the clock is active. VHDL Implementation of the Simple Computer The VHDL framework for scomp is LIBRARY USE ENTITY SCOMP IS ARCHITECTURE a OF SCOMP IS declarations BEGIN other modules (like memory) assignments PROCESS (CLOCK, RESET) BEGIN CASE STATE IS WHEN RESET_PC => WHEN OTHERS => END PROCESS; END a; The VHDL entity for scomp is ENTITY SCOMP IS PORT(CLOCK,RESETN : IN STD_LOGIC; PC_OUT : OUT STD_LOGIC_VECTOR( 9 DOWNTO 0 ); AC_OUT : OUT STD_LOGIC_VECTOR(15 DOWNTO 0 ); MDR_OUT : OUT STD_LOGIC_VECTOR(15 DOWNTO 0 ); MAR_OUT : OUT STD_LOGIC_VECTOR(9 DOWNTO 0 ); IO_WRITE : OUT STD_LOGIC; IO_CYCLE : OUT STD_LOGIC; IO_ADDR : OUT STD_LOGIC_VECTOR(7 DOWNTO 0); IO_DATA : INOUT STD_LOGIC_VECTOR(15 DOWNTO 0) ); END SCOMP; The entity symbol block is shown in Figure 6 Figure 6, scomp Entity
Assuming a program is loaded into scomp memory, for the scomp to operate only two inputs are needed, RESETN to initialize and CLOCK to run. The VHDL architecture (beginning only) for scomp is ARCHITECTURE a OF SCOMP IS TYPE STATE_TYPE IS (RESET_PC, FETCH, DECODE, EX_LOAD, EX_STORE, EX_STORE2, EX_ADD, EX_JUMP,EX_AND); SIGNAL STATE : STATE_TYPE; SIGNAL AC, IR, MDR : STD_LOGIC_VECTOR(15 DOWNTO 0); SIGNAL PC, MAR : STD_LOGIC_VECTOR(9 DOWNTO 0); SIGNAL MW : STD_LOGIC; The VHDL for implementing the RESET and FETCH states is CASE STATE IS WHEN RESET_PC => MW <= '0'; -- Clear memory write flag PC <= "0000000000"; -- Reset PC AC <= x"0000"; -- Clear AC register WHEN FETCH => MW <= '0'; -- Clear memory write flag IR <= MDR; -- Latch instruction into IR PC <= PC + 1; -- Increment PC to next address STATE <= DECODE; The VHDL for implementing the DECODE state is WHEN DECODE => CASE IR(15 downto 10) IS WHEN "000000" => -- No Operation (NOP) WHEN "000001" => -- LOAD STATE <= EX_LOAD; WHEN "000010" => -- STORE STATE <= EX_STORE;... The VDL for implementing the ADD execute state is WHEN EX_ADD => -- Add MDR (memory contents) to AC AC <= AC + MDR; The VHDL for implementing the STORE execute states is (see Figure 7 for when MW changes) WHEN EX_STORE => MW <= '1'; -- Raise MW to write AC to MEM STATE <= EX_STORE2; WHEN EX_STORE2 => MW <= '0'; -- Drop MW to end write cycle Figure 7, MW Timing for STORE
The memory bank is implemented using a Quartus parametric module, altsyncram. Using the module saves you from having to implement memory from scratch. The memory is 1024 words, each 16 bits wide. The VHDL to implement the bank of memory using altsyncram is -- Use altsyncram component for unified program and data memory MEMORY : altsyncram GENERIC MAP ( intended_device_family => "Cyclone", width_a => 16, widthad_a => 10, numwords_a => 1024, operation_mode => "SINGLE_PORT", outdata_reg_a => "UNREGISTERED", indata_aclr_a => "NONE", wrcontrol_aclr_a => "NONE", address_aclr_a => "NONE", outdata_aclr_a => "NONE", init_file => "example.mif", lpm_hint => "ENABLE_RUNTIME_MOD=NO", lpm_type => "altsyncram" ) PORT MAP ( wren_a => MW, clock0 => NOT(CLOCK), address_a => MAR, data_a => AC, q_a => MDR ); The specifics for the scomp memory are shown in Figure 8a and the memory block symbol created using the altsyncram module is shown in Figure 8b. Figure 8b, Memory Block Figure 8b, Memory Details
Lab 7 Read Chapter 9 of Rapid Prototyping of Digital Systems, note there are some differences between the architecture described in the textbook and the one in the lab manual. You will not use the scomp files from the textbook CD, the scomp files should be obtained from the handouts page course website. Download the following from the handouts page course website: scomp files, simple computer assembler (SCASM), and Crimson editor. The example.asm file is part of the scomp files. Install Crimson editor and SCASM on your computer. Documentation on how to correctly install Crimson editor and SCASM is provided on the handouts page course website. Additional Pointers for Lab 7 Lab step 1, Add instructions to scomp The scomp for lab 7 has 6-bit opcodes and 10-bit operands for a 16-bit instruction. Bits 15..10 are the opcode and bits 9..0 are the operand. A 6-bit opcode means the scomp supports up to 64 instructions and a 10-bit operand means the scomp can address up to 2 10 words of memory or use a 10-bit immediate data value. For example, the 16-bit instructions (four hexadecimal digits) 0x0411 0000010000010001 opcode operand Opcode 1 is LOAD, the operand is hex address 0x0011 0x0C12 000011 0000010010 opcode operand Opcode 3 is ADD, the operand is hex address 0x0012 When hand assembling scomp code, make sure to use a 6-bit opcode not an 8-bit opcode. Immediate arithmetic and logic instructions will require the operand to be sign extended to 16-bits. Signed integers are represented in two s complement and sign extension can be done with concatentation. Jump instructions will require conditional logic: If ELSE. To check if numbers are positive, negative, and zero the >, <, and = logic operators will not work as you might expect as the data in the accumulator is a 16 bit standard logic vector not an arithmetic value. So it is not as simple as just comparing the accumulator to zero. Instead think of how one determines if a number is positive or negative when represented in a signed two's complement representation. Comparisons for the immediate add instructions will need to be handled similarly to implement sign extension. Lab step 2, Simulation of test_code.mif Notice in the altsyncram LPM in the scomp VHDL there is the line init_file => example.mif This mif file (memory initialization file) is what memory is initialized to. This is how your program and data get into memory. Any time you wish to use a different program, you must change the memory initialization file and recompile the project. The scomp takes at least three clock cycles to perform an instruction: fetch one clock cycle, decode one clock cycle, execute one or two clock cycles. Thus to simulate a program consisting of ten lines (with no loops) requires at least 30 clock cycles. Programs containing loops will require even longer simulations.
To get internal signals such as PC and IR in the simulation you may need to change what the node finder looks for. PINS only will give the inputs nad outputs defined in the PORT statement of the entity. Lab step 8, Oscilloscope capture of scomp clock on DE2 The bandwidth of the oscilloscopes in the digital design lab is 100MHz and we will be capturing clock signals in the 50 70MHz range. Make sure to restore the default settings on the oscilloscope (see scope checkout procedure from Lab 2 lab step 13, page 32 of the DDLM. After the scope checkout is performed check and make sure that the bandwidth limit (BW Limit) on the channel you are using to capture the clock signal is off thus using the full 100MHz bandwidth of the scope. Also make sure that the probe that you are using is set to 10x and that the probe attenuation for the channel you are using is set to 10x. Pre Lab 7 Pre lab step 3, modified example.asm and example.mif for computing A = (B + C) + D Pre lab step 5, screenshot of simple computer simulation results for example.mif of Pre lab step 3 Pre lab step 6, total logic elements, total pins, and total memory bits from fitter summary results Lab 7 Results Pre lab step 3, modified example.asm and example.mif Pre lab step 5, simulation results for example.mif of Pre lab step 3 Lab step 2, simulation results for test_code.mif Lab steps 4-7, simple computer schematic for DE2 board Lab step 8, oscilloscope capture of clock waveform scomp.vhd VHDL code (can be as an appendix at the end of the results)