Sparse Notes on an MIPS Processor s Architecture and its Assembly Language February 6, 2004 1 Introduction In this notes we are not going in details with the architecture of an MIPS processor, but only with very simple programs and techniques for using MIPS Assembly language. We have three basic hardware elements: a processor, a bench of registers, and the memory. The Assembly language features a (reduced) set of operations, whose format is fixed. In order to help the programmer, the Assembly is equipped with: (i) a set of pseudo-operations, anyone of them is not atomic, but it is exploded in a small set of operations; (ii) a symbolic name for any operation and for registers. One can imagine the process of executing an Assembly program in the following abstract way: 1. take the next operation from the memory; 2. execute it; 3. if it not the end of the program, go back to point 1. Clearly, the above one is just an abstraction of the real fetch-and-execute model for microcomputers, but for it is sufficient for our scopes. A basic concept in a MIPS processor that one has to learn is: operands must stay into registers. A register is a portion of the memory of fixed dimension (32 bits, in our architecture) that can be directly addressed through its name. Symbolic names for register are of the type $s0, $t1... and so on. Remember that instruction in MIPS have a fixed format: operation Dest Source1 Source2, 1
that is, in the first position takes place the symbolic name of the instruction, then the destination register, and then the source register(s) or constants. Remark: studying an Assembly language is independent from studying a simulator (such as SPIM) that is able to execute Assembly programs. In order to execute programs by means of a simulator, one has to do some technical steps depending on the particular simulator and operating system. In this course we will be talking about programs; a program for us will be a sequence of operations: program ::= instruction instruction program instruction ::= label : operation The above abstract grammar simply says that a program is an arbitrary sequence of operations and labels. An operation is simply a basic Assembly unit, which format is fixed: later we will see a subset of such units. Labels are mnemonical strings that allow the programmer to explicitly refer to a point of the program: their usage will be clear later. 1.1 Memory The memory can be viewed as an array of bytes. Each element of this array as an address (of 32 bits), uniquely identifying it. From an abstract point of view, one may use the memory as preferred. In practice, a computer must execute many programs at the same time (operating system, drivers, exception handler,... ). For these reason, there are conventions in the memory usage (and in the register usage) that we have to respect in order to make our programs useful. Moreover, such convention are effectively used by compilers during compilation of high level programs. The first convention that we have to learn is that the area between (0) 16 and (3F F F F F ) 16 is reserved. Programs are placed in memory starting from the address (400000) 16. Moreover, data are placed starting from (10000000) 16. Data must be separated into two conceptual different sets: the first one is conventionally called the set of static data, while the second one is the set of dynamic data. Intuitively, the static portion of data is the one where data whose size is known to the program (constants, arrays, etc.) are placed. A word of memory is a 32 bits quantity. Values in a 32 bits machine are usually placed in words (also the dimension of a register is 1 word). Anyway, the addressing mode allows one to directly address a byte (8 bits). If a register, for example $t0 contains an address in the memory, then 0($t0) is the content of the memory at the address contained in $t0. The 0 in front of the command indicates that one wants to consider the first byte of the addressed word. So, suppose that starting from the address contained in $t0 one has a list 2
of (word) values, let s say an array, such a list will be addressed by means of 0($t0), 4($t0), 8($t0), 12($t0),... 3
2 Arithmetic Operations in MIPS In this section, we confine ourselves to integer (signed and unsigned) values. Suppose that $t0 and $t1 contain two integer values. The operation add $t2, $t0, $t1 # # addition sums the content of $t0 and $t1 and puts the result into $t2. Similarly, addi $t2, $t0, Imm # addition with immediate sums the content of $t0 and the value Imm (immediate) and puts the result into $t2. The subtraction (which is a primitive operation) has a similar syntax: sub $t2, $t0, $t1 # subtraction Exercise 1 Suppose that $t2, $t3, and $t4 contain three integers; write an Assembly program putting their sum into $t0, and the value of $t2 $t3 $t4 into $t1. Solution. Clearly, $t2 $t3 $t4 = $t2 ($t3 + $t4). So, we first calculate $t3 + $t4 and put the result into a temporary register $t5, and then we compute the desired results: add $t5, $t3, $t4 add $t0, $t2, $t5 sub $t1, $t2, $t5 As an aside, notice that we can (an sometimes later we will need) add labels at certain points of our programs, as we have done above with start. Remark. How the real MIPS interprets instructions? The processor sees instructions as 32 bits strings, having a fixed format. This is (from left to right): 6 bites for the opcode (operation code), 5 bits for the first register source operand, 5 bits for the second register source operand, 5 bits for the register destination operand, 5 bits for the shift amount (only used in shift operations), and 6 bits for the function (the variant of the operation we want to use). For example: add $t0, $s1, $s2 is the mnemonic code for 000000 10001 10010 01000 00000 100000 that is: (000000) 2 = (0) 10 for the arithmetic instruction, (10001) 2 = (17) 1 0 for the $t0 register, (10010) 2 = (18) 1 0 for the $s1 register, (01000) 2 = (8) 10 for the $s2 register, (00000) 2 = (0) 10 for the shift amount (not used in this case), and (100000) 2 = 32 10 for the addition between registers variant. Multiplication of integers has a similar structure: 4
mul $t2, $t0, $t1 # multiplication Division of integer is a bit more complicate: div $t0, $t1 # division divu $t0, $t1 # unsigned division are two operations for division ($t0/$t1), the first one for signed integers, and the second one for unsigned integers. The result of an integer division is given by a quotient and a reminder, which are placed respectively into the registers $lo and $hi (reserved). Such registers must be used only with the two operations mflo and mfhi (move from $lo and move from $hi) to move the results into general purpose registers. The unsigned division algorithms simply treats the values as absolute (no sign is considered). The signed division algorithm negates the quotient if the signs of the operands are opposite and makes the sign of the nonzero remainder match the dividend. Example: +7/ 2 = 3 + (+1), and 7/ 2 = +3+( 1). The reason of such a choice is that we have to produce an algorithm that works in every case, and the (absolute) values of quotient and remainder must not depend on the signs. Exercise 2 Suppose that $t0 and $t1 contains two positive integers, and $t3 contains the value 2; write an Assembly program that calculates the average of them and places the result into $t2. Solution. add $t2, $t1, $t0 divu $t2, $t3 mflo $t2 5
3 Memory Load and Store Operations in MIPS Even for the simplest examples and programs, it is easy to see that we need to use the (data fragment of the) memory. Such operations are called load and store. The basic operation is: lw $t0, Add which loads the 32 bits quantity at the memory address Add into the register $t0. How is Add built up? Add represents an address in the memory, but we should remember that operands stand into registers; so, typically, Add is built up from a register (containing the base address) and a quantity (called offset). Example 1 Consider the instruction lw $t0, 0($t1) # load word It copies the (word) value which is written in the first 4 bytes indicated by the address contained in $t1. So, if in $t1 there is the quantity 1000000A 16, and memory[1000000a 16 ] = 0, memory[1000000b 16 ] = 0, memory[1000000c 16 ] = 0, memory[1000000d 16 ] = 3, then, after the execution, we have that the value of t0 is 3. The converse operation is: sw $t0, Add # store word by which the content of $t0 is stored in the memory at the address Add. Loading constants can be performed by: li $t0, Imm # load immediate Loading addresses (computed addresses) can be performed by: la $t0, Add # load address Exercise 3 Suppose that $t0 and $t1 contains two positive integers; write an Assembly program that computes the average of them and places the result into $t2. Solution. add $t2, $t1, $t0 li $t3, 2 divu $t2, $t3 6
mflo $t2 A note on the pseudo-instruction li: the SPIN simulator implements for example li $t0, 3 by ori $t0, $zero, 3, that is, by the logical OR bit-to-bit with immediate. Exercise 4 Suppose that $t0 contains the starting address of an array of 3 words. Write an Assembly program that returns in $t1 the sum of all elements of the array. Solution: lw $t2, 0($t0) # loads the first component of the array to the reg. $t2 lw $t3, 4($t0) add $t1, $t2, $t3 # sums the first two components of the array lw $t2, 8($t0) add $t1, $t1, $t2 # sums the third component Clearly there exists an alternative solution for the same problem; we may increment the base address instead on the offset: Solution (bis): lw $t2, 0($t0) # loads the first component of the array to the reg. $t2 addi $t0, $t0, 4 # increment the base address lw $t3, 0($t0) add $t1, $t2, $t3 # sums the first two components of the array addi $t0, $t0, 4 # increment the base address lw $t2, 0($t0) add $t1, $t1, $t2 # sums the third component Exercise 5 Suppose that $t0 contains the starting address of an array of 3 words. Write an Assembly program that returns in $t1 the (integer part of the) average of all elements of the array. Solution: lw $t2, 0($t0) lw $t3, 4($t0) add $t1, $t2, $t3 lw $t2, 8($t0) add $t1, $t1, $t2 li $t4, 3 # loads the constant 3 div $t1, $t4, # computes the average mflo $t1 # returns the result 7
Finally, we should notice that in MIPS there is the possibility to move data from registers to registers. Operations of this type are those that allow one to move the results of an integer division from the (reserved) registers $lo and $hi to general purpose registers. Another useful operation is: move $t0, $t1 # move for moving the content of $t1 to $t0. Exercise 6 How is the move pseudo-instruction exploded into a set of instructions? 8