CoE - ECE 0142 Computer Organization. Instructions: Language of the Computer

Size: px
Start display at page:

Download "CoE - ECE 0142 Computer Organization. Instructions: Language of the Computer"

Transcription

1 CoE - ECE 42 Computer Organization Instructions: Language of the Computer

2 The Stored Program Concept The stored program concept says that the program is stored with data in the computer s memory. The computer is able to manipulate it as data for example, to load it from disk, move it in memory, and store it back on disk. It is the basic operating principle for every computer. It is so common that it is taken for granted. Without it, every instruction would have to be initiated manually. 2

3 The Fetch-Execute Cycle Fig..2 3

4 Machine, Processor, and Memory State The Machine State: contents of all registers in system, accessible to programmer or not The Processor State: registers internal to the CPU The Memory State: contents in the memory system State is used in the formal finite state machine sense Maintaining or restoring the machine and processor state is important to many operations, especially procedure calls and interrupts 4

5 Instruction set architecture (ISA) Software ISA Hardware 5

6 MIPS In this class, we ll use the MIPS instruction set architecture (ISA) to illustrate concepts in assembly language and machine organization Of course, the concepts are not MIPS-specific MIPS is just convenient because it is real, yet simple (unlike x86) The MIPS ISA is still used in many places today. Primarily in embedded systems, like: Various routers from Cisco Game machines like the Nintendo 64 and Sony Playstation 2 You must become fluent in MIPS assembly: Translate from C to MIPS and MIPS to C 6

7 MIPS: register-to-register, three address MIPS is a register-to-register, or load/store, architecture. The destination and sources must all be registers. Special instructions, which we ll see later, are needed to access main memory. MIPS uses three-address instructions for data manipulation. Each ALU instruction contains a destination and two sources. For example, an addition instruction (a = b + c) has the form: operation operands add a, b, c destination sources 7

8 MIPS register names MIPS register names begin with a $. There are two naming conventions: By number: $ $ $2 $3 By (mostly) two-character names, such as: $a-$a3 $s-$s7 $t-$t9 $sp $ra Not all of the registers are equivalent: E.g., register $ or $zero always contains the value (go ahead, try to change it) Other registers have special uses, by convention: E.g., register $sp is used to hold the stack pointer You have to be a little careful in picking registers for your programs. 8

9 Policy of Use Conventions Name Register number Usage $zero the constant value $at assembler temporary $v-$v 2-3 values for results and expression evaluation $a-$a3 4-7 arguments $t-$t7 8-5 temporaries $s-$s Saved temporaries $t8-$t more temporaries $k-$k reserved for OS kernel $gp 28 global pointer $sp 29 stack pointer $fp 3 frame pointer $ra 3 return address 9

10 Basic arithmetic and logic operations The basic integer arithmetic operations include the following: add sub mul div And here are a few logical operations: and or xor Remember that these all require three register operands; for example: add $t, $t, $t2 # $t = $t + $t2 xor $s, $s, $a # $s = $s xor $a

11 Immediate operands The ALU instructions we ve seen so far expect register operands. How do you get data into registers in the first place? Some MIPS instructions allow you to specify a signed constant, or immediate value, for the second source instead of a register. For example, here is the immediate add instruction, addi: addi $t, $t, 4 # $t = $t + 4 Immediate operands can be used in conjunction with the $zero register to write constants into registers: addi $t, $, 4 # $t = 4 Data can also be loaded first into the memory along with the executable file. Then you can use load instructions to put them into registers lw $t, 8($t) # $t = mem[8+$t] MIPS is considered a load/store architecture, because arithmetic operands cannot be from arbitrary memory locations. They must either be registers or constants that are embedded in the instruction.

12 We need more space memory Registers are fast and convenient, but we have only 32 of them, and each one is just 32-bit wide. That s not enough to hold data structures like large arrays. We also can t access data elements that are wider than 32 bits. We need to add some main memory to the system! RAM is cheaper and denser than registers, so we can add lots of it. But memory is also significantly slower, so registers should be used whenever possible. In the past, using registers wisely was the programmer s job. For example, C has a keyword register that marks commonlyused variables which should be kept in the register file if possible. However, modern compilers do a pretty good job of using registers intelligently and minimizing RAM accesses. 2

13 Memory review Memory sizes are specified much like register files; here is a 2 k x n bit RAM. k n 2 k n memory ADRS DATA CS WR OUT n CS WR Operation x None selected address Write selected address A chip select input CS enables or disables the RAM. ADRS specifies the memory location to access. WR selects between reading from or writing to the memory. To read from memory, WR should be set to. OUT will be the n- bit value stored at ADRS. To write to memory, we set WR =. DATA is the n-bit value to store in memory. 3

14 MIPS memory memory ADRS DATA CS WR OUT 8 MIPS memory is byte-addressable, which means that each memory address references an 8-bit quantity. The MIPS architecture can support up to 32 address lines. This results in a 2 32 x 8 RAM, which would be 4 GB of memory. Not all actual MIPS machines will have this much! 4

15 Bytes and words: Word = 4 Bytes Remember to be careful with memory addresses when accessing words. For instance, assume an array of words begins at address 2. The first array element is at address 2. The second word is at address 24, not 2. For example, if $a contains 2, then lw $t, ($a) accesses the first word of the array, but lw $t, 8($a) would access the third word of the array, at address 28. 5

16 Loading and storing bytes The MIPS instruction set includes dedicated load and store instructions for accessing memory. The main difference is that MIPS uses indexed addressing. The address operand specifies a signed constant and a register. These values are added to generate the effective address. The MIPS load byte instruction lb transfers one byte of data from main memory to a register. lb $t, 2($a) # $t = Memory[$a + 2] question: what about the other 3 bytes in $t? Sign extension! The store byte instruction sb transfers the lowest byte of data from a register into main memory. sb $t, 2($a) # Memory[$a + 2] = $t 6

17 Loading and storing words You can also load or store 32-bit quantities a complete word instead of just a byte with the lw and sw instructions. lw $t, 2($a) # $t = Memory[$a + 2] sw $t, 2($a) # Memory[$a + 2] = $t Most programming languages support several 32-bit data types. Integers Single-precision floating-point numbers Memory addresses, or pointers Unless otherwise stated, we ll assume words are the basic unit of data. 7

18 Computing with memory So, to compute with memory-based data, you must:. Load the data from memory to the register file. 2. Do the computation, leaving the result in a register. 3. Store that value back to memory if needed. For example, let s say that you wanted to do the same addition, but the values were in memory. How can we do the following using MIPS assembly language using as few registers as possible? char A[4] = {, 2, 3, 4}; int result; result = A[] + A[] + A[2] + A[3]; 8

19 Memory alignment Keep in mind that memory is byte-addressable, so a 32-bit word actually occupies four contiguous locations (bytes) of main memory. Address bit data Word Word 2 Word 3 The MIPS architecture requires words to be aligned in memory; 32-bit words must start at an address that is divisible by 4., 4, 8 and 2 are valid word addresses., 2, 3, 5, 6, 7, 9, and are not valid word addresses. Unaligned memory accesses result in a bus error, which you may have unfortunately seen before. This restriction has relatively little effect on high-level languages and compilers, but it makes things easier and faster for the processor. 9

20 Endianness Endianness is the byte ordering used to store data. Typical cases are the order in which integer values are stored as bytes in memory. Big-endian Little-endian 2

21 Comparison In Little-endian, the least significant byte goes to the lowest memory address consistent with computer convention In Big-endian, reading bytes from low address to high address is akin to left-to-right reading order in hexadecimal For example, to store a string ABCD In Big-endian: LSB MSB Address 8-bit data A B C D In Little-endian Address bit data D C B A 2

22 Exercise Can we figure out the code? swap(int v[], int k); { int temp; } temp = v[k] v[k] = v[k+]; v[k+] = temp; Assuming k is stored in $5, and the starting address of v[] is in $4. swap: ; $5=k $4=v[] sll $2, $5, 2; $2 k 4 add $2, $4, $2; $2 v[k] lw $5, ($2) ; $5 v[k] lw $6, 4($2) ; $6 v[k+] sw $6, ($2) ; v[k] $6 sw $5, 4($2) ; v[k+] $5 jr $3 22

23 Pseudo-instructions MIPS assemblers support pseudo-instructions that give the illusion of a more expressive instruction set, but are actually translated into one or more simpler, real instructions. For example, you can use the li and move pseudo-instructions: li $a, 2 # Load immediate 2 into $a move $a, $t # Copy $t into $a They are probably clearer than their corresponding MIPS instructions: addi $a, $, 2 # Initialize $a to 2 add $a, $t, $ # Copy $t into $a We ll see lots more pseudo-instructions this semester. A core instruction set is given in Green Card of the text (J. Hennessy and D. Patterson, st page). Unless otherwise stated, you can always use pseudo-instructions in your assignments and on exams. 23

24 Control flow in high-level languages The instructions in a program usually execute one after another, but it s often necessary to alter the normal control flow. Conditional statements execute only if some test expression is true. // Find the absolute value of *a v = *a; if (v < ) v = -v; // This might not be executed v = v + v; Loops cause some statements to be executed many times. // Sum the elements of a five-element array a v = ; t = ; while (t < 5) { v = v + a[t]; // These statements will t++; // be executed five times } 24

25 MIPS control instructions In this lecture, we introduced some of MIPS s control-flow instructions j immediate // for unconditional jumps jr $r // jump to address stored in $r bne and beq $r, $r2, label // for conditional branches slt and slti $rd, $rs, $rt // set if less than (w/ and w/o an immediate) $rs, $rt, imm And how to implement loops Today, we ll talk about MIPS s pseudo branches if/else case/switch 25

26 Pseudo-branches The MIPS processor only supports two branch instructions, beq and bne, but to simplify your life the assembler provides the following other branches: blt $t, $t, L // Branch if $t < $t ble $t, $t, L2 // Branch if $t <= $t bgt $t, $t, L3 // Branch if $t > $t bge $t, $t, L4 // Branch if $t >= $t Later this term we ll see how supporting just beq and bne simplifies the processor design. 26

27 Implementing pseudo-branches Most pseudo-branches are implemented using slt. For example, a branch-if-less-than instruction blt $a, $a, Label is translated into the following. slt $at, $a, $a // $at = if $a < $a bne $at, $, Label // Branch if $at!= All of the pseudo-branches need a register to save the result of slt, even though it s not needed afterwards. MIPS assemblers use register $, or $at, for temporary storage. You should be careful in using $at in your own programs, as it may be overwritten by assembler-generated code. 27

28 Translating an if-then statement We can use branch instructions to translate if-then statements into MIPS assembly code. v = *a; lw $t, ($a) if (v < ) bge $t, $zero, label v = -v; sub $t, $zero, $t v = v + v; label: add $t, $t, $t Sometimes it s easier to invert the original condition. In this case, we changed continue if v < to skip if v >=. This saves a few instructions in the resulting assembly code. 28

29 Translating an if-then-else statements If there is an else clause, it is the target of the conditional branch And the then clause needs a jump over the else clause // increase the magnitude of v by one if (v < ) bge $v, $, E v --; sub $v, $v, j L else v ++; E: add $v, $v, v = v; L: move $v, $v Dealing with else-if code is similar, but the target of the first branch will be another if statement. Drawing the control-flow graph can help you out. 29

30 Example of a Loop Structure for (i=; i>; i--) x[i] = x[i] + h; Assume: addresses of x[] and x[] are in $s and $s5 respectively; h is in $s2; Loop: lw $s, ($s) ;$s=x[] add $s3, $s, $s2 ;$s2=h sw $s3, ($s) addi $s, $s, # - 4 bne $s, $s5, Loop ;$s5=x[] 3

31 Case/Switch statement Many high-level languages support multi-way branches, e.g. switch (two_bits) { case : break; case : /* fall through */ case 2: count ++; break; case 3: count += 2; break; } We could just translate the code to if, thens, and elses: if ((two_bits == ) (two_bits == 2)) { count ++; } else if (two_bits == 3) { count += 2; } This isn t very efficient if there are many, many cases. 3

32 Case/Switch statement } switch (two_bits) { case : break; case : /* fall through */ case 2: count ++; break; case 3: count += 2; break; Alternatively, we can:. Create an array of jump targets jump table 2. Load the entry indexed by the variable two_bits 3. Jump to that address using the jump register, or jr, instruction jr $r This is much easier to show than to tell. 32

33 Coding with jump table (sketch) Suppose the jump table is stored in the memory. Its starting address is in $t. If two_bits==, the branch should jump to the 2 nd entry in the table, i.e., our target address is $t+4. Assume two_bits is in $t: /* test the range of two_bits */ blt $t, $zero, Exit bge $t, $a, Exit /* $a==4 */ /* multiply two_bits by 4, to get byte addr */ sll $t, $t, 2 /* get the target address */ add $t, $t, $t lw $t2, ($t) /* jump */ jr $t2 33

34 Homework Let s write a program to count how many bits are one in a 32-bit word. Suppose the word is stored in register $t. C code int input, i, counter, bit, position; counter = ; position = ; For (i=; i<32; i++) { bit = input & position; if (bit = = ) counter++ position = position >> ; } 34

35 Functions calls in MIPS We ll talk about the 3 steps in handling function calls:. The program s flow of control must be changed. 2. Arguments and return values are passed back and forth. 3. Local variables can be allocated and destroyed. And how they are handled in MIPS: New instructions for calling functions. Conventions for sharing registers between functions. Use of a stack. 35

36 Control flow in C Invoking a function changes the control flow of a program twice.. Calling the function 2. Returning from the function In this example the main function calls fact twice, and fact returns twice but to different locations in main. Each time fact is called, the CPU has to remember the appropriate return address. Notice that main itself is also a function! It is called by the operating system when you run the program. int main() {... t = fact(8); t3 = t + t2; t2 = fact(3);... } int fact(int n) { int i, f = ; for (i = n; i > ; i--) f = f * i; return f; } 36

37 Control flow in MIPS MIPS uses the jump-and-link instruction jal to call functions. The jal saves the return address (the address of the next instruction) in the dedicated register $ra, before jumping to the function. jal is the only MIPS instruction that can access the value of the program counter, so it can store the return address PC+4 in $ra. jal Fact To transfer control back to the caller, the function just has to jump to the address that was stored in $ra. jr $ra Let s now add the jal and jr instructions that are necessary for our factorial example. 37

38 Changing the control flow in MIPS int main() {... jal Fact;... } t3 = t + t2;... jal Fact;... int fact(int n) { int i, f = ; for (i = n; i > ; i--) f = f * i; jr $ra; } 38

39 Data flow in C Functions accept arguments and produce return values. The black parts of the program show the actual and formal arguments of the fact function. The purple parts of the code deal with returning and using a result. int main() {... t = fact(8); t3 = t + t2; t2 = fact(3); }... int fact(int n) { int i, f = ; for (i = n; i > ; i--) f = f * i; return f; } 39

40 Data flow in MIPS MIPS uses the following conventions for function arguments and results. Up to four function arguments can be passed by placing them in argument registers $a-$a3 before calling the function with jal. A function can return up to two values by placing them in registers $v-$v, before returning via jr. These conventions are not enforced by the hardware or assembler, but programmers agree to them so functions written by different people can interface with each other. Later we ll talk about handling additional arguments or return values. 4

41 Nested functions What happens when you call a function that then calls another function? Let s say A calls B, which calls C. The arguments for the call to C would be placed in $a-$a3, thus overwriting the original arguments for B. Similarly, jal C overwrites the return address that was saved in $ra by the earlier jal B. A:... # Put B s args in $a-$a3 jal B # $ra = A2 A2:... B:... # Put C s args in $a-$a3, # erasing B s args! jal C # $ra = B2 B2:... jr $ra # Where does # this go??? C:... jr $ra 4

42 Spilling registers The CPU has a limited number of registers for use by all functions, and it s possible that several functions will need the same registers. We can keep important registers from being overwritten by a function call, by saving them before the function executes, and restoring them after the function completes. But there are two important questions. Who is responsible for saving registers the caller or the callee? Where exactly are the register contents saved? 42

43 Who saves the registers? However, in the typical black box programming approach, the caller and callee do not know anything about each other s implementation. Different functions may be written by different people or companies. A function should be able to interface with any client, and different implementations of the same function should be substitutable. Who is responsible for saving important registers across function calls? The caller knows which registers are important to it and should be saved. The callee knows exactly which registers it will use and potentially overwrite. So how can two functions cooperate and share registers when they don t know anything about each other? 43

44 The caller could save the registers One possibility is for the caller to save any important registers that it needs before making a function call, and to restore them after. But the caller does not know what registers are actually written by the function, so it may save more registers than necessary. In the example on the right, frodo wants to preserve $a, $a, $s and $s from gollum, but gollum may not even use those registers. frodo: li $a, 3 li $a, li $s, 4 li $s, # Save registers # $a, $a, $s, $s jal gollum # Restore registers # $a, $a, $s, $s add $v, $a, $a add $v, $s, $s jr $ra 44

45 or the callee could save the registers Another possibility is if the callee saves and restores any registers it might overwrite. For instance, a gollum function that uses registers $a, $a2, $s and $s2 could save the original values first, and restore them before returning. But the callee does not know what registers are important to the caller, so again it may save more registers than necessary. gollum: # Save registers # $a $a2 $s $s2 li $a, 2 li $a2, 7 li $s, li $s2, 8... # Restore registers # $a $a2 $s $s2 jr $ra 45

46 or they could work together MIPS uses conventions again to split the register spilling chores. The caller is responsible for saving and restoring any of the following caller-saved registers that it cares about. $t-$t9 $a-$a3 $v-$v In other words, the callee may freely modify these registers, under the assumption that the caller already saved them if necessary. The callee is responsible for saving and restoring any of the following callee-saved registers that it uses. (Remember that $ra is used by jal.) $s-$s7 $ra Thus the caller may assume these registers are not changed by the callee. $ra is tricky; it is saved by a callee who is also a caller. Be especially careful when writing nested functions, which act as both a caller and a callee! 46

47 Register spilling example This convention ensures that the caller and callee together save all of the important registers frodo only needs to save registers $a and $a, while gollum only has to save registers $s and $s2. frodo: li $a, 3 li $a, li $s, 4 li $s, # Save registers # $a, $a jal gollum # Restore registers # $a and $a add $v, $a, $a add $v, $s, $s jr $ra gollum: # Save $ra # Save registers # $s and $s2 li $a, 2 li $a2, 7 li $s, li $s2, 8... # Restore registers # $s and $s2 # Restore $ra jr $ra 47

48 Where are the registers saved? Now we know who is responsible for saving which registers, but we still need to discuss where those registers are saved. It would be nice if each function call had its own private memory area. This would prevent other function calls from overwriting our saved registers. We could use this private memory for other purposes too, like storing local variables. 48

49 Function calls and stacks Notice function calls and returns occur in a stack-like order: the most recently called function is the first one to return. 2 A:... jal B A2: Someone calls A jr $ra 5 2. A calls B B: B calls C jal C 4. C returns to B 5. B returns to A 3 B2:... jr $ra 4 6. A returns Here, for example, C must return to B before B can return to A. C:... jr $ra 49

50 Stacks and function calls It s natural to use a stack for function call storage. A block of stack space, called a stack frame, can be allocated for each function call. When a function is called, it creates a new frame onto the stack, which will be used for local storage. Before the function returns, it must pop its stack frame, to restore the stack to its original state. The stack frame can be used for several purposes. Caller- and callee-save registers can be put in the stack. The stack frame can also hold local variables, or extra arguments and return values. 5

51 The MIPS stack In MIPS machines, part of main memory is reserved for a stack. The stack grows downward in terms of memory addresses. The address of the top element of the stack is stored (by convention) in the stack pointer register, $sp ($29). MIPS does not provide push and pop instructions. Instead, they must be done explicitly by the programmer. x7fffffff $sp stack x 5

52 Pushing elements To push elements onto the stack: Move the stack pointer $sp down to make room for the new data. Store the elements into the stack. For example, to push registers $t and $t2 onto the stack: sub $sp, $sp, 8 sw $t, 4($sp) sw $t2, ($sp) An equivalent sequence is: sw $t, -4($sp) sw $t2, -8($sp) sub $sp, $sp, 8 Before and after diagrams of the stack are shown on the right. $sp $sp word word 2 Before word word 2 $t $t2 After 52

53 Accessing and popping elements You can access any element in the stack (not just the top one) if you know where it is relative to $sp. For example, to retrieve the value of $t: lw $s, 4($sp) You can pop, or erase, elements simply by adjusting the stack pointer upwards. To pop the value of $t2, yielding the stack shown at the bottom: addi $sp, $sp, 4 Note that the popped data is still present in memory, but data past the stack pointer is considered invalid. $sp $sp word word 2 $t $t2 word word 2 $t $t2 53

54 Summary Today we focused on implementing function calls in MIPS. We call functions using jal, passing arguments in registers $a- $a3. Functions place results in $v-$v and return using jr $ra. Managing resources is an important part of function calls. To keep important data from being overwritten, registers are saved according to conventions for caller-save and callee-save registers. Each function call uses stack memory for saving registers, storing local variables and passing extra arguments and return values. Assembly programmers must follow many conventions. Nothing prevents a rogue program from overwriting registers or stack memory used by some other function. 54

55 Assembly vs. machine language So far we ve been using assembly language. We assign names to operations (e.g., add) and operands (e.g., $t). Branches and jumps use labels instead of actual addresses. Assemblers support many pseudo-instructions. Programs must eventually be translated into machine language, a binary format that can be stored in memory and decoded by the CPU. MIPS machine language is designed to be easy to decode. Each MIPS instruction is the same length, 32 bits. There are only three different instruction formats, which are very similar to each other. Studying MIPS machine language will also reveal some restrictions in the instruction set architecture, and how they can be overcome. 55

56 Three MIPS formats simple instructions all 32 bits wide very structured, no unnecessary baggage only three instruction formats R I J op rs rt rd shamt funct op rs rt 6 bit address op 26 bit address Signed value ~ R-type: *I-type: J-type: ALU instructions (add, sub, ) immediate (addi ), loads (lw ), stores (sw ), conditional branches (bne ), jump register (jr ) jump (j), jump and link (jal) 56

57 Constants Small constants are used quite frequently (5% of operands) e.g., A = A + 5; B = B + ; C = C - 8; MIPS Instructions: addi $29, $29, 4 slti $8, $8, andi $29, $29, 6 ori $29, $29, 4 57

58 Larger constants Larger constants can be loaded into a register 6 bits at a time. The load upper immediate instruction lui loads the highest 6 bits of a register with a constant, and clears the lowest 6 bits to s. An immediate logical OR, ori, then sets the lower 6 bits. To load the 32-bit value : lui $s, x3d # $s = 3D (in hex) ori $s, $s, x9 # $s = 3D 9 This illustrates the principle of making the common case fast. Most of the time, 6-bit constants are enough. It s still possible to load 32-bit constants, but at the cost of two instructions and one temporary register. Pseudo-instructions may contain large constants. Assemblers will translate such instructions correctly. We used a lw instruction before. 58

59 Loads and stores The limited 6-bit constant can present difficulties for accesses to global data. Let s assume the assembler puts a variable at address x4. x4 is bigger than 32,767 In these situations, the assembler breaks the immediate into two pieces. lui $t, x # x lw $t, x4($t) # from Mem[x 4] 59

60 Branches For branch instructions, the constant field is not an address, but an offset from the next program counter (PC+4) to the target address. beq $at, $, L add $v, $v, $ add $v, $v, $v j Somewhere L: add $v, $v, $v Since the branch target L is three instructions past the first add, the address field would contain 3 4=2. The whole beq instruction would be stored as: op rs rt address Why (PC+4)? Will be clear when we learned pipelining 6

61 Larger branch constants Empirical studies of real programs show that most branches go to targets less than 32,767 instructions away branches are mostly used in loops and conditionals, and programmers are taught to make code bodies short. If you do need to branch further, you can use a jump with a branch. For example, if Far is very far away, then the effect of: beq... $s, $s, Far can be simulated with the following actual code. bne j Next:... $s, $s, Next Far Again, the MIPS designers have taken care of the common case first. 6

62 Summary Instruction Set Architecture (ISA) The ISA is the interface between hardware and software. The ISA serves as an abstraction layer between the HW and SW Software doesn t need to know how the processor is implemented Any processor that implements the ISA appears equivalent Software ISA Proc # Proc #2 An ISA enables processor innovation without changing software This is how Intel has made billions of dollars. Before ISAs, software was re-written for each new machine. 62

63 RISC vs. CISC MIPS was one of the first RISC architectures. It was started about 2 years ago by John Hennessy, one of the authors of our textbook. The architecture is similar to that of other RISC architectures, including Sun s SPARC, IBM and Motorola s PowerPC, and ARM-based processors. Older processors used complex instruction sets, or CISC architectures. Many powerful instructions were supported, making the assembly language programmer s job much easier. But this meant that the processor was more complex, which made the hardware designer s life harder. Many new processors use reduced instruction sets, or RISC architectures. Only relatively simple instructions are available. But with high-level languages and compilers, the impact on programmers is minimal. On the other hand, the hardware is much easier to design, optimize, and teach in classes. Even most current CISC processors, such as Intel 886-based chips, are now implemented using a lot of RISC techniques. 63

64 RISC vs. CISC Characteristics of ISAs CISC Variable length instruction Variable format Memory operands Complex operations RISC Single word instruction Fixed-field decoding Load/store architecture Simple operations 64

65 A little ISA history 964: IBM System/36, the first computer family IBM wanted to sell a range of machines that ran the same software 96 s, 97 s: Complex Instruction Set Computer (CISC) era Much assembly programming, compiler technology immature Simple machine implementations Complex instructions simplified programming, little impact on design 98 s: Reduced Instruction Set Computer (RISC) era Most programming in high-level languages, mature compilers Aggressive machine implementations Simpler, cleaner ISA s facilitated pipelining, high clock frequencies 99 s: Post-RISC era ISA complexity largely relegated to non-issue CISC and RISC chips use same techniques (pipelining, superscalar,..) ISA compatibility outweighs any RISC advantage in general purpose Embedded processors prefer RISC for lower power, cost 2 s:??? EPIC? Dynamic Translation? 65

66 CoE/ECE 42 Computer Organization Pipelining Instructor: Jun Yang Slides are adapted from Zilles 998 Morgan Kaufmann Publishers

67 A relevant question Assuming you ve got: One washer (takes 3 minutes) One drier (takes 4 minutes) One folder (takes 2 minutes) It takes 9 minutes to wash, dry, and fold load of laundry. How long does 4 loads take? 998 Morgan Kaufmann Publishers 2

68 The slow way 6 PM Midnight Time If each load is done sequentially it takes 6 hours 998 Morgan Kaufmann Publishers 3

69 Laundry Pipelining Start each load as soon as possible Overlap loads 6 PM Midnight Time Pipelined laundry takes 3.5 hours 998 Morgan Kaufmann Publishers 4

70 Pipelining Lessons 6 PM Time Pipelining doesn t help latency of single load, it helps throughput of entire workload Pipeline rate limited by slowest pipeline stage Multiple tasks operating simultaneously using different resources Potential speedup = Number pipe stages Unbalanced lengths of pipe stages reduces speedup Time to fill pipeline and time to drain it reduces speedup 998 Morgan Kaufmann Publishers 5

71 Pipelining Pipelining is a general-purpose efficiency technique It is not specific to processors Pipelining is used in: Assembly lines Fast food restaurants Pipelining gives the best of both worlds and is used in just about every modern processor. 998 Morgan Kaufmann Publishers 6

72 Instruction execution review Executing a MIPS instruction can take up to five steps. Step Name Description Instruction Fetch IF an instruction from memory. Instruction Decode ID source registers and generate control signals. Execute EX Compute an R-type result or a branch outcome. Memory MEM or write the data memory. Writeback WB Store a result in the destination register. However, as we saw, not all instructions need all five steps. Instruction Steps required beq IF ID EX R-type IF ID EX WB sw IF ID EX MEM lw IF ID EX MEM WB 998 Morgan Kaufmann Publishers 7

73 Single-cycle datapath diagram PC Instruction address [3-] Instruction memory 2ns 4 I [25-2] I [2-6] I [5 - ] Add M u x register register 2 Write register Write data ns RegWrite data data 2 Registers Shift left 2 M u x Add 2ns ALU Zero Result ALUOp M u x PCSrc address Write address Write data 2ns MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend How long does it take to execute each instruction? 998 Morgan Kaufmann Publishers 8

74 Single-cycle review All five execution steps occur in one clock cycle. Each hardware element can only be used once per clock cycle. A lw or sw must access memory twice (in the IF and MEM stages), so there are separate instruction and data memories. There are multiple adders, since each instruction increments the PC (IF) and performs another computation (EX). On top of that, branches also need to compute a target address. 998 Morgan Kaufmann Publishers 9

75 Review: Instruction Fetch (IF) Let s quickly review how lw is executed in the single-cycle datapath. We ll ignore PC incrementing and branching for now. In the Instruction Fetch (IF) step, we read the instruction memory. RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers

76 Instruction Decode (ID) The Instruction Decode (ID) step reads the source register from the register file. RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers

77 Execute (EX) The third step, Execute (EX), computes the effective memory address from the source register and the instruction s constant field. RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers 2

78 Memory (MEM) The Memory (MEM) step involves reading the data memory, from the address computed by the ALU. RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers 3

79 Writeback (WB) Finally, in the Writeback (WB) step, the memory value is stored into the destination register. RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers 4

80 A bunch of lazy functional units Notice that each execution step uses a different functional unit. In other words, the main units are idle for most of the 8ns cycle! The instruction RAM is used for just 2ns at the start of the cycle. Registers are read once in ID (ns), and written once in WB (ns). The ALU is used for 2ns near the middle of the cycle. ing the data memory only takes 2ns as well. That s a lot of hardware sitting around doing nothing. 998 Morgan Kaufmann Publishers 5

81 Putting those slackers to work We shouldn t have to wait for the entire instruction to complete before we can re-use the functional units. For example, the instruction memory is free in the Instruction Decode step as shown below, so... Idle Instruction Decode (ID) RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers 6

82 Decoding and fetching together Why don t we go ahead and fetch the next instruction while we re decoding the first one? Fetch 2nd Decode st instruction RegWrite address Instruction [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers 7

83 Executing, decoding and fetching Similarly, once the first instruction enters its Execute stage, we can go ahead and decode the second instruction. But now the instruction memory is free again, so we can fetch the third instruction! Fetch 3rd Decode 2nd Execute st RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 998 Morgan Kaufmann Publishers 8

84 Making Pipelining Work We ll make our pipeline 5 stages long, to handle load instructions as they were handled in the multi-cycle implementation Stages are: IF, ID, EX, MEM, and WB We want to support executing 5 instructions simultaneously: one in each stage. 998 Morgan Kaufmann Publishers 9

85 Break datapath into 5 stages Each stage has its own functional units. Each stage can execute in 2ns Just like the multi-cycle implementation IF ID EXE MEM WB RegWrite Instruction address [3-] Instruction memory I [25-2] I [2-6] I [5 - ] M u x register register 2 Write register Write data data data 2 Registers M u x ALU Zero Result ALUOp address Write address Write data MemWrite data Data memory MemToReg M u x RegDst ALUSrc Mem I [5 - ] Sign extend 2ns ns 2ns 2ns 998 Morgan Kaufmann Publishers 2

86 Pipelining Loads Clock cycle lw $t, 4($sp) IF ID EX MEM WB lw $t, 8($sp) IF ID EX MEM WB lw $t2, 2($sp) IF ID EX MEM WB lw $t3, 6($sp) IF ID EX MEM WB lw $t4, 2($sp) IF ID EX MEM WB A pipeline diagram shows the execution of a series of instructions. The instruction sequence is shown vertically, from top to bottom. Clock cycles are shown horizontally, from left to right. Each instruction is divided into its component stages. (We show five stages for every instruction, which will make the control unit easier.) This clearly indicates the overlapping of instructions. For example, there are three instructions active in the third cycle above. The lw $t instruction is in its Execute stage. Simultaneously, the lw $t is in its Instruction Decode stage. Also, the lw $t2 instruction is just being fetched. 998 Morgan Kaufmann Publishers 2

87 Pipelining terminology Clock cycle lw $t, 4($sp) IF ID EX MEM WB lw $t, 8($sp) IF ID EX MEM WB lw $t2, 2($sp) IF ID EX MEM WB lw $t3, 6($sp) IF ID EX MEM WB lw $t4, 2($sp) IF ID EX MEM WB The pipeline depth is the number of stages in this case, five. In the first four cycles here, the pipeline is filling, since there are unused functional units. In cycle 5, the pipeline is full. Five instructions are being executed simultaneously, so all hardware units are in use. In cycles 6-9, the pipeline is emptying. filling full emptying 998 Morgan Kaufmann Publishers 22

88 Pipelining Performance Clock cycle lw $t, 4($sp) IF ID EX MEM WB lw $t, 8($sp) IF ID EX MEM WB lw $t2, 2($sp) IF ID EX MEM WB lw $t3, 6($sp) IF ID EX MEM WB lw $t4, 2($sp) IF ID EX MEM WB Execution time on ideal pipeline: time to fill the pipeline + one cycle per instruction How long for N instructions? filling Compared to single-cycle design, how much faster is pipelining for N=? 998 Morgan Kaufmann Publishers 23

89 Pipeline Datapath: Resource Requirements Clock cycle lw $t, 4($sp) IF ID EX MEM WB lw $t, 8($sp) IF ID EX MEM WB lw $t2, 2($sp) IF ID EX MEM WB lw $t3, 6($sp) IF ID EX MEM WB lw $t4, 2($sp) IF ID EX MEM WB We need to perform several operations in the same cycle. Increment the PC and add registers at the same time. Fetch one instruction while another one reads or writes data. What does that mean for our hardware? 998 Morgan Kaufmann Publishers 24

90 Pipelining other instruction types R-type instructions only require 4 stages: IF, ID, EX, and WB We don t need the MEM stage What happens if we try to pipeline loads with R-type instructions? Clock cycle add $sp, $sp, -4 IF ID EX WB sub $v, $a, $a IF ID EX WB lw $t, 4($sp) IF ID EX MEM WB or $s, $s, $s2 IF ID EX WB lw $t, 8($sp) IF ID EX MEM WB Load uses Register File s Write Port during its 5 th (cycle 7) stage R-type uses Register File s Write Port during its 4th (cycle 7) stage 998 Morgan Kaufmann Publishers 25

91 A solution: Insert NOP stages Enforce uniformity Make all instructions take 5 cycles. Make them have the same stages, in the same order Some stages will do nothing for some instructions R-type IF ID EX NOP WB Clock cycle add $sp, $sp, -4 IF ID EX NOP WB sub $v, $a, $a IF ID EX NOP WB lw $t, 4($sp) IF ID EX MEM WB or $s, $s, $s2 IF ID EX NOP WB lw $t, 8($sp) IF ID EX MEM WB Stores and Branches have NOP stages, too store IF ID EX MEM NOP branch IF ID EX NOP NOP 998 Morgan Kaufmann Publishers 26

92 What we have so far Pipelining attempts to maximize instruction throughput by overlapping the execution of multiple instructions. Pipelining offers amazing speedup. In the best case, one instruction finishes on every cycle, and the speedup is equal to the pipeline depth. The pipeline datapath is much like the single-cycle one, but with added pipeline registers Each stage needs is own functional units Next we ll see the datapath and control, and walk through an example execution. 998 Morgan Kaufmann Publishers 27

93 Pipelined Datapath and Control Last time we introduced the main ideas of pipelining. Today we ll see a basic implementation of a pipelined processor. The datapath and control unit share similarities with both the single-cycle and multicycle implementations that we already saw. An example execution highlights important pipelining concepts. In future lectures, we ll discuss several complications of pipelining that we re hiding from you for now. 998 Morgan Kaufmann Publishers 28

94 Pipelining Concepts A pipelined processor allows multiple instructions to execute at once, and each instruction uses a different functional unit in the datapath. This increases throughput, so programs can run faster. One instruction can finish executing on every clock cycle, and simpler stages also lead to shorter cycle times. Clock cycle lw $t, 4($sp) IF ID EX MEM WB sub $v, $a, $a IF ID EX MEM WB and $t, $t2, $t3 IF ID EX MEM WB or $s, $s, $s2 IF ID EX MEM WB add $t5, $t6, $ IF ID EX MEM WB 998 Morgan Kaufmann Publishers 29

95 Pipelined Datapath The whole point of pipelining is to allow multiple instructions to execute at the same time. We may need to perform several operations in the same cycle. Increment the PC and add registers at the same time. Fetch one instruction while another one reads or writes data. Clock cycle lw $t, 4($sp) IF ID EX MEM WB sub $v, $a, $a IF ID EX MEM WB and $t, $t2, $t3 IF ID EX MEM WB or $s, $s, $s2 IF ID EX MEM WB add $t5, $t6, $ IF ID EX MEM WB Thus, like the single-cycle datapath, a pipelined processor will need to duplicate hardware elements that are needed several times in the same clock cycle. 998 Morgan Kaufmann Publishers 3

96 One register file is enough We need only one register file to support both the ID and WB stages. register register 2 Write register Write data data data 2 Registers s and writes go to separate ports on the register file. We already took advantage of this property in our single-cycle CPU. 998 Morgan Kaufmann Publishers 3

97 Single-cycle datapath, slightly rearranged PCSrc 4 Add P C RegWrite Shift left 2 Add address Instruction [3-] register register 2 data data 2 ALU Zero Result MemWrite Address Instruction memory Write register Write data Registers ALUSrc ALUOp Write data Data memory data MemToReg Instr [5 - ] Instr [2-6] Instr [5 - ] Sign extend RegDst Mem 998 Morgan Kaufmann Publishers 32

98 Multiple cycles In pipelining, we also divide instruction execution into multiple cycles. Information computed during one cycle may be needed in a later cycle. The instruction read in the IF stage determines which registers are fetched in the ID stage, what constant is used for the EX stage, and what the destination register is for WB. The registers read in ID are used in the EX and/or MEM stages. The ALU output produced in the EX stage is an effective address for the MEM stage or a result for the WB stage. We added several intermediate registers to the multicycle datapath to preserve information between stages, as highlighted on the next slide. 998 Morgan Kaufmann Publishers 33

99 Registers added to the multi-cycle PCWrite PC IorD ALUSrcA M u x Mem Address Write data Memory MemWrite Mem Data IRWrite [3-26] [25-2] [2-6] [5-] [5-] RegDst M u x RegWrite register register 2 Write register Write data data data 2 Registers A B 4 M u x 2 3 ALU Zero Result ALUOp ALU Out M u x PCSource Instruction register Memory data register M u x Sign extend Shift left 2 ALUSrcB MemToReg 998 Morgan Kaufmann Publishers 34

100 Pipeline registers We ll add intermediate registers to our pipelined datapath too. There s a lot of information to save, however. We ll simplify our diagrams by drawing just one big pipeline register between each stage. The registers are named for the stages they connect. IF/ID ID/EX EX/MEM MEM/WB No register is needed after the WB stage, because after WB the instruction is done. 998 Morgan Kaufmann Publishers 35

101 Pipelined datapath PCSrc 4 IF/ID ID/EX EX/MEM MEM/WB Add P C RegWrite Shift left 2 Add address Instruction [3-] register register 2 data data 2 ALU Zero Result MemWrite Address Instruction memory Write register Write data Registers ALUSrc ALUOp Write data Data memory data MemToReg Instr [5 - ] Instr [2-6] Instr [5 - ] Sign extend RegDst Mem 998 Morgan Kaufmann Publishers 36

102 Propagating values forward Any data values required in later stages must be propagated through the pipeline registers. The most extreme example is the destination register. The rd field of the instruction word, retrieved in the first stage (IF), determines the destination register. But that register isn t updated until the fifth stage (WB). Thus, the rd field must be passed through all of the pipeline stages, as shown in red on the next slide. 998 Morgan Kaufmann Publishers 37

103 The destination register PCSrc 4 IF/ID ID/EX EX/MEM MEM/WB Add P C RegWrite Shift left 2 Add address Instruction [3-] register register 2 data data 2 ALU Zero Result MemWrite Address Instruction memory Write register Write data Registers ALUSrc ALUOp Write data Data memory data MemToReg Instr [5 - ] Instr [2-6] Instr [5 - ] Sign extend RegDst Mem 998 Morgan Kaufmann Publishers 38

104 What about control signals? The control signals are generated in the same way as in the single-cycle processor after an instruction is fetched, the processor decodes it and produces the appropriate control values. But just like before, some of the control signals will not be needed until some later stage and clock cycle. These signals must be propagated through the pipeline until they reach the appropriate stage. We can just pass them in the pipeline registers, along with the other data. Control signals can be categorized by the pipeline stage that uses them. Stage Control signals needed EX ALUSrc ALUOp RegDst MEM Mem MemWrite PCSrc WB RegWrite MemToReg 998 Morgan Kaufmann Publishers 39

105 Pipelined datapath and control ID/EX PCSrc Control WB M EX/MEM WB MEM/WB 4 IF/ID EX M WB Add P C RegWrite Shift left 2 Add address Instruction [3-] register register 2 data data 2 ALU Zero Result MemWrite Address Instruction memory Write register Write data Registers ALUSrc ALUOp Write data Data memory data MemToReg Instr [5 - ] Instr [2-6] Instr [5 - ] Sign extend RegDst Mem 998 Morgan Kaufmann Publishers 4

106 Notes about the diagram The control signals are grouped together in the pipeline registers, just to make the diagram a little clearer. Not all of the registers have a write enable signal. Because the datapath fetches one instruction per cycle, the PC must also be updated on each clock cycle. Including a write enable for the PC would be redundant. Similarly, the pipeline registers are also written on every cycle, so no explicit write signals are needed. 998 Morgan Kaufmann Publishers 4

107 An example execution sequence Here s a sample sequence of instructions to execute. addresses in decimal : lw $8, 4($29) 4: sub $2, $4, $5 8: and $9, $, $ 2: or $6, $7, $8 6: add $3, $4, $ We ll make some assumptions, just so we can show actual data values. Each register contains its number plus. For instance, register $8 contains 8, register $29 contains 29, and so forth. Every data memory location contains 99. Our pipeline diagrams will follow some conventions. An X indicates values that aren t important, like the constant field of an R-type instruction. Question marks??? indicate values we don t know, usually resulting from instructions coming before and after the ones in our example. 998 Morgan Kaufmann Publishers 42

108 Cycle (filling) IF: lw $8, 4($29) ID:??? EX:??? MEM:??? WB:??? ID/EX WB EX/MEM PCSrc Control M WB MEM/WB 4 IF/ID EX M WB P C Add 4 RegWrite (?) Shift left 2 Add address Instruction [3-] Instruction memory???????????? register register 2 Write register Write data data data 2 Registers???????????? ALUSrc (?) ALU Zero Result??? ALUOp (???)?????? MemWrite (?) Address Write data Data memory data MemToReg (?)???????????? Sign extend????????? RegDst (?)?????? Mem (?)?????? 998 Morgan Kaufmann Publishers??? 43

109 Cycle 2 IF: sub $2, $4, $5 ID: lw $8, 4($29) EX:??? MEM:??? WB:??? ID/EX WB EX/MEM PCSrc Control M WB MEM/WB 4 IF/ID EX M WB Add P C 8 RegWrite (?) Shift left 2 Add 4 address Instruction [3-] 29 X register register 2 data data 2 29 X?????? ALU Zero Result??? MemWrite (?) Address Instruction memory?????? Write register Write data Registers ALUSrc (?)??? ALUOp (???)??? Write data Data memory data MemToReg (?)??? 4 8 X Sign extend????????? RegDst (?)?????? Mem (?)????????? 998 Morgan Kaufmann Publishers 44

110 Cycle 3 IF: and $9, $, $ ID: sub $2, $4, $5 EX: lw $8, 4($29) MEM:??? WB:??? ID/EX WB EX/MEM PCSrc Control M WB MEM/WB 4 IF/ID EX M WB P C Add 2 RegWrite (?) Shift left 2 Add 8 address Instruction [3-] Instruction memory 4 5?????? register register 2 Write register Write data data data 2 Registers 4 5 X 29 4 ALUSrc () ALU Zero Result 33 ALUOp (add)?????? MemWrite (?) Address Write data Data memory data MemToReg (?)??? X X 2 Sign extend 4 8 X RegDst () 8??? Mem (?)????????? 998 Morgan Kaufmann Publishers 45

111 Cycle 4 IF: or $6, $7, $8 ID: and $9, $, $ EX: sub $2, $4, $5 MEM: lw $8, 4($29) WB:??? ID/EX WB EX/MEM PCSrc Control M WB MEM/WB 4 IF/ID EX M WB P C Add 6 RegWrite (?) Shift left 2 Add 2 address Instruction [3-] Instruction memory?????? register register 2 Write register Write data data data 2 Registers 5 4 ALUSrc () ALU Zero Result ALUOp (sub) 33 X MemWrite () Address Write data Data memory data 99 MemToReg (?)??? X X 9 Sign extend X X 2 RegDst () 2 8 Mem ()?????? 998 Morgan Kaufmann Publishers??? 46

112 P C Cycle 5 (full) IF: add $3, $4, $ ID: or $6, $7, $8 EX: and $9, $, $ MEM: sub $2, $4, $5 WB: lw $8, 4($29) 4 PCSrc Add 2 IF/ID Control RegWrite () ID/EX WB M EX Shift left 2 Add EX/MEM WB M MEM/WB WB 6 address Instruction [3-] 7 8 register register 2 data data ALU Zero Result - MemWrite () Address Instruction memory 8 99 Write register Write data Registers ALUSrc () ALUOp (and) 5 Write data Data memory data X MemToReg () 99 X X 6 Sign extend X X 9 RegDst () 9 2 Mem () Morgan Kaufmann Publishers 99 47

113 Cycle 6 (emptying) P C 4 IF:??? ID: add $3, $4, $ EX: or $6, $7, $8 MEM: and $9, $, $ WB: sub $2, $4, $5 PCSrc Add??? IF/ID Control RegWrite () ID/EX WB M EX Shift left 2 Add EX/MEM WB M MEM/WB WB 2 address Instruction [3-] Instruction memory register register 2 Write register Write data data data 2 Registers ALUSrc () ALU Zero Result 9 ALUOp (or) MemWrite () Address Write data Data memory data X MemToReg () X X 3 Sign extend X X 6 RegDst () 6 9 Mem () 998 Morgan Kaufmann Publishers 48

114 Cycle 7 P C 4 IF:??? ID:??? EX: add $3, $4, $ MEM: or $6, $7, $8 WB: and $9, $, $ PCSrc Add??? IF/ID Control RegWrite () ID/EX WB M EX Shift left 2 Add EX/MEM WB M MEM/WB WB??? address Instruction [3-]?????? register register 2 data data 2?????? 4 ALU Zero Result 9 MemWrite () Address Instruction memory 9 Write register Write data Registers ALUSrc () 4 ALUOp (add) 8 Write data Data memory data X MemToReg () X????????? Sign extend X X 3 RegDst () 3 6 Mem () Morgan Kaufmann Publishers 49

115 Cycle 8 P C 4 IF:??? ID:??? EX:??? MEM: add $3, $4, $ WB: or $6, $7, $8 PCSrc Add??? IF/ID Control RegWrite () ID/EX WB M EX Shift left 2 Add EX/MEM WB M MEM/WB WB??? address Instruction [3-]?????? register register 2 data data 2???????????? ALU Zero Result 4 MemWrite () Address Instruction memory 6 9 Write register Write data Registers ALUSrc (?)??? ALUOp (???) Write data Data memory data X MemToReg () X????????? Sign extend????????? RegDst (?)??? 3 Mem () Morgan Kaufmann Publishers 9 5

116 Cycle 9 P C 4 IF:??? ID:??? EX:??? MEM:??? WB: add $3, $4, $ PCSrc Add??? IF/ID Control RegWrite () ID/EX WB M EX Shift left 2 Add EX/MEM WB M MEM/WB WB??? address Instruction [3-]?????? register register 2 data data 2???????????? ALU Zero Result??? MemWrite (?) Address Instruction memory 3 4 Write register Write data Registers ALUSrc (?)??? ALUOp (???)? Write data Data memory data X MemToReg () X????????? Sign extend????????? RegDst (?)?????? Mem (?) Morgan Kaufmann Publishers 4 5

117 That s a lot of diagrams there Clock cycle lw $t, 4($sp) IF ID EX MEM WB sub $v, $a, $a IF ID EX MEM WB and $t, $t2, $t3 IF ID EX MEM WB or $s, $s, $s2 IF ID EX MEM WB add $t5, $t6, $ IF ID EX MEM WB Compare the last nine slides with the pipeline diagram above. You can see how instruction executions are overlapped. Each functional unit is used by a different instruction in each cycle. The pipeline registers save control and data values generated in previous clock cycles for later use. When the pipeline is full in clock cycle 5, all of the hardware units are utilized. This is the ideal situation, and what makes pipelined processors so fast. 998 Morgan Kaufmann Publishers 52

118 Performance Revisited Assuming the following functional unit latencies: 3ns 2ns 2ns 3ns 2ns Inst mem Reg ALU Data Mem Reg Write What is the cycle time of a single-cycle implementation? What is its throughput? What is the cycle time of an ideal pipelined implementation? What is its steady-state throughput? How much faster is pipelining? 998 Morgan Kaufmann Publishers 53

119 Ideal speedup Clock cycle lw $t, 4($sp) IF ID EX MEM WB sub $v, $a, $a IF ID EX MEM WB and $t, $t2, $t3 IF ID EX MEM WB or $s, $s, $s2 IF ID EX MEM WB add $sp, $sp, -4 IF ID EX MEM WB In our pipeline, we can execute up to five instructions simultaneously. This implies that the maximum speedup is 5 times. In general, the ideal speedup equals the pipeline depth. Why was our speedup on the previous slide only 4 times? The pipeline stages are imbalanced: a register file and ALU operations can be done in 2ns, but we must stretch that out to 3ns to keep the ID, EX, and WB stages synchronized with IF and MEM. Balancing the stages is one of the many hard parts in designing a pipelined processor. 998 Morgan Kaufmann Publishers 54

120 The pipelining paradox Clock cycle lw $t, 4($sp) IF ID EX MEM WB sub $v, $a, $a IF ID EX MEM WB and $t, $t2, $t3 IF ID EX MEM WB or $s, $s, $s2 IF ID EX MEM WB add $sp, $sp, -4 IF ID EX MEM WB Pipelining does not improve the execution time of any single instruction. Each instruction here actually takes longer to execute than in a single-cycle datapath (5ns vs. 2ns)! Instead, pipelining increases the throughput, or the amount of work done per unit time. Here, several instructions are executed together in each clock cycle. The result is improved execution time for a sequence of instructions, such as an entire program. 998 Morgan Kaufmann Publishers 55

121 Instruction set architectures and pipelining The MIPS instruction set was designed especially for easy pipelining. All instructions are 32-bits long, so the instruction fetch stage just needs to read one word on every clock cycle. Fields are in the same position in different instruction formats the opcode is always the first six bits, rs is the next five bits, etc. This makes things easy for the ID stage. MIPS is a register-to-register architecture, so arithmetic operations cannot contain memory references. This keeps the pipeline shorter and simpler. Pipelining is harder for older, more complex instruction sets. If different instructions had different lengths or formats, the fetch and decode stages would need extra time to determine the actual length of each instruction and the position of the fields. With memory-to-memory instructions, additional pipeline stages may be needed to compute effective addresses and read memory before the EX stage. 998 Morgan Kaufmann Publishers 56

122 Summary so far The pipelined datapath combines ideas from the single and multicycle processors that we saw earlier. It uses multiple memories and ALUs. Instruction execution is split into several stages. Pipeline registers propagate data and control values to later stages. The MIPS instruction set architecture supports pipelining with uniform instruction formats and simple addressing modes. Next, we ll start talking about Hazards. 998 Morgan Kaufmann Publishers 57

123 Welcome to Part 3: Memory Systems and I/O We ve already seen how to make a fast processor. How can we supply the CPU with enough data to keep it busy? We will now focus on memory issues, which are frequently bottlenecks that limit the performance of a system. We ll start off by looking at memory systems in the remaining lectures. Processor Memory Input/Output

124 Cache introduction Today we ll answer the following questions. What are the challenges of building big, fast memory systems? What is a cache? Why caches work? (answer: locality) How are caches organized? Where do we put things -and- how do we find them? 2

Functions in MIPS. Functions in MIPS 1

Functions in MIPS. Functions in MIPS 1 Functions in MIPS We ll talk about the 3 steps in handling function calls: 1. The program s flow of control must be changed. 2. Arguments and return values are passed back and forth. 3. Local variables

More information

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS Lectures 5 Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS 1 OOPS - What does this C code do? int foo(char *s) { int L = 0; while (*s++) { ++L; } return L; } 2

More information

Today. Putting it all together

Today. Putting it all together Today! One complete example To put together the snippets of assembly code we have seen! Functions in MIPS Slides adapted from Josep Torrellas, Craig Zilles, and Howard Huang Putting it all together! Count

More information

Lecture 5. Announcements: Today: Finish up functions in MIPS

Lecture 5. Announcements: Today: Finish up functions in MIPS Lecture 5 Announcements: Today: Finish up functions in MIPS 1 Control flow in C Invoking a function changes the control flow of a program twice. 1. Calling the function 2. Returning from the function In

More information

ELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)

ELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2) ELEC 5200-001/6200-001 Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2) Victor P. Nelson, Professor & Asst. Chair Vishwani D. Agrawal, James J. Danaher Professor Department

More information

CS232 Final Exam May 5, 2001

CS232 Final Exam May 5, 2001 CS232 Final Exam May 5, 2 Name: This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your work. State

More information

CENG3420 Lecture 03 Review

CENG3420 Lecture 03 Review CENG3420 Lecture 03 Review Bei Yu byu@cse.cuhk.edu.hk 2017 Spring 1 / 38 CISC vs. RISC Complex Instruction Set Computer (CISC) Lots of instructions of variable size, very memory optimal, typically less

More information

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from:

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from: Instructions: Chapter 3 : MIPS Downloaded from: http://www.cs.umr.edu/~bsiever/cs234/ Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive

More information

Chapter 2A Instructions: Language of the Computer

Chapter 2A Instructions: Language of the Computer Chapter 2A Instructions: Language of the Computer Copyright 2009 Elsevier, Inc. All rights reserved. Instruction Set The repertoire of instructions of a computer Different computers have different instruction

More information

CS232 Final Exam May 5, 2001

CS232 Final Exam May 5, 2001 CS232 Final Exam May 5, 2 Name: Spiderman This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your

More information

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine Machine Language Instructions Introduction Instructions Words of a language understood by machine Instruction set Vocabulary of the machine Current goal: to relate a high level language to instruction

More information

Topic Notes: MIPS Instruction Set Architecture

Topic Notes: MIPS Instruction Set Architecture Computer Science 220 Assembly Language & Comp. Architecture Siena College Fall 2011 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture.

More information

Computer Architecture

Computer Architecture Computer Architecture Chapter 2 Instructions: Language of the Computer Fall 2005 Department of Computer Science Kent State University Assembly Language Encodes machine instructions using symbols and numbers

More information

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands Stored Program Concept Instructions: Instructions are bits Programs are stored in memory to be read or written just like data Processor Memory memory for data, programs, compilers, editors, etc. Fetch

More information

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions Representing Instructions Instructions are encoded in binary Called machine code MIPS instructions Encoded as 32-bit instruction words Small number of formats encoding operation code (opcode), register

More information

Chapter 3 MIPS Assembly Language. Ó1998 Morgan Kaufmann Publishers 1

Chapter 3 MIPS Assembly Language. Ó1998 Morgan Kaufmann Publishers 1 Chapter 3 MIPS Assembly Language Ó1998 Morgan Kaufmann Publishers 1 Instructions: Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization CISC 662 Graduate Computer Architecture Lecture 4 - ISA MIPS ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes Chapter 2 Instructions: Language of the Computer Adapted by Paulo Lopes Instruction Set The repertoire of instructions of a computer Different computers have different instruction sets But with many aspects

More information

Computer Organization MIPS ISA

Computer Organization MIPS ISA CPE 335 Computer Organization MIPS ISA Dr. Iyad Jafar Adapted from Dr. Gheith Abandah Slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE 232 MIPS ISA 1 (vonneumann) Processor Organization

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA CISC 662 Graduate Computer Architecture Lecture 4 - ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

CSE 141 Computer Architecture Spring Lecture 3 Instruction Set Architecute. Course Schedule. Announcements

CSE 141 Computer Architecture Spring Lecture 3 Instruction Set Architecute. Course Schedule. Announcements CSE141: Introduction to Computer Architecture CSE 141 Computer Architecture Spring 2005 Lecture 3 Instruction Set Architecute Pramod V. Argade April 4, 2005 Instructor: TAs: Pramod V. Argade (p2argade@cs.ucsd.edu)

More information

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture Computer Science 324 Computer Architecture Mount Holyoke College Fall 2009 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture. Idea:

More information

Instruction Set Architecture. "Speaking with the computer"

Instruction Set Architecture. Speaking with the computer Instruction Set Architecture "Speaking with the computer" The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture Digital Design

More information

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Introduction to the MIPS Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Introduction to the MIPS The Microprocessor without Interlocked Pipeline Stages

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19 CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be

More information

Instructions: Language of the Computer

Instructions: Language of the Computer CS359: Computer Architecture Instructions: Language of the Computer Yanyan Shen Department of Computer Science and Engineering 1 The Language a Computer Understands Word a computer understands: instruction

More information

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set COMPSCI 313 S2 2018 Computer Organization 7 MIPS Instruction Set Agenda & Reading MIPS instruction set MIPS I-format instructions MIPS R-format instructions 2 7.1 MIPS Instruction Set MIPS Instruction

More information

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont ) Chapter 2 Computer Abstractions and Technology Lesson 4: MIPS (cont ) Logical Operations Instructions for bitwise manipulation Operation C Java MIPS Shift left >>> srl Bitwise

More information

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009 101 Assembly ENGR 3410 Computer Architecture Mark L. Chang Fall 2009 What is assembly? 79 Why are we learning assembly now? 80 Assembly Language Readings: Chapter 2 (2.1-2.6, 2.8, 2.9, 2.13, 2.15), Appendix

More information

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary Control Instructions Computer Organization Architectures for Embedded Computing Thursday, 26 September 2013 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,

More information

Control Instructions

Control Instructions Control Instructions Tuesday 22 September 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary Previous Class Instruction Set

More information

Instructions: Assembly Language

Instructions: Assembly Language Chapter 2 Instructions: Assembly Language Reading: The corresponding chapter in the 2nd edition is Chapter 3, in the 3rd edition it is Chapter 2 and Appendix A and in the 4th edition it is Chapter 2 and

More information

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands Stored Program Concept nstructions: nstructions are bits Programs are stored in memory to be read or written just like data Processor Memory memory for data, programs, compilers, editors, etc. Fetch &

More information

Chapter 2: Instructions:

Chapter 2: Instructions: Chapter 2: Instructions: Language of the Computer Computer Architecture CS-3511-2 1 Instructions: To command a computer s hardware you must speak it s language The computer s language is called instruction

More information

COMP2611: Computer Organization. The Pipelined Processor

COMP2611: Computer Organization. The Pipelined Processor COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among

More information

378: Machine Organization and Assembly Language

378: Machine Organization and Assembly Language 378: Machine Organization and Assembly Language Spring 2010 Luis Ceze Slides adapted from: UIUC, Luis Ceze, Larry Snyder, Hal Perkins 1 What is computer architecture about? Computer architecture is the

More information

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture Computer Science 324 Computer Architecture Mount Holyoke College Fall 2007 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture. Idea:

More information

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design Professor Sherief Reda http://scale.engin.brown.edu School of Engineering Brown University Spring 2014 Sources: Computer

More information

CS/COE1541: Introduction to Computer Architecture

CS/COE1541: Introduction to Computer Architecture CS/COE1541: Introduction to Computer Architecture Dept. of Computer Science University of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/1541p/index.html 1 Computer Architecture? Application pull Operating

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Instructions: Language of the Computer Operations and Operands of the Computer Hardware Signed and Unsigned Numbers Representing

More information

Computer Architecture

Computer Architecture CS3350B Computer Architecture Winter 2015 Lecture 4.2: MIPS ISA -- Instruction Representation Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design,

More information

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)

More information

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook) Lecture 2 Instructions: Language of the Computer (Chapter 2 of the textbook) Instructions: tell computers what to do Chapter 2 Instructions: Language of the Computer 2 Introduction Chapter 2.1 Chapter

More information

Course Administration

Course Administration Fall 2017 EE 3613: Computer Organization Chapter 2: Instruction Set Architecture 2/4 Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 45701 E-mail: kodi@ohio.edu

More information

CS 351 Exam 2 Mon. 11/2/2015

CS 351 Exam 2 Mon. 11/2/2015 CS 351 Exam 2 Mon. 11/2/2015 Name: Rules and Hints The MIPS cheat sheet and datapath diagram are attached at the end of this exam for your reference. You may use one handwritten 8.5 11 cheat sheet (front

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 35: Final Exam Review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Material from Earlier in the Semester Throughput and latency

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

CSE Lecture In Class Example Handout

CSE Lecture In Class Example Handout CSE 30321 Lecture 07-08 In Class Example Handout Part A: J-Type Example: If you look in your book at the syntax for j (an unconditional jump instruction), you see something like: e.g. j addr would seemingly

More information

Today s topics. MIPS operations and operands. MIPS arithmetic. CS/COE1541: Introduction to Computer Architecture. A Review of MIPS ISA.

Today s topics. MIPS operations and operands. MIPS arithmetic. CS/COE1541: Introduction to Computer Architecture. A Review of MIPS ISA. Today s topics CS/COE1541: Introduction to Computer Architecture MIPS operations and operands MIPS registers Memory view Instruction encoding A Review of MIPS ISA Sangyeun Cho Arithmetic operations Logic

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Chapter 2. Instruction Set Architecture (ISA)

Chapter 2. Instruction Set Architecture (ISA) Chapter 2 Instruction Set Architecture (ISA) MIPS arithmetic Design Principle: simplicity favors regularity. Why? Of course this complicates some things... C code: A = B + C + D; E = F - A; MIPS code:

More information

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI Recall Big endian, little endian Memory alignment Unsigned

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

CSE 378 Midterm 2/12/10 Sample Solution

CSE 378 Midterm 2/12/10 Sample Solution Question 1. (6 points) (a) Rewrite the instruction sub $v0,$t8,$a2 using absolute register numbers instead of symbolic names (i.e., if the instruction contained $at, you would rewrite that as $1.) sub

More information

EEC 581 Computer Architecture Lecture 1 Review MIPS

EEC 581 Computer Architecture Lecture 1 Review MIPS EEC 581 Computer Architecture Lecture 1 Review MIPS 1 Supercomputing: Suddenly Fancy 2 1 Instructions: Language of the Machine More primitive than higher level languages e.g., no sophisticated control

More information

CSE Lecture In Class Example Handout

CSE Lecture In Class Example Handout CSE 30321 Lecture 07-09 In Class Example Handout Part A: A Simple, MIPS-based Procedure: Swap Procedure Example: Let s write the MIPS code for the following statement (and function call): if (A[i] > A

More information

Processor Architecture

Processor Architecture Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1 Instructions: MIPS ISA Chapter 2 Instructions: Language of the Computer 1 PH Chapter 2 Pt A Instructions: MIPS ISA Based on Text: Patterson Henessey Publisher: Morgan Kaufmann Edited by Y.K. Malaiya for

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849

More information

Lecture 4: MIPS Instruction Set

Lecture 4: MIPS Instruction Set Lecture 4: MIPS Instruction Set No class on Tuesday Today s topic: MIPS instructions Code examples 1 Instruction Set Understanding the language of the hardware is key to understanding the hardware/software

More information

Chapter 2. Instructions:

Chapter 2. Instructions: Chapter 2 1 Instructions: Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive e.g., MIPS Arithmetic Instructions We ll be working with

More information

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015 Branch Addressing Branch instructions specify Opcode, two registers, target address Most branch targets are near branch Forward or backward op rs rt constant or address 6 bits 5 bits 5 bits 16 bits PC-relative

More information

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero Do-While Example In C++ do { z--; while (a == b); z = b; In assembly language loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero 25 Comparisons Set on less than (slt) compares its source registers

More information

Chapter 3. Instructions:

Chapter 3. Instructions: Chapter 3 1 Instructions: Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive e.g., MIPS Arithmetic Instructions We ll be working with

More information

CS 61c: Great Ideas in Computer Architecture

CS 61c: Great Ideas in Computer Architecture MIPS Instruction Formats July 2, 2014 Review New registers: $a0-$a3, $v0-$v1, $ra, $sp New instructions: slt, la, li, jal, jr Saved registers: $s0-$s7, $sp, $ra Volatile registers: $t0-$t9, $v0-$v1, $a0-$a3

More information

CS222: MIPS Instruction Set

CS222: MIPS Instruction Set CS222: MIPS Instruction Set Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati 1 Outline Previous Introduction to MIPS Instruction Set MIPS Arithmetic's Register Vs Memory, Registers

More information

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei Instruction Set Architecture part 1 (Introduction) Mehran Rezaei Overview Last Lecture s Review Execution Cycle Levels of Computer Languages Stored Program Computer/Instruction Execution Cycle SPIM, a

More information

Computer Architecture. MIPS Instruction Set Architecture

Computer Architecture. MIPS Instruction Set Architecture Computer Architecture MIPS Instruction Set Architecture Instruction Set Architecture An Abstract Data Type Objects Registers & Memory Operations Instructions Goal of Instruction Set Architecture Design

More information

Quick Review. lw $t0, 4($a0) Registers x Memory. $a0 is simply another name for register 4 $t0 is another name for register (green sheet)

Quick Review. lw $t0, 4($a0) Registers x Memory. $a0 is simply another name for register 4 $t0 is another name for register (green sheet) CSE378 Lecture 3 Today: Finish up memory Control-flow (branches) in MIPS if/then loops case/switch (maybe) Start: Array Indexing vs. Pointers In particular pointer arithmetic String representation 1 Quick

More information

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl. Lecture 4: Review of MIPS Instruction formats, impl. of control and datapath, pipelined impl. 1 MIPS Instruction Types Data transfer: Load and store Integer arithmetic/logic Floating point arithmetic Control

More information

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro Math 230 Assembly Programming (AKA Computer Organization) Spring 2008 MIPS Intro Adapted from slides developed for: Mary J. Irwin PSU CSE331 Dave Patterson s UCB CS152 M230 L09.1 Smith Spring 2008 MIPS

More information

Announcements HW1 is due on this Friday (Sept 12th) Appendix A is very helpful to HW1. Check out system calls

Announcements HW1 is due on this Friday (Sept 12th) Appendix A is very helpful to HW1. Check out system calls Announcements HW1 is due on this Friday (Sept 12 th ) Appendix A is very helpful to HW1. Check out system calls on Page A-48. Ask TA (Liquan chen: liquan@ece.rutgers.edu) about homework related questions.

More information

ENE 334 Microprocessors

ENE 334 Microprocessors ENE 334 Microprocessors Lecture 6: Datapath and Control : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 3 th & 4 th Edition, Patterson & Hennessy, 2005/2008, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/

More information

MODULE 4 INSTRUCTIONS: LANGUAGE OF THE MACHINE

MODULE 4 INSTRUCTIONS: LANGUAGE OF THE MACHINE MODULE 4 INSTRUCTIONS: LANGUAGE OF THE MACHINE 1 ARCHITECTURE MODEL The basic instruction set of a computer is comprised of sequences of REGISTER TRANSFERS. Example: Add A, B, C Register B # A

More information

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions Procedure Calls A procedure of a subroutine is like an agent which needs certain information to perform a

More information

ECE369. Chapter 2 ECE369

ECE369. Chapter 2 ECE369 Chapter 2 1 Instruction Set Architecture A very important abstraction interface between hardware and low-level software standardizes instructions, machine language bit patterns, etc. advantage: different

More information

EN164: Design of Computing Systems Topic 03: Instruction Set Architecture Design

EN164: Design of Computing Systems Topic 03: Instruction Set Architecture Design EN164: Design of Computing Systems Topic 03: Instruction Set Architecture Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown

More information

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University. Instructions: ti Language of the Computer Rui Wang, Assistant professor Dept. of Information and Communication Tongji University it Email: ruiwang@tongji.edu.cn Computer Hierarchy Levels Language understood

More information

ECE 154A Introduction to. Fall 2012

ECE 154A Introduction to. Fall 2012 ECE 154A Introduction to Computer Architecture Fall 2012 Dmitri Strukov Lecture 4: Arithmetic and Data Transfer Instructions Agenda Review of last lecture Logic and shift instructions Load/store instructionsi

More information

Chapter 5 Solutions: For More Practice

Chapter 5 Solutions: For More Practice Chapter 5 Solutions: For More Practice 1 Chapter 5 Solutions: For More Practice 5.4 Fetching, reading registers, and writing the destination register takes a total of 300ps for both floating point add/subtract

More information

Chapter 2. Instructions: Language of the Computer. HW#1: 1.3 all, 1.4 all, 1.6.1, , , , , and Due date: one week.

Chapter 2. Instructions: Language of the Computer. HW#1: 1.3 all, 1.4 all, 1.6.1, , , , , and Due date: one week. Chapter 2 Instructions: Language of the Computer HW#1: 1.3 all, 1.4 all, 1.6.1, 1.14.4, 1.14.5, 1.14.6, 1.15.1, and 1.15.4 Due date: one week. Practice: 1.5 all, 1.6 all, 1.10 all, 1.11 all, 1.14 all,

More information

CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Formats

CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Formats CS 61C: Great Ideas in Computer Architecture MIPS Instruction Formats Instructor: Justin Hsia 6/27/2012 Summer 2012 Lecture #7 1 Review of Last Lecture New registers: $a0-$a3, $v0-$v1, $ra, $sp Also: $at,

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 th The Hardware/Software Interface Edition Chapter 2 Instructions: Language of the Computer 2.1 Introduction Instruction Set The repertoire of instructions of a computer

More information

Assembly Language Programming. CPSC 252 Computer Organization Ellen Walker, Hiram College

Assembly Language Programming. CPSC 252 Computer Organization Ellen Walker, Hiram College Assembly Language Programming CPSC 252 Computer Organization Ellen Walker, Hiram College Instruction Set Design Complex and powerful enough to enable any computation Simplicity of equipment MIPS Microprocessor

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Branch to a labeled instruction if a condition is true Otherwise, continue sequentially beq rs, rt, L1 if (rs == rt) branch to

More information

Pipelined Datapath. One register file is enough

Pipelined Datapath. One register file is enough ipelined path The goal of pipelining is to allow multiple instructions execute at the same time We may need to perform several operations in a cycle Increment the and add s at the same time. Fetch one

More information

Lecture 5: Procedure Calls

Lecture 5: Procedure Calls Lecture 5: Procedure Calls Today s topics: Memory layout, numbers, control instructions Procedure calls 1 Memory Organization The space allocated on stack by a procedure is termed the activation record

More information

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Chapter 4. The Processor. Computer Architecture and IC Design Lab Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS

More information

Lectures 3-4: MIPS instructions

Lectures 3-4: MIPS instructions Lectures 3-4: MIPS instructions Motivation Learn how a processor s native language looks like Discover the most important software-hardware interface MIPS Microprocessor without Interlocked Pipeline Stages

More information

CENG3420 L03: Instruction Set Architecture

CENG3420 L03: Instruction Set Architecture CENG3420 L03: Instruction Set Architecture Bei Yu byu@cse.cuhk.edu.hk (Latest update: January 31, 2018) Spring 2018 1 / 49 Overview Introduction Arithmetic & Logical Instructions Data Transfer Instructions

More information

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer 1 5. Addressing Modes MIPS Addressing Modes 2 Addressing takes care of where to find data instruction We have seen, so far three addressing modes of MIPS (to find data): 1. Immediate addressing: provides

More information

All instructions have 3 operands Operand order is fixed (destination first)

All instructions have 3 operands Operand order is fixed (destination first) Instruction Set Architecture for MIPS Processors Overview Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70803 durresi@csc.lsu.edu These slides are available at: http://www.csc.lsu.edu/~durresi/_07/

More information

ISA and RISCV. CASS 2018 Lavanya Ramapantulu

ISA and RISCV. CASS 2018 Lavanya Ramapantulu ISA and RISCV CASS 2018 Lavanya Ramapantulu Program Program =?? Algorithm + Data Structures Niklaus Wirth Program (Abstraction) of processor/hardware that executes 3-Jul-18 CASS18 - ISA and RISCV 2 Program

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Instructions: Language of the Computer Operations and Operands of the Computer Hardware Signed and Unsigned Numbers Representing

More information

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers CSE 2021: Computer Organization Recap from Last Time load from disk High-Level Program Lecture-2 Code Translation-1 Registers, Arithmetic, logical, jump, and branch instructions MIPS to machine language

More information

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011 CSE 2021: Computer Organization Recap from Last Time load from disk High-Level Program Lecture-3 Code Translation-1 Registers, Arithmetic, logical, jump, and branch instructions MIPS to machine language

More information

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

ECE232: Hardware Organization and Design. Computer Organization - Previously covered ECE232: Hardware Organization and Design Part 6: MIPS Instructions II http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Computer Organization

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

CENG 3420 Lecture 06: Datapath

CENG 3420 Lecture 06: Datapath CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference

More information