EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage dictionaries are allowed. No reference material is allowed. There are 180 possible marks on the eam. Every bo with a heavy otline reqires an answer. There is an appendi listing IPS instrctions. Topic arks 1 Inpt-Otpt, Performance, and lti Core 45 marks 2 Converting C into IPS Assembly 20 marks 3 IPS Addressing odes 15 marks 4 IPS Assembly Programming 10 marks 5 Single Cycle Data Path 30 marks 6 Pipelined Data Path 30 marks 7 emory Hierarchy 30 marks Note: arks are shown for each qestion as a whole and also for their parts.
Qestion 1. Inpt-Otpt, Performance, and lti-core StdentId [45 arks] For each of the following statements, say whether it is tre or false, 2 marks each qestion. a) DA directly transfers data between an I/O device and memory. b) A DA controller does not need to inform the processor when the data transfer has completed. c) Peripheral devices are considered to be part of the physical address space. d) emory bses are a common way of connecting I/O devices to processors driven by handshaking. e) A simple way to ensre that a ser program does not read or write a peripheral device, is to ensre that the ser program s virtal memory space is set p so that no page corresponds to physical addresses set aside for specific devices. f) DA operations can be triggered at bs cycle bondary while interrpts can only be triggered at instrction bondary. 2 contined
StdentId g) ost DA controllers have a Stats Register. This register spplies information to the CPU. h) lti-core processors are ID: Different cores eecte different threads (ltiple Instrctions), operating on different parts of memory (ltiple Data). i) In a mlti-core mltiprocessor, each core has its own memory. j) In a IPS processor, when one thread is waiting for a floating point operation to complete, another thread can se the integer nits. k) There are two kinds of bses: Synchronos bses are more fleible; asynchronos bses are faster. l) The operating system plays a major role in handling I/O, acting as an interface between program and hardware. m) The processor always ses different bses to address memory and I/O devices. 3 contined
StdentId n) In brst transfer, a DA controller steals a cycle before every CPU instrction. o) The operating system sally prevents ser programs from interacting directly with I/O devices. For the net five qestions, choose ONE correct answer. 3 arks each qestion. p) A disadvantage of DA is that: A. Compter system performance is improved by direct transfer of data between memory and I/O devices, bypassing the CPU. B. CPU is free to perform operations that do not se system bses. C. In case of brst mode data transfer, the CPU is rendered inactive for relatively long periods of time. D. None of above. q) To speed p the eection of a program, yo can: A. Use fewer instrctions B. Use fewer cycles per instrction C. Use faster clock D. All of above r) Synchronos bses: A. Sffer overhead of handshaking B. Are proof against clock skew (there is no clock) C. Are clocked: All devices connected mst be rn at the same clock rate D. ake longer bses possible 4 contined
s) Asynchronos bses: StdentId A. Work with fast and slow devices B. Are inepensive and fast C. Use a simpler protocol than synchronos bses D. All of above t) The time reqired to transmit a signal on a bs in the compter depends on A. inimm block size B. Total delay time of all the devices on the bs C. Time to get control of the bs D. All of above 5 contined
Qestion 2. Converting C into IPS Assembly a) [10 marks] Rewrite the following program into IPS Assembly if ( i > j ) A[3] = i + j ; else A[4] = A[i] j; StdentId [20 arks] 6 contined
StdentId b) [10 marks] Rewrite the following program into IPS Assembly sm = 0; for (i=1;i<n;i++){ sm=sm+1;} 7 contined
Qestion 3. IPS Addressing odes StdentId [15 arks] Briefly eplain each of the following IPS addressing modes, inclding how the addressing mode works and any limitations it has. For each mode list two commands which cold se the addressing mode. a) Register addressing b) PC-relative addressing c) Base addressing 8 contined
SPARE PAGE FOR EXTRA S StdentId Cross ot the rogh working that yo do not want marked. Specify the qestion nmber for work yo do want marked. 9 contined
Qestion 4. IPS Assembly Programming[10 arks] StdentId Give the console otpts of the IPS program below, for each of the following inpt sets: a) [3 arks] 25. b) [3 arks] -7. c) [4 arks] 5, and then 1, 2, 4, 5, 9. The IPS program ses several instrctions that were not covered in lectres. Yo will need to read the brief reference material provided after the program in order to nderstand the program. The same reference material is also given in the Appendi to this eam..data str1:.asciiz " Please give an integer from 1 to 20: " errormsg:.asciiz "Ot of range (1-20). \n" nline:.asciiz "\n" # new line.tet error: li $v0, 4 la $a0, errormsg syscall #print error msg j get.globl main main: addi $s0, $zero,21 #s0=21 get: li $v0, 4 la $a0, str1 syscall #print string1 li $v0, 5 syscall #read int slt $s1, $v0, $s0 beq $s1, $zero, error blez $v0, error move $t0, $v0 add $t1, $0, $0 loop: addi $t1, $t1, 1 li $v0, 1 move $a0, $t1 syscall #print int li $v0, 4 la $a0, nline syscall #print nline slt $t2, $t1, $t0 bnez $t2, loop li $v0, 10 #eit program syscall 10 contined
StdentId IPS Reference la register, variable Load address of variable into register li register, vale Load 32-bit vale into register blez register, target branch to target if register is less than or eqal to zero bnez register, target branch to target if register is not eqal to zero move register1, register2 copies contents of register2 to register 1 syscall invokes a system call, for eample, for reading or printing to the console. Usage of syscall: Step 1. Load the service nmber in register $v0. Step 2. Load argment vales, if any, in $a0, $a1, $a2, or $f12 as specified. Step 3. Isse the SYSCALL instrction. Step 4. Retrieve retrn vales, if any, from reslt registers as specified. Service nmbers and registers *print integer: Code 1 in $v0: $a0 = vale to print *print string: Code 4 in $v0: $a0 = address of nllterminated string to print *read integer: Code 5 in $v0: $v0 willcontain integer read Eample: display the vale stored in $t0 on the console li $v0, 1 add $a0, $t0, $zero register $a0 syscall # service 1 is print integer # load desired vale into argment 11 contined
SPARE PAGE FOR EXTRA S StdentId Cross ot the rogh working that yo do not want marked. Specify the qestion nmber for work yo do want marked. 12 contined
Qestion 5. Single Cycle Data Path StdentId [30 marks] a) [10 marks] Consider the data_path of a IPS processor. Describe in one or two sentences what is each of the following data_path components sed for: i) [2 marks] Program conter (PC). ii) [2 marks] Register File. iii) [2 marks] ALU. iv) [2 marks] Control Unit. v) [2 marks] ltipleer. 13 contined
StdentId b) [7 marks] Consider the diagram of the Single Cycle Data Path in Figre 1 on the facing page. On the diagram, highlight only those lines and components that will be sed to eecte a IPS sw instrction. Trace the lines and circle the components sing a colored pen. Do not pay attention to control signals when answering this qestion. c) [3 marks] Consider the diagram of the Single Cycle Data Path in Figre 1 on the facing page again. Assme a sw instrction is taken. On the diagram, highlight only those control signals that will be asserted dring the eection of the IPS sw instrction. Trace these control signals sing a colored pen. d) [10 marks] Opcode of the IPS bne (branch not eqal) instrction is 000101 2. Opcode of the IPS beq (branch on eqal) instrction is 000100 2. Opcode of the IPS j (jmp) instrction is 000010 2. Design combinational logic blocks that will act as parts of the Single Cycle Data_Path Control Unit and generate Zero&Branch and Jmp control signals in Figre 1. 14 contined
StdentId contined 15 RegWrite emwrite shift left 2 26 0 1 1 0 Instrction[25-0] 28 PC+4[31-28] JmpAddress[31-0] 4 32 32 Jmp emtoreg ALUSrc RegDest zero Zero&Branch Read address Instrction [31-0] 0 1 Sign et PC shift left 2 1 0 0 1 Add Instrction memory ALU Data memory Add 4 Instrction[25-21] Instrction[20-16] Instrction[15-11] Instrction[15-0] Address Write data Read data reslt Read register 1 Read register 2 Write register Write data Read data 1 Read data 2 16 32 Register s Figre 1.
StdentId Qestion 6: Pipelined Data Path [30 marks] a) [2 marks] What is pipelining and what is the epected benefit of pipelining? b) [2 marks] What is a data hazard? c) [2 marks] Sppose there is neither hardware detection nor prevention of hazards. How many nops are needed between any two sbseqent instrctions being involved in a hazard? 16
ALUSelB ALUSelA StdentId d) [8 marks] Assme yo rn a IPS assembly program on a processor whose data_path contains forwarding nit as those that we considered in lectres. i) [3 marks] Consider the following fragment of a IPS assembly program. add $s0, $t0, $t1 sw $t2, 0($s2) sb $s2, $s0, $t3 If yo think there shold be nops inserted between instrctions to avoid hazards, then insert the necessary nops and show the altered program fragment in the answer bo below. If yo think nops are not needed, jst copy the program fragment above in the answer bo below. ii) [5 marks] Consider the data_path with forwarding in Figre 2. below and the fragment of the IPS assembly program in the answer bo above. If yo think the forwarding nit still has to ndertake some actions to prevent hazards, write instrctions of the program fragment in the appropriate pipeline stages in Figre 2. Take a colored pen and trace only those lines and circle the components in Figre 2 that will be sed to perform the needed forwarding. Note: The data_path in Figre 2 is not complete, bt contains all components yo may need to answer the qestion. ID/EX EX/E E/WB Register File Data memory rt rs Forwarding nit rd rd Figre 2. 17
StdentId e) [8 marks] Assme yo rn a IPS assembly program on a processor whose data_path contains a forwarding nit as one considered in lectres. i) [3 marks] Consider the following fragment of a IPS assembly program. add $s0, $t0, $t1 sw $t2, 0($s0) sb $s2, $s2, $t3 If yo think there shold be nops inserted between instrctions to avoid hazards, then insert the necessary nops and show the altered program fragment in the answer bo below. If yo think nops are not needed, jst copy the program fragment above in the answer bo below. ii) [5 marks] Consider the data_path with forwarding in Figre 3 below and the fragment of the IPS assembly program in the answer bo above. If yo think the forwarding nit still has to ndertake some actions to prevent hazards, write instrctions of the program fragment in the appropriate pipeline stages in Figre 3, take a colored pen and trace only those lines and circle the components that will be sed to perform the needed forwarding. Note: The data_path in Figre 3 is not complete, bt contains all components yo may need to answer the qestion. ID/EX EX/E E/WB sign etend ALUSrc Data memory Forwarding nit rd rd Figre 3. 18
ALUSelB ALUSelA StdentId f) [8 marks] Assme yo rn a IPS assembly program on a processor whose data_path contains a forwarding nit as one considered in lectres. i) [3 marks] Consider the following fragment of a IPS assembly program. add $s0, $t0, $t1 lw $t2, 0($s1) sb $s2, $s1, $t2 If yo think there shold be nops inserted between instrctions to avoid hazards, then insert the necessary nops and show the altered program fragment in the answer bo below. If yo think nops are not needed, jst copy the program fragment above in the answer bo below. ii) [5 marks] Consider the data_path with forwarding in Figre 4 below and the fragment of the IPS assembly program in the answer bo above. If yo think the forwarding nit still has to ndertake some actions to prevent hazards, show instrctions of the program fragment in the appropriate pipeline stages in Figre 4, take a colored pen and trace only those lines and circle the components that will be sed to perform the needed forwarding. Note: The data_path in Figre 4 is not complete, bt contains all components yo may need to answer the qestion. ID/EX EX/E E/WB Register File Data memory rt rs Forwarding nit rt Figre 4. 19
StdentId Qestion 7. emory Hierarchy [30 marks] a) [10 marks] In the table provided for the answer, list the names of the memory hierarchy levels, their typical size (capacity) in bytes or their mltiples, a typical time to read or write, and a typical amont of data read or written dring one read or right operation in bytes or their mltiples. name size r/w time amont b) [10 marks] Determining of the word offset length, slot nmber length, and the tag length are crcial for a correct memory to cache mapping. All these lengths are epressed by the nmber of memory address bits. Figre 5 below displays a memory address of a bits with its tag, slot (or slot set) nmber, word offset, and byte offset fields. Figre 5. Let the cache size be n = 2 s slots (s > 0), the way of cache set associativeness be m = 2 k (k > 0 and n > m), and the memory block size be w = 2 i (i > 0). The length of the byte offset is a constant b = 2. Recall, for m = 1, the cache is direct mapped, and for m = n, the cache is flly associative. Otherwise it is m-way set associative. i) [5 marks] Epress the tag length t as a fnction of a, s, k, i, and b. tag slot (or slot set) nmber word offset byte offset a - 1 0 20
StdentId ii) [5 marks] Sppose a = 32, n = 2048, m = 8, and w =16. How mch is the tag length t? c) [10 marks] Consider an 8-way set associative cash having 2048 slots of 16 words per block. Assme the cache is initially empty. The processor isses the following reqests: 00000000 11111111 00000000 11000000 00000000 11111111 00000000 11111111 00000000 10101010 01000000 11010101 00000000 10101010 01000000 11101001 i) [8 marks] In the answer bo below, show the cache content after satisfying the for processor reqests above. In yor answer show the content of the set slot nmber, valid bit, and tag fields of only those slots that are pdated when satisfying processor reqests. Slot Set Nmber V Tag W 0 W 15 ii) [2 marks] How many hits and misses will prodce the processor reqests above? Nmber of Hits = Nmber of isses = ******************* 21