COGNOME NOME MATRICOLA

Size: px

Start display at page:

Download "COGNOME NOME MATRICOLA"

Randall Lester
5 years ago
Views:

1 CORSO Architectures for Multimedia Systems - Code: (073335) - Prof. C. SILVANO Prova del 3 settembre 2010 COGNOME NOME MATRICOLA EXERCISE 1 - PIPELINE Given the following loop expressed in a high level language: do {vetta[i] = vetta[i] + vettb[i]; if ( vetta[i] >= 0 ) { vettb[i] = vettb[i] + K; } i++; } while (i!= N) Il programma sia stato compilato nel codice assembly MIPS riportato nella seguente tabella. Si supponga che i registri $t6 e $t7 siano stati inizializzati rispettivamente ai valori 0 e N. I simboli VETTA, VETTB, VETTC sono costanti a 16 bit, prefissate. La frequenza di clock del processore vale 1 GHz. Si consideri una generica iterazione dei cicli eseguiti dal processore MIPS in modalità pipeline a 5 stadi.

2 a) Assuming there are NO optimisations in the pipeline and (vetta[i] >= 0)in the 50% of the cases: 1. Identify the RAW (Read After Write) data hazards and control hazards. 2. Identify the number of stalls to be inserted before each instruction (or between the stage IF and ID of each instruction) necessary to solve the harzards. Num. stalls ISTRUCTION C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 Hazard Type DO: lw $t2,vetta($t6) IF ID EX ME WB lw $t3, VETTB($t6) IF ID EX ME WB add $t2, $t2, $t3 IF ID EX ME WB sw $t2,vetta($t6) IF ID EX ME WB slt $t0,$t2,$0 IF ID EX ME WB bne $t0, $0, INC IF ID EX ME WB addi $t3,$t3,k IF ID EX ME WB sw $t3,vettb($t6) IF ID EX ME WB INC: addi $t6,$t6,4 IF ID EX ME WB bne $t6,$t7, DO IF ID EX ME WB END: IF ID EX ME WB NOTA: slt $t0,$t2,$0 # if $t2 < $0 then set $t0 = 1 otherwise $t0 =0; Average Instruction Count: IC AVE = Average Number of Stalls: STALL AVE = Asymptotic CPI (N ): CPI AS = Asymptotic Throughput expressed in MIPS (N ): MIPS AS =

3 b) Assuming there are the following optimisations in the pipeline and (vetta[i] >= 0)in the 50% of the cases: - In the Register File it is possible the read and write at the same address in the same clock cycle; - Forwarding - Computation of PC e TARGET ADDRESS for branch & jump instructions anticipated in the ID stage 1. Identify the RAW (Read After Write) data hazards and control hazards. 2. Identify the number of stalls to be inserted before each instruction (or between the stage IF and ID of each instruction) necessary to solve the harzards. 3. Identify in the last column the forwarding path used Num. stalls ISTRUZIONE C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 Hazard Type DO: lw $t2,vetta($t6) IF ID EX ME WB lw $t3, VETTB($t6) IF ID EX ME WB add $t2, $t2, $t3 IF ID EX ME WB sw $t2,vetta($t6) IF ID EX ME WB slt $t0,$t2,$0 IF ID EX ME WB bne $t0, $0, INC IF ID EX ME WB addi $t3,$t3,k IF ID EX ME WB sw $t3,vettb($t6) IF ID EX ME WB INC: addi $t6,$t6,4 IF ID EX ME WB bne $t6,$t7, DO IF ID EX ME WB END: IF ID EX ME WB NOTA: slt $t0,$t2,$0 # if $t2 < $0 then set $t0 = 1 otherwise $t0 =0; Average Instruction Count: IC AVE = Average Number of Stalls: STALL AVE = Asymptotic CPI (N ): CPI AS = Asymptotic Throughput expressed in MIPS (N ): MIPS AS = Asymptotic SpeedUp with respect to the first case: SpeedUp AS =

4 c) Assuming there are the previous optimisations in the pipeline with static branch prediction BTFNT (BACKWARD TAKEN FORWARD NOT TAKEN) with BRANCH TARGET BUFFER and (vetta[i] >= 0)in the 50% of the cases: 1. Identify the RAW (Read After Write) data hazards and control hazards. 2. Identify the number of stalls to be inserted before each instruction (or between the stage IF and ID of each instruction) necessary to solve the harzards. 3. Identify in the Static Branch Prediction (Taken/Not Taken) Num. stalls INSTRUCTION C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 Hazard Type DO: lw $t2,vetta($t6) IF ID EX ME WB lw $t3, VETTB($t6) IF ID EX ME WB add $t2, $t2, $t3 IF ID EX ME WB sw $t2,vetta($t6) IF ID EX ME WB slt $t0,$t2,$0 IF ID EX ME WB bne $t0, $0, INC IF ID EX ME WB addi $t3,$t3,k IF ID EX ME WB sw $t3,vettb($t6) IF ID EX ME WB INC: addi $t6,$t6,4 IF ID EX ME WB bne $t6,$t7, DO IF ID EX ME WB END: IF ID EX ME WB NOTA: slt $t0,$t2,$0 # if $t2 < $0 then set $t0 = 1 otherwise $t0 =0; Average Instruction Count: IC AVE = Average Number of Stalls: STALL AVE = Asymptotic CPI (N ): CPI AS = Asymptotic Throughput expressed in MIPS (N ): MIPS AS = Asymptotic SpeedUp with respect to the first case: SpeedUp AS =

5 EXERCISE 2: SCOREBOARD Assuming the program be executed by a CPU with dynamic scheduling based on SCOREBOARD with: 2 LOAD/STORE units (LDU1, LDU2) with Latency 4 2 ALU/BR/J units (ALU1, ALU2) with Latency 2. Check structural hazards in ISSUE phase Check RAW hazads in READ OPERANDS phase Check WAR e WAW in WRITE BACK phase Forwarding Static Branch Prediction BTFNT (BACKWARD TAKEN FORWARD NOT TAKEN) with Branch Target Buffer 1. Assuming the case (vetta[i] >= 0) and considering the first iteration of the DO cycle, fill in the following table, assuming all data cache HITS INSTRUCTION ISSUE READ OPERANDS DO: lw $t2,vetta($t6) lw $t3,vettb($t6) add $t2, $t2, $t3 sw $t2,vetta($t6) slt $t0, $t2, $0 bne $t0,$0, INC addi $t3, $t3, k sw $t3,vettb($t6) INC: addi $t6, $t6, 4 bne $t6, $t7, DO END: EXECUTION COMPLETE WRITE BACK HAZARDS TYPE UNIT

6 EXERCISE 3: SCOREBOARD with DATA CACHE MISSES Assuming the program be executed by a CPU with dynamic scheduling based on SCOREBOARD with: 2 LOAD/STORE units (LDU1, LDU2) with Latency 4 2 ALU/BR/J units (ALU1, ALU2) with Latency 2. 1 DATA MEMORY with latency 10 when DATA CACHE MISS Check structural hazards in ISSUE phase Check RAW hazads in READ OPERANDS phase Check WAR e WAW in WRITE BACK phase Forwarding Static Branch Prediction BTFNT (BACKWARD TAKEN FORWARD NOT TAKEN) with Branch Target Buffer Assuming the case (vetta[i] >= 0) and considering the first iteration of the DO cycle, fill in the following table, assuming all data cache MISSES INSTRUCTION ISSUE READ OPERANDS DO: lw $t2,vetta($t6) lw $t3,vettb($t6) add $t2, $t2, $t3 sw $t2,vetta($t6) slt $t0, $t2, $0 bne $t0,$0, INC addi $t3, $t3, k sw $t3,vettb($t6) INC: addi $t6, $t6, 4 bne $t6, $t7, DO END: EXECUTION COMPLETE MEM ACCESS COMPLETE WRITE BACK HAZARDS TYPE UNIT

7 EXERCISE 4 - TOMASULO Assuming the program be executed by a CPU with dynamic scheduling based on TOMASULO algorithm: 2 RESERVATION STATIONS (RS1, RS2) + 1 LOAD/STORE UNIT (LDU1) with latency 4 2 RESERVATION STATIONS (RS3, RS4) + 1 ALU/BR/J UNIT (ALU1) with latency 2 Check structural hazards for RS in ISSUE phase Check RAW hazads and Check structural hazards for FUs in START EXECUTE phase WRITE RESULT in RS and RF Forwarding Static Branch Prediction BTFNT (BACKWARD TAKEN FORWARD NOT TAKEN) with Branch Target Buffer 1. Assuming the case (vetta[i] >= 0 and considering the first iteration of the DO cycle, fill in the following table, assuming all cache HITS ISTRUZIONE ISSUE START EXEC DO: lw $t2,vetta($t6) lw $t3,vettb($t6) add $t2, $t2, $t3 sw $t2,vetta($t6) slt $t0, $t2, $0 bne $t0,$0, INC addi $t3, $t3, k sw $t3,vettb($t6) INC: addi $t6, $t6, 4 bne $t6, $t7, DO END: WRITE RESULTS HAZARDS TYPE RSi UNIT Calculate the asymptotic Speedup obtained by Tomasulo algorithm with respect to Scoreboard: Speedup =

8 EXERCISE 5 - TOMASULO Assuming the program be executed by a CPU with dynamic scheduling based on TOMASULO algorithm: 4 RESERVATION STATIONS (RS1, RS2, RS3, RS4) + 1 LOAD/STORE UNIT (LDU1) with latency 4 4 RESERVATION STATIONS (RSA, RSB, RSC, RSD) + 1 ALU/BR/J UNIT (ALU1) with latency 2 Check structural hazards for RS in ISSUE phase Check RAW hazads and Check structural hazards for FUs in START EXECUTE phase WRITE RESULT in RS and RF Forwarding Static Branch Prediction BTFNT (BACKWARD TAKEN FORWARD NOT TAKEN) with Branch Target Buffer 1. Assuming the case (vetta[i] >= 0 and considering the first iteration of the DO cycle, fill in the following table, assuming all cache HITS INSTRUCTION ISSUE START EXEC DO: lw $t2,vetta($t6) lw $t3,vettb($t6) add $t2, $t2, $t3 sw $t2,vetta($t6) slt $t0, $t2, $0 bne $t0,$0, INC addi $t3, $t3, k sw $t3,vettb($t6) INC: addi $t6, $t6, 4 bne $t6, $t7, DO END: WRITE RESULTS HAZARDS TYPE RSi UNIT Calculate the asymptotic Speedup obtained with respect to the previous Tomasulo case: Speedup =

9 EXERCISE 6 VLIW ARCHITECTURES Give a definition of VLIW ARCHITECTURE Please explain the main advantages of VLIW architectures

10 What are the limiting factors for using VLIW architectures? VLIW vs. Superscalar architectures: Please try to compare them in terms of advantages/disadvantages

Architectures for Multimedia Systems - Code: (073335) Prof. C. SILVANO 5th July 2010

Architectures for Multimedia Systems - Code: 078650 (073335) Prof. C. SILVANO 5th July 2010 SURNAME NAME STUDENT ID (MATRICOLA) EMAIL EXERCISE 1 - PIPELINE Given the following loop expressed in a high