The Evolution of Microprocessors. Per Stenström

Size: px

Start display at page:

Download "The Evolution of Microprocessors. Per Stenström"

Sheena Williams
5 years ago
Views:

1 The Evolution of Microprocessors Per Stenström

2 Processor (Core) Processor (Core) Processor (Core) L1 Cache L1 Cache L1 Cache L2 Cache Microprocessor Chip Memory

3 Evolution of Microprocessors Multicycle Pipelined Superscalar Multicore and beyond Time

4 Multicycle Computers

5 Fetch next instruction Decode/Fetch data Execute Store result

6 Opcode Assembly code Meaning Comments ADD,SUB, ADD R1,R2,R3 R1=(R2) + (R3) Integer operations ADDI, SUBI ADDI R1,R2,#3 R1=(R2)+3 Immediate AND, OR, XOR AND R1,R2,R3 R1=(R2).AND.(R3) bit-wise logical AND,OR, XOR SLT SLT R1,R2,R3 If (R2)<(R3) then R1=1 else R1=0 ADD.D, SUB.D, MUL.D, DIV.D Test R2, R3 outcome in R1 ADD.D F1,F2,F3 F1=(F2)+(F3) Floatingpoint operations

7 Opcode Assembly code Meaning Comments LB,LH,LW,LD LW R1,#20(R2) R1=MEM[20+(R2)] For bytes, half-words, words and double words SB,SH,SW,SD SW R1,#20(R2) MEM[20+(R2)]=R1 L.S, L.D L.S F0,#20(R2) F0=MEM[20+(R2)] Single/double float load S.S, S.D S.S F0,#20(R2) MEM[20+(R2)]=F0 Single/double float store

8 Opcode Assembly code Meaning Comments BEQZ, BNEZ BEQZ R1,Label If (R1)=0 then PC=Label BEQ, BNE BEQ R1,R2,Label If (R1)=(R2) PC=Label Conditional branch-equal 0/not equal 0 Conditional branchequal/not equal J J Target PC=Target Target is an immediate field JR JR R1 PC=(R1) Target is in register JAL JAL Target R31 = PC + 4; PC = Target Jump to target and save return address in R31

9 PC Read Address [31..0] memory

10 ALU Memory transfer PC Control Read Address [31..0] memory [15..11] [25..21] [20..16] Rs addr Rs data Rt addr Rt data Registers [15..0]

11 ALU Memory transfer PC Control Read Address [31..0] memory [15..11] [25..21] [20..16] Rs addr Rs data Rt addr Rt data Registers ALU [15..0]

12 ALU Memory transfer PC Control Read Address [31..0] memory [15..11] [25..21] [20..16] Rs addr Rs data Rt addr Rt data Registers ALU [15..0]

13 ALU Memory transfer PC Control Read Address [31..0] memory [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data ALU Effective addresss Displacement + (Rs) [15..0] Sign Extend

14 ALU Memory transfer PC Control Read Address [31..0] memory [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data ALU Wr addr Rd data Data memory [15..0] Sign Extend

15 ALU Memory transfer PC Control Read Address [31..0] memory [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data ALU Wr addr Rd data Data memory [15..0] Sign Extend

16 +4 ALU Memory transfer PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Control Fetching the next instruction ALU Wr addr Rd data Data memory [15..0] Sign Extend

17 +4 ALU Memory transfer PC Read Address [31..0] memory ADD BEQ R1,R2,offset [15..11] [25..21] [20..16] Rs addr Rs data Rt addr Rt data Registers Shift Left 2 ADD ALU ZERO Wr addr Rd data Data memory Control [15..0] Sign Extend

18 +4 PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend

19 +4 PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] RegDst [15..0] Rs addr Rt addr RegWrite Registers Sign Extend Rs data Rt data Shift Left 2 ALUSrc ADD ALU ALUOp MemRead PCSrc Wr addr Rd data Data memory MemWrite MemToReg

20 INSTRUCTION FETCH +4 DECODE /OPERAND FETCH EXECUTE MEMORY ACCESS WRITE BACK PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend

21 Pipelined Single-Cycle Computer

22 The Conveyor Belt

25 LABEL: LD R2, 0(R10) LD R3, 0(R11) +4 ADD R1,R2,R3 SD 0(R12), R1 SUBI R4,R4,#1 PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory ADDI R10,R10,#4 ADDI R11,R11,#4 ADDI R12,R12,#4 BNEZ R4, LABEL [15..0] Sign Extend SUBI R4,R4,#1 SD 0(R12),R1 ADD R1,R2,R3 LD R3,0(R11) LD R2,0(R10)

26 Structural hazards +4 Data hazards Control hazards PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend SUBI R4,R4,#1 SD 0(R12),R1 ADD R1,R2,R3 LD R3,0(R11) LD R2,0(R10)

27 Structural hazards +4 Data hazards Control hazards PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend SUBI R4,R4,#1 SD 0(R12),R1 UNUSED ADD R1,R2,R3 LD R3,0(R11)

28 Structural hazards +4 Data hazards Control hazards PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend SUBI R4,R4,#1 SD 0(R12),R1 UNUSED UNUSED ADD R1,R2,R3

29 Structural hazards +4 Data hazards Control hazards PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend UNUSED BEQ R1,offset

30 Structural hazards +4 Data hazards Control hazards PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend UNUSED UNUSED BEQ R1,offset

31 Structural hazards +4 Data hazards Control hazards PC Read Address [31..0] memory ADD [15..11] [25..21] [20..16] Rs addr Rt addr Registers Rs data Rt data Shift Left 2 ADD ALU Wr addr Rd data Data memory [15..0] Sign Extend NEXT INST. UNUSED UNUSED BEQ R1,offset

32 Superscalar Microprocessors

33 IF ID IS EX WB Fetch Decode/ Dispatch Issue/Read Operands Execute Write Back

34 IQ 1 EX 1 IF ID IQ 2 EX 2 WB IQ N EX N Fetch Decode/ Dispatch Read Operands Execute Write Back

35 I-Q INT IF ID FP-Q FP WB Cycle: 1 M-Q MEM L.S F0,0(R1) Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

36 I-Q INT IF ID FP-Q FP WB Cycle: 2 M-Q MEM L.S F1,0(R2) L.S F0,0(R1) Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

37 I-Q INT IF ID FP-Q FP WB Cycle: 3 M-Q L.S F0,0(R1) MEM ADD.S F2,F1,F0 L.S F1,0(R2) Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

38 I-Q INT IF ID FP-Q FP WB Cycle: 4 M-Q L.S F1,0(R2) MEM L.S F0,0(R1) S.S F2,0(R1) ADD.S F2,F1,F0 Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

39 I-Q INT IF ID ADD.S FP-Q F2,F1,F0 FP WB Cycle: 5 M-Q MEM L.S F1,0(R2) ADDI R1,R1,#4 S.S F2,0(R1) L.S F0,0(R1) Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

40 I-Q INT IF ID FP-Q ADD.S FP F2,F1,F0 WB Cycle: 6 M-Q S.S F2,0(R1) MEM ADDI R2,R2,#4 ADDI R1,R1,#4 L.S F1,0(R2) Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

41 I-Q ADDI R1,R1,#4 INT IF ID FP-Q ADD.S FP F2,F1,F0 WB Cycle: 7 M-Q S.S F2,0(R1) MEM SUBI R3,R3,#1 ADDI R2,R2,#4 Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

42 I-Q INT ADDI R2,R2,#4 Cycle: 8 ADDI R1,R1,#4 IF ID FP-Q ADD.S FP F2,F1,F0 WB M-Q S.S F2,0(R1) MEM BNEZ R3,Loop SUBI R3,R3,#1 The ADDI R1,R1,#4 is being executed Out-of-program-order with respect to the S.S F2,0(R1)! Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

43 I-Q SUBI R3,R3,#1 INT IF ID FP-Q ADD.S FP F2,F1,F0 WB ADDI R2,R2,#4 Cycle: 9 M-Q S.S F2,0(R1) MEM L.S F0,0(R2) BNEZ R3,Loop ADDI R1,R1,#4 Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

44 I-Q BNEZ R3,Loop INT SUBI R3,R3,#1 IF ID FP-Q ADD.S FP F2,F1,F0 WB Cycle: 10 M-Q S.S F2,0(R1) MEM L.S F1,0(R1) L.S F0,0(R2) ADDI R2,R2,#4 Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

45 I-Q INT BNEZ R3,Loop IF ID FP-Q ADD.S FP F2,F1,F0 WB Cycle: 11 M-Q S.S F2,0(R1) MEM L.S F1,0(R1) L.S F0,0(R2) SUBI R3,R3,#1 Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

46 I-Q INT BNEZ R3,Loop IF ID FP-Q FP WB Cycle: 12 M-Q L.S F0,0(R2) MEM S.S F2,0(R1) ADD,S F2,F1,F0 L.S F1,0(R1) ADD.S F2,F1,F0 Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

47 Multicore Computers

48 Processor (Core) Processor (Core) Processor (Core) L1 Cache L1 Cache L1 Cache L2 Cache Microprocessor Chip Memory

49 Blended Learning Lectures on youtube 9 interactive sessions, flipped class-room style 5 problem solution sessions 1 design exploration project Textbook: Parallel Computer Organization and Design Dubois, Annavaram, Stenström

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts Prof. Sherief Reda School of Engineering Brown University S. Reda EN2910A FALL'15 1 Classical concepts (prerequisite) 1. Instruction