SAE5C Computer Organization and Architecture. Unit : I - V

SAE5C Computer Organization and Architecture Unit : I - V

UNIT-I Evolution of Pentium and Power PC Evolution of Computer Components functions Interconnection Bus Basics of PCI Memory:Characteristics,Hierarchy Cache Memory-Principles,Design,Locality of reference. SAE5C-Computer Architecture 1

Computer Evolution Pentium and Power PC Pentium Evolution---Pre x86 series processors x86 series - 16 bit processors x86 series - 32 bit processors x86 64 bit series PowerPC Evolution In 1975, 801 minicomputer projects by IBM introduced the RISC Then the Berkeley RISC I processor was introduced In 1986, IBM introduced the commercial RISC workstation product called RT PC which was not commercially successful due to performance of other competitors. In 1990, IBM introduced IBM RISC System/6000 which has RISC-like superscalar machine. POWER architecture SAE5C-Computer Architecture 2

IBM alliance with Motorola (68000 microprocessors), and Apple, (used 68000 in Macintosh) and produced PowerPC architecture which was Derived from the POWER architecture. Superscalar RISC PowerPC Family is: 601: Quickly to market. 32-bit machine. 603: 32 bit low-end desktop and portable. Lower cost and more efficient implementation 604: Desktop and low-end servers, 32-bit machine. Much more advanced superscalar design. Greater performance 620: High-end servers, 64-bit architecture. 740/750: Also known as G3, two levels of cache on chip. G4: Increases parallelism and internal speed. G5: Improvements in parallelism and internal speed, 64-bit organization SAE5C-Computer Architecture 3

Evolution Comp Sys Component Function SAE5C-Computer Architecture 4

- SAE5C-Computer Architecture 5

SAE5C-Computer Architecture 6

SAE5C-Computer Architecture 7

Interconnection Structure Bus SAE5C-Computer Architecture 8

SAE5C-Computer Architecture 9

Memory-Characteristics Hierarchy SAE5C-Computer Architecture 10

Cache Memory-Principles,Design,Locality SAE5C-Computer Architecture 11

Locality and Caching Memory hierarchies take advantage of memory locality. Memory locality is the principle that future memory near past accesses. accesses are Memories take advantage of two types of locality Temporal locality -- near in time we will often access the same data again very soon Spatial locality -- near in space/distance our next access is often very close to our last access (or recent accesses). SAE5C-Computer Architecture 12

Memory hierarchies exploit locality by caching (keeping close to the processor) data likely to be used again. This is done because we can build large, slow memories and small, fast memories, but we can t build large, fast memories. If it works, we get the illusion of SRAM access time with disk capacity SRAM (static RAM) 1-5 ns access time DRAM (dynamic RAM) 40-60 ns disk -- access time measured in milliseconds, very cheap SAE5C-Computer Architecture 13

Cache Fundamentals cache hit -- an access where the data is found in the cache. cache miss -- an access which isn t hit time -- time to access the higher cache miss penalty -- time to move data from lower level to upper, then to cpu hit ratio -- percentage of time the data is found in the higher cache miss ratio -- (1 - hit ratio cache block size or cache line size-- the amount of data that gets transferred on a cache miss. instruction cache -- cache that only holds instructions. data cache -- cache that only caches data. unified cache -- cache that holds both. (L1 is unified princeton architecture ) SAE5C-Computer Architecture 14

Cache Characteristics Cache Organization Cache Access Cache Replacement Write Policy SAE5C-Computer Architecture 15

UNIT-II Main Memory Types of ROM Memory Chip Organization Types of DRAM External Memory-Memory Disk Basics of RAID Optical Memory Memory Tapes SAE5C-Computer Architecture 16

RAM Misnamed as all semiconductor memory is :random access Read/Write Volatile (contents are lost when power switched off) Temporary storage Static or dynamic Main Memory Dynamic is based on capacitors leaks thus needs refresh Static is based on flip-flops no leaks, does not need refresh SAE5C-Computer Architecture 17

SAE5C-Computer Architecture 18

Memory Chip Organization SAE5C-Computer Architecture 19

Types of DRAM Simple DRAM Fast Page Mode (FPM) DRAM Extended Data Out (EDO) DRAM Burst Extended Data Out (BEDO) DRAM Synchronous DRAM (SDRAM) Rambus DRAM (RDRAM) Double Data Rate (DDR) DRAM Control logic One memory cell per bit Cell consists of one or more transistors Not really a latch made of logic Logic equivalent SAE5C-Computer Architecture 20

SAE5C-Computer Architecture 21

External Memory SAE5C-Computer Architecture 22

Magnetic Disk RAID Removable Optical CD-ROM Types of External Memory CD-Writable (WORM) CD-R/W DVD Magnetic Tape SAE5C-Computer Architecture 23

SAE5C-Computer Architecture 24

RAID Redundant Array of Independent Disks Redundant Array of Inexpensive Disks 6 levels in common use Not a hierarchy Set of physical disks viewed as single logical drive by O/S Data distributed across physical drives Can use redundant capacity to store parity information SAE5C-Computer Architecture 25

RAID SAE5C-Computer Architecture 26

SAE5C-Computer Architecture 27

SAE5C-Computer Architecture 28

Heading: font size 32 SAE5C-Computer Architecture 29

SAE5C-Computer Architecture 30

UNIT-III I/O External Devices Module programmed I/O Interrupt Driven I/O Computer Arithmetic Floating Point representation and Arithmetic Addressing Modes SAE5C-Computer Architecture 31

I/O EXTERNAL DEVICES Data Transfer External Devices I/O Modules Programmed I/O Interrupt-Driven I/O Direct Memory Access (DMA) I/O Channels and Processor SAE5C-Computer Architecture 32

I/O Module Programmed Interface to CPU and Memory Interface to one or more peripherals GENERIC MODEL OF I/O DIAGRAM SAE5C-Computer Architecture 33

Control & Timing CPU Communication Device Communication Data Buffering Error Detection I/O Steps: Module Programmed -I/O-Functions CPU checks I/O module device status I/O module returns status If ready, CPU requests data transfer I/O module gets data from device I/O module transfers data to CPU Variations for output, DMA, etc. SAE5C-Computer Architecture 34

I/O Module Diagram Systems Bus Interface Data Lines Data Register Status/Control Register External Device Interface External Device Interface Logic Data Status Control Address Lines Data Lines Input Output Logic External Device Interface Logic Data Status Control SAE5C-Computer Architecture 35

Three I/O Techniques SAE5C-Computer Architecture 36

Interrupt Driven I/O-Simple Interrupt SAE5C-Computer Architecture 37

Multiple Interrupts Each interrupt line has a priority Higher priority lines can interrupt lower priority lines If bus mastering only current master can interrupt Direct Memory Access Interrupt driven and programmed I/O require active CPU intervention Transfer rate is limited CPU is tied up DMA is the answer SAE5C-Computer Architecture 38

Performs arithmetic and logic operations on data everything that we think of as computing. Everything else in the computer is there to service this unit All ALUs handle integers Arithmetic & Logic Unit Some may handle floating point (real) numbers May be separate FPU (math co-processor) FPU may be on separate chip (486DX +) SAE5C-Computer Architecture 39

We have the smallest possible alphabet: the symbols 0 & 1 represent everything No minus sign No period Signed-Magnitude Two s complement Benefits of 2 s complement : One representation of zero Integer Representation Arithmetic works easily (see later) Negating is fairly easy 3 = 00000011 Boolean complement gives 11111100 Add 1 to LSB 11111101 SAE5C-Computer Architecture 40

SAE5C-Computer Architecture 41

2 s complement negation Taking the 2 s complement (complement and add 1) is computing the arithmetic negation of a number Compute y = 0 x Or Compute y such that x + y = 0 Addition and Subtraction: For addition use normal binary addition 0+0=sum 0 carry 0 0+1=sum 1 carry 0 1+1=sum 0 carry 1 Monitor MSB for overflow Overflow cannot occur when adding 2 operands with the different signs If 2 operand have same sign and result has a different sign, overflow has occurred Subtraction: Take 2 s complement of subtrahend and add to minuend i.e. a - b = a + (-b) So we only need addition and complement circuits SAE5C-Computer Architecture 42

Flow Chart of Addition Subtraction-Integer SAE5C-Computer Architecture 43

Multiplication A complex operation compared with addition and subtraction Many algorithms are used, esp. for large numbers Simple algorithm is the same long multiplication taught in grade school Compute partial product for each digit Add partial products SAE5C-Computer Architecture 44

Multiplication Example 1011 Multiplicand (11 dec) x 1101 Multiplier (13 dec) 1011 Partial products 0000 Note: if multiplier bit is 1 copy 1011 multiplicand (place value) 1011 otherwise zero 10001111 Product (143 dec) Note: need double length result SAE5C-Computer Architecture 45

Multiplication Algorithm Repeat n times: If Q0 = 1 Add M into A, store carry in CF Shift CF, A, Q right one bit so that: An-1 <- CF Qn-1 <- A0 Q0 is lost Note that during execution Q contains bits from both product and multiplier SAE5C-Computer Architecture 46

Booth s Algorithm Registers and Setup 3 n bit registers, 1 bit register logically to the right of Q (denoted as Q-1 ) Register set up Q register <- multiplier Q-1 <- 0 M register <- multiplicand A register <- 0 Count <- n Product will be 2n bits in A Q registers Booth s Algorithm Control Logic Bits of the multiplier are scanned one at a a time (the current bit Q0 ) As bit is examined the bit to the right is considered also (the previous bit Q-1 ) Then: 00: Middle of a string of 0s, so no arithmetic operation. 01: End of a string of 1s, so add the multiplicand to the left half of the product (A). 10: Beginning of a string of 1s, so subtract the multiplicand from the left half of the product (A). 11: Middle of a string of 1s, so no arithmetic operation. Then shift A, Q, bit Q-1 right one bit using an arithmetic shift In an arithmetic shift, the msb remains unchanged SAE5C-Computer Architecture 47

Booths algorithm FlowChart SAE5C-Computer Architecture 48

Booths Algorithm Multiplication Eg: SAE5C-Computer Architecture 49

Addressing Modes https://www.youtube.com/watch?v=03fhijh6e2w https://www.youtube.com/watch?v=p9wxyix-j-c SAE5C-Computer Architecture 50

Unit : IV CPU:Organization Processors and Registers Instruction pipelining RISC:Characteristics Lrge Register files RISC Vs CISC Charecteristics of Pipeline SAE5C-Computer Architecture 51

Organization of Processors and Registers Instruction Pipelining To improve the performance of a CPU we have two options: 1) Improve the hardware by introducing faster circuits. 2) Arrange the hardware such that more than one operation can be performed at the same time. Pipelining : Pipelining is a process of arrangement of hardware elements of the CPU such that its overall performance is increased. Simultaneous execution of more than one instruction takes place in a pipelined processor. SAE5C-Computer Architecture 52

Let us see a real life example that works on the concept of pipelined operation. Consider a water bottle packaging plant. Let there be 3 stages that a bottle should pass through, Inserting the bottle, Filling water in the bottle, and Sealing the bottle. Let us consider these stages as stage 1, stage 2 and stage 3 respectively. Let each stage take 1 minute to complete its operation. Now, in a non pipelined operation, a bottle is first inserted in the plant, after 1 minute it is moved to stage 2 where water is filled. Now, in stage 1 nothing is happening. Similarly, when the bottle moves to stage 3, both stage 1 and stage 2 are idle. But in pipelined operation, when the bottle is in stage 2, another bottle can be loaded at stage 1. Similarly, when the bottle is in stage 3, there can be one bottle each in stage 1 and stage 2. So, after each minute, we get a new bottle at the end of stage 3. Hence, the average time taken to manufacture 1 bottle is : Without pipelining = 1 min + 1 min + 1 min = 3 minutes With pipelining = 1 minute Thus, pipelined operation increases the efficiency of a system. SAE5C-Computer Architecture 53

Design of a basic pipeline In a pipelined processor, a pipeline has two ends, the input end and the output end. Between these ends, there are multiple stages/segments such that output of one stage is connected to input of next stage and each stage performs a specific operation. Interface registers are used to hold the intermediate output between two stages. These interface registers are also called latch or buffer. All the stages in the pipeline along with the interface registers are controlled by a common clock. SAE5C-Computer Architecture 54

Execution in a pipelined processor Execution sequence of instructions in a pipelined processor can be visualized using a space-time diagram. For example, consider a processor having 4 stages and let there be 2 instructions to be executed. We can visualize the execution sequence through the following space-time diagrams: SAE5C-Computer Architecture 55

Pipeline Stages RISC processor has 5 stage instruction pipeline to execute all the instructions in the RISC instruction set. Following are the 5 stages of RISC pipeline with their respective operations: Stage 1 (Instruction Fetch) In this stage the CPU reads instructions from the address in the memory whose value is present in the program counter. Stage 2 (Instruction Decode) In this stage, instruction is decoded and the register file is accessed to get the values from the registers used in the instruction. Stage 3 (Instruction Execute) In this stage, ALU operations are performed. Stage 4 (Memory Access) In this stage, memory operands are read and written from/to the memory that is present in the instruction. Stage 5 (Write Back) In this stage, computed/fetched value is written back to the register present in the instruction. SAE5C-Computer Architecture 56

Dependencies in a pipelined processor There are mainly three types of dependencies possible in a pipelined processor. These are : 1) Structural Dependency 2) Control Dependency 3) Data Dependency These dependencies may introduce stalls in the pipeline. Stall : A stall is a cycle in the pipeline without new input. Structural dependency This dependency arises due to the resource conflict in the pipeline. A resource conflict is a situation when more than one instruction tries to access the same resource in the same cycle. A resource can be a register, memory, or ALU. SAE5C-Computer Architecture 57

SAE5C-Computer Architecture 58

Control Dependency (Branch Hazards) This type of dependency occurs during the transfer of control instructions such as BRANCH, CALL, JMP, etc. On many instruction architectures, the processor will not know the target address of these instructions when it needs to insert the new instruction into the pipeline. Due to this, unwanted instructions are fed to the pipeline. Data Hazards Data hazards occur when instructions that exhibit data dependence, modify data in different stages of a pipeline. Hazard cause delays in the pipeline. There are mainly three types of data hazards: 1) RAW (Read after Write) [Flow dependency] 2) WAR (Write after Read) [Anti-Data dependency] 3) WAW (Write after Write) [Output dependency] SAE5C-Computer Architecture 59

https://www.youtube.com/watch?v=axgfev568c8 https://www.youtube.com/watch?v=_klfqh3dktm https://www.youtube.com/watch?v=bg7ew15aevo SAE5C-Computer Architecture 60

RISC VS CISC https://www.youtube.com/watch?v=mdrukjovtau https://www.youtube.com/watch?v=07cpxbfy7ji&t=2s https://www.youtube.com/watch?v=4dhhkxeds-a SAE5C-Computer Architecture 61

Unit : V Control Unit: Micro operations, control of processors Hardwired implementation Micro programmed control concepts Micro Instructions Sequencing general micro instructions executions SAE5C-Computer Architecture 62

CONTROL UNIT DESIGN Important component of CPU is the controller: A computer executes a program Fetch/execute cycle Each cycle has a number of steps see pipelining Called micro-operations Each step does very little Atomic operation of CPU SAE5C-Computer Architecture 63

SAE5C-Computer Architecture 64

Fetch - 4 Registers Memory Address Register (MAR) Connected to address bus Specifies address for read or write op Memory Buffer Register (MBR) Connected to data bus Holds data to write or last data read Program Counter (PC) Holds address of next instruction to be fetched Instruction Register (IR) Holds last instruction fetched SAE5C-Computer Architecture 65

Fetch Sequence Address of next instruction is in PC Address (MAR) is placed on address bus Control unit issues READ command Result (data from memory) appears on data bus Data from data bus copied into MBR PC incremented by 1 (in parallel with data fetch from memory) Data (instruction) moved from MBR to IR MBR is now free for further data fetches SAE5C-Computer Architecture 66

Micro-programmed Control Use sequences of instructions (see earlier notes) to control complex operations Called micro-programming or firmware All the control unit does is generate a set of control signals Each control signal is on or off Represent each control signal by a bit Have a control word for each micro-operation Have a sequence of control words for each machine code instruction Add an address to specify the next micro-instruction, depending on conditions Today s large microprocessor Many instructions and associated register-level hardware Many control points to be manipulated This results in control memory that Contains a large number of words co-responding to the number of instructions to be executed Has a wide word width Due to the large number of control points to be manipulated SAE5C-Computer Architecture 67

Control Unit,Micro Programmed concept,micro Instruction https://www.youtube.com/watch?v=nknlnj-sq-k https://www.youtube.com/watch?v=81v7jqlbtmi https://www.youtube.com/watch?v=c1_9ur6a6ny https://www.youtube.com/watch?v=cegpirqiei0 SAE5C-Computer Architecture 68

Sequencing general Micro Instruction execution https://www.youtube.com/watch?v=ywyvcozp8ws SAE5C-Computer Architecture 69