Chapter 5 Computer Systems Organization Objectives In this chapter, you will learn about: The components of a computer system Putting all the pieces together the Von Neumann architecture The future: non-von Neumann architectures INVITATION TO Computer Science 1 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 2 Objectives After studying this chapter, students will be able to: Enumerate the characteristics of the Von Neumann architecture escribe the components of a RAM system, including how fetch and store operations work, and the use of cache memory to speed up access time Explain why mass storage devices are important, and how ASs like hard drives or Vs work iagram the components of a typical ALU and illustrate how the ALU data path operates Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 3 Objectives (continued) After studying this chapter, students will be able to: escribe the control unit s Von Neumann cycle, and explain how it implements a stored program List and explain the types of instructions and how they are encoded iagram the components of a typical Von Neumann machine Show the sequence of steps in the fetch, decode, and execute cycle to perform a typical instruction escribe non-von Neumann parallel processing systems Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 4 Introduction This chapter changes the level of abstraction Focus on functional units and computer organization A hierarchy of abstractions hides unneeded details Change focus from transistors, to gates, to circuits as the basic unit Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 5 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 6 1
The Components of a Computer System Von Neumann architecture Foundation for nearly all modern computers Characteristics: Central Processing Unit (CPU) memory input/output arithmetic/logic unit control unit Stored program concept Sequential execution of instructions Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 7 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 8 volatile RAM nonvolatile Memory Hierarchy Fast, Expensive, Small Slow, Cheap, Large Memory and Cache Information is stored and fetched from memory: Read only Memory (ROM) built in memory that can only be read and not written to Random Access Memory (RAM)- maps addresses to memory locations Cache memory keeps values currently in use in faster memory to speed access times Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 9 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 10 The Components of a Computer System Memory and Cache (continued) Memory: functional unit where data is stored/retrieved Read Only Memory (ROM) can only be read Random access memory (RAM) Organized into cells, each given a unique address Equal time to access any cell Cell values may be read and changed Cell size/memory width typically 8 bits Maximum memory size/address space is 2 N, where N is length of address Largest memory address is 2 n -1 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 11 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 12 2
Memory and Cache (continued) Parts of the memory subsystem Fetch/store controller Fetch: retrieve a value from memory (load operation) Store: store a value into memory Memory address register (MAR) Memory data register (MR) Memory cells, with decoder(s) to select individual cells Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 13 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 14 The Components of a Computer System Memory and Cache (continued) Fetch: retrieve from memory (nondestructive fetch) Store: write to memory (destructive store) Memory access time time required to fetch/store Modern RAM requires 5-10 nanoseconds MAR holds memory address to access MR receives data from fetch, holds data to be stored The Components of a Computer System Memory and Cache (continued) Memory system circuits: decoder and fetch/store controller ecoder converts MAR into signal to a specific memory cell One-dimensional versus two-dimensional memory organization Fetch/Store controller = traffic cop for MR Takes in a signal that indicates fetch or store Routes data flow to/from memory cells and MR Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 15 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 16 Memory and Cache (continued) Fetch operation (read) Load the address into the MAR ecode the address in the MAR Copy the contents of that memory location into the MR Memory and Cache (continued) Store operation (write) The address of the cell where the value should go is placed in the MAR Load the new value into the MR ecode the address in the MAR Store the contents of the MR into that memory location. Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 17 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 18 3
Memory and Cache (continued) Memory register Very fast memory location Given a name, not an address (e.g., R1, PC, IR, MAR) Serves some special purpose (e.g., accumulator) Modern computers have dozens or even hundreds of registers Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 19 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 20 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 21 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 22 The Components of a Computer System Memory and Cache (continued) RAM speeds increased more slowly than CPU speeds - use the fastest memory to keep up with CPU Registers are fastest but very expensive Cache memory is fast but expensive Principle of locality: Values close to recently accessed memory are more likely to be accessed Load neighbors into cache and keep recent there Cache hit rate: percentage of times values are found in cache The Components of a Computer System Input/Output and Mass Storage Input/Output (I/O) connects the processor to the outside world: Humans: keyboard, monitor, etc. ata storage: hard drive, V, flash drive Other computers: network RAM = volatile memory (gone without power) Mass storage systems = nonvolatile memory irect access storage devices (ASs) Sequential access storage devices (SASs) Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 23 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 24 4
Input/Output and Mass Storage (continued) Volatile storage Information disappears when the power is turned off Example: RAM, cache, registers Nonvolatile storage Information does not disappear when the power is turned off Example: mass storage devices such as flash drives, disks and tapes Input/Output and Mass Storage (continued) irect access storage devices ata stored on a spinning disk isk divided into concentric rings (tracks) and subdivisions (sectors) Read/write head moves from one ring to another while disk spins Access time depends on: Time to move head to correct sector (seek time) Time for sector to spin to data location (rotational latency) Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 25 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 26 The Components of a Computer System I/O and Mass Storage (continued) ASs Hard drives, Cs, Vs contain disks Tracks: concentric rings around disk surface Sectors: fixed size segments of tracks, unit of retrieval Time to retrieve data based on: seek time latency transfer time Other non-disk ASs: flash memory, optical Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 27 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 28 Access time Seek time time to position the read/write head over the correct track Latency- time needed for the correct sector to rotate under the read/write head Transfer time- time to read or write the data Best Worst Average Seek Time 0 19.98 10.00 Latency 0 8.33 4.17 Transfer time 0.13 0.13 0.13 Total 0.13 28.44 14.30 The Components of a Computer System I/O and Mass Storage (continued) ASs and SASs are orders of magnitude slower than RAM: (microseconds or milliseconds) I/O Controller manages data transfer with slow I/O devices, freeing processor to do other work Controller sends an interrupt signal to processor when I/O task is done Average is half the number of tracks or half a revolution Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 29 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 30 5
The Arithmetic/Logic Unit (ALU) The ALU performs arithmetic and logical operations and is located on the processor chip. The ALU includes: The registers The ALU circuitry The interconnections These components are called the data path Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 31 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 32 The Components of a Computer System The Arithmetic/Logic Unit ALU is part of central processing unit (CPU) Contains circuits for arithmetic: addition, subtraction, multiplication, division Contains circuits for comparison and logic: equality, and, or, not Contains registers: super-fast, dedicated memory connected to circuits ata path: how information flows in ALU from registers to circuits from circuits back to registers Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 33 The Arithmetic/Logic Unit (ALU) Actual computations are performed Primitive operation circuits Arithmetic (A, etc.) Comparison (CE, etc.) Logic (AN, etc.) ata inputs and results are stored in registers Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 34 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 35 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 36 6
The Components of a Computer System The ALU (continued) How to choose which operation to perform? Option 1: decoder signals one circuit to run Option 2: run all circuits, multiplexor selects one output from all circuits In practice, option 2 is usually chosen Information flow: ata comes in from outside ALU to registers Signal comes to multiplexor, which operation Result goes back to register, and then to outside The Arithmetic/Logic Unit (continued) ALU process Values for operations are copied into ALU s input register locations All circuits compute results for those inputs Multiplexor selects the one desired result from all values Result value is copied to desired result register Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 37 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 38 Multiplexor A multiplexor is a control circuit that selects one of the input lines to allow its value to be output. To do so, it uses the selector lines to indicate which input line to select. A multiplexor It has 2 N input lines, N selector lines and one output. The N selector lines are set to 0s or 1s. When the values of the N selector lines are interpreted as a binary number, they represent the number of the input line that must be selected. With N selector lines you can represent numbers between 0 and 2 N -1. The single output is the value on the line represented by the number formed by the selector lines. Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 39 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 40 Multiplexor Circuit 2 N input lines 0 1 2 2 N -1 multiplexor circuit..... N selector lines Interpret the selector lines as a binary number. output Suppose that the selector line writes the number 00.01 which is equal to 1. Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 For our example, the output is the value on the line numbered 1 41 Multiplexor Circuit (continued) multiplexor with N=1 0 1 A multiplexor can be built with AN-OR-NOT gates Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 42 7
The Control Unit Manages stored program execution- the fundamental characteristic of the Von Neumann Architecture Task (called the machine cycle) Fetch from memory the next instruction to be executed ecode: determine what is to be done Execute: issue appropriate command to ALU, memory, and I/O controllers Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 43 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 44 The Components of a Computer System The Control Unit Stored program characteristic: Programs are encoded in binary and stored in computer s memory Control unit fetches instructions from memory, decodes them, and executes them Instructions encoded: Operation code (op code) tells which operation Addresses tell which memory addresses/registers to operate on Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 45 R RECALL: THE ARITHMETIC/LOGIC UNIT USES A MULTIPLEOR Other registers AL1 AL2 Register R circuits multiplexor ALU selector lines condition code register GtT EQ LT output (In the lab you will see where this will go) Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 46 Machine Language Instructions Are all binary Can be decoded and executed by control unit Parts of instructions Operation code (op code) Unique unsigned-integer code assigned to each machine language operation Address field(s) Memory addresses of the values on which operation will work Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 47 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 48 8
The Components of a Computer System The Control Unit (continued) Machine language: Binary strings that encode instructions Instructions can be carried out by hardware Sequences of instructions encode algorithms Instruction set: The instructions implemented by a particular chip Each kind of processor speaks a different language Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 49 Machine Language Instruction A, Y (add contents of addresses, Y and put sum back in Y) Assume that the op code for A is 9 and and Y correspond to addresses 99 and 100 (in decimal values) This is how the instruction would appear in memory Op code Address Address Y 00001001 0000000001100011 0000000001100100 9 99 100 Bits 8 16 16 Add Y Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 50 Instruction Set Set of all operations that can be executed by a processor no general agreement. This is why a Power Mac cannot directly run programs designed for an Intel Pentium 4 RISC reduced instruction set computers have limited and simple instruction sets (30-50 instructions), each instruction runs faster CISC complex instruction set computers are more complex, more expensive, more difficult to build (300-500 instructions) Most modern machine use a combination Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 51 The Components of a Computer System The Control Unit (continued) RISC machines and CISC machines Reduced instruction set computers (RISC): small instruction sets each instruction highly optimized easy to design hardware Complex instruction set computers (CISC): large instruction set single instruction can do a lot of work complex to design hardware Modern hardware: some RISC and some CISC Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 52 The Components of a Computer System The Control Unit (continued) Kinds of instructions: ata transfer, e.g., move data from memory to register Arithmetic, e.g., add, but also and Comparison, compare two values Branch, change to a non-sequential instruction Branching allows for conditional and loop forms E.g., JUMPLT a = If previous comparison of A and B found A < B, then jump to instruction at address a Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 53 Operations Available Arithmetic Operations load store clear add increment subtract decrement I/0 Operations in out Logic/Control Operations compare jump jumpgt jumpeq jumplt jumpneq halt There is a different Operation Code (OpCode) for each Operation Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 54 9
Machine Language Instructions (continued) Operations of machine language Instruction Set for Our Machine ata transfer Move values to and from memory and registers Arithmetic/logic Perform ALU operations that produce numeric values Compares Set bits of compare register to hold result Branches Jump to a new memory address to continue processing Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 55 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 56 The Components of a Computer System The Control Unit (continued) A Choose larger Control unit contains: Program counter (PC) register: holds address of next instruction Instruction register (IR): holds encoding of current instruction Instruction decoder circuit ecodes op code of instruction, and signals helper circuits, one per instruction Helpers send addresses to proper circuits Helpers signal ALU, I/O controller, memory Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 57 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 58 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 59 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 60 10
Putting the Pieces Together---the Von Neumann Architecture Combine previous pieces: Von Neumann machine Subsystems are connected by a bus (wires that permit data transfer) Fetch-ecode-Execute cycle machine repeats until HALT instruction or error also called Von Neumann cycle Fetch phase: get next instruction into memory ecode phase: instruction decoder gets op code Execute phase: different for each instruction Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 61 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 62 Putting the Pieces Together---the Von Neumann Architecture (continued) Notation for computer s behavior: CON(A) A -> B FETCH STORE A Contents of memory cell A Send value in register A to register B (special registers: PC, MAR, MR, IR, ALU, R, GT, EQ, LT, +1) Initiate a memory fetch operation Initiate a memory store operation Instruct the ALU to select the output of the adder circuit SUBTRACT Instruct the ALU to select the output of the subtract circuit Putting the Pieces Together---the Von Neumann Architecture (continued) Fetch phase: 1. PC -> MAR Send address in PC to MAR 2. FETCH Initiate Fetch, data to MR 3. MR -> IR Move instruction in MR to IR 4. PC + 1 -> PC Add one to PC ecode phase: 1. IR op -> instruction decoder Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 63 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 64 1) After a fetch, the PC is incremented by 1. The Control Unit PC +1 4) The decoder sends out signals for execution 2) The fetch places the instruction in the IR. IR 1 0 0 1 address instruction decoder 5) As all instructions use the address, except for halt, the address is sent to either the MAR or PC- which depends on the instruction. line 9 3) The opcode is sent to the decoder enable add Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 65 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 66 11
Putting the Pieces Together---the Von Neumann Architecture (continued) LOA meaning CON() -> R 1. IR addr -> MAR Send address to MAR 2. FETCH Initiate Fetch, data to MR 3. MR -> R Move data in MR to R STORE meaning R -> CON() 1. IR addr -> MAR Send address to MAR 2. R -> MR Send data in R to MR 3. STORE Initiate store of MR to See animated examples at end Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 67 Putting the Pieces Together---the Von Neumann Architecture (continued) A meaning R + CON() -> R 1. IR addr -> MAR Send address to MAR 2. FETCH Initiate Fetch, data to MR 3. MR -> ALU Send data in MR to ALU 4. R -> ALU Send data in R to ALU 5. A Select A circuit as result 6. ALU -> R Copy selected result to R JUMP meaning get next instruction from 1. IR addr -> PC Send address to PC Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 68 Putting the Pieces Together---the Von Neumann Architecture (continued) COMPARE meaning: if CON() > R then GT = 1 else 0 if CON() = R then EQ = 1 else 0 if CON() < R then LT = 1 else 0 1. IR addr -> MAR Send address to MAR 2. FETCH Initiate Fetch, data to MR 3. MR -> ALU Send data in MR to ALU 4. R -> ALU Send data in R to ALU 5. SUBTRACT Evaluate CON() R Sets EQ, GT, and LT Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 69 Putting the Pieces Together---the Von Neumann Architecture (continued) JUMPGT meaning: if GT = 1 then jump to else continue to next instruction 1. IF GT = 1 THEN IR addr -> PC Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 70 Non-Von Neumann Architectures Physical limits on speed Problems to solve are always larger Computer chip speeds no longer increase exponentially Reducing size puts gates closer together, faster Speed of light pertains to signals through wire Cannot put gates much closer together Heat production increases too fast Von Neumann bottleneck: inability of sequential machines to handle larger problems Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 71 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 72 12
Non-Von Neumann Architectures (continued) Non-Von Neumann architectures: Other ways to organize computers Most are experimental/theoretical, ECEPT parallel processing Parallel processing: Use many processing units operating at the same time Supercomputers (in the past) esktop multi-core machines (in the present) The cloud (in the future) Non-Von Neumann Architectures (continued) SIM parallel processing: Single Instruction stream/multiple ata streams Processor contains one control unit, but many ALUs Each ALU operates on its own data All ALUs perform exactly the same instruction at the same time Older supercomputers, vector operations (sequences of numbers) Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 73 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 74 Non-Von Neumann Architectures (continued) MIM parallel processing or clusters: Multiple Instruction streams/multiple ata streams Replicate whole processors, each doing its own thing in parallel Communication over interconnection network Off-the-shelf processors work fine Scalable: can always add more processors cheaply Communication costs can slow performance Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 75 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 76 MIM MIM computing is also scalable. Scalable means that (theoretically) it is possible to match the number of processors to the size of the problem. Massively parallel computers contain 10,000+ processors Grid computing- combining computing resources at multiple locations to solve a single problem Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 77 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 78 13
Non-Von Neumann Architectures (continued) Varieties of MIM systems: Special-purpose systems, newer supercomputers Cluster computing, standard machines communicating over LAN or WAN Grid computing, machines of varying power, over large distances/internet Example: SETI project e infothome.ssl.berkeley.edu Hot research are: parallel algorithms Need to take advantage of all this processing power Cloud computing Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 79 Internet based computing where shared Invitation to Computer resources Science, are 6th Edition, provided modified by on SJF, demand. fall2012 80 Quantum Computing Quantum computing is based on the properties of matter at the atomic and subatomic level. It uses qubits (quantum bits) to represent data and perform operations. Research is being done to use these computers for security and cryptanalysis. http://en.wikipedia.org/wiki/quantum_computer Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 81 Summary of Level 2 Focus on how to design and build computer systems Chapter 4 Binary codes Transistors Gates Circuits Chapter 5 Von Neumann architecture Shortcomings of the sequential model of computing Parallel computers Summary We must abstract in order to manage system complexity Von Neumann architecture is standard for modern computing Von Neumann machines have memory, I/O, ALU, and control unit; programs are stored in memory; execution is sequential unless program says otherwise Memory is organized into addressable cells; data is fetched and stored based on MAR and MR; uses decoder and fetch/store controller Summary (continued) Mass data storage is nonvolatile; disks store and fetch sectors of data stored in tracks I/O is slow, needs dedicated controller to free CPU ALU performs computations, moving data to/from dedicated registers Control unit fetches, decodes, and executes instructions; instructions are written in machine language Parallel processing architectures can perform multiple instructions at one time Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 83 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 84 14
Some Interesting Resources Massively Parallel Systems http://www.top500.org Grid computing (Seti@home) http://setiathome.ssl.berkeley.edu Using the grid to benefit humanity http://www.worldcommunitygrid.org/ http://fightaidsathome.scripps.edu/ http://cleanenergy.harvard.edu/ http://www.worldcommunitygrid.org/research/cep2/overview.do http://en.wikipedia.org/wiki/list_of_distributed_computing_projects Some Interesting Resources Cloud computing http://www.youtube.com/watch?v=ijcs7mun9e&feature=channel Quantum computing http://en.wikipedia.org/wiki/quantum_computer Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 85 Invitation to Computer Science, 6th Edition, modified by SJF, fall2012 86 Examples of instructions: LOA STORE f s LOA STORE 87 88 A INCREMENT f E+ E E+ E ALU1 & ALU2 E+ A +1 INC 89 90 15
COMPARE (assume > E) JUMP f E E ALU1 & ALU2 COM JUMP 1 0 0 91 92 JUMPLT JUMPLT JUMPLT JUMPLT Similarly for condition code of 0 1 0 0 0 1 93 1 0 0 94 IN OUT s f IN OUT 95 96 16