Latches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter

Similar documents
The von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store.

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

Processing Unit CS206T

COA. Prepared By: Dhaval R. Patel Page 1. Q.1 Define MBR.

CS 101, Mock Computer Architecture

CHAPTER 8: CPU and Memory Design, Enhancement, and Implementation

Basic Computer Architecture

COMPUTER ORGANIZATION AND ARCHITECTURE

SAE5C Computer Organization and Architecture. Unit : I - V

2 MARKS Q&A 1 KNREDDY UNIT-I

Computer Organization

5 Computer Organization

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

The Central Processing Unit

COS 140: Foundations of Computer Science

Lecture1: introduction. Outline: History overview Central processing unite Register set Special purpose address registers Datapath Control unit

Data Manipulation. Chih-Wei Tang ( 唐之瑋 ) Department of Communication Engineering National Central University JhongLi, Taiwan

Computer Organization CS 206 T Lec# 2: Instruction Sets

Computer Organization

Typical Processor Execution Cycle

CHAPTER 4 MARIE: An Introduction to a Simple Computer

ADVANCED COMPUTER ARCHITECTURE TWO MARKS WITH ANSWERS

Digital System Design Using Verilog. - Processing Unit Design

QUESTION BANK UNIT-I. 4. With a neat diagram explain Von Neumann computer architecture

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1

INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 ELEC : Computer Architecture and Design

Chapter 4. Chapter 4 Objectives

COURSE DESCRIPTION. CS 232 Course Title Computer Organization. Course Coordinators

Chapter 2 Data Manipulation

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language.

5 Computer Organization

Hardwired Control (4) Micro-programmed Control Ch 17. Micro-programmed Control (3) Machine Instructions vs. Micro-instructions


Real instruction set architectures. Part 2: a representative sample

MaanavaN.Com CS1202 COMPUTER ARCHITECHTURE

Microcomputer Architecture and Programming

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

Advanced processor designs

MARIE: An Introduction to a Simple Computer

EE 3170 Microcontroller Applications

Chapter 4. MARIE: An Introduction to a Simple Computer

PART A (22 Marks) 2. a) Briefly write about r's complement and (r-1)'s complement. [8] b) Explain any two ways of adding decimal numbers.

Chapter 2: Data Manipulation

Microprocessors and Microcontrollers. Assignment 1:

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

3 Computer Architecture and Assembly Language

Micro-programmed Control Ch 17

Micro-programmed Control Ch 15

COSC 122 Computer Fluency. Computer Organization. Dr. Ramon Lawrence University of British Columbia Okanagan

Machine Instructions vs. Micro-instructions. Micro-programmed Control Ch 15. Machine Instructions vs. Micro-instructions (2) Hardwired Control (4)

Micro-programmed Control Ch 15

7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

SYLLABUS. osmania university CHAPTER - 1 : REGISTER TRANSFER LANGUAGE AND MICRO OPERATION CHAPTER - 2 : BASIC COMPUTER

Chapter 2: Data Manipulation

Module 5 - CPU Design

CREATED BY M BILAL & Arslan Ahmad Shaad Visit:

Chapter 2: Data Manipulation

What Are The Main Differences Between Program Counter Pc And Instruction Register Ir

Universität Dortmund. ARM Architecture

omputer Design Concept adao Nakamura

UNIT I DATA REPRESENTATION, MICRO-OPERATIONS, ORGANIZATION AND DESIGN

Computer Fundamentals and Operating System Theory. By Neil Bloomberg Spring 2017

COMPUTER STRUCTURE AND ORGANIZATION

UNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation.

Chapter 7: Processor and Memory

Course Description: This course includes concepts of instruction set architecture,

JNTUWORLD. 1. Discuss in detail inter processor arbitration logics and procedures with necessary diagrams? [15]

machine cycle, the CPU: (a) Fetches an instruction, (b) Decodes the instruction, (c) Executes the instruction, and (d) Stores the result.

UNIT I BASIC STRUCTURE OF COMPUTERS Part A( 2Marks) 1. What is meant by the stored program concept? 2. What are the basic functional units of a

Instruction Register. Instruction Decoder. Control Unit (Combinational Circuit) Control Signals (These signals go to register) The bus and the ALU

CHAPTER 7 THE CPU AND MEMORY SHOE MACNELLY. DISTRIBUTED BY KING FEATURES SYNDICATE

CC312: Computer Organization

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

MARIE: An Introduction to a Simple Computer

Chapter 5 12/2/2013. Objectives. Computer Systems Organization. Objectives. Objectives (continued) Introduction. INVITATION TO Computer Science 1

Chapter 1. Microprocessor architecture ECE Dr. Mohamed Mahmoud.

Computer Organization and Technology Processor and System Structures

Computer Architecture 2/26/01 Lecture #

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Chapter One. Introduction to Computer System

Instruction Sets: Characteristics and Functions Addressing Modes

Chapter 5: Computer Systems Organization

Computer organization and architecture UNIT-I 2 MARKS

Chapter 5: Computer Systems Organization. Invitation to Computer Science, C++ Version, Third Edition


Class Notes. Dr.C.N.Zhang. Department of Computer Science. University of Regina. Regina, SK, Canada, S4S 0A2

Chapter 13 Reduced Instruction Set Computers

Technology in Action

Introduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University

CS2253 COMPUTER ORGANIZATION AND ARCHITECTURE 1 KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY

THE MICROPROCESSOR Von Neumann s Architecture Model

Dec Hex Bin ORG ; ZERO. Introduction To Computing

Micro-Operations. execution of a sequence of steps, i.e., cycles

Advanced Computer Architecture

Computer Systems Organization

Where Does The Cpu Store The Address Of The

BASIC COMPUTER ORGANIZATION. Operating System Concepts 8 th Edition

Chapter 16. Control Unit Operation. Yonsei University

Transcription:

IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more bits, e.g. an 8-bit latch is really eight one-bit latches. Input Bus Clock Read Enable Latch Out Clock D-In Write Output Bus Registers Small, fast storage within the CPU Dedicated to a particular purpose Sizes are bits or bytes At least conceptually are composed of the proper number of latches. LMC has two explicit registers: Program counter Calculator display (accumulator) The Little Man has Registers The Little Man could remember things; in a real computer, registers do that. The address of the next instruction: the Program Counter The calculator result: the Accumulator The memory location to read or write: The memory address register The data transferred to or from memory: The memory data register An instruction read from memory: The instruction register Program Counter Holds the address of the next instruction Updated shortly after an instruction is fetched Can be changed (by the CPU) to implement branching. Data Registers Generally thought of as part of the ALU Number: between one and around a hundred. 16 or 32 is typical. The LMC has one data register: the display of the calculator. When a computer has only one data register, it is called an accumulator. (That s the A in LDA!)

Memory Registers The memory address register holds an address in memory From which to read data To which to write data The memory data register holds data Read from memory To be written to memory Instruction Register Holds the instruction fetched from the memory location pointed by the Program Counter The Decode phase examines the contents of the instruction register to decide what operation to perform The instruction was something the Little Man remembered; a real computer uses the Instruction Register Register Operations Store data values temporarily Receive the results of arithmetic operations (addition, subtraction, etc.) Receive the results of logical operations (shift data, AND data) Test their contents for conditions such as zero or positive. A Closer Look at the Architecture CPU 00 01 02 03 PC IR Control Unit MAR MDR Command Memory Accumulator ALU 96 97 98 99 I-O Conversations with Memory The Memory Address Register (MAR) Holds one memory address If memory is 128 bytes, how big must the MAR be? What if memory is 16 MB? The address in the MAR determines what memory location will be read or written. (Only one location can be read or written at a time.) Conversations with Memory The Memory Data Register (MDR) Holds one memory word If memory words are 32 bits, how big must the MDR be? The MDR receives one word of data from memory on a read The MDR holds one word of data to be transferred to memory on a write.

Reading from Memory 1. Place a memory address in the MAR 2. Send a command (electronic signal) to memory to read. 3. The memory places the selected data word on a bus 4. That is connected to the MDR. 5. The control unit can transfer the data from the MDR. Writing to Memory 1. Place a memory address in the MAR 2. Place the data to be written in the MDR 3. The memory bus is connected to the MDR 4. Send a command (electronic signal) to memory to write 5. The memory stores data from the bus into the selected location. Operation of Memory Operation of Memory: Example individual memory bits Visual Analogy Reading or Writing lsb

Memory Capacity Limited by two factors Size of the MAR: k bits address 2 k cells Size of address portion of an instruction (in LMC, it is two digits.) The amount of physical memory is important for performance Random Access Memory (RAM) Called random access because any cell may be addressed as fast as any other. Dynamic RAM (DRAM) Loses contents when power removed (volatile) Must be refreshed thousands of times per second. Static RAM (SRAM) More expensive, faster, no refresh needed Loses contents when power removed Read Only Memory (ROM) Non-volatile memory to hold software that is not expected to change over the life of the system EEPROM Electrically Erasable Programmable ROM Slower and less flexible than Flash ROM Flash ROM Faster than disks but more expensive Uses BIOS: initial boot instructions and diagnostics Digital cameras, music players, thumb drives, etc. CMOS: Very low power read-write memory; clock and configuration info. The Instruction Cycle The von Neumann Instruction Cycle Fetch: Get an instruction from the memory location pointed by the program counter and advance the program counter Decode: Determine what operation code is present, and what data to use Execute: Perform the commanded operation Register Transfer Basic operation of the execute part of the instruction cycle: send the contents of one or two registers through the ALU. The result is stored in a register, possibly the same as one of the sending registers. Data are transformed according to the command (add, shift, etc.) given the ALU. A no operation command can move data without changing it. Register operations are described with RTL Register Transfer Example PC MAR Write A PC IR MAR MDR Enable

The Complete Datapath A Fetch The Instruction Cycle Read the PC MAR program counter (read memory) Get contents of MDR IR indicated mailbox PC IR MAR The real ALU! Decode Execute Increment the program counter Check op code: it s a STORE Get calculator display value PC+1 PC Determine next operation A MDR MDR Function Status Read instruction address field Store calculator value there IR[add] MAR (write memory) Done Done 1. PC MAR 2. PC+1 PC 3. MDR IR The STO Instruction Transfer the address from the PC to the MAR Program Counter incremented Memory read completes Transfer the instruction to the IR CPU PC 01 PC MAR; Read IR Control Unit MAR MDR 01 Accumulator 137 ALU I-O 4. IR[address] MAR 5. A MDR* *Notice how Step #5 differs for LOAD and STORE Address portion of the instruction loaded in MAR Accumulator copies its data into MDR, write memory 00 01 02 03 399 (Store) 199 Command (Read) Memory 96 97 98 99 PC+1 PC; Memory Read Completes CPU PC IR Accumulator 02 137 Control Unit MAR MDR 01 399 ALU I-O MDR IR CPU PC IR Accumulator 02 3 99 137 Control Unit MAR MDR 01 399 ALU I-O Command Command 00 01 02 03 399 199 Memory 96 97 98 99 00 01 02 03 399 199 Memory 96 97 98 99

IR[address] MAR CPU PC IR Accumulator 02 3 99 137 Control Unit MAR MDR 99 399 ALU I-O A MDR; Write CPU PC IR Accumulator 02 3 99 137 Control Unit MAR MDR 99 137 ALU I-O Command Command (Write) 00 01 02 03 399 199 Memory 96 97 98 99 00 01 02 03 399 199 Memory 137 96 97 98 99 The LDA Instruction The ADD Instruction 1. PC MAR 2. PC+1 PC 3. MDR IR Transfer the address from the PC to the MAR, read memory Program Counter incremented (This is in a different place in Englander s diagram.) Transfer the instruction to the IR 1. PC MAR 2. PC+1 PC 3. MDR IR Transfer the address from the PC to the MAR Program Counter incremented Transfer the instruction to the IR 4. IR[address] MAR 5. MDR A Address portion of the instruction loaded in MAR, read memory Actual data copied into the accumulator 4. IR[address] MAR 5. A + MDR A Address portion of the instruction loaded in MAR, read memory Contents of MDR added to contents of accumulator Buses The physical connection that makes it possible to transfer data from one location in the computer system to another Group of electrical conductors for carrying signals from one location to another Line: each conductor in the bus 4 kinds of signals Data (binary numbers: alphanumeric, numerical, instructions) Addresses Control signals Power (sometimes) Buses Connect CPU and Memory I/O peripherals: on same bus as CPU/memory or separate bus Physical packaging commonly called backplane or motherboard Also called system bus or external bus Example of broadcast bus Part of printed circuit board called motherboard that holds CPU and related components

Bus Characteristics Point to Point vs. Multipoint Protocol Documented agreement for communication Specification that spells out the meaning of each line and each signal on each line Throughput, i.e., data transfer rate in bits per second Data width in bits carried simultaneously Motherboard Instructions Direction given to a computer Causes electrical signals to be sent through specific circuits for processing Instruction Set: The collection of instructions a given computer can perform. (LMC has ten instructions; the list of them is its instruction set.) Instruction Set Design defines functions performed by the processor Differentiates computer architecture by the Number of instructions Complexity of operations performed by individual instructions Data types supported Format (layout, fixed vs. variable length) Use of registers Addressing (size, modes) Elements of an Instruction Operation Code (op-code): Commands the control unit and the ALU what to do Operands: tell the location of the data to be used in the instruction. Source operand: where to get the data Result operand: where to put the result (Also called the destination operand.) The operands are (usually) addresses.

Operand Addresses Addresses may be explicit or implicit Explicit: encoded in the instruction. (The LMC memory address is explicit.) Implicit: implied by the nature of the operand. (The LMC uses the calculator display implicitly.) Addresses may refer to memory or to registers. General Form of an Instruction OP-CODE Source Operand 4 bits 20 bits Result Operand Instruction Format Specific to a particular family of computers (architecture) Specifies the length of the op-code And the size and number of operand fields A single computer may have several different instruction formats. Complex Instruction-Set Computers Many different kinds of instructions Many different instruction formats Several different instruction lengths A few different operation code lengths Often things done in high-level languages can be performed in one instruction. Emphasis is on flexibility CISC Instruction Formats Reduced Instruction-Set Computers A few kinds of instructions A small number of formats All instructions are the same length All operation codes are the same length High-level language statements generally require several instructions Emphasis is on speed

RISC Instruction Formats Categories of Instructions Data transfer instructions Arithmetic instructions Logical operations Program control Stack manipulation I-O and machine control Multiple-data instructions Data Transfer Instructions Move data between registers in CPU Transfer data from memory to a CPU register (load) Transfer data from a CPU register to memory (store) Size of a single transfer: generally the size of a data register; a word Words are 8, 16, 32, 64, or 128 bits 32-bit words are currently most common Arithmetic Instructions The usual suspects: + / * Separate instructions for integer and floating point operands. Shift and rotate instructions One bit shift left multiplies by two One bit shift right divides by two Rotate: Bits shifted out one end are used for replacement bits at the other end. Increment, complement, etc. Shift and Rotate Instructions Logical Operations Logical AND and OR of two operands Sometimes others: XOR, NOR, NOT Relational operations: > < = Testing for zero, positive, negative

Program Control Branch instructions; conditional and unconditional Call instructions (save program counter someplace) Stack Manipulation Special instructions for dealing with LIFO data structures. (A stack is a good place to store program counters for subroutine linkage!) Push Pop I-O and Machine Control Transfers from registers to I-O devices Direct memory access (DMA) I-O The I-O device communicates with memory independent of the CPU Machine state switching (privileged instructions) Interrupt control State saving Halt Multiple-Data Instructions Perform the same operation on multiple data items simultaneously (Example: Intel MMX) Commonly used in vector and array processing SIMD: Single-instruction, multiple data Multiple Data Instructions CISC Architecture Examples Intel x86, IBM Z-Series Mainframes, older CPU architectures Characteristics Few general purpose registers (perhaps 16) Many addressing modes Large number of specialized, complex instructions Instructions are of varying sizes

Limitations of CISC Architecture Some instructions are infrequently used by programmers and compilers Memory references, loads and stores, are slow and account for a significant fraction of all instructions Only a few of the many instructions are used frequently Procedure and function calls are a major bottleneck Passing arguments Storing and retrieving values in registers RISC Features Limited and simple instruction set Fixed length, fixed format instruction words enable pipelining, parallel fetches and executions Limited addressing modes reduce complicated hardware Register-oriented instruction set reduce memory accesses Large bank of registers Reduce memory accesses Efficient procedure calls CISC vs. RISC Processing Speeding Up Procedure Calls Procedure calls help modularize programs. They cause major overhead at execution time: Saving state Setting up parameters Retrieving results What if we could make procedure calls without moving data around? Circular Register Buffer (RISC) (It isn t really a circle; it s a linear space that wraps around. ) Circular Register Buffer - After Procedure Call The caller s out becomes the procedure s in. No data was moved; a single pointer was changed by a fixed amount.

CISC vs. RISC Performance Comparison RISC Simpler instructions more instructions required for a program more memory required to hold program, maybe CISC More memory access for data, so more bus traffic and increased cache memory misses More registers would improve CISC performance but there was formerly no space available for them Modern CISC and RISC architectures are becoming similar due to Moore s Law. Memory Enhancements Memory is slow compared to CPU processing speeds! 2Ghz CPU = 1 cycle in ½ of a billionth of a second 70ns DRAM = 1 access in 70 billionths of a second (140 times slower!) Improving Memory Access Wide Path Memory Access Retrieve multiple bytes instead of one byte at a time Memory Interleaving Partition memory into subsections, each with its own address register and data register Cache Memory Memory Interleaving Cache Memory A small, fast memory placed between the CPU and main memory Works because memory locations used once are likely to be used again. (Locality of reference.) Cache Terminology Blocks: Amount of data transferred; Tags: point to a location in main memory Cache controller hardware that checks tags Cache Line Unit of transfer between storage and cache memory Hit Ratio: ratio of hits out of total requests Synchronizing cache and memory Write through Write back

Step-by-Step Use of Cache: Hit Step-by-Step Use of Cache: Miss Performance Advantages Hit ratios of 90% common 50%+ improved execution speed Locality of Reference Most memory references confined to small region of memory at any given time Well-written program in small loop, procedure or function Data likely in array Variables stored together Two-level Caches Current and Emerging Trends CISC and RISC are re-converging because of greater chip densities. Multi-core chips: two, four, and even more CPUs in a single integrated circuit package. Cluster computing: Hundreds or thousands of commodity computers working together. Example: the Big Mac at Virginia Tech Parallel computing Experimental architectures

VLIW Architecture Transmeta Crusoe CPU 128-bit instruction bundle = molecule 4 32-bit atoms (atom = instruction) Parallel processing of 4 instructions 64 general purpose registers Code morphing layer Translates instructions written for other CPUs into molecules Instructions are not written directly for the Crusoe CPU EPIC Architecture Intel Itanium CPU 128-bit instruction bundle 3 41-bit instructions 5 bits to identify type of instructions in bundle 128 64-bit general purpose registers 128 82-bit floating point registers Intel x86 instruction set included Programmers and compilers follow guidelines to ensure parallel execution of instructions Modern CPU Processing Methods Alternative CPU Organization Separate Fetch/Execute Units Pipelining Scalar Processing Superscalar Processing Instruction Pipelining Assembly-line technique to allow overlapping between fetch-execute cycles of sequences of instructions Only one instruction is being completed at a time More on Pipelining Scalar processing: Average instruction execution is approximately equal to the clock speed of the CPU Problems from stalling: Instructions have different numbers of steps Problems of data latency Problems from branching

Branch Problem Solutions Separate pipelines for both possibilities Probabilistic approach Requiring the following instruction to not be dependent on the branch Instruction Reordering Pipelining Example Superscalar Processing Process more than one instruction per clock cycle Separate fetch and execute cycles as much as possible Buffers for fetch and decode phases Parallel execution units Superscalar CPU Block Diagram Scalar vs. Superscalar Processing Superscalar Issues Out-of-order processing dependencies (hazards) Data dependencies Branch (flow) dependencies and speculative execution Parallel speculative execution or branch prediction Branch History Table Register access conflicts Logical registers (register renaming)

Hardware Implementation Hardware implementation operations are implemented using logic gates Advantages: Speed RISC designs are simple and typically implemented in hardware Hardware and Software Hardware and software are logically equivalent. (But there has to be some hardware someplace!) So, computer designers have a choice of implementing with hardware or software. Microprogrammed Implementation Microcode: programs stored in ROM that replace hardwired CPU instructions Advantages More flexible Easier to implement complex instructions Can emulate other CPUs Can be changed! Disadvantage Usually requires more clock cycles Questions