Summary of Computer Architecture

Similar documents
Computer Architecture

CSC D70: Compiler Optimization Memory Optimizations

Computer Architecture Lecture 19: Memory Hierarchy and Caches. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 3/19/2014

Caches. Samira Khan March 21, 2017

18-447: Computer Architecture Lecture 17: Memory Hierarchy and Caches. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 3/26/2012

18-447: Computer Architecture Lecture 22: Memory Hierarchy and Caches. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 3/27/2013

Lecture-14 (Memory Hierarchy) CS422-Spring

SAE5C Computer Organization and Architecture. Unit : I - V

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

Computer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2.

The University of Adelaide, School of Computer Science 13 September 2018

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION

Computer Organization ECE514. Chapter 5 Input/Output (9hrs)

Computer System Overview

Advanced Memory Organizations

Computer Organization

Large and Fast: Exploiting Memory Hierarchy

Computer Architecture and Organization (CS-507)

Chapter One. Introduction to Computer System

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATION SEMESTER: III

5 Computer Organization

Chapter 3. Top Level View of Computer Function and Interconnection. Yonsei University

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (3 rd Week)

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

ECE 152 Introduction to Computer Architecture

5 Computer Organization

CREATED BY M BILAL & Arslan Ahmad Shaad Visit:

Q.1 Explain Computer s Basic Elements

Computer Architecture Review. Jo, Heeseung

UNIT- 5. Chapter 12 Processor Structure and Function

Chapter 1 Computer System Overview

Chapter Seven Morgan Kaufmann Publishers

Chapter 1 Computer System Overview

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Designing Computers. The Von Neumann Architecture. The Von Neumann Architecture. The Von Neumann Architecture

The Von Neumann Architecture. Designing Computers. The Von Neumann Architecture. CMPUT101 Introduction to Computing - Spring 2001

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Computer Organization

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CSC Memory System. A. A Hierarchy and Driving Forces

CSC 553 Operating Systems

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

CENG4480 Lecture 09: Memory 1

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1

CMPUT101 Introduction to Computing - Summer 2002

Computer Organization

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

Computer Systems. Binary Representation. Binary Representation. Logical Computation: Boolean Algebra

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Final Lecture. A few minutes to wrap up and add some perspective


CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function

Memory. From Chapter 3 of High Performance Computing. c R. Leduc

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Computer Organization and Assembly Language (CS-506)

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

CPE300: Digital System Architecture and Design

Generic Model of I/O Module Interface to CPU and Memory Interface to one or more peripherals

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

CPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition

BASIC COMPUTER ORGANIZATION. Operating System Concepts 8 th Edition

Unit 1. Chapter 3 Top Level View of Computer Function and Interconnection

Introduction to Microprocessor

CS 31: Intro to Systems Digital Logic. Kevin Webb Swarthmore College February 3, 2015

The Central Processing Unit

CPU Structure and Function

CENG3420 Lecture 08: Memory Organization

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Chapter 12. CPU Structure and Function. Yonsei University

CPU Structure and Function

CS 31: Intro to Systems Digital Logic. Kevin Webb Swarthmore College February 2, 2016

Components of a personal computer

Instruction Register. Instruction Decoder. Control Unit (Combinational Circuit) Control Signals (These signals go to register) The bus and the ALU

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

EEM 486: Computer Architecture. Lecture 9. Memory

The Memory Hierarchy. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.

Memories. Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu.

DC57 COMPUTER ORGANIZATION JUNE 2013

Computers and Microprocessors. Lecture 34 PHYS3360/AEP3630

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Running Applications

Chapter 5: Computer Systems Organization. Invitation to Computer Science, C++ Version, Third Edition

Chapter 5: Computer Systems Organization

COMPUTER SYSTEM. COMPUTER SYSTEM IB DP Computer science Standard Level ICS3U. COMPUTER SYSTEM IB DP Computer science Standard Level ICS3U

ECE 341. Lecture # 16

Where Have We Been? Ch. 6 Memory Technology

EEC 483 Computer Organization

The Memory Hierarchy & Cache

CPUs. Caching: The Basic Idea. Cache : MainMemory :: Window : Caches. Memory management. CPU performance. 1. Door 2. Bigger Door 3. The Great Outdoors

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

Organisasi Sistem Komputer

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

Computer System Overview. Chapter 1

Transcription:

Summary of Computer Architecture

Summary CHAP 1: INTRODUCTION

Structure Top Level Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output

Structure - CPU CPU I/O Computer System Bus Memory CPU Registers Internal CPU Interconnection Arithmetic and Logic Unit Control Unit

CPU CPU controls the operation of the computer Components of CPU Control Unit control the operation of the CPU Arithmetic Logic Unit (ALU) performs data processing function e.g. calculation Internal CPU Interconnection provides communication between control unit, registers and ALU.

Structure - Control Unit Control Unit ALU CPU Internal Bus Registers Control Unit Sequencing Logic Control Unit Registers and Decoders Control Memory

Summary CHAP 2: BUS

Bus system Expansion slots (PCI, PCIe, )

Function of Control Unit For each operation a unique code is provided e.g. ADD, MOVE A hardware segment accepts the code and issues the control signals We have a computer! Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 9 BIT20303-Computer Architecture

Components The Control Unit and the Arithmetic and Logic Unit (ALU) constitute the Central Processing Unit (CPU) Data and instructions need to get into the system and results out Input/output Temporary storage of code and results is needed Main memory Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 10 BIT20303-Computer Architecture

Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 11 BIT20303-Computer Architecture

Computer Components: Top Level View Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 12 BIT20303-Computer Architecture

How Instruction is Executed? What is instruction? Instruction specify the action that the processor is suppose to take. The processing required for a single instruction is called an instruction cycle. Instruction cycle are made of these two steps: Fetch (processor reads from memory and also referred to as fetch cycle) Execute (Also referred to as execute cycle) Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 13 BIT20303-Computer Architecture

Fetch Cycle Program Counter (PC) holds address of next instruction to fetch Processor fetches instruction from memory location pointed to by PC Increment PC Unless told otherwise Instruction loaded into Instruction Register (IR) Processor interprets instruction and performs required actions Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 14 BIT20303-Computer Architecture

Execute Cycle An instruction s execution (execute cycle) may involve one or a combination of these actions Processor-memory Data transfer between CPU and main memory Processor I/O Data transfer between CPU and I/O module Data processing Some arithmetic or logical operation on data Control Alteration of operations sequences Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 15 BIT20303-Computer Architecture

Instruction Format Assume both instructions and data are 16 bits (2 bytes) long. The instruction format provides 4 bytes for the opcode, so that there can be as many as 2 4 = 16 different opcodes and up to 2 12 words of memory can be directly addressed. Instruction format Integer format Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 16 BIT20303-Computer Architecture

What is Word, Half-Word and Double Word? A "word," in computing, is a standard memory size used for data storage. The most popular word sizes for modern computers is 16, 32, or 64 bits. Some systems or programming languages do not declare specific sizes for variables and use "word," "half-word" and "double word" to describe how much storage space you are allocating. This means that if you have a system with a 32 bit word size, and you declare a double word integer, you have declared a 64 bit integer. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM

Example of Program Execution Internal CPU Registers PC (Program Counter) AC (Accumulator) a data register IR (Instruction Register) Program to be executed: Adds the content of the memory word at address 940 to the content of the memory word address 941 and stores the result in latter location. (Assume a word=16 bits/2 bytes) Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 18 BIT20303-Computer Architecture

(cont.) Example of Program Execution Requires 3 fetch and 3 execute cycles. 1. {1 st Fetch cycle} The PC contains 300, the address of the first instruction. This instruction (the value 1940 in hexadecimal) is loaded into the instruction register IR and the PC is incremented. Note that this process involves the use of a memory address register (MAR) and a memory buffer register (MBR). For simplicity these intermediate registers are ignored. NOTE: The number used in this example is in hexadecimal e.g. 0x1940. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 19 BIT20303-Computer Architecture

(cont.) Example of Program Execution 2. {1 st Execute cycle} The first 4 bits (first hexadecimal digit) in the IR indicate that the AC is to be loaded. The remaining 12 bits (3 hexadecimal digits) specify the address (940) from which data are to be loaded. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 20 BIT20303-Computer Architecture

(cont.) Example of Program Execution 3. {2 nd Fetch cycle} The next instruction (5941) is fetched from location 301 and the PC is incremented. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 21 BIT20303-Computer Architecture

(cont.) Example of Program Execution 4. {2 nd Execute cycle} The old content of the AC and the content of location 941 are added and the result is stored in the AC. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 22 BIT20303-Computer Architecture

(cont.) Example of Program Execution 5. {3 rd Fetch cycle} The next instruction (2941) is fetched from location 302 and the PC is incremented. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 23 BIT20303-Computer Architecture

(cont.) Example of Program Execution 6. {3 rd Execute cycle} The content of AC is stored in location 941. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 24 BIT20303-Computer Architecture

6 5 2 1,10 7 4,11 3 8 9 Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 25 BIT20303-Computer Architecture

Summary CHAP 3: MEMORY

Location Inside CPU (e.g. Registers) Internal (inside the computer e.g. RAM, Level 1 or L1 cache, L2 cache, L3 cache) External (outside of the computer e.g. Hard disks, SSD, removable drives)

A Modern Memory Hierarchy Memory Abstraction Register File 32 words, sub-nsec L1 cache ~32 KB, ~nsec manual/compiler register spilling L2 cache 512 KB ~ 1MB, many nsec L3 cache,... Automatic HW cache management Main memory (DRAM), GB, ~100 nsec Swap Disk 100 GB, ~10 msec automatic demand paging 28

How to access memory location? Random (e.g. RAM) individual address identify locations exactly Direct (e.g. hard disk) Each block has unique address; access by jumping to specific block plus sequential search Associative (e.g. cache) data is retrieved based on the portion of its contents rather than its address Sequentially (e.g. tape) start from the beginning of the tape; access time depends on location of data and previous location.

RAM Two types Static RAM (SRAM) Dynamic RAM (DRAM)

_bitline Memory Technology: DRAM Dynamic random access memory Capacitor charge state indicates stored value Whether the capacitor is charged or discharged indicates storage of 1 or 0 1 capacitor 1 access transistor Capacitor leaks through the RC path DRAM cell loses charge over time DRAM cell needs to be refreshed row enable

bitline _bitline Memory Technology: SRAM Static random access memory Two cross coupled inverters store a single bit Feedback path enables the stored value to persist in the cell 4 transistors for storage 2 transistors for access row select

Fundamental tradeoff Fast memory: small Large memory: slow Idea: Memory hierarchy Memory Hierarchy CPU RF Cache Main Memory (DRAM) Hard Disk Latency, cost, size, bandwidth

Caching Basics: Exploit Temporal Locality Idea: Store recently accessed data in automatically managed fast memory (called cache) Anticipation: the data will be accessed again soon Temporal locality principle Recently accessed data will be again accessed in the near future This is what Maurice Wilkes had in mind: Wilkes, Slave Memories and Dynamic Storage Allocation, IEEE Trans. On Electronic Computers, 1965. The use is discussed of a fast core memory of, say 32000 words as a slave to a slower core memory of, say, one million words in such a way that in practical cases the effective access time is nearer that of the fast memory than that of the slow memory.

Caching Basics: Exploit Spatial Locality Idea: Store addresses adjacent to the recently accessed one in automatically managed fast memory Logically divide memory into equal size blocks Fetch to cache the accessed block in its entirety Anticipation: nearby data will be accessed soon Spatial locality principle Nearby data in memory will be accessed in the near future E.g., sequential instruction access, array traversal This is what IBM 360/85 implemented 16 Kbyte cache with 64 byte blocks Liptay, Structural aspects of the System/360 Model 85 II: the cache, IBM Systems Journal, 1968.

The Bookshelf Analogy Book in your hand Desk Bookshelf Boxes at home Boxes in storage Recently-used books tend to stay on desk Comp Arch books, books for classes you are currently taking Until the desk gets full Adjacent books in the shelf needed around the same time If I have organized/categorized my books well in the shelf

Cache Cache hits vs. Cache misses Cache types Direct-mapped cache Set Associativity cache

Summary CHAP 4: INPUT OUTPUT

Input/Output Problems Wide variety of peripherals Delivering different amounts of data At different speeds In different formats All slower than CPU and RAM Need I/O modules BIT20303-Computer Architecture 39

Input/Output Module Interface to CPU and Memory Interface to one or more peripherals BIT20303-Computer Architecture 40

Generic Model of I/O Module BIT20303-Computer Architecture 41

External Devices Human readable Screen, printer, keyboard Machine readable Monitoring and control Communication Modem Network Interface Card (NIC) BIT20303-Computer Architecture 42

External Device Block Diagram Control Signal determines the function that the device will perform such as send data to the I/O module (INPUT or READ) or accept data from the I/O module (OUTPUT or WRITE). Status signal indicates the state of the device e.g. busy or idle. Data are according to the control signal either for READ or WRITE. Buffer is to temporarily hold the data being transferred between I/O and the external environment.

I/O Module Functions Control & Timing CPU Communication Device Communication Data Buffering Error Detection BIT20303-Computer Architecture 44

Three Techniques for Input of a Block of Data What are the differences between these techniques? BIT20303-Computer Architecture 45

Programmed I/O BIT20303-Computer Architecture 46

Programmed I/O CPU has direct control over I/O Sensing status Read/write commands Transferring data CPU waits for I/O module to complete operation Wastes CPU time BIT20303-Computer Architecture 47

Programmed I/O - detail CPU requests I/O operation I/O module performs operation I/O module sets status bits CPU checks status bits periodically I/O module does not inform CPU directly I/O module does not interrupt CPU CPU may wait or come back later BIT20303-Computer Architecture 48

Interrupt-Driven I/O BIT20303-Computer Architecture 49

Interrupt Driven I/O Basic Operation CPU issues read command I/O module gets data from peripheral whilst CPU does other work I/O module interrupts CPU CPU requests data I/O module transfers data BIT20303-Computer Architecture 50

Simple Interrupt Processing BIT20303-Computer Architecture 51

Direct Memory Access (DMA) BIT20303-Computer Architecture 52

DMA Interrupt driven and programmed I/O require active CPU intervention Transfer rate is limited CPU is tied up DMA is the answer BIT20303-Computer Architecture 53

DMA Operation CPU tells DMA controller:- Read/Write Device address Starting address of memory block for data Amount of data to be transferred CPU carries on with other work DMA controller deals with transfer DMA controller sends interrupt when finished BIT20303-Computer Architecture 54

DMA Transfer Cycle Stealing DMA controller takes over bus for a cycle Transfer of one word of data Not an interrupt CPU does not switch context CPU suspended just before it accesses bus i.e. before an operand or data fetch or a data write Slows down CPU but not as much as CPU doing transfer BIT20303-Computer Architecture 55

Summary CHAP 5: COMPUTER ARITHMETIC

Unsigned Integer 0101 + 0010 =(4+1) + 2 = 7 0101 1010 + 0001 0001 0101 + 0010 0111 0101 1010 + 0001 0001 0110 1011

0101 x 0110 0101 x 0110 0000 0101 0000 + 0101 011001 Unsigned Integer

(REVERSE BIT) (PLUS 1) Minimum value = 1000000 = -64 Maximum value = 0111111 = 63

Signed Integers (2 s Complement) OVERFLOW RULE If 2 numbers are added, and they are both positive or both negative, then OVERFLOW occurs if and only if the result has the opposite sign.

Fixed Floating Point 0010.1010 =2 1 + 2-1 + 2-3 = 2 + (½) + (1/8) = 2 + 0.5 + 0.25 = 2.75

Single-Precision Floating Point FORMULA: Sign (1 bit).exponent (3 bit).significand (4 bit) ANSWER: 1.125x0.5=1.625 Note: Bias = 3, Thus exponent = -1 (where 010 is 2; thus 2 3 = -1), 1.001=1 + (1/8)=1 + 0.125

Single Precision Floating Point 0 010 0010 (8 bit) Sign = 0 Exponent = 010 7 = -5 Significand = 0010 = 2-3 = (1/8) = 0.25 (-1) Sign x 1.significand x 2 exponent-bias = (-1) 0 x 1.0010 x 2-5 = 1 x (1+0.25) x (1/32) = 1.25 x 0.03125 = 0.0390625 1 01111110 00100000000 000000000000 (24 bit) (-1) Sign x 1.significand x 2 exponent-bias = (-1) 1 x 1.0010 x 2 126-127 = -1 x (1+0.25) x 2-1 = -1.25 x 0.5 = -0.625 NOTE: For 8 bit, bias=3 (-3 to 4); for 24 bit, bias=127 (-127 to 128)

3-bit bias 111=-3 011=3 8-bit bias 1111 1111=-127 0111 1111=127

CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data BIT20303-Computer Architecture 65

Summary CHAP 7: CPU

CPU With Systems Bus BIT20303-Computer Architecture 67

CPU Internal Structure BIT20303-Computer Architecture 68

Registers A small storage available in CPU Faster than main memory BIT20303-Computer Architecture 69

Type of Registers General Purpose Data Address hold addresses that are used by instructions to access main memory (RAM) Control and Status BIT20303-Computer Architecture 70

How to increase speed performance of CPU? Improving organization e.g. locate cache nearer to CPU, increase bus bandwidth Increase clock frequency e.g. from 1 GHz to 5 GHz Increase parallelism e.g. pipelining, superscalar, Simultaneous Multithreading (SMT)

Thank You