Structure of Computer Systems

Similar documents
SYLLABUS. osmania university CHAPTER - 1 : REGISTER TRANSFER LANGUAGE AND MICRO OPERATION CHAPTER - 2 : BASIC COMPUTER

INDEX. 3 3DNow! multimedia extension, 164

Intel released new technology call P6P

Advanced Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Structure of Computer Systems

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering

Advanced Processor Architecture

Advanced d Processor Architecture. Computer Systems Laboratory Sungkyunkwan University

Reader's Guide Outline of the Book A Roadmap For Readers and Instructors Why Study Computer Organization and Architecture Internet and Web Resources

Honorary Professor Supercomputer Education and Research Centre Indian Institute of Science, Bangalore

SAE5C Computer Organization and Architecture. Unit : I - V

Computer Organization and Design THE HARDWARE/SOFTWARE INTERFACE

Keywords and Review Questions

CS6303-COMPUTER ARCHITECTURE UNIT I OVERVIEW AND INSTRUCTIONS PART A

MaanavaN.Com CS1202 COMPUTER ARCHITECHTURE

CS2253 COMPUTER ORGANIZATION AND ARCHITECTURE 1 KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY

PowerPC 740 and 750

Computer Architecture. Fall Dongkun Shin, SKKU

Computer Systems Architecture I. CSE 560M Lecture 10 Prof. Patrick Crowley

Computer Organization and Design, 5th Edition: The Hardware/Software Interface

UNIT I BASIC STRUCTURE OF COMPUTERS Part A( 2Marks) 1. What is meant by the stored program concept? 2. What are the basic functional units of a

CS146 Computer Architecture. Fall Midterm Exam

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS

INTELLIGENCE PLUS CHARACTER - THAT IS THE GOAL OF TRUE EDUCATION UNIT-I

Chapter 13 Reduced Instruction Set Computers

Pipelining and Vector Processing

Uniprocessors. HPC Fall 2012 Prof. Robert van Engelen

GRE Architecture Session

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

Superscalar Machines. Characteristics of superscalar processors

ECE 485/585 Microprocessor System Design

EC 513 Computer Architecture

Performance of Computer Systems. CSE 586 Computer Architecture. Review. ISA s (RISC, CISC, EPIC) Basic Pipeline Model.

anced computer architecture CONTENTS AND THE TASK OF THE COMPUTER DESIGNER The Task of the Computer Designer

Superscalar Processors

DC57 COMPUTER ORGANIZATION JUNE 2013

Computer Organization and Microprocessors SYLLABUS CHAPTER - 1 : BASIC STRUCTURE OF COMPUTERS CHAPTER - 3 : THE MEMORY SYSTEM

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

Like scalar processor Processes individual data items Item may be single integer or floating point number. - 1 of 15 - Superscalar Architectures

CPE300: Digital System Architecture and Design

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

Superscalar Processors

omputer Design Concept adao Nakamura

ECE 341. Lecture # 15

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language.

Computer Architecture

CS 654 Computer Architecture Summary. Peter Kemper

COMPUTER ORGANIZATION AND ARCHITECTURE

M (~ Computer Organization and Design ELSEVIER. David A. Patterson. John L. Hennessy. University of California, Berkeley. Stanford University

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1

ECE 341 Final Exam Solution

Computer Organization and Design

EECS 322 Computer Architecture Superpipline and the Cache

Chapter 06: Instruction Pipelining and Parallel Processing. Lesson 14: Example of the Pipelined CISC and RISC Processors

PART A (22 Marks) 2. a) Briefly write about r's complement and (r-1)'s complement. [8] b) Explain any two ways of adding decimal numbers.

Instruction Level Parallelism

UNIT I DATA REPRESENTATION, MICRO-OPERATIONS, ORGANIZATION AND DESIGN

Advanced issues in pipelining

Course web site: teaching/courses/car. Piazza discussion forum:

EE 4980 Modern Electronic Systems. Processor Advanced

Designing for Performance. Patrick Happ Raul Feitosa

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

Course Description: This course includes concepts of instruction set architecture,

COMPUTER PRINCIPLES AND DESIGN IN VERILOG HDL

Microprocessor Architecture Dr. Charles Kim Howard University

SRM ARTS AND SCIENCE COLLEGE SRM NAGAR, KATTANKULATHUR

T T T T T T N T T T T T T T T N T T T T T T T T T N T T T T T T T T T T T N.

Lecture Topics. Principle #1: Exploit Parallelism ECE 486/586. Computer Architecture. Lecture # 5. Key Principles of Computer Architecture

Advanced Computer Architecture

1.3 Data processing; data storage; data movement; and control.

By, Ajinkya Karande Adarsh Yoga

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

1. PowerPC 970MP Overview

COMPUTER PRINCIPLES AND DESIGN IN VERILOG HDL

CS425 Computer Systems Architecture

Topics in computer architecture

Lecture 1: Course Introduction and Overview Prof. Randy H. Katz Computer Science 252 Spring 1996

Multiple Instruction Issue. Superscalars

ECE 341 Midterm Exam

Main Memory. EECC551 - Shaaban. Memory latency: Affects cache miss penalty. Measured by:

Computer Organization and Architecture (CSCI-365) Sample Final Exam

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE) UNIT-I

7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.

Portland State University ECE 588/688. Cray-1 and Cray T3E

Memory latency: Affects cache miss penalty. Measured by:

Computer Architecture Spring 2016

RISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.

Memory latency: Affects cache miss penalty. Measured by:

What is Superscalar? CSCI 4717 Computer Architecture. Why the drive toward Superscalar? What is Superscalar? (continued) In class exercise

Lecture 1: Introduction

Computer organization and architecture UNIT-I 2 MARKS

PREPARED BY: S.SAKTHI, AP/IT

CISC / RISC. Complex / Reduced Instruction Set Computers

UNIVERSITY OF MORATUWA CS2052 COMPUTER ARCHITECTURE. Time allowed: 2 Hours 10 min December 2018

Page 1. Multilevel Memories (Improving performance using a little cash )

CS Computer Architecture

Techniques for Mitigating Memory Latency Effects in the PA-8500 Processor. David Johnson Systems Technology Division Hewlett-Packard Company

Digital Semiconductor Alpha Microprocessor Product Brief

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Transcription:

Structure of Computer Systems

Structure of Computer Systems Baruch Zoltan Francisc Technical University of Cluj-Napoca Computer Science Department U. T. PRES Cluj-Napoca, 2002

CONTENTS PREFACE... xiii 1.INTRODUCTION...1 1.1. TAXONOMIES OF COMPUTER ARCHITECTURES...1 1.2. OVERVIEW OF COMPUTER ARCHITECTURES...3 1.2.1. Multiprocessors...3 1.2.2. Multicomputers...4 1.2.3. Multi-multiprocessors...5 1.2.4. Data Flow Architectures...6 1.2.5. Array Processors...7 1.2.6. Pipelined Vector Processors...8 1.2.7. Systolic Arrays...9 1.2.8. Hybrid Architectures...9 1.2.9. Artificial Neural Networks... 10 1.2.10. Fuzzy Logic Processors... 12 1.3. PERFORMANCE AND QUALITY MEASUREMENTS... 13 1.3.1. Execution time... 13 1.3.2. CPU Performance... 14 1.3.3. MIPS... 17 1.3.4. MFLOPS... 19 1.3.5. Other performance measurements... 20 1.3.6. Benchmark programs... 20 1.3.6.1. Comparing and Summarizing Performance... 21 1.3.6.2. The Evolution of Benchmark Programs... 23 1.3.6.3. CPU95... 24 1.3.6.4. CPU2000... 25 1.3.7. Quality factors... 26 1.4. QUANTITATIVE PRINCIPLES OF COMPUTER DESIGN... 27

vi Structure of Computer Systems 1.4.1. Amdahl s Law... 27 1.4.2. Locality of Reference... 29 1.5. PROBLEMS... 29 2. DESIGN REPRESENTATION AND METHODOLOGY...34 2.1. SYSTEM REPRESENTATION... 34 2.2. LEVELS OF DESCRIPTION... 35 2.3. DESIGN PROCESS... 38 2.3.1. System-Level Synthesis... 39 2.3.2. High-Level Synthesis... 40 2.3.3. Register-Transfer Level Synthesis... 40 2.3.4. Logic-Level Synthesis... 41 2.3.5. Technology Mapping... 41 2.4. VHDL HARDWARE DESCRIPTION LANGUAGE... 41 2.4.1. Hardware Description Languages... 41 2.4.2. Introduction to VHDL... 42 2.4.3. VHDL Styles of Description... 44 2.4.4. The Time Model in VHDL... 48 2.4.5. Simulation of a Model... 50 3. ARITHMETIC-LOGIC UNIT...52 3.1. ADDITION... 52 3.1.1. Full Adder... 52 3.1.2. Ripple Carry Adder... 54 3.1.3. Carry Lookahead Adder... 55 3.1.4. Carry Select Adder... 58 3.1.5. Carry Save Adder... 58 3.1.6. Serial Adder... 60 3.1.7. Binary-Coded Decimal Number Addition... 60 3.2. MULTIPLICATION... 61 3.2.1. Shift-and-Add Multiplication... 62 3.2.2. Booth s Technique... 66 3.2.3. Wallace Tree... 67 3.2.4. Shifting Over Zeros and Ones... 70 3.2.5. Array Multiplier... 71 3.3. DIVISION... 73 3.3.1. Restoring Division... 74 3.3.2. Nonrestoring Division... 78 3.3.3. SRT Division... 79 3.3.4. Other Fast Division Methods... 81 3.3.5. Array Divider... 82 3.3.6. Signed Division... 84 3.4. FLOATING-POINT NUMBERS... 84 3.4.1. Floating-Point Representation... 85 3.4.1.1. Principles... 85

Contents vii 3.4.1.2. IEEE 754 Floating-Point Standard... 88 3.4.2. Floating-Point Operations... 93 3.4.2.1. Floating-Point Addition and Subtraction... 94 3.4.2.2. Floating-Point Multiplication and Division... 97 3.4.2.3. Precision Considerations... 100 3.5. PROBLEMS... 101 4. MEMORY SYSTEMS... 105 4.1. MEMORY HIERARCHY... 105 4.2. MEMORY TYPES... 109 4.3. MEMORY PERFORMANCE MEASURES... 111 4.4. SEMICONDUCTOR MAIN MEMORY... 112 4.4.1. Memory Cell and Memory Unit... 112 4.4.2. Memory Organization... 115 4.4.3. Memory Design... 118 4.4.4. Example of a Commercial Memory Circuit... 119 4.4.5. Performance Parameters of DRAM Memories... 121 4.4.6. Technologies for DRAM Memories... 124 4.4.6.1. Categories of DRAM Memories... 124 4.4.6.2. FPM DRAM... 125 4.4.6.3. EDO DRAM... 126 4.4.6.4. BEDO DRAM... 127 4.4.6.5. SDRAM... 128 4.4.6.6. HSDRAM... 136 4.4.6.7. ESDRAM... 137 4.4.6.8. Virtual Channel Memory... 142 4.4.6.9. FCRAM... 145 4.4.6.10. DDR SDRAM... 148 4.4.6.11. DDR II SDRAM... 153 4.4.6.12. RDRAM and DRDRAM... 156 4.4.6.13. IRAM... 165 4.4.6.14. Memory Modules... 171 4.5. INTERLEAVED MEMORY... 173 4.6. ASSOCIATIVE MEMORY... 174 4.7. CACHE MEMORY... 179 4.7.1. Principle of Cache Memory... 179 4.7.2. Cache Memory Organization... 179 4.7.3. Cache Memory Operation... 181 4.7.4. Address Mapping... 184 4.7.4.1. Associative Mapping... 184 4.7.4.2. Direct Mapping... 185 4.7.4.3. Set-Associative Mapping... 189 4.7.5. Replacement Policies... 192 4.7.5.1. Random Replacement... 192 4.7.5.2. Least Frequently Used... 192

viii Structure of Computer Systems 4.7.5.3. Least Recently Used (LRU)... 192 4.7.6. Cache Memory Types... 193 4.7.7. Cache Memory Performance... 193 4.7.8. Cache Memory Coherence... 195 4.8. VIRTUAL MEMORY... 198 4.8.1. Principle of Virtual Memory... 198 4.8.2. Address Translation... 199 4.8.3. Paging... 202 4.8.4. Segmentation... 205 4.8.5. Paging and Segmentation... 206 4.8.6. Memory Allocation... 207 4.8.6.1. Non-preemptive Allocation... 208 4.8.6.2. Preemptive Allocation... 210 4.8.6.3. Replacement Policies... 211 4.8.7. Memory Management in the Intel Architecture... 216 4.8.7.1. Memory Management Overview... 216 4.8.7.2. Segmentation... 219 4.8.7.3. Paging... 220 4.9. PROBLEMS... 222 5. PIPELINING...228 5.1. PIPELINE STRUCTURE... 228 5.2. PIPELINE PERFORMANCE MEASURES... 229 5.3. PIPELINE TYPES... 230 5.4. INSTRUCTION PIPELINES... 232 5.4.1. Principle of Instruction Pipelines... 232 5.4.2. The Fetching Problem... 235 5.4.3. The Bottleneck Problem... 235 5.4.4. The Structural Hazard Problem... 236 5.4.5. The Data Hazard Problem... 236 5.4.5.1. Data Dependencies... 236 5.4.5.2. Tomasulo s Method... 239 5.4.5.3. Scoreboard Method... 241 5.4.6. The Control Hazard Problem... 244 5.4.6.1. Branch Instructions... 244 5.4.6.2. Branch Prediction... 246 5.4.6.3. Delayed Branching... 250 5.4.6.4. Multiple Prefetching... 251 5.4.7. The Intel Architecture Processors Pipeline... 251 5.4.7.1. Fetch/Decode Unit... 251 5.4.7.2. Instruction Pool... 253 5.4.7.3. Dispatch/Execute Unit... 253 5.4.7.4. Retirement Unit... 255 5.4.7.5. Bus Interface Unit... 255 5.4.8. Throughput Improvements of an Instruction Pipeline... 256

Contents ix 5.4.8.1. Superscalar Processing... 256 5.4.8.2. Superpipeline Processing... 260 5.4.8.3. Very Long Instruction Word... 261 5.4.8.4. Explicitly Parallel Instruction Computing... 263 5.4.8.5. Comparison of Throughput Improvement Methods... 269 5.5. ARITHMETIC PIPELINES... 270 5.5.1. Principle of Arithmetic Pipelines... 270 5.5.2. Design of an Arithmetic Pipeline... 271 6.5.3. Arithmetic Pipelines with Feedback... 274 5.5.4. Pipelined Multipliers... 277 5.5.5. Systolic Arrays... 280 5.6. PIPELINE CONTROL... 282 5.6.1. Scheduling... 282 5.6.2. Scheduling Static Pipelines... 283 5.6.3. Scheduling Dynamic Pipelines... 286 5.7. PROBLEMS... 288 6. RISC ARCHITECTURES...293 6.1. INTRODUCTION... 293 6.2. CAUSES FOR INCREASED ARCHITECTURAL COMPLEXITY... 294 6.3. ADVANTAGES OF RISC ARCHITECTURES... 294 6.4. THE USE OF A LARGE NUMBER OF REGISTERS... 296 6.5. CHARACTERISTICS OF RISC ARCHITECTURES... 298 6.6. COMPARISON BETWEEN RISC AND CISC ARCHITECTURES... 299 6.7. APPLICATIONS OF RISC PROCESSORS... 299 6.8. MIPS... 302 6.8.1. Introduction... 302 6.8.2. MIPS R2000... 302 6.8.3. MIPS R3000... 303 6.8.4. MIPS R3500... 303 6.8.5. MIPS R3001... 303 6.8.6. MIPS R4000... 303 6.8.7. MIPS R4300i... 304 6.8.8. MIPS R4400... 304 6.8.9. MIPS R4600, R4650 and R4700... 305 6.8.10. MIPS R6000... 306 6.8.11. MIPS-III Architecture... 306 6.8.11.1. Hardware Details... 306 6.8.11.2. Software Details... 308 6.8.11.3. Floating-Point Unit... 309 6.8.11.4. Cache Memories... 310 6.8.11.5. Memory Management... 311 6.8.11.6. Exceptions... 311 6.8.12. MIPS R8000 and R10000... 312 6.8.12.1. Introduction... 312

x Structure of Computer Systems 6.8.12.2. Hardware Details... 312 6.8.12.3. Software Details... 314 6.8.12.4. Floating-Point Unit... 314 6.8.12.5. Cache Memories... 314 6.8.12.6. Memory Management... 315 6.8.13. MIPS R5000... 315 6.8.13.1. Overview... 315 6.8.13.2. Increased 3D Graphics Performance... 316 6.8.13.3. Multi-Processing Support... 318 6.8.13.4. Secondary Cache Memory Support... 318 6.8.13.5. Flexible Clocking Mechanism... 319 6.8.14. Summary... 319 6.9. SPARC... 320 6.9.1. Introduction... 320 6.9.2. HyperSPARC... 322 6.9.3. SuperSPARC... 322 6.9.4. MicroSPARC and MicroSPARC-II... 323 6.9.5. SPARClite... 323 6.9.6. UltraSPARC-I... 324 6.9.7. UltraSPARC-II... 324 6.9.8. UltraSPARC-IIi... 325 6.9.8.1. Overview... 325 6.9.8.2. Block Diagram... 326 6.9.8.3. Prefetch and Dispatch Unit... 327 6.9.8.4. Integer Execution Unit... 328 6.9.8.5. Floating-point Unit... 328 6.9.8.6. I/O Memory Management Unit... 329 6.9.8.7. Memory Controller Unit... 329 6.9.8.8. Load-Store Unit... 330 6.9.8.9. Data and Instruction Cache Memories... 330 6.9.8.10. External Cache Unit... 330 6.9.8.11. Graphics Unit... 331 6.9.8.12. The Visual Instruction Set... 332 6.9.9. UltraSPARC-III... 333 6.9.10. MAJC... 333 6.9.11. Summary... 336 6.10. ALPHA... 337 6.10.1. Introduction... 337 6.10.2. Alpha 21064 and 21064A... 338 6.10.2.1. Overview... 338 6.10.2.2. Block Diagram... 338 6.10.2.3. Instruction Fetch/Decode Unit... 338 6.10.2.4. Integer Execution Unit... 340 6.10.2.5. Floating-Point Execution Unit... 340 6.10.2.6. Address Unit... 340

Contents xi 6.10.2.7. Branch Unit... 341 6.10.2.8. Cache Memory... 341 6.10.2.9. Software Details... 341 6.10.2.10. Memory Management... 343 6.10.3. Alpha 21066, 21066A and 21068... 343 6.10.4. Alpha 21164 and 21164PC... 344 6.10.4.1. Overview... 344 6.10.4.2. Block Diagram... 345 6.10.4.3. Instruction Fetch/Decode and Branch Unit... 346 6.10.4.4. Integer Execution Unit... 347 6.10.4.5. Floating-Point Execution Unit... 347 6.10.4.6. Memory Address Translation Unit... 348 6.10.4.7. Cache Control and Bus Interface Unit... 348 6.10.4.8. Cache Memories... 349 6.10.4.9. Serial Read-Only Memory Interface... 349 6.10.5. Alpha 21264... 349 6.10.5.1. Overview... 349 6.10.5.2. Block Diagram... 350 6.10.5.3. Instruction Fetch, Issue, and Retire Unit... 351 6.10.5.4. Integer Execution Unit... 356 6.10.5.5. Floating-Point Execution Unit... 356 6.10.5.6. Cache Memories... 357 6.10.5.7. Memory Reference Unit... 357 6.10.5.8. External Cache Memory and System Interface Unit... 358 6.10.5.9. SROM Interface... 359 6.10.6. Summary... 359 6.11. POWERPC... 359 6.11.1. Introduction... 359 6.11.2. PowerPC 601... 360 6.11.2.1. Overview... 360 6.11.2.2. Block Diagram... 361 6.11.2.3. Instruction Unit... 361 6.11.2.4. Execution Units... 361 6.11.2.5. Cache Memory... 363 6.11.2.6. Memory Management... 363 6.11.2.7. Software Details... 363 6.11.2.8. Exceptions... 364 6.11.3. PowerPC 602... 365 6.11.4. PowerPC 603 and 603e... 366 6.11.5. PowerPC 604 and 604e... 367 6.11.6. PowerPC 740 and 750... 368 6.11.6.1. Overview... 368 6.11.6.2. Block Diagram... 370 6.11.6.3. Instruction Unit... 370 6.11.6.4. Completion Unit... 372

xii Structure of Computer Systems 6.11.6.5. Integer Units... 372 6.11.6.6. Floating-Point Unit... 373 6.11.6.7. Load/Store Unit... 373 6.11.6.8. System Register Unit... 374 6.11.6.9. Memory Management Units... 374 6.11.6.10. On-Chip Cache Memories... 375 6.11.6.11. L2 Cache Memory... 375 6.11.6.12. Bus Interface Unit... 375 6.11.7. PowerPC 7400... 376 6.11.7.1. Overview... 376 6.11.7.2. AltiVec Vector Permute Unit... 377 6.11.7.3. AltiVec Vector Arithmetic-Logic Unit... 378 6.11.8. PowerPC 850 and 860... 379 6.11.8.1. Overview... 379 6.11.8.2. Block Diagram... 379 6.11.8.3. The PowerPC Core... 380 6.11.8.4. System Interface Unit... 381 6.11.8.5. PCMCIA Controller... 381 6.11.8.6. Communications Processor Module... 381 6.11.8.7. Differences between the MPC850 and MPC860 processors. 382 6.11.9. Summary... 382 6.12. PROBLEMS... 383 BIBLIOGRAPHY...384 INDEX... 388