RAID 0 (non-redundant) RAID Types 4/25/2011

Similar documents
Parallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1

EN164: Design of Computing Systems Lecture 24: Processor / ILP 5

EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design

Computer Architecture

THREAD LEVEL PARALLELISM

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan

Lecture 26: Parallel Processing. Spring 2018 Jason Tang

Online Course Evaluation. What we will do in the last week?

UNIT- 5. Chapter 12 Processor Structure and Function

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

UNIT I (Two Marks Questions & Answers)

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Non-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors.

COSC 6385 Computer Architecture - Thread Level Parallelism (I)

Lecture 8: RISC & Parallel Computers. Parallel computers

CS 1013 Advance Computer Architecture UNIT I

CSCI 4717 Computer Architecture

Chapter 17 - Parallel Processing

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

Fundamentals of Computers Design

Compiler Code Generation COMP360

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function

Fundamentals of Computer Design

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers

CMSC 313 Lecture 27. System Performance CPU Performance Disk Performance. Announcement: Don t use oscillator in DigSim3

4. Jump to *RA 4. StackGuard 5. Execute code 5. Instruction Set Randomization 6. Make system call 6. System call Randomization

1. (10) True or False: (1) It is possible to have a WAW hazard in a 5-stage MIPS pipeline.

Announcement. Computer Architecture (CSC-3501) Lecture 25 (24 April 2008) Chapter 9 Objectives. 9.2 RISC Machines

CDA3101 Recitation Section 13

Major Advances (continued)

Superscalar Processors

RISC Processors and Parallel Processing. Section and 3.3.6

Tools and techniques for optimization and debugging. Fabio Affinito October 2015

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

EE 4980 Modern Electronic Systems. Processor Advanced

omputer Design Concept adao Nakamura

Multiprocessors & Thread Level Parallelism

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)

Lecture 25: Interrupt Handling and Multi-Data Processing. Spring 2018 Jason Tang

Spring 2014 Midterm Exam Review

Control Hazards. Branch Prediction

Lecture Notes for 04/04/06: UNTRUSTED CODE Fatima Zarinni.

Processor Architecture and Interconnect

CS6303-COMPUTER ARCHITECTURE UNIT I OVERVIEW AND INSTRUCTIONS PART A

Parallel Computing: Parallel Architectures Jin, Hai

Computer Architecture COMP360

Parallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization

EECS4201 Computer Architecture

Computer organization by G. Naveen kumar, Asst Prof, C.S.E Department 1

COSC 243. Computer Architecture 2. Lecture 12 Computer Architecture 2. COSC 243 (Computer Architecture)

Computer Architecture

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Tools and techniques for optimization and debugging. Andrew Emerson, Fabio Affinito November 2017

Beyond Latency and Throughput

CPSC 313, 04w Term 2 Midterm Exam 2 Solutions

Introduction to Parallel Programming

Issues in Parallel Processing. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Main Points of the Computer Organization and System Software Module

Normal computer 1 CPU & 1 memory The problem of Von Neumann Bottleneck: Slow processing because the CPU faster than memory

3.3 Hardware Parallel processing

anced computer architecture CONTENTS AND THE TASK OF THE COMPUTER DESIGNER The Task of the Computer Designer

Lecture 7: Parallel Processing

Copyright 2012, Elsevier Inc. All rights reserved.

Multiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

WHY PARALLEL PROCESSING? (CE-401)

Chapter 4 Data-Level Parallelism

Tutorial 4 KE What are the differences among sequential access, direct access, and random access?

Martin Kruliš, v

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-11: 80x86 Architecture

Embedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.

Lect. 2: Types of Parallelism

Complex Instruction Sets

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems

Processors, Performance, and Profiling

Pipelining, Branch Prediction, Trends

Advanced Computer Architectures CC721

Real instruction set architectures. Part 2: a representative sample

Principles of Computer Architecture. Chapter 10: Trends in Computer. Principles of Computer Architecture by M. Murdocca and V.

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

Dynamic Translation for EPIC Architectures

BlueGene/L (No. 4 in the Latest Top500 List)

ECEC 355: Pipelining

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

Parallel Computing Introduction

Memory Models. Registers

Superscalar Machines. Characteristics of superscalar processors

Computer Architecture Spring 2016

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1

Computer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University

High Performance Computing in C and C++

Transcription:

Exam 3 Review COMP375 Topics I/O controllers chapter 7 Disk performance section 6.3-6.4 RAID section 6.2 Pipelining section 12.4 Superscalar chapter 14 RISC chapter 13 Parallel Processors chapter 18 Security Topics not covered in class will not be on the exam. RAID Types RAID 0 - Striping RAID 1 - Mirroring RAID 2 - Hamming code error recovery RAID 3 - Bit-interleaved parity RAID 4 - Block-level parity RAID 5 - Block-level distributed parity RAID 6 - Dual redundancy RAID 0 (non-redundant) Sequential blocks of a file are written across multiple disks Improved transfer rate Decreased reliability 1

RAID 1 (mirrored) RAID 5 (distributed block parity) Improved Reliability Slightly slower writes. Possibly faster reads Twice the disk space required Striping improves read performance Parity improves reliability N+1 disks are required RAID 6 (dual redundancy) Like RAID 5 but with two parity blocks for each data block Slow writes N+2 disks required RAID Comparison RAID Disks Reads Writes Survives failures 0 N faster faster 0 1 2N slightly faster 5 N+1 faster 6 N+2 faster slightly slower slightly slower slightly slower 1 1 2 2

RAID Planning First calculate how many drives does it take to hold your data If high performance is the only goal, then use RAID 0, striping, with N disks No fault recovery RAID Planning If the data will fit on one physical hard drive, use RAID 1 with two drives If the data fits on N > 1 physical hard drives, use RAID 5 with N+1 hard drives If you need to survive the failure of two physical drives, use RAID 6 with N+2 drives Hazards A hazard is a situation that reduces the processors ability to pipeline instructions. Resource When different instructions want to use the same CPU resource. Data When the data used in an instruction is modified by the previous instruction. Control When a jump is taken or anything changes the sequential flow. Resource Hazards In the example below, both the operand fetch and instruction fetch stages are using the memory system. Hazards can cause pipeline stalls. 3

Data Hazards The data used by one instruction may be modified by a previous instruction. If the previous instruction has not completed and stored the results, the next instruction will use an incorrect value. Data Hazard Resolution Register Forwarding The data from a previous instruction ti can be used by the next instruction before or while it is being written back. Register locking When a register is in use by an instruction, that register is locked to following instructions until the first instruction completes. This avoids incorrect results but introduces delays. Control Hazards A jump or function call changes the sequential execution of instructions. The pipelined instruction fetch stage continually fetches sequential instructions. When a jump occurs, the previously fetched instructions should not be executed. Instructions in the pipe may have to be discarded before the write back stage. Using A Pipeline Pipeline is transparent to programmer Disadvantage: programmer who does not understand pipeline can produce inefficient code Reason: hardware automatically stalls pipeline if items are not available 4

Pipelining Guide Pipeline Optimization Assume 3 stage pipeline Compilers can rearrange the machine language instructions ti so that t adjacent instructions avoid data hazards Same compiler optimization improves superscalar execution Hardware improvements, such as delayed branches, multi-access memory and multiple ALUs Un-optimized code again: mov eax, dog add eax, pig mov cow, eax mov eax, cat add eax, 47 mov goat, eax mov eax, rat sub eax, pig mov bird, eax cmp bird, 100 jl again Optimized code again: mov eax, rat mov ebx, dog mov ecx, cat sub eax, pig add ebx, pig add ecx, 47 mov bird, eax cmp eax, 100 mov cow, ebx mov goat, ecx jl again RISC Design Principles Simple operations Simple instructions that can execute in one cycle Register-to-register operations Only load and store operations access memory Rest of the operations on a register-to-register basis Simple addressing modes A few addressing modes (1 or 2) RISC Design Principles Large number of registers Needed to support register-to-register operations Minimize the procedure call and return overhead Fixed-length instructions Facilitates efficient instruction execution Simple instruction format Fixed boundaries for various fields 5

RISC Design Principle Start an instruction every cycle Simple, fixed length instructions are easy to pipeline. Only two instruction have memory operands all other operands are in registers. Delayed branches RISC Traits Pipelined Simple uniform instructions Few instructions ti No microcode Few addressing modes Load/Store architecture Many identical general purpose registers Sliding register stack Delayed branches Fast How Visible is Parallelism? Superscalar Programmer never notices Multiple Threads Programmer must create multiple threads in the program Multiple processes Different programs Programmer never notices Parts of the same program Programmer must divide the work among different processes Flynn s Parallel Classification SISD Single Instruction Single Data standard uniprocessors SIMD Single Instruction Multiple Data vector and array processors MISD - Multiple Instruction Single Data t li systolic processors MIMD Multiple Instruction Multiple Data 6

Instruction at the Same Time Usually most program execute on line of a program at a time. If this is all the computer can do, then it is Single Instruction If the computer has two or more CPUs and can execute two or more programs at the same time, it is Multiple p Instruction Data per Instruction The assembly programs you wrote operate on one pair of fdata values at a time add eax, dog If more than one of these instructions is executed at the same time, it is Multiple Data If an instruction can operate on multiple data values at a time, it is Multiple Data SISD Flynn Examples Microcode computer model Old computers SIMD Vector processors Intel instructions that operate on multiple integer or floating point numbers MISD Flynn Examples Systolic processors (not generally available) MIMD SMP Popular dual core processors Computers with multiple CPU chips MIMD separate memory Supercomputers with many CPUs 7

Amdahl s Law P = fraction of the program that can be executed din parallel l N = number of processors Single = single CPU execution time multicpu time = (1-P)*single + P*(single/N) multicpu time = single * (1 P + P/N) Highly Probable Questions How long to read X bytes from a drive What RAID type to use Types of pipelining hazards Flynn s classification Amdahl s law Attributes of RISC processors Encryption types Use of digital signatures 8