Parallel Processors. Session 1 Introduction

Size: px
Start display at page:

Download "Parallel Processors. Session 1 Introduction"

Transcription

1 Parallel Processors Session 1 Introduction

2 Applications of Parallel Processors Structural Analysis Weather Forecasting Petroleum Exploration Fusion Energy Research Medical Diagnosis Aerodynamics Simulations Artificial Intelligence Expert Systems Industrial Automation Remote Sensing Military Applications Genetic Engineering Socioeconomics Encryption And Many Other Applications New Applications More Performance Common Requirement: High volume of processing and computations in a limited time

3 Architecture Application requirements ARCHITECTURE Technological constraints Requirements and constraints are often at odds with each other! Architecture ---> making tradeoffs Architecture translates technology s gifts into performance and capability

4 High Performance Computing Achieving high performance depends on: Fast and reliable hardware (Technology Driven) Computer architectures (For example the use of Carry Look Ahead in addition increases the speed of operation) Processing techniques (Better algorithms can result in higher speed of operation) Yet another way: Performing as many operations as possible simultaneously, concurrently, in parallel, instead of sequentially Cost is important! Cost-effective solutions for high performance computing: Advanced computer architectures theories of parallel computing optimal resource allocation fast algorithms efficient programming languages Requires knowledge of algorithms, languages, software, hardware, performance evaluation, and computing alternatives

5 Advanced Computer Architectures Pipelined Computers Array Processors Multiprocessor Systems Hardware Structure Software Structure Parallel Computing Algorithms Optimal Allocations of Resources

6 Definition Parallel processing provides a cost-effective solution to achieve high system performance through concurrent activities It is the method of organization of operations in a computing system where more than one operation is performed simultaneously

7 Scalability Scalability is a major objective in the design of advanced parallel computers Scalability means: a proportional increase in performance with increasing system resources System resources include: Processors, Memory Capacity, I/O Bandwidth

8 Course Description Introductory Graduate Course For Computer Group Prerequisite: Computer Organization and Programming Concepts Course Work: Case Study (Due: Week 8) Project (Due: Week 16) Possible Homework Assignments A Final Exam References: Advanced Computer Architecture, Parallelism, Scalability, Programmability (Kai Hwang) Scalable Parallel Computing Technology, Architecture, Programming (Kai Hwang and Zhiwei Xu) Parallel Computer Architecture : A Hardware/Software Approach (David Culler, J.P. Singh, Anoop Gupta ) Introduction to Parallel Computing (Ted G. Lewis, Hesham El-Rewini)

9 Course Outline Introduction, History, Applications, Classification Principles of Parallel Processing and Basic Concepts Parallel Computer Models and Structures Programming Requirements Interconnection Networks Performance and Scalability Parallel and Scalable Architectures Parallel Programming and Models

10 History Mechanical Computers before 1945 Five generations of electronic computers 1. ( ), Vacuum Tubes, Relay Memories, Fixed-Point Arithmetic, Machine Language (The age of Dinosaurs!) 2. ( ), Discrete Transistors, Multiplexed Memory Access, Floating-Point Arithmetic, High Level Languages and Compilers, batch processing 3. ( ), Integrated Circuits, Microprogramming, Pipelining, Cache, Multiprogramming and Time-Sharing OS, Multiuser applications 4. ( ), LSI/VLSI, Semiconductor Memory, Multiprocessors, Vector Supercomputers, Multicomputers, Multiprocessor OS, Languages, Compilers, Environment for Parallel Processing 5. ULSI/VHSIC Processors Memory, and Switches, High-Density Packaging, Scalable Architectures, Massively Parallel Processing, Teraflops (10 12 floating-point operations per second) Introduction of Concurrency: In early Von Neumann models every operation (Instruction are fetched, operands are fetched, operation is executed, the results are stored) Prefetch operation introduced some degree of concurrency Extra ALUs allowed multiple execution units within the cpu capable of operating in parallel Pipelined operation was introduced in the third generation More CPUs were added to the computers to be able to perform instructions in parallel and independently

11 Evolution From a different perspective the evolution of computers has gone through three waves: First wave: Mainframes Second wave: Minicomputers, High performance super computers Third wave: Personal Computers, Networked computers Parallel computers are the next wave

12 Levels of Parallelism Concurrency is achieved in different levels: Job or Program Level: Multiple jobs or programs are processed concurrently through multiprogramming, timesharing and multiprocessing Requires the development of parallel processable algorithms Efficient allocation of limited hardware and software resources to multiple programms Task or Procedure Level: Multiple procedures or tasks (program segments) within the same program are executed in parallel Requires the decomposition of the program into multiple tasks Interinstruction Level: Multiple instructions are executed concurrently Requires data dependency analysis Intrainstruction Level: Faster and concurrent operations are executed within each instruction Software involvement is the highest in the first level and the lowest in the last level Hardware involvement is increasing as its speed and cost is reduced while software is getting more expensive

13 Alternatives Parallelism in Uniprocessor Systems Multiprocessor Systems Distributed Computers Cluster (Networked) Computers Web Computing Parallel Computers with Centralized Computing Facilities

14 Elements of Modern Computers Computer Architecture: Not only structure of the hardware But also: Instruction Set System Software Application Programs User Interfaces Depending on the nature of the problems the solutions may require different computing resources, for example: Numerical Problems require mathematical formulations and integer and floating-point operations (Numerical Computing) Alphanumerical Problems require database management and information retrieval operations (Transaction Processing) Artificial Intelligence require logic inferences and symbolic manipulations (Logical Reasoning) Respectively the algorithms and data structures will be different The mapping of the system resources for the appropriate algorithms used for specific computing problems is an objective of parallel computer design Mapping includes: Processor Scheduling Memory Maps Interprocessor Communicaitons Computing Problem Algorithms And Data Structures Programming High-Level Languages Mapping Binding (Compile, Load) Operating System Hardware Architecture Applications Software Performance Evaluation

15 Elements of Modern Computers Coordinated effort by hardware resources, operating system, and application software determines the power of a modern computer system The operating system manages the allocation and deallocation of resources during the execution of user programs The mapping of algorithmic and data structures onto the machine architecture relies on efficient compiler and operating system support Parallelism can be exploited at: Algorithm design Programming Compilation Run time Techniques for exploiting parallelism at the above levels form the core of parallel processing technology Standard benchmark programs are needed for performance evaluation Computing Problem Algorithms And Data Structures Programming High-Level Languages Mapping Binding (Compile, Load) System Architecture Operating System Hardware Architecture Applications Software Performance Evaluation Processors, Memory, I/O and Peripheral Devices

16 Classification of Parallel Computers Flynn Classification: Single Instruction Single Data stream system (SISD) Single Instruction Multiple Data stream system (SIMD) Multiple Instruction Single Data stream system (MISD) Multiple Instruction Multiple Data stream system (MIMD)

17 Example Customers in a bank are serviced by bank tellers Customers are tasks and tellers are processors

18 SISD System If there is one teller for all customers they all are serviced in sequence in a line This is a conventional single processor system and SISD model The total processing time is the sum of all processing times

19 Parallel System If there are more tellers they can serve the customers in parallel One teller per customer: All customers are served simultaneously and the total processing time is the largest processing time

20 Load Balancing If there are more customer than tellers customers must be assigned to tellers such that all tellers are utilized efficiently while the load is fairly distributed among all of them while customers are also served in the shortest possible time Load balancing and scheduling become important

21 Dependency and Coordination Assume customer A deposits some money into a joint account with teller #1 At the same time customer B wants to withdraw some money from the same account with teller #2 Obviously both transactions cannot be completed at the same time Coordination is required in parallel processes when a task is dependent on other tasks

22 Pipelined Processing The bank tellers are organized into a coordinated line of workers Each teller is given a fine-grained task to perform rather than a whole transaction #1 gets the customers account book #2 Uses the book to validate the customer and his account #3 Updates the account #4 Takes or returns cash or check to or from the customer

23 Pipelined Processing By overlapping the tasks, all tellers will always be busy if the line of customers is full The customers will be served in parallel but in a different way compare to the case that each customer is serviced completely by one teller

24 SIMD or Data Parallel Processing Data handling is the key in SIMD Assume every teller needs to go to a shelf, get the account for the customer and update it and return to his desk for every customer Instead, if all the accounts are given to all tellers at the right time the overhead will be saved A systematic mechanism is used: Phase 1: Data is prepared and delivered to all tellers Phase 2: All tellers do the processing at the same time This model is useful when there are a lot of data that needs to be processed similarly (vector/array processing)

25 SIMD or Data Parallel Processing Coordination is still required for customers A and B in the previous example Coordination can be done by a high level supervisor The supervisor can use a simple strategy such as doing all deposits first and withdrawals second The performance depends on the number of deposits and number of withdrawals, for the best results they must be equal The total processing time depends on the number of tellers

26 MISD Assume there are several customers with a joint account Assume all come to the bank for different types of transactions Each customer goes to a different teller The most efficient way of processing the customer is that the account file is passed to one of the tellers and then is circulated among the tellers one after the other

27 MISD If there are several similar cases waiting in line for transactions the same procedure can be repeated in pipeline for next set of customers (in fact for the next account file) Then the time to go and get the account file is saved for all the tellers This mechanism is good when there is a whole bunch of data and certain processes that is to be performed on each data

28 MIMD The tellers perform different services on different customer files There is no global synchronization, the tellers are on their own and do not interact with each other Saves a lot of overhead accessing the files when each teller needs to do several operations on each file

29 MIMD Simultaneous transactions for customers A and B is still prohibited A simple lock mechanism synchronizes the transactions for A and B which have data dependency Whenever a teller processes the file for A, he puts a lock on the file The file cannot be processed by any other teller until the lock is removed by the teller who put the lock

30 Classification of Parallel Computers Flynn Classification: Single Instruction Single Data stream system (SISD) Single Instruction Multiple Data stream system (SIMD) Multiple Instruction Single Data stream system (MISD) Multiple Instruction Multiple Data stream system (MIMD)

31 SISD Instruction Stream (IS) I/O Control Unit (CU) IS Processor Unit (PU) Data Stream (DS) Memory Unit (MU) Basic single-processor or uniprocessor system Von Neumann architecture is a SISD system Also includes processors with multiple functional units and/or pipelining

32 SIMD Processing Element (PE) 1 DS Local Memory (LM) 1 DS Loaded from host IS CU IS Processing Element (PE) n DS Local Memory (LM) n DS Loaded from host A number of processors execute simultaneously the same instruction transmitted by the control unit Each instruction is executed on a different set of data transmitted to each processing element from a local memory The results are stored temporarily in the local memory There is a bidirectional bus between the local memories and the main memory The program is stored in the main memory and transmitted to the control unit This system is also called an Array Processor or Vector Computer

33 MISD IS IS CU 1 CU 2 CU n Memory (Program and Data) DS IS IS IS DS DS PU 1 PU 2 PU n DS I/O A sequence of data is transmitted to a series of processors Each processor is controlled by a separate control unit Each processor executes a separate instruction sequence This is referred to as systolic array and is used for pipelined execution of specific algorithms This is different from pipelining in the processors in which pipelining belongs to the same processor and is controlled by the same control unit

34 MIMD I/O IS CU 1 IS PU 1 DS Shared Memory I/O IS CU n IS PU n DS A set of n processors simultaneously execute different instructions sequences on different data sets Multiprocessors and parallel computers are MIMD systems

35 Coordination Mechanisms of Parallel Programs Parallel Computer Synchronous Asynchronous Pipelining SIMD (Vector/Array) MISD (Systolic Array) MIMD Different operations and tasks that have dependency in any ways must be coordinated Coordination can be done using synchronous mechanisms built into the hardware or asynchronous mechanisms

36 Other Classifications Flynn s approach is not the only classification and it does not cover all possible configurations Another widely accepted classification has been given by Enslow

37 Enslow s Definitioin A multiprocessor must satisfy the following four properties: It must contain two or more processors of approximately comparable capabilities All processors share access to a common memory. This does not preclude the existence of local memories for each or some of the processors All processors share access to I/O channels, control units, and devices. This does not preclude the existence of some local I/O interface and devices The entire system is controlled by one Operating System

38 Enslow s Definitioin P 1 P 2 P n Communication Network M 1 M 2 M i IO 1 IO 2 IO j A multiprocessor conforming to Enslow s definition is denoted as Tightly Coupled Multiprocessor

39 Enslow s Definitioin A loosely coupled multiprocessor have less shared and more local resources A loosely coupled multiprocessor is more likely to have additional OS environments at each individual processor A loosely coupled multiprocessor could be regarded as a Computer Network They also recognized as Distributed Systems

40 Enslow s Definitioin LM 1 IO 1 LM 2 IO 2 LM n IO n Local Bus Local Bus Local Bus P 1 P 2 P n Communication Network A Distributed Computer System has: A multiplicity of general purpose, physical and logical resources that can be assigned to specific tasks on a dynamic basis A physical distribution of the above resources interacting through a communication network A high-level Operating System that unifies and integrates the control of the distributed components. Individual processors may have their own local OS. System Transparency which permits services to be requested by name only, without having to identify the serving resources Cooperative Autonomy which permits a serving resource to refuse a request of service, or delay it, if it is busy processing another task. There is no hierarchy of control within the system

41 Levels of Concurrency Job: The highest level, consists of one or more tasks Task: A unit of scheduling to be assigned to one or more processors. Consists of one or more processes Process: A collection of program instructions, executed on one processor. An indivisible unit with respect to processor allocation Instruction: A simple unit of execution at the lowest level

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Parallel Computer Architectures Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Outline Flynn s Taxonomy Classification of Parallel Computers Based on Architectures Flynn s Taxonomy Based on notions of

More information

Advanced Computer Architecture. The Architecture of Parallel Computers

Advanced Computer Architecture. The Architecture of Parallel Computers Advanced Computer Architecture The Architecture of Parallel Computers Computer Systems No Component Can be Treated In Isolation From the Others Application Software Operating System Hardware Architecture

More information

Lecture 8: RISC & Parallel Computers. Parallel computers

Lecture 8: RISC & Parallel Computers. Parallel computers Lecture 8: RISC & Parallel Computers RISC vs CISC computers Parallel computers Final remarks Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) is an important innovation in computer

More information

RISC Processors and Parallel Processing. Section and 3.3.6

RISC Processors and Parallel Processing. Section and 3.3.6 RISC Processors and Parallel Processing Section 3.3.5 and 3.3.6 The Control Unit When a program is being executed it is actually the CPU receiving and executing a sequence of machine code instructions.

More information

Module 5 Introduction to Parallel Processing Systems

Module 5 Introduction to Parallel Processing Systems Module 5 Introduction to Parallel Processing Systems 1. What is the difference between pipelining and parallelism? In general, parallelism is simply multiple operations being done at the same time.this

More information

Unit 9 : Fundamentals of Parallel Processing

Unit 9 : Fundamentals of Parallel Processing Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing

More information

ARCHITECTURAL CLASSIFICATION. Mariam A. Salih

ARCHITECTURAL CLASSIFICATION. Mariam A. Salih ARCHITECTURAL CLASSIFICATION Mariam A. Salih Basic types of architectural classification FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE FENG S CLASSIFICATION Handler Classification Other types of architectural

More information

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE The most popular taxonomy of computer architecture was defined by Flynn in 1966. Flynn s classification scheme is based on the notion of a stream of information.

More information

Chapter 18 Parallel Processing

Chapter 18 Parallel Processing Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data stream - SIMD Multiple instruction, single data stream - MISD

More information

PIPELINE AND VECTOR PROCESSING

PIPELINE AND VECTOR PROCESSING PIPELINE AND VECTOR PROCESSING PIPELINING: Pipelining is a technique of decomposing a sequential process into sub operations, with each sub process being executed in a special dedicated segment that operates

More information

Chapter 11. Introduction to Multiprocessors

Chapter 11. Introduction to Multiprocessors Chapter 11 Introduction to Multiprocessors 11.1 Introduction A multiple processor system consists of two or more processors that are connected in a manner that allows them to share the simultaneous (parallel)

More information

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes:

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes: BIT 325 PARALLEL PROCESSING ASSESSMENT CA 40% TESTS 30% PRESENTATIONS 10% EXAM 60% CLASS TIME TABLE SYLLUBUS & RECOMMENDED BOOKS Parallel processing Overview Clarification of parallel machines Some General

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)

More information

Advanced Parallel Architecture. Annalisa Massini /2017

Advanced Parallel Architecture. Annalisa Massini /2017 Advanced Parallel Architecture Annalisa Massini - 2016/2017 References Advanced Computer Architecture and Parallel Processing H. El-Rewini, M. Abd-El-Barr, John Wiley and Sons, 2005 Parallel computing

More information

Chapter 18. Parallel Processing. Yonsei University

Chapter 18. Parallel Processing. Yonsei University Chapter 18 Parallel Processing Contents Multiple Processor Organizations Symmetric Multiprocessors Cache Coherence and the MESI Protocol Clusters Nonuniform Memory Access Vector Computation 18-2 Types

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems 1 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ 10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems To enhance system performance and, in some cases, to increase

More information

Introduction to parallel computing

Introduction to parallel computing Introduction to parallel computing 2. Parallel Hardware Zhiao Shi (modifications by Will French) Advanced Computing Center for Education & Research Vanderbilt University Motherboard Processor https://sites.google.com/

More information

Lecture 9: MIMD Architecture

Lecture 9: MIMD Architecture Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is

More information

Pipeline and Vector Processing 1. Parallel Processing SISD SIMD MISD & MIMD

Pipeline and Vector Processing 1. Parallel Processing SISD SIMD MISD & MIMD Pipeline and Vector Processing 1. Parallel Processing Parallel processing is a term used to denote a large class of techniques that are used to provide simultaneous data-processing tasks for the purpose

More information

Crossbar switch. Chapter 2: Concepts and Architectures. Traditional Computer Architecture. Computer System Architectures. Flynn Architectures (2)

Crossbar switch. Chapter 2: Concepts and Architectures. Traditional Computer Architecture. Computer System Architectures. Flynn Architectures (2) Chapter 2: Concepts and Architectures Computer System Architectures Disk(s) CPU I/O Memory Traditional Computer Architecture Flynn, 1966+1972 classification of computer systems in terms of instruction

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

WHY PARALLEL PROCESSING? (CE-401)

WHY PARALLEL PROCESSING? (CE-401) PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:

More information

Processor Architecture and Interconnect

Processor Architecture and Interconnect Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing

More information

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan Course II Parallel Computer Architecture Week 2-3 by Dr. Putu Harry Gunawan www.phg-simulation-laboratory.com Review Review Review Review Review Review Review Review Review Review Review Review Processor

More information

Top500 Supercomputer list

Top500 Supercomputer list Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity

More information

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 Readings: Multiprocessing Required Amdahl, Validity of the single processor

More information

BlueGene/L (No. 4 in the Latest Top500 List)

BlueGene/L (No. 4 in the Latest Top500 List) BlueGene/L (No. 4 in the Latest Top500 List) first supercomputer in the Blue Gene project architecture. Individual PowerPC 440 processors at 700Mhz Two processors reside in a single chip. Two chips reside

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

CSCI 4717 Computer Architecture

CSCI 4717 Computer Architecture CSCI 4717/5717 Computer Architecture Topic: Symmetric Multiprocessors & Clusters Reading: Stallings, Sections 18.1 through 18.4 Classifications of Parallel Processing M. Flynn classified types of parallel

More information

Computer Architecture

Computer Architecture Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors

More information

Classification of Parallel Architecture Designs

Classification of Parallel Architecture Designs Classification of Parallel Architecture Designs Level of Parallelism Job level between jobs between phases of a job Program level between parts of a program within do-loops between different function invocations

More information

Introduction. Distributed Systems. Introduction. Introduction. Instructor Brian Mitchell - Brian

Introduction. Distributed Systems. Introduction. Introduction. Instructor Brian Mitchell - Brian Distributed 1 Directory 1 Cache 1 1 2 Directory 2 Cache 2 2... N Directory N Interconnection Network... Cache N N Instructor Brian Mitchell - Brian bmitchel@mcs.drexel.edu www.mcs.drexel.edu/~bmitchel

More information

Parallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization

Parallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Parallel Processing http://www.yildiz.edu.tr/~naydin 1 2 Outline Multiple Processor

More information

Issues in Multiprocessors

Issues in Multiprocessors Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing explicit sends & receives Which execution model control parallel

More information

Computer Organization. Chapter 16

Computer Organization. Chapter 16 William Stallings Computer Organization and Architecture t Chapter 16 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data

More information

Architecture of parallel processing in computer organization

Architecture of parallel processing in computer organization American Journal of Computer Science and Engineering 2014; 1(2): 12-17 Published online August 20, 2014 (http://www.openscienceonline.com/journal/ajcse) Architecture of parallel processing in computer

More information

Instruction Register. Instruction Decoder. Control Unit (Combinational Circuit) Control Signals (These signals go to register) The bus and the ALU

Instruction Register. Instruction Decoder. Control Unit (Combinational Circuit) Control Signals (These signals go to register) The bus and the ALU Hardwired and Microprogrammed Control For each instruction, the control unit causes the CPU to execute a sequence of steps correctly. In reality, there must be control signals to assert lines on various

More information

Computer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Parallel Processing Basics Prof. Onur Mutlu Carnegie Mellon University Readings Required Hill, Jouppi, Sohi, Multiprocessors and Multicomputers, pp. 551-560 in Readings in Computer

More information

Computer Architecture Lecture 27: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015

Computer Architecture Lecture 27: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015 18-447 Computer Architecture Lecture 27: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015 Assignments Lab 7 out Due April 17 HW 6 Due Friday (April 10) Midterm II April

More information

Architectures of Flynn s taxonomy -- A Comparison of Methods

Architectures of Flynn s taxonomy -- A Comparison of Methods Architectures of Flynn s taxonomy -- A Comparison of Methods Neha K. Shinde Student, Department of Electronic Engineering, J D College of Engineering and Management, RTM Nagpur University, Maharashtra,

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Issues in Multiprocessors

Issues in Multiprocessors Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores SPARCCenter, SGI Challenge, Cray T3D, Convex Exemplar, KSR-1&2, today s CMPs message

More information

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering Multiprocessors and Thread-Level Parallelism Multithreading Increasing performance by ILP has the great advantage that it is reasonable transparent to the programmer, ILP can be quite limited or hard to

More information

Why Multiprocessors?

Why Multiprocessors? Why Multiprocessors? Motivation: Go beyond the performance offered by a single processor Without requiring specialized processors Without the complexity of too much multiple issue Opportunity: Software

More information

In the early days of computing, the best way to increase the speed of a computer was to use faster logic devices.

In the early days of computing, the best way to increase the speed of a computer was to use faster logic devices. acroarchitecture vs. microarchitecture icroarchitecture is concerned with how processors and other components are put together. acroarchitecture is concerned with how processors and other components can

More information

Course overview Computer system structure and operation

Course overview Computer system structure and operation Computer Architecture Week 01 Course overview Computer system structure and operation College of Information Science and Engineering Ritsumeikan University reference information course web site: http://www.ritsumei.ac.jp/~piumarta/ca/

More information

Issues in Parallel Processing. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Issues in Parallel Processing. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Issues in Parallel Processing Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Introduction Goal: connecting multiple computers to get higher performance

More information

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1 Credits:4 1 Understand the Distributed Systems and the challenges involved in Design of the Distributed Systems. Understand how communication is created and synchronized in Distributed systems Design and

More information

Number of processing elements (PEs). Computing power of each element. Amount of physical memory used. Data access, Communication and Synchronization

Number of processing elements (PEs). Computing power of each element. Amount of physical memory used. Data access, Communication and Synchronization Parallel Computer Architecture A parallel computer is a collection of processing elements that cooperate to solve large problems fast Broad issues involved: Resource Allocation: Number of processing elements

More information

3.3 Hardware Parallel processing

3.3 Hardware Parallel processing Parallel processing is the simultaneous use of more than one CPU to execute a program. Ideally, parallel processing makes a program run faster because there are more CPUs running it. In practice, it is

More information

Organisasi Sistem Komputer

Organisasi Sistem Komputer LOGO Organisasi Sistem Komputer OSK 14 Parallel Processing Pendidikan Teknik Elektronika FT UNY Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple

More information

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing

More information

Flynn s Taxonomy of Parallel Architectures

Flynn s Taxonomy of Parallel Architectures Flynn s Taxonomy of Parallel Architectures Stefano Markidis, Erwin Laure, Niclas Jansson, Sergio Rivas-Gomez and Steven Wei Der Chien 1 Sequential Architecture The von Neumann architecture was conceived

More information

MULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming

MULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance

More information

CMSC 611: Advanced. Parallel Systems

CMSC 611: Advanced. Parallel Systems CMSC 611: Advanced Computer Architecture Parallel Systems Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems

More information

MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming

MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance

More information

Flynn s Classification

Flynn s Classification Flynn s Classification Guang R. Gao ACM Fellow and IEEE Fellow Endowed Distinguished Professor Electrical & Computer Engineering University of Delaware ggao@capsl.udel.edu 652-14F-PXM-intro 1 Execution

More information

CS418 Operating Systems

CS418 Operating Systems CS418 Operating Systems Lecture 9 Processor Management, part 1 Textbook: Operating Systems by William Stallings 1 1. Basic Concepts Processor is also called CPU (Central Processing Unit). Process an executable

More information

Chapter 2 Parallel Computer Models & Classification Thoai Nam

Chapter 2 Parallel Computer Models & Classification Thoai Nam Chapter 2 Parallel Computer Models & Classification Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology Chapter 2: Parallel Computer Models & Classification Abstract Machine

More information

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor Multiprocessing Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Almasi and Gottlieb, Highly Parallel

More information

INTELLIGENCE PLUS CHARACTER - THAT IS THE GOAL OF TRUE EDUCATION UNIT-I

INTELLIGENCE PLUS CHARACTER - THAT IS THE GOAL OF TRUE EDUCATION UNIT-I UNIT-I 1. List and explain the functional units of a computer with a neat diagram 2. Explain the computer levels of programming languages 3. a) Explain about instruction formats b) Evaluate the arithmetic

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming David Lifka lifka@cac.cornell.edu May 23, 2011 5/23/2011 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor or computer to complete

More information

Introduction to Parallel Processing

Introduction to Parallel Processing Babylon University College of Information Technology Software Department Introduction to Parallel Processing By Single processor supercomputers have achieved great speeds and have been pushing hardware

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

! Readings! ! Room-level, on-chip! vs.!

! Readings! ! Room-level, on-chip! vs.! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads

More information

5 Computer Organization

5 Computer Organization 5 Computer Organization 5.1 Foundations of Computer Science ã Cengage Learning Objectives After studying this chapter, the student should be able to: q List the three subsystems of a computer. q Describe

More information

A Multiprocessor system generally means that more than one instruction stream is being executed in parallel.

A Multiprocessor system generally means that more than one instruction stream is being executed in parallel. Multiprocessor Systems A Multiprocessor system generally means that more than one instruction stream is being executed in parallel. However, Flynn s SIMD machine classification, also called an array processor,

More information

PIPELINING AND VECTOR PROCESSING

PIPELINING AND VECTOR PROCESSING 1 PIPELINING AND VECTOR PROCESSING Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector Processing Array Processors 2 PARALLEL PROCESSING Parallel Processing Execution

More information

CS Computer Architecture

CS Computer Architecture CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 Structured Computer Organization A computer s native language, machine language, is difficult for human s to use to program the computer

More information

Multicores, Multiprocessors, and Clusters

Multicores, Multiprocessors, and Clusters 1 / 12 Multicores, Multiprocessors, and Clusters P. A. Wilsey Univ of Cincinnati 2 / 12 Classification of Parallelism Classification from Textbook Software Sequential Concurrent Serial Some problem written

More information

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY Department of Computer science and engineering Year :II year CS6303 COMPUTER ARCHITECTURE Question Bank UNIT-1OVERVIEW AND INSTRUCTIONS PART-B

More information

Chapter 17 - Parallel Processing

Chapter 17 - Parallel Processing Chapter 17 - Parallel Processing Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ Luis Tarrataca Chapter 17 - Parallel Processing 1 / 71 Table of Contents I 1 Motivation 2 Parallel Processing Categories

More information

Multi-Processor / Parallel Processing

Multi-Processor / Parallel Processing Parallel Processing: Multi-Processor / Parallel Processing Originally, the computer has been viewed as a sequential machine. Most computer programming languages require the programmer to specify algorithms

More information

Principles of Operating Systems CS 446/646

Principles of Operating Systems CS 446/646 Principles of Operating Systems CS 446/646 1. Introduction to Operating Systems a. Role of an O/S b. O/S History and Features c. Types of O/S Mainframe systems Desktop & laptop systems Parallel systems

More information

High Performance Computing in C and C++

High Performance Computing in C and C++ High Performance Computing in C and C++ Rita Borgo Computer Science Department, Swansea University Announcement No change in lecture schedule: Timetable remains the same: Monday 1 to 2 Glyndwr C Friday

More information

COMPUTER ARCHITECTURE

COMPUTER ARCHITECTURE COURSE: COMPUTER ARCHITECTURE per week: Lectures 3h Lab 2h For the specialty: COMPUTER SYSTEMS AND TECHNOLOGIES Degree: BSc Semester: VII Lecturer: Assoc. Prof. PhD P. BOROVSKA Head of Computer Systems

More information

Online Course Evaluation. What we will do in the last week?

Online Course Evaluation. What we will do in the last week? Online Course Evaluation Please fill in the online form The link will expire on April 30 (next Monday) So far 10 students have filled in the online form Thank you if you completed it. 1 What we will do

More information

Introduction. EE 4504 Computer Organization

Introduction. EE 4504 Computer Organization Introduction EE 4504 Computer Organization Section 11 Parallel Processing Overview EE 4504 Section 11 1 This course has concentrated on singleprocessor architectures and techniques to improve upon their

More information

2 MARKS Q&A 1 KNREDDY UNIT-I

2 MARKS Q&A 1 KNREDDY UNIT-I 2 MARKS Q&A 1 KNREDDY UNIT-I 1. What is bus; list the different types of buses with its function. A group of lines that serves as a connecting path for several devices is called a bus; TYPES: ADDRESS BUS,

More information

COSC 6385 Computer Architecture - Multi Processor Systems

COSC 6385 Computer Architecture - Multi Processor Systems COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:

More information

Multiprocessors & Thread Level Parallelism

Multiprocessors & Thread Level Parallelism Multiprocessors & Thread Level Parallelism COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Introduction

More information

Types of Parallel Computers

Types of Parallel Computers slides1-22 Two principal types: Types of Parallel Computers Shared memory multiprocessor Distributed memory multicomputer slides1-23 Shared Memory Multiprocessor Conventional Computer slides1-24 Consists

More information

Non-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors.

Non-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors. CS 320 Ch. 17 Parallel Processing Multiple Processor Organization The author makes the statement: "Processors execute programs by executing machine instructions in a sequence one at a time." He also says

More information

CS 101, Mock Computer Architecture

CS 101, Mock Computer Architecture CS 101, Mock Computer Architecture Computer organization and architecture refers to the actual hardware used to construct the computer, and the way that the hardware operates both physically and logically

More information

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines

More information

Introduction to High-Performance Computing

Introduction to High-Performance Computing Introduction to High-Performance Computing Simon D. Levy BIOL 274 17 November 2010 Chapter 12 12.1: Concurrent Processing High-Performance Computing A fancy term for computers significantly faster than

More information

BİL 542 Parallel Computing

BİL 542 Parallel Computing BİL 542 Parallel Computing 1 Chapter 1 Parallel Programming 2 Why Use Parallel Computing? Main Reasons: Save time and/or money: In theory, throwing more resources at a task will shorten its time to completion,

More information

CS4230 Parallel Programming. Lecture 3: Introduction to Parallel Architectures 8/28/12. Homework 1: Parallel Programming Basics

CS4230 Parallel Programming. Lecture 3: Introduction to Parallel Architectures 8/28/12. Homework 1: Parallel Programming Basics CS4230 Parallel Programming Lecture 3: Introduction to Parallel Architectures Mary Hall August 28, 2012 Homework 1: Parallel Programming Basics Due before class, Thursday, August 30 Turn in electronically

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

Parallel Computing Introduction

Parallel Computing Introduction Parallel Computing Introduction Bedřich Beneš, Ph.D. Associate Professor Department of Computer Graphics Purdue University von Neumann computer architecture CPU Hard disk Network Bus Memory GPU I/O devices

More information

RAID 0 (non-redundant) RAID Types 4/25/2011

RAID 0 (non-redundant) RAID Types 4/25/2011 Exam 3 Review COMP375 Topics I/O controllers chapter 7 Disk performance section 6.3-6.4 RAID section 6.2 Pipelining section 12.4 Superscalar chapter 14 RISC chapter 13 Parallel Processors chapter 18 Security

More information

Computer parallelism Flynn s categories

Computer parallelism Flynn s categories 04 Multi-processors 04.01-04.02 Taxonomy and communication Parallelism Taxonomy Communication alessandro bogliolo isti information science and technology institute 1/9 Computer parallelism Flynn s categories

More information

Computer organization by G. Naveen kumar, Asst Prof, C.S.E Department 1

Computer organization by G. Naveen kumar, Asst Prof, C.S.E Department 1 Pipelining and Vector Processing Parallel Processing: The term parallel processing indicates that the system is able to perform several operations in a single time. Now we will elaborate the scenario,

More information

THREAD LEVEL PARALLELISM

THREAD LEVEL PARALLELISM THREAD LEVEL PARALLELISM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 4 is due on Dec. 11 th This lecture

More information

Normal computer 1 CPU & 1 memory The problem of Von Neumann Bottleneck: Slow processing because the CPU faster than memory

Normal computer 1 CPU & 1 memory The problem of Von Neumann Bottleneck: Slow processing because the CPU faster than memory Parallel Machine 1 CPU Usage Normal computer 1 CPU & 1 memory The problem of Von Neumann Bottleneck: Slow processing because the CPU faster than memory Solution Use multiple CPUs or multiple ALUs For simultaneous

More information

Embedded Systems Architecture. Computer Architectures

Embedded Systems Architecture. Computer Architectures Embedded Systems Architecture Computer Architectures M. Eng. Mariusz Rudnicki 1/18 A taxonomy of computer architectures There are many different types of architectures, and it is worth considering some

More information