Downloaded from

Size: px
Start display at page:

Download "Downloaded from"

Transcription

1 MCA (Revised) Term-End F,xarnination June, 2OOT MCSE-011 : PARALLEL COUfPUTING Time : 3 hours Maximum Morks : 700 Note 3 Question number 7 is three questions from compulsory. Attempt any the rest. l. (a) What is meant by 'Temporal Parallelism'? With the help of an example, explain how breaking a task into multiple tasks increases speed of execution. (c) (d) Write and explain the algorithm for solving the matrix multiplication problem using the parallel model. State and explain Amdahl's law for measuring speed up performance of parallel systems. Also, list the outcomes of analysis of the Amdahl's law. Define'hyper threading technology (HTT). Explain its salient features. Also, discuss the functionality of hyperthread processor. MCSE-01 P.T.O.

2 2. (a) Consider a list of following elements : l0 g, 6, g, 6, 7, 5, 11, Sort it using odd even transposition. Show intermediate steps. What is the concept of message passing libraries? List the salient features of MPI-I interface. l0 3. (a) Identify the types of the following vector processing instructions : (i) C(I) : A(I) AND B(I) s (ii) C(I) : MAX (A(I), B(I)) (iii) B(l) : A(t) / S (S is a scalar) (iv) B(l) : SIN (A(l)) (v) C(l) : SIN (A(t)) / CoS (A(r)) What are the problems faced by superscalar (c) architecture? How are these problems removed in VLIW architecture? Explain the shared memory parallel programming model. s 4- (a) Explain Bens network. Show the interconnection of Bens network for the following permutation : 70 r P: J MCSE-O1 1 2

3 Elaborate Handler's classification of parallel computers. (c) Define a Parallel Virtual Machine (PVM) and list its salient features. 5. (a) "To implement any algorithm, selection of a proper data structure is very important." In this context, explain any two data structures which are used in parallel algorithms, with an example for each. {b) What is meant by vector processing? How is it different from scalar processing? List and discuss various vector instructions along with their function mappings. 1.0 MCSE-OI 1 7,000

4

5 Teirn-End Examination December, 2OO7 MCSE-011 : PARALLEL COMPUT NG Time : 3 hours Maximum Marks : 0 Note : Question number 1 is compulsory. Attempt any three"questions f rom the rest. l l - (a) State and explain the law which uses the notion of constant execution time. Explain with the help of an example. Explain the concept of permutation network with an example. Discuss Perfect Shuffle permutation and Butterfly permutation. (c) (d) State and explain different fundamental parameters required for the analysis of parallel algorithm. What are the problems faced in super scalar architecture? How can these problerns be removed in VLIW architecture?. 'r- MCSE-01 1 P.T.O.

6 2. (a) Explain parallel virtual machine and list its salient f e a t u r e s.. l i l (c) Explain message passing with the issues decided by the system in the process of message passing. 5 Explain Bens Network as a non-blocking network. Show the interconnection of Bens Network for the following Permutation : 70 t-o P:l '.' '.'' j, -' 3. (a) Describe the property of the.merge Sort " circuit sequence and sort out the following list of values in ascending order using an odd-even merging circuit consisting of a set of cornparators. A : (4, 6, g,) B : (2, 7, g, 12\ Also show the intermediate steps Draw instruction exetcution steps Flynn's _t" Classification and discuss the Flynn's Classification based on instruction and data stream' 4. (a) What is the concept of message passing libraries? Explain two different types of message passing ' libraries with their merits and dernerits. Explain pipeline processing and describe the architecture of pipeline processing. M CS E

7 i i i I 5. (a) Discuss Block distribution and Cyclic distribution with their example. Explain difference between the following ; 6 6 (i) Tightly coupled system and Loosely coupled system (c) (ii) Vector processing and Scalar processing Explain the i.on.npt of Thread with basic methods in. concurrent programming languages for creating and terminating of threads. Also give the advantages the thread offers over other processes. I I I I : MCSE ,OOO I

8

9 MCSE-011o MCA (Revised) Term-End Examination June, 2OO7 MCSE-0l 1O : PARALLEL COMPUTING Time : 3 hours Maximum Marks : 700 Nofe r Qu estion number 1 is compulsory. Attempt any three quesfions from the rest. l. (a) Define Array processing. Why are array processors called as SIMD array computers? With the help of a block diagram, explain the architecture of an SIMD array processor. State and explain Gustafson's law for measuring speedup performance of parallel systems. Explain with the help of an example (c) (d) Elaborate the concept of permutation network. What is butterfly permutation? How is it implemented? Discuss. 70 Define cluster computing. Explain the memory organisation in a cluster computing. Give details of any of the important projects based on cluster computing. MCSE-O1 P.T.O.

10 2. (a) With the help of suitable example, explain control dependence. (c) List any three scientific applications and engineering applications of parallel computing, With the help of an example for each, explain the following parallel programming models : (i) Message passing (ii) Data parallel programming 3. (a) What are Bernstein conditions? Show the operation TIT;Til: ::l'wngc'de: 12:y:(b+c) *d.i 13: z: *2 + (a* e) "Flynn's classification discusses the behavioural concept for classifying the parallel cornputers and doesn't take into consideration the computer's classification depending on their structure. 4- (a) What is Synchronization Latency problem in ' multithreaded processors? How can it be handled? 5 Define following asymptotic notations used for analysing functions : 6 (i) (ii) Theta notation Big-O notation (iii) O notation MCSE-01 2

11 (c) Describe the property of the bitonic sequence and sort out the following list of values in ascending order using a combinational circuit consisting of a set of comparators. 3, 5, 19, 29,, 2, L4,0, 2L, 4,9 Ahso show the intermediate steps.. I 5. (a) Write short notes on the following : (i) (ii) Parallel virtual machine Data parallel programming Draw the structure and explai.n the following interconnection networks : (i) Fat tree (ii) Systolic array MCSE-01 2,000

12

13 McsE-ol1o MCA (Revtsed) Term-End Examination June, 2OO8 MCSE-0l1 ) : PARALLEL COMPUTING Time : 3 hours Maximum Morks : 700 Note : Quesfion number 7 is compulsory. Attempt any thtee ques ons lrom the rest. 1. (a) Explain two applications of parallelism. Deline the following terms : (i) (iil Running Time Speed up (c) F:<plain Amdahl's Law. (d) List the various search based tools used in performance analysis. (e) List some data parallelism feature of a modern comduter architech.re. 5x8=40 2. lal Define the terms Temporal parallelism and Data parallelism in detail. Also provide examples for both. 1O Explain the life cycle o{ a process in detail. What are the four actions for process creations? Explain each. I0 P.T. O.

14 3. (a) What do you understand by Bernstein conditions? Find out Bernstein mndition in the following example. A = B x C C _ D + E C=A+B E=F-D H : l + J What are the differences between Control flow and Data flow architech-rre? 4. (a) How will you classify Parallel computers according io Handler's classif ication? Discuss the design issues of interconnection network. Explain Cross bar network. 70 5, (a) Draw a Clos network for [ l t l L s 9 4 1). Explain VLIW Architecture in detail. What is the condition for compaciing the instruction in a VLIW instruction word? McsE-ol 1@ 3,000

15 MCA (Revised) Term-End Examination June, 2OO8 MCSE-011 : PARALLEL COMPUTING Time : 3 hours Maximum Morks : 700 Note : Question number 7 is compulsory. Attempt ang thtee questions from the rest. f. (a) Discuss any two applications of parallel processing. (c) (d) (e) What are the primary attributes used to meazure the performance of a parallel computer system? Discuss the four levels of parallel processing. Define Flynn's classification of parallelism. Define the prop rties of interconnection networks. Ex8:40 2. (a) Define Benz Network and design it {or permutation f s P l I L l s J r c Give the classification of Vector instructions. Explain each. MCSE-OI 1 1 P.T.O.

16 (a) How are parallel algorithms analysed? Explain each What types of data structures are used for parallel algorithms? Discuss in detail D. Discuss the four parallel programming models in use' Discuss the lollowing : (a) Grid compuiing 20 4x5 (c) (d) Cluster computing Hyper threading Parallel virutal machine McsE-ol 1,000

17 MCSE-O11 MCA (Revised) Term-End Examination December, 2OO8 MCSE-011 : PARALLEL COMPUTING Time : 3 hours Maximum Marks: 700 Nofe : Question number 7 is compulsory. Attempt any three questions t'rom the rest. 1. (a) Define the following in brief : (i) Fine grain (ii) Coarse grain (iii) Grain packing (iv) Communication latency (v) Node duplication Write the salient features of the parallel computer series PARAM and MARK developed by India. 6 MCSE-o11 P.T.O.

18 (c) (d) (e) A workstation uses a 15 MHz processor with a claimed MIPS rating to execute a given program mix. Assume a 7 cycle delay for each memory access. Find the effective CPI of this computer. Discuss the parameters used to analyse a generic algorithm. Write a parallel algorithm to rank the elements of a linearly linked list in terms of distance from each node to the last element of the list. Compare and contrast the following : (i) Vector processing and Array processing (ii) Instruction pipelines and Arithmetic pipelines (iii) Superscalar processor and Multithreaded processor (a) Elaborate the following in context of parallel algorithms : (i) Merge Sort circuit (ii) CRCW and CREW Give the output of following instruction in HPF : (i)! HPF $ pnocrssors Q(s, r) (ii)! HPF $ PROCESSORS P1(5)! HPF $ reuplrre Tr (22)! HPF $ DISTRIBUTE T1 (CYCLIC) ONTOPl MCSE-O1 1 2

19 (c) What is the function of Open MP directive? Write its syntax and meaning of the following directives : (i) First private clause (ii) Private clause (iii) Work sharing construct (iv) COPYIN clause 3. (a) Differentiate between control flow computing concept and data flow computing concept. Give example ol each. What is Flynn's classification of parallel computer systems? List salientfeatures of all categories of parallel systems. 4. (a) List various visualisation tools performance analysis. employed in Briefly discuss the following laws to n-reasure the speed-up performance : (i) Amdahl's Law (ii) Gustafson's Law (c) (i) Write a shared memory program for - parallel systems, to add elements of an array using two processors. (ii) Write a program for p'um (parallel virtual ' machine), to give a listing of the "slave" or spawned program, MCSE-O11 P.T.O,

20 Elaborate any four of the following : (i) Grid Computing 4x5:20 (ii) Intel Architecture IA - 64 (iii) Gantt Chart (iv) Parallel Virtual Machine (v) Multithreading MCSE-O1

21 r-c) v tc) O O MCA (Revised) Term-End Examination june, 2009 MCSE-011 : PARAttEt COMPUTING Time : 3 hours Maximum Marks : 1-00 Note : Question numbier 7 is compulsory. Attempt any three questions from the rest. 1. (u) Explain Bernstein conditions for detection of parallelism. (c) (d) (e) What are the Asymptotic Notations. Explain with the example, Write the Handler's classification of parallel computer. Explain permutation network with properties Explain the following interconnection networks. (t) (ii) Systolic Array Hyper cube MCSE-011 P.T.O.

22 2. (u) Explain basic concepts of dataflow computing and describe various applications of parallel computing. Describe the metrics involved for analysing the performance of parallel algorithm for parallel computers, also explain the various factor causing parallel system overhead. 3. (u) With the help of a diagram illustrate the concept of sorting using comprators for the L0 unsorted list having the elements values as [3, 5, g, g,, 12, 1.4, 20, 95, 90, 60, 35, 23, 18,01 Explain the pipeline processing and describe pipeline processing architecture. 4. (u) A three stage network is set so that ? ) P(s1)=[; s ) s 6 7 ) P(s2)=[l ) (0 1, ) P(s3)=[.z s s 6 4 z o t) What permutation realised by network. MCSE-011

23 Write short note on the following : (0 Spin Lock Mechanism f or synchoronisation. (ii) Synchronous and Asynchronous message passing J. (u) Discuss Handler's classification based on three distinct levels of computer. L0 Explain three work sharing constructs of open Mp. L0 -ooo- MCSE-011

24

25 I MCSE-0111 MCA (Revised) -14 Term-End Examination -14 December, " C MCSE-011 : PARALLEL COMPUTING Time : 3 hours Maximum Marks : 0 Note : Question number 1 is compulsory. Attempt any three questions from the rest. 1. (a) Explain and write the Algorithm for odd- 8 even transposition with an example. Explain various properties associated with 8 Inter Connection Networks. Explain and discuss all the performance 8 measurement tools and list the various search based tools and visualisation tools used in performance analysis. (d) Explain the following : 8 Vector processing Array processing (e) Explain two types of combinational 8 comparators. MCSE P.T.O.

26 (a) Flynn's classification is based on multiplicity of instruction streams and data streams observed by the CPU during program execution. Explain in detail. Define Hyper Threading Technology with it's features and functionality. (a) Explain the concept of message passing programming. Define structural classification based on different computer organisations. (a) Show the 8 x 8 Network created from the clos Network by setting n= m=n/ 2 and k = 2 and recursively decomposing any switches greater than 2 x 2 in size. What is the hardware complexity of this Network? Describe at least seven recent parallel programming models with example. (a) Explain the following basic concepts : Program Process Thread Concurrency (v) Granularity Explain the concepts of multithreading and its uses in parallel computer architecture. - o 0 o - MCSE-011 2

27 No. of Printed Pages : 3 MCSE-011 cv O MCA (Revised) Term-End Examination June, 20 MCSE-011 : PARALLEL COMPUTING Time : 3 hours Maximum Marks : 0 Note : Question number 1 is compulsory. Attempt any three questions from the rest. 1. (a) What do you understand by parallel 8 processing? Discuss the various levels of parallelism. Define the following asymptonic notations 8 used for analysing functions. Explain visualisation method for evaluating 8 the performance of parallel programs. (d) Explain the following terms : 8 Single instruction and single data stream (SISD) Single instruction and multiple data stream. (SIMD) (e) Which issues should be considered while designing an interconnection network? MCSE P.T.O.

28 (a) What are the various parallel programming models? Discuss each briefly. Define Bitonic sequence. Discuss a Bitonic sorting algorithm. Further using the algorithm, sort the following sequence. [15, 17, 19 20, 25, 27, 29, 34, 37, 18, 16, 13,, 8, 7, 6, 2] (a) What are the problems faced by super scalar architecture? How are these problems removed in VLIW architecture. Using Bernstein's conditions, detach maximum parallelism between the instruction of the following code. P1 :X = Y * Z P2 :P=Q+X P3 :R=T+X P4 :X=S+P P5 :V=Q+Z 4. (a) Explain different compiler directives in openmp in details. What do you mean by tightly coupled system? Give its characteristics. MCSE-011 2

29 5. (a) Explain the Gustafson's Law for measuring speed up performance with the help of an example. Explain the concept of Permutation Network with an example. Discuss perfect shuffle permutation and Butterfly permutation. MCSE-011 3

30

31

32

33 No. of Printed Pages : 3 MCSE-011 N N O MCA (Revised) Term-End Examination O June, 2011 MCSE-011 : PARALLEL COMPUTING Time : 3 hours Maximum Marks : 0 Note : Question number 1 is compulsory. Attempt any three questions from the rest. 1. (a) Explain the basic concepts of dataflow 8 computing and describe various applications of parallel computing. Explain PRAM Model with its 8 components. (c) Explain Hypercube Network with 8 properties. (d) Explain Bernstein conditions for detection 8 of parallelism. (e) Explain the Amdahl's law for measuring 8 speed up performance with the help of an example. MCSE P.T.O.

34 2. (a) Flynn's classification is based on multiplicity of instruction stream and data stream observed by CPU during program execution. Explain in detail. Discuss the following with respect to a parallel virtual machine. (i) Compiling and running of a PVM program. Creating and managing Dynamic process group. 3. (a) Explain the concept of multithreading and its use in parallel computer architecture. Give the classification of vector instruction. Explain each. 4. (a) Define array processing. Why are array processors called as SIMD Array computers? With the help of a Block diagram. Explain the architecture of an SIMD array processor. With the help of a diagram illustrate the concept of sorting using comparators for the unsorted list having the elements value as (3, 5, 8, 9,, 12, 14, 20, 95, 90, 60, 35, 23, 18, 0) MCSE-011 2

35 5. (a) A three stage Network is set so that. P(S1) = ( ) ( ) P(S2) = ( ) ( ) P(S3) = ( ) ( ) with permutation realised by Network. Define Cluster computing. Explain the memory organisation in a cluster computing. Give details of any of the important project based on cluster computing. MCSE-011 3

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K. Fundamentals of Parallel Computing Sanjay Razdan Alpha Science International Ltd. Oxford, U.K. CONTENTS Preface Acknowledgements vii ix 1. Introduction to Parallel Computing 1.1-1.37 1.1 Parallel Computing

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 Readings: Multiprocessing Required Amdahl, Validity of the single processor

More information

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK SUBJECT : CS6303 / COMPUTER ARCHITECTURE SEM / YEAR : VI / III year B.E. Unit I OVERVIEW AND INSTRUCTIONS Part A Q.No Questions BT Level

More information

Computer Architecture Lecture 27: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015

Computer Architecture Lecture 27: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015 18-447 Computer Architecture Lecture 27: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015 Assignments Lab 7 out Due April 17 HW 6 Due Friday (April 10) Midterm II April

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming David Lifka lifka@cac.cornell.edu May 23, 2011 5/23/2011 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor or computer to complete

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

Advanced Computer Architecture. The Architecture of Parallel Computers

Advanced Computer Architecture. The Architecture of Parallel Computers Advanced Computer Architecture The Architecture of Parallel Computers Computer Systems No Component Can be Treated In Isolation From the Others Application Software Operating System Hardware Architecture

More information

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Multi-Core Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 4 " Due: 11:49pm, Saturday " Two late days with penalty! Exam I " Grades out on

More information

High Performance Computing

High Performance Computing The Need for Parallelism High Performance Computing David McCaughan, HPC Analyst SHARCNET, University of Guelph dbm@sharcnet.ca Scientific investigation traditionally takes two forms theoretical empirical

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11 Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed

More information

Computer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Parallel Processing Basics Prof. Onur Mutlu Carnegie Mellon University Readings Required Hill, Jouppi, Sohi, Multiprocessors and Multicomputers, pp. 551-560 in Readings in Computer

More information

The PRAM (Parallel Random Access Memory) model. All processors operate synchronously under the control of a common CPU.

The PRAM (Parallel Random Access Memory) model. All processors operate synchronously under the control of a common CPU. The PRAM (Parallel Random Access Memory) model All processors operate synchronously under the control of a common CPU. The PRAM (Parallel Random Access Memory) model All processors operate synchronously

More information

ARCHITECTURES FOR PARALLEL COMPUTATION

ARCHITECTURES FOR PARALLEL COMPUTATION Datorarkitektur Fö 11/12-1 Datorarkitektur Fö 11/12-2 Why Parallel Computation? ARCHITECTURES FOR PARALLEL COMTATION 1. Why Parallel Computation 2. Parallel Programs 3. A Classification of Computer Architectures

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.

More information

Introduction. EE 4504 Computer Organization

Introduction. EE 4504 Computer Organization Introduction EE 4504 Computer Organization Section 11 Parallel Processing Overview EE 4504 Section 11 1 This course has concentrated on singleprocessor architectures and techniques to improve upon their

More information

ARCHITECTURAL CLASSIFICATION. Mariam A. Salih

ARCHITECTURAL CLASSIFICATION. Mariam A. Salih ARCHITECTURAL CLASSIFICATION Mariam A. Salih Basic types of architectural classification FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE FENG S CLASSIFICATION Handler Classification Other types of architectural

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen

More information

Design of Digital Circuits Lecture 21: GPUs. Prof. Onur Mutlu ETH Zurich Spring May 2017

Design of Digital Circuits Lecture 21: GPUs. Prof. Onur Mutlu ETH Zurich Spring May 2017 Design of Digital Circuits Lecture 21: GPUs Prof. Onur Mutlu ETH Zurich Spring 2017 12 May 2017 Agenda for Today & Next Few Lectures Single-cycle Microarchitectures Multi-cycle and Microprogrammed Microarchitectures

More information

Chapter 11. Introduction to Multiprocessors

Chapter 11. Introduction to Multiprocessors Chapter 11 Introduction to Multiprocessors 11.1 Introduction A multiple processor system consists of two or more processors that are connected in a manner that allows them to share the simultaneous (parallel)

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #12 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Last class Outline

More information

I) The Question paper contains 40 multiple choice questions with four choices and student will have

I) The Question paper contains 40 multiple choice questions with four choices and student will have Time: 3 Hrs. Model Paper I Examination-2016 BCA III Advanced Computer Architecture MM:50 I) The Question paper contains 40 multiple choice questions with four choices and student will have to pick the

More information

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor Multiprocessing Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Almasi and Gottlieb, Highly Parallel

More information

Computer Architecture and Organization

Computer Architecture and Organization 10-1 Chapter 10 - Advanced Computer Architecture Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 10 Advanced Computer Architecture 10-2 Chapter 10 - Advanced Computer

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism Motivation for Parallelism Motivation for Parallelism The speed of an application is determined by more than just processor speed. speed Disk speed Network speed... Multiprocessors typically improve the

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

CS6303-COMPUTER ARCHITECTURE UNIT I OVERVIEW AND INSTRUCTIONS PART A

CS6303-COMPUTER ARCHITECTURE UNIT I OVERVIEW AND INSTRUCTIONS PART A CS6303-COMPUTER ARCHITECTURE UNIT I OVERVIEW AND INSTRUCTIONS 1. Define Computer Architecture 2. Define Computer H/W 3. What are the functions of control unit? 4. 4.Define Interrupt 5. What are the uses

More information

Unit 9 : Fundamentals of Parallel Processing

Unit 9 : Fundamentals of Parallel Processing Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing

More information

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Parallel Computer Architectures Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Outline Flynn s Taxonomy Classification of Parallel Computers Based on Architectures Flynn s Taxonomy Based on notions of

More information

Advanced Parallel Architecture. Annalisa Massini /2017

Advanced Parallel Architecture. Annalisa Massini /2017 Advanced Parallel Architecture Annalisa Massini - 2016/2017 References Advanced Computer Architecture and Parallel Processing H. El-Rewini, M. Abd-El-Barr, John Wiley and Sons, 2005 Parallel computing

More information

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan Course II Parallel Computer Architecture Week 2-3 by Dr. Putu Harry Gunawan www.phg-simulation-laboratory.com Review Review Review Review Review Review Review Review Review Review Review Review Processor

More information

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1

More information

Introduction II. Overview

Introduction II. Overview Introduction II Overview Today we will introduce multicore hardware (we will introduce many-core hardware prior to learning OpenCL) We will also consider the relationship between computer hardware and

More information

What is Parallel Computing?

What is Parallel Computing? What is Parallel Computing? Parallel Computing is several processing elements working simultaneously to solve a problem faster. 1/33 What is Parallel Computing? Parallel Computing is several processing

More information

High Performance Computing. University questions with solution

High Performance Computing. University questions with solution High Performance Computing University questions with solution Q1) Explain the basic working principle of VLIW processor. (6 marks) The following points are basic working principle of VLIW processor. The

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

Multiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism

Multiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism Computing Systems & Performance Beyond Instruction-Level Parallelism MSc Informatics Eng. 2012/13 A.J.Proença From ILP to Multithreading and Shared Cache (most slides are borrowed) When exploiting ILP,

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming Linda Woodard CAC 19 May 2010 Introduction to Parallel Computing on Ranger 5/18/2010 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor

More information

Exam-2 Scope. 3. Shared memory architecture, distributed memory architecture, SMP, Distributed Shared Memory and Directory based coherence

Exam-2 Scope. 3. Shared memory architecture, distributed memory architecture, SMP, Distributed Shared Memory and Directory based coherence Exam-2 Scope 1. Memory Hierarchy Design (Cache, Virtual memory) Chapter-2 slides memory-basics.ppt Optimizations of Cache Performance Memory technology and optimizations Virtual memory 2. SIMD, MIMD, Vector,

More information

COSC 6385 Computer Architecture - Thread Level Parallelism (I)

COSC 6385 Computer Architecture - Thread Level Parallelism (I) COSC 6385 Computer Architecture - Thread Level Parallelism (I) Edgar Gabriel Spring 2014 Long-term trend on the number of transistor per integrated circuit Number of transistors double every ~18 month

More information

WHY PARALLEL PROCESSING? (CE-401)

WHY PARALLEL PROCESSING? (CE-401) PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:

More information

ronny@mit.edu www.cag.lcs.mit.edu/scale Introduction Architectures are all about exploiting the parallelism inherent to applications Performance Energy The Vector-Thread Architecture is a new approach

More information

Three basic multiprocessing issues

Three basic multiprocessing issues Three basic multiprocessing issues 1. artitioning. The sequential program must be partitioned into subprogram units or tasks. This is done either by the programmer or by the compiler. 2. Scheduling. Associated

More information

High Performance Computing in C and C++

High Performance Computing in C and C++ High Performance Computing in C and C++ Rita Borgo Computer Science Department, Swansea University Announcement No change in lecture schedule: Timetable remains the same: Monday 1 to 2 Glyndwr C Friday

More information

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004 A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into

More information

Keywords and Review Questions

Keywords and Review Questions Keywords and Review Questions lec1: Keywords: ISA, Moore s Law Q1. Who are the people credited for inventing transistor? Q2. In which year IC was invented and who was the inventor? Q3. What is ISA? Explain

More information

COSC 6385 Computer Architecture - Multi Processor Systems

COSC 6385 Computer Architecture - Multi Processor Systems COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:

More information

ARCHITECTURES FOR PARALLEL COMPUTATION

ARCHITECTURES FOR PARALLEL COMPUTATION Datorarkitektur Fö 11/12-1 Datorarkitektur Fö 11/12-2 ARCHITECTURES FOR PARALLEL COMTATION 1. Why Parallel Computation 2. Parallel Programs 3. A Classification of Computer Architectures 4. Performance

More information

Multiprocessors - Flynn s Taxonomy (1966)

Multiprocessors - Flynn s Taxonomy (1966) Multiprocessors - Flynn s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) Conventional uniprocessor Although ILP is exploited Single Program Counter -> Single Instruction stream The

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY Department of Computer science and engineering Year :II year CS6303 COMPUTER ARCHITECTURE Question Bank UNIT-1OVERVIEW AND INSTRUCTIONS PART-B

More information

Multi-core Programming - Introduction

Multi-core Programming - Introduction Multi-core Programming - Introduction Based on slides from Intel Software College and Multi-Core Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts,

More information

PIPELINE AND VECTOR PROCESSING

PIPELINE AND VECTOR PROCESSING PIPELINE AND VECTOR PROCESSING PIPELINING: Pipelining is a technique of decomposing a sequential process into sub operations, with each sub process being executed in a special dedicated segment that operates

More information

Online Course Evaluation. What we will do in the last week?

Online Course Evaluation. What we will do in the last week? Online Course Evaluation Please fill in the online form The link will expire on April 30 (next Monday) So far 10 students have filled in the online form Thank you if you completed it. 1 What we will do

More information

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1 Introduction to parallel computers and parallel programming Introduction to parallel computersand parallel programming p. 1 Content A quick overview of morden parallel hardware Parallelism within a chip

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #7 2/5/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class

More information

Review of previous examinations TMA4280 Introduction to Supercomputing

Review of previous examinations TMA4280 Introduction to Supercomputing Review of previous examinations TMA4280 Introduction to Supercomputing NTNU, IMF April 24. 2017 1 Examination The examination is usually comprised of: one problem related to linear algebra operations with

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 6. Parallel Processors from Client to Cloud

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 6. Parallel Processors from Client to Cloud COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 6 Parallel Processors from Client to Cloud Introduction Goal: connecting multiple computers to get higher performance

More information

Pipeline and Vector Processing 1. Parallel Processing SISD SIMD MISD & MIMD

Pipeline and Vector Processing 1. Parallel Processing SISD SIMD MISD & MIMD Pipeline and Vector Processing 1. Parallel Processing Parallel processing is a term used to denote a large class of techniques that are used to provide simultaneous data-processing tasks for the purpose

More information

Parallel Computing Why & How?

Parallel Computing Why & How? Parallel Computing Why & How? Xing Cai Simula Research Laboratory Dept. of Informatics, University of Oslo Winter School on Parallel Computing Geilo January 20 25, 2008 Outline 1 Motivation 2 Parallel

More information

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE The most popular taxonomy of computer architecture was defined by Flynn in 1966. Flynn s classification scheme is based on the notion of a stream of information.

More information

The Art of Parallel Processing

The Art of Parallel Processing The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a

More information

Overview. Processor organizations Types of parallel machines. Real machines

Overview. Processor organizations Types of parallel machines. Real machines Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters, DAS Programming methods, languages, and environments

More information

THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT HARDWARE PLATFORMS

THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT HARDWARE PLATFORMS Computer Science 14 (4) 2013 http://dx.doi.org/10.7494/csci.2013.14.4.679 Dominik Żurek Marcin Pietroń Maciej Wielgosz Kazimierz Wiatr THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT

More information

Programmation Concurrente (SE205)

Programmation Concurrente (SE205) Programmation Concurrente (SE205) CM1 - Introduction to Parallelism Florian Brandner & Laurent Pautet LTCI, Télécom ParisTech, Université Paris-Saclay x Outline Course Outline CM1: Introduction Forms of

More information

Parallel Architectures

Parallel Architectures Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36

More information

Lecture 26: Parallel Processing. Spring 2018 Jason Tang

Lecture 26: Parallel Processing. Spring 2018 Jason Tang Lecture 26: Parallel Processing Spring 2018 Jason Tang 1 Topics Static multiple issue pipelines Dynamic multiple issue pipelines Hardware multithreading 2 Taxonomy of Parallel Architectures Flynn categories:

More information

Lecture 28 Introduction to Parallel Processing and some Architectural Ramifications. Flynn s Taxonomy. Multiprocessing.

Lecture 28 Introduction to Parallel Processing and some Architectural Ramifications. Flynn s Taxonomy. Multiprocessing. 1 2 Lecture 28 Introduction to arallel rocessing and some Architectural Ramifications 3 4 ultiprocessing Flynn s Taxonomy Flynn s Taxonomy of arallel achines How many Instruction streams? How many Data

More information

Parallel Programs. EECC756 - Shaaban. Parallel Random-Access Machine (PRAM) Example: Asynchronous Matrix Vector Product on a Ring

Parallel Programs. EECC756 - Shaaban. Parallel Random-Access Machine (PRAM) Example: Asynchronous Matrix Vector Product on a Ring Parallel Programs Conditions of Parallelism: Data Dependence Control Dependence Resource Dependence Bernstein s Conditions Asymptotic Notations for Algorithm Analysis Parallel Random-Access Machine (PRAM)

More information

CS 654 Computer Architecture Summary. Peter Kemper

CS 654 Computer Architecture Summary. Peter Kemper CS 654 Computer Architecture Summary Peter Kemper Chapters in Hennessy & Patterson Ch 1: Fundamentals Ch 2: Instruction Level Parallelism Ch 3: Limits on ILP Ch 4: Multiprocessors & TLP Ap A: Pipelining

More information

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. November Parallel Sorting

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. November Parallel Sorting Parallel Systems Course: Chapter VIII Sorting Algorithms Kumar Chapter 9 Jan Lemeire ETRO Dept. November 2014 Overview 1. Parallel sort distributed memory 2. Parallel sort shared memory 3. Sorting Networks

More information

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing Dr Izadi CSE-4533 Introduction to Parallel Processing Chapter 4 Models of Parallel Processing Elaborate on the taxonomy of parallel processing from chapter Introduce abstract models of shared and distributed

More information

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means High Performance Computing: Concepts, Methods & Means OpenMP Programming Prof. Thomas Sterling Department of Computer Science Louisiana State University February 8 th, 2007 Topics Introduction Overview

More information

Issues in Multiprocessors

Issues in Multiprocessors Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing explicit sends & receives Which execution model control parallel

More information

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight

More information

UNIVERSITI SAINS MALAYSIA. CCS524 Parallel Computing Architectures, Algorithms & Compilers

UNIVERSITI SAINS MALAYSIA. CCS524 Parallel Computing Architectures, Algorithms & Compilers UNIVERSITI SAINS MALAYSIA Second Semester Examination Academic Session 2003/2004 September/October 2003 CCS524 Parallel Computing Architectures, Algorithms & Compilers Duration : 3 hours INSTRUCTION TO

More information

Processor Architecture and Interconnect

Processor Architecture and Interconnect Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing

More information

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. Fall Parallel Sorting

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. Fall Parallel Sorting Parallel Systems Course: Chapter VIII Sorting Algorithms Kumar Chapter 9 Jan Lemeire ETRO Dept. Fall 2017 Overview 1. Parallel sort distributed memory 2. Parallel sort shared memory 3. Sorting Networks

More information

CUDA GPGPU Workshop 2012

CUDA GPGPU Workshop 2012 CUDA GPGPU Workshop 2012 Parallel Programming: C thread, Open MP, and Open MPI Presenter: Nasrin Sultana Wichita State University 07/10/2012 Parallel Programming: Open MP, MPI, Open MPI & CUDA Outline

More information

Cray XE6 Performance Workshop

Cray XE6 Performance Workshop Cray XE6 erformance Workshop odern HC Architectures David Henty d.henty@epcc.ed.ac.uk ECC, University of Edinburgh Overview Components History Flynn s Taxonomy SID ID Classification via emory Distributed

More information

High Performance Computing Systems

High Performance Computing Systems High Performance Computing Systems Shared Memory Doug Shook Shared Memory Bottlenecks Trips to memory Cache coherence 2 Why Multicore? Shared memory systems used to be purely the domain of HPC... What

More information

Computer Systems Architecture

Computer Systems Architecture Computer Systems Architecture Lecture 23 Mahadevan Gomathisankaran April 27, 2010 04/27/2010 Lecture 23 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student

More information

45-year CPU Evolution: 1 Law -2 Equations

45-year CPU Evolution: 1 Law -2 Equations 4004 8086 PowerPC 601 Pentium 4 Prescott 1971 1978 1992 45-year CPU Evolution: 1 Law -2 Equations Daniel Etiemble LRI Université Paris Sud 2004 Xeon X7560 Power9 Nvidia Pascal 2010 2017 2016 Are there

More information

CSL 860: Modern Parallel

CSL 860: Modern Parallel CSL 860: Modern Parallel Computation Course Information www.cse.iitd.ac.in/~subodh/courses/csl860 Grading: Quizes25 Lab Exercise 17 + 8 Project35 (25% design, 25% presentations, 50% Demo) Final Exam 25

More information

COMP 308 Parallel Efficient Algorithms. Course Description and Objectives: Teaching method. Recommended Course Textbooks. What is Parallel Computing?

COMP 308 Parallel Efficient Algorithms. Course Description and Objectives: Teaching method. Recommended Course Textbooks. What is Parallel Computing? COMP 308 Parallel Efficient Algorithms Course Description and Objectives: Lecturer: Dr. Igor Potapov Chadwick Building, room 2.09 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: http://www.csc.liv.ac.uk/~igor/comp308

More information

INTELLIGENCE PLUS CHARACTER - THAT IS THE GOAL OF TRUE EDUCATION UNIT-I

INTELLIGENCE PLUS CHARACTER - THAT IS THE GOAL OF TRUE EDUCATION UNIT-I UNIT-I 1. List and explain the functional units of a computer with a neat diagram 2. Explain the computer levels of programming languages 3. a) Explain about instruction formats b) Evaluate the arithmetic

More information

Parallelism. CS6787 Lecture 8 Fall 2017

Parallelism. CS6787 Lecture 8 Fall 2017 Parallelism CS6787 Lecture 8 Fall 2017 So far We ve been talking about algorithms We ve been talking about ways to optimize their parameters But we haven t talked about the underlying hardware How does

More information

Exploring different level of parallelism Instruction-level parallelism (ILP): how many of the operations/instructions in a computer program can be performed simultaneously 1. e = a + b 2. f = c + d 3.

More information

Computer Architecture

Computer Architecture Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors

More information

10th August Part One: Introduction to Parallel Computing

10th August Part One: Introduction to Parallel Computing Part One: Introduction to Parallel Computing 10th August 2007 Part 1 - Contents Reasons for parallel computing Goals and limitations Criteria for High Performance Computing Overview of parallel computer

More information

The future is parallel but it may not be easy

The future is parallel but it may not be easy The future is parallel but it may not be easy Michael J. Flynn Maxeler and Stanford University M. J. Flynn 1 HiPC Dec 07 Outline I The big technology tradeoffs: area, time, power HPC: What s new at the

More information

Fundamentals of Computer Design

Fundamentals of Computer Design Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering Multiprocessors and Thread-Level Parallelism Multithreading Increasing performance by ILP has the great advantage that it is reasonable transparent to the programmer, ILP can be quite limited or hard to

More information

Computer Systems Architecture

Computer Systems Architecture Computer Systems Architecture Lecture 24 Mahadevan Gomathisankaran April 29, 2010 04/29/2010 Lecture 24 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student

More information

EC 513 Computer Architecture

EC 513 Computer Architecture EC 513 Computer Architecture Complex Pipelining: Superscalar Prof. Michel A. Kinsy Summary Concepts Von Neumann architecture = stored-program computer architecture Self-Modifying Code Princeton architecture

More information