Parallel Models. Hypercube Butterfly Fully Connected Other Networks Shared Memory v.s. Distributed Memory SIMD v.s. MIMD

Size: px
Start display at page:

Download "Parallel Models. Hypercube Butterfly Fully Connected Other Networks Shared Memory v.s. Distributed Memory SIMD v.s. MIMD"

Transcription

1 Parallel Algorithms

2 Parallel Models Hypercube Butterfly Fully Connected Other Networks Shared Memory v.s. Distributed Memory SIMD v.s. MIMD

3 The PRAM Model Parallel Random Access Machine All processors act in lock-step Number of processors is not limited All processors have local memory One global memory accessible to all processors Processors must read and write global memory

4 A Pram Algorithm Every Processor knows its own index (usually indicated by variable i) Vector Sum: Read M[i] Into x; Read M[i+n] Into y; x := x + y; Write x into M[i];

5 Binary Fan-In Read M[i] into Largest; Write M[i] into M[i+n]; Delta := 1; For k := 1 to lg n Read M[i+Delta] into x; Largest := Maximum(x,Largest); Write Largest into M[i]; Delta := Delta * 2; End For

6 Parallel Addition Read M[i] into Total; Write 0 into M[i+n]; Delta := 1; For k := 1 to lg n Read M[i+Delta] into x; Total := x + Total; Write Total into M[i]; Delta := Delta * 2; End For

7 Pointer Jumping Read M[i] Into Total; For k := 1 to lg n Read Next[i] into Ptr If Ptr 0 Then Read M[Ptr] Into x; Total := Total + x; Write Total into M[i]; Read Next[Ptr] Into NewPtr Write NewPtr into Next[i] End If End For

8 Initialization of Next[i] If i = n Then Write 0 Into Next[i]; Else Write i+1 Into Next[i]; End If

9 Calculate Node Depth I If there is a Left Child 1-1 To 1 of Left Child 0 From -1 of Left Child

10 Calculate Node Depth 2 If there is no left child 1-1 0

11 Calculate Node Depth If there is a Right Child 0 From -1 of Right Child To 1 of Right Child

12 Calculate Node Depth If there is no right child

13 Concurrent Reads & Writes EREW - Exclusive Read, Exclusive Write CREW - Common Read, Exclusive Write CRCW - Common Read, Common Write All common writes must write the same thing Highest Priority Processor wins contest CREW is more powerful than EREW CRCW is more powerful than CREW

14 Finding Max Square Array of Processors Indexed by i,j Write True into R[i]; Read M[i] into x; Read M[j] into y; If x < y Then Write False Into R[i]; Else If y < x Then Write False Into R[j]; End If

15 CRCW V.S. CREW CRCW Max runs in constant time CREW Max runs in lg n time CRCW cannot be any better than lg p faster than EREW

16 EREW V.S. CREW Finding Roots by Shortcutting Pointers CREW Runs in lg lg n Time EREW Runs in lg n Time

17 Optimal Parallel Algorithms NC -- The class of algorithms that run in Θ(log m n) time using Θ(n k ) processors General Boolean Functions Cannot be Computed any Faster than Θ(lg n) Θ(lg n) is optimal for computing the sum of n integers

18 Parallel Algorithms

19 Parallel Models Hypercube Butterfly Fully Connected Other Networks Shared Memory v.s. Distributed Memory SIMD v.s. MIMD

20 The PRAM Model Parallel Random Access Machine All processors act in lock-step Number of processors is not limited All processors have local memory One global memory accessible to all processors Processors must read and write global memory

21 A Pram Algorithm Every Processor knows its own index (usually indicated by variable i) Vector Sum: Read M[i] Into x; Read M[i+n] Into y; x := x + y; Write x into M[i];

22 Binary Fan-In Read M[i] into Largest; Write M[i] into M[i+n]; Delta := 1; For k := 1 to lg n Read M[i+Delta] into x; Largest := Maximum(x,Largest); Write Largest into M[i]; Delta := Delta * 2; End For

23 Parallel Addition Read M[i] into Total; Write 0 into M[i+n]; Delta := 1; For k := 1 to lg n Read M[i+Delta] into x; Total := x + Total; Write Total into M[i]; Delta := Delta * 2; End For

24 Pointer Jumping Read M[i] Into Total; For k := 1 to lg n Read Next[i] into Ptr If Ptr 0 Then Read M[Ptr] Into x; Total := Total + x; Write Total into M[i]; Read Next[Ptr] Into NewPtr Write NewPtr into Next[i] End If End For

25 Initialization of Next[i] If i = n Then Write 0 Into Next[i]; Else Write i+1 Into Next[i]; End If

26 Calculate Node Depth I If there is a Left Child 1-1 To 1 of Left Child 0 From -1 of Left Child

27 Calculate Node Depth 2 If there is no left child 1-1 0

28 Calculate Node Depth If there is a Right Child 0 From -1 of Right Child To 1 of Right Child

29 Calculate Node Depth If there is no right child

30 Concurrent Reads & Writes EREW - Exclusive Read, Exclusive Write CREW - Common Read, Exclusive Write CRCW - Common Read, Common Write All common writes must write the same thing Highest Priority Processor wins contest CREW is more powerful than EREW CRCW is more powerful than CREW

31 Finding Max Square Array of Processors Indexed by i,j Write True into R[i]; Read M[i] into x; Read M[j] into y; If x < y Then Write False Into R[i]; Else If y < x Then Write False Into R[j]; End If

32 CRCW V.S. CREW CRCW Max runs in constant time CREW Max runs in lg n time CRCW cannot be any better than lg p faster than EREW

33 EREW V.S. CREW Finding Roots by Shortcutting Pointers CREW Runs in lg lg n Time EREW Runs in lg n Time

34 Optimal Parallel Algorithms NC -- The class of algorithms that run in Θ(log m n) time using Θ(n k ) processors General Boolean Functions Cannot be Computed any Faster than Θ(lg n) Θ(lg n) is optimal for computing the sum of n integers

Fundamental Algorithms

Fundamental Algorithms Fundamental Algorithms Chapter 4: Parallel Algorithms The PRAM Model Michael Bader, Kaveh Rahnema Winter 2011/12 Chapter 4: Parallel Algorithms The PRAM Model, Winter 2011/12 1 Example: Parallel Searching

More information

Parallel Random Access Machine (PRAM)

Parallel Random Access Machine (PRAM) PRAM Algorithms Parallel Random Access Machine (PRAM) Collection of numbered processors Access shared memory Each processor could have local memory (registers) Each processor can access any shared memory

More information

Fundamental Algorithms

Fundamental Algorithms Fundamental Algorithms Chapter 6: Parallel Algorithms The PRAM Model Jan Křetínský Winter 2017/18 Chapter 6: Parallel Algorithms The PRAM Model, Winter 2017/18 1 Example: Parallel Sorting Definition Sorting

More information

Real parallel computers

Real parallel computers CHAPTER 30 (in old edition) Parallel Algorithms The PRAM MODEL OF COMPUTATION Abbreviation for Parallel Random Access Machine Consists of p processors (PEs), P 0, P 1, P 2,, P p-1 connected to a shared

More information

CSE Introduction to Parallel Processing. Chapter 5. PRAM and Basic Algorithms

CSE Introduction to Parallel Processing. Chapter 5. PRAM and Basic Algorithms Dr Izadi CSE-40533 Introduction to Parallel Processing Chapter 5 PRAM and Basic Algorithms Define PRAM and its various submodels Show PRAM to be a natural extension of the sequential computer (RAM) Develop

More information

The PRAM (Parallel Random Access Memory) model. All processors operate synchronously under the control of a common CPU.

The PRAM (Parallel Random Access Memory) model. All processors operate synchronously under the control of a common CPU. The PRAM (Parallel Random Access Memory) model All processors operate synchronously under the control of a common CPU. The PRAM (Parallel Random Access Memory) model All processors operate synchronously

More information

The PRAM model. A. V. Gerbessiotis CIS 485/Spring 1999 Handout 2 Week 2

The PRAM model. A. V. Gerbessiotis CIS 485/Spring 1999 Handout 2 Week 2 The PRAM model A. V. Gerbessiotis CIS 485/Spring 1999 Handout 2 Week 2 Introduction The Parallel Random Access Machine (PRAM) is one of the simplest ways to model a parallel computer. A PRAM consists of

More information

CS256 Applied Theory of Computation

CS256 Applied Theory of Computation CS256 Applied Theory of Computation Parallel Computation IV John E Savage Overview PRAM Work-time framework for parallel algorithms Prefix computations Finding roots of trees in a forest Parallel merging

More information

Parallel Algorithms for (PRAM) Computers & Some Parallel Algorithms. Reference : Horowitz, Sahni and Rajasekaran, Computer Algorithms

Parallel Algorithms for (PRAM) Computers & Some Parallel Algorithms. Reference : Horowitz, Sahni and Rajasekaran, Computer Algorithms Parallel Algorithms for (PRAM) Computers & Some Parallel Algorithms Reference : Horowitz, Sahni and Rajasekaran, Computer Algorithms Part 2 1 3 Maximum Selection Problem : Given n numbers, x 1, x 2,, x

More information

PRAM Divide and Conquer Algorithms

PRAM Divide and Conquer Algorithms PRAM Divide and Conquer Algorithms (Chapter Five) Introduction: Really three fundamental operations: Divide is the partitioning process Conquer the the process of (eventually) solving the eventual base

More information

CS 598: Communication Cost Analysis of Algorithms Lecture 15: Communication-optimal sorting and tree-based algorithms

CS 598: Communication Cost Analysis of Algorithms Lecture 15: Communication-optimal sorting and tree-based algorithms CS 598: Communication Cost Analysis of Algorithms Lecture 15: Communication-optimal sorting and tree-based algorithms Edgar Solomonik University of Illinois at Urbana-Champaign October 12, 2016 Defining

More information

1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors

1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors 1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors on an EREW PRAM: See solution for the next problem. Omit the step where each processor sequentially computes the AND of

More information

INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR Stamp / Signature of the Invigilator

INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR Stamp / Signature of the Invigilator INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR Stamp / Signature of the Invigilator EXAMINATION ( End Semester ) SEMESTER ( Autumn ) Roll Number Section Name Subject Number C S 6 0 0 2 6 Subject Name Parallel

More information

Complexity and Advanced Algorithms Monsoon Parallel Algorithms Lecture 2

Complexity and Advanced Algorithms Monsoon Parallel Algorithms Lecture 2 Complexity and Advanced Algorithms Monsoon 2011 Parallel Algorithms Lecture 2 Trivia ISRO has a new supercomputer rated at 220 Tflops Can be extended to Pflops. Consumes only 150 KW of power. LINPACK is

More information

DPHPC: Performance Recitation session

DPHPC: Performance Recitation session SALVATORE DI GIROLAMO DPHPC: Performance Recitation session spcl.inf.ethz.ch Administrativia Reminder: Project presentations next Monday 9min/team (7min talk + 2min questions) Presentations

More information

COMP Parallel Computing. PRAM (4) PRAM models and complexity

COMP Parallel Computing. PRAM (4) PRAM models and complexity COMP 633 - Parallel Computing Lecture 5 September 4, 2018 PRAM models and complexity Reading for Thursday Memory hierarchy and cache-based systems Topics Comparison of PRAM models relative performance

More information

Parallel Random-Access Machines

Parallel Random-Access Machines Parallel Random-Access Machines Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS3101 (Moreno Maza) Parallel Random-Access Machines CS3101 1 / 69 Plan 1 The PRAM Model 2 Performance

More information

Algorithms & Data Structures 2

Algorithms & Data Structures 2 Algorithms & Data Structures 2 PRAM Algorithms WS2017 B. Anzengruber-Tanase (Institute for Pervasive Computing, JKU Linz) (Institute for Pervasive Computing, JKU Linz) RAM MODELL (AHO/HOPCROFT/ULLMANN

More information

COMP Parallel Computing. PRAM (2) PRAM algorithm design techniques

COMP Parallel Computing. PRAM (2) PRAM algorithm design techniques COMP 633 - Parallel Computing Lecture 3 Aug 29, 2017 PRAM algorithm design techniques Reading for next class (Thu Aug 31): PRAM handout secns 3.6, 4.1, skim section 5. Written assignment 1 is posted, due

More information

SHARED MEMORY VS DISTRIBUTED MEMORY

SHARED MEMORY VS DISTRIBUTED MEMORY OVERVIEW Important Processor Organizations 3 SHARED MEMORY VS DISTRIBUTED MEMORY Classical parallel algorithms were discussed using the shared memory paradigm. In shared memory parallel platform processors

More information

What is Parallel Computing?

What is Parallel Computing? What is Parallel Computing? Parallel Computing is several processing elements working simultaneously to solve a problem faster. 1/33 What is Parallel Computing? Parallel Computing is several processing

More information

Advanced Computer Architecture. The Architecture of Parallel Computers

Advanced Computer Architecture. The Architecture of Parallel Computers Advanced Computer Architecture The Architecture of Parallel Computers Computer Systems No Component Can be Treated In Isolation From the Others Application Software Operating System Hardware Architecture

More information

: Parallel Algorithms Exercises, Batch 1. Exercise Day, Tuesday 18.11, 10:00. Hand-in before or at Exercise Day

: Parallel Algorithms Exercises, Batch 1. Exercise Day, Tuesday 18.11, 10:00. Hand-in before or at Exercise Day 184.727: Parallel Algorithms Exercises, Batch 1. Exercise Day, Tuesday 18.11, 10:00. Hand-in before or at Exercise Day Jesper Larsson Träff, Francesco Versaci Parallel Computing Group TU Wien October 16,

More information

PRAM (Parallel Random Access Machine)

PRAM (Parallel Random Access Machine) PRAM (Parallel Random Access Machine) Lecture Overview Why do we need a model PRAM Some PRAM algorithms Analysis A Parallel Machine Model What is a machine model? Describes a machine Puts a value to the

More information

Data Structures and Algorithms CSE 465

Data Structures and Algorithms CSE 465 Data Structures and Algorithms CSE 465 LECTURE 4 More Divide and Conquer Binary Search Exponentiation Multiplication Sofya Raskhodnikova and Adam Smith Review questions How long does Merge Sort take on

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2017S-06 Binary Search Trees David Galles Department of Computer Science University of San Francisco 06-0: Ordered List ADT Operations: Insert an element in the list

More information

Arrays aren t going to work. What can we do? Use pointers Copy a large section of a heap, with a single pointer assignment

Arrays aren t going to work. What can we do? Use pointers Copy a large section of a heap, with a single pointer assignment CS5-008S-0 Leftist Heaps 0-0: Leftist Heaps Operations: Add an element Remove smallest element Merge two heaps together 0-: Leftist Heaps Operations: Add an element Remove smallest element Merge two heaps

More information

Student Number: CSE191 Midterm II Spring Plagiarism will earn you an F in the course and a recommendation of expulsion from the university.

Student Number: CSE191 Midterm II Spring Plagiarism will earn you an F in the course and a recommendation of expulsion from the university. Plagiarism will earn you an F in the course and a recommendation of expulsion from the university. (1 pt each) For Questions 1-5, when asked for a running time or the result of a summation, you must choose

More information

IE 495 Lecture 3. Septermber 5, 2000

IE 495 Lecture 3. Septermber 5, 2000 IE 495 Lecture 3 Septermber 5, 2000 Reading for this lecture Primary Miller and Boxer, Chapter 1 Aho, Hopcroft, and Ullman, Chapter 1 Secondary Parberry, Chapters 3 and 4 Cosnard and Trystram, Chapter

More information

EE/CSCI 451 Spring 2018 Homework 8 Total Points: [10 points] Explain the following terms: EREW PRAM CRCW PRAM. Brent s Theorem.

EE/CSCI 451 Spring 2018 Homework 8 Total Points: [10 points] Explain the following terms: EREW PRAM CRCW PRAM. Brent s Theorem. EE/CSCI 451 Spring 2018 Homework 8 Total Points: 100 1 [10 points] Explain the following terms: EREW PRAM CRCW PRAM Brent s Theorem BSP model 1 2 [15 points] Assume two sorted sequences of size n can be

More information

Paradigms for Parallel Algorithms

Paradigms for Parallel Algorithms S Parallel Algorithms Paradigms for Parallel Algorithms Reference : C. Xavier and S. S. Iyengar, Introduction to Parallel Algorithms Binary Tree Paradigm A binary tree with n nodes is of height log n Can

More information

Chapter 2 Parallel Computer Models & Classification Thoai Nam

Chapter 2 Parallel Computer Models & Classification Thoai Nam Chapter 2 Parallel Computer Models & Classification Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology Chapter 2: Parallel Computer Models & Classification Abstract Machine

More information

CSCE 750, Spring 2001 Notes 3 Page Symmetric Multi Processors (SMPs) (e.g., Cray vector machines, Sun Enterprise with caveats) Many processors

CSCE 750, Spring 2001 Notes 3 Page Symmetric Multi Processors (SMPs) (e.g., Cray vector machines, Sun Enterprise with caveats) Many processors CSCE 750, Spring 2001 Notes 3 Page 1 5 Parallel Algorithms 5.1 Basic Concepts With ordinary computers and serial (=sequential) algorithms, we have one processor and one memory. We count the number of operations

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS5-008S-0 Leftist Heaps David Galles Department of Computer Science University of San Francisco 0-0: Leftist Heaps Operations: Add an element Remove smallest element Merge

More information

Parallel Models RAM. Parallel RAM aka PRAM. Variants of CRCW PRAM. Advanced Algorithms

Parallel Models RAM. Parallel RAM aka PRAM. Variants of CRCW PRAM. Advanced Algorithms Parallel Models Advanced Algorithms Piyush Kumar (Lecture 10: Parallel Algorithms) An abstract description of a real world parallel machine. Attempts to capture essential features (and suppress details?)

More information

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K. Fundamentals of Parallel Computing Sanjay Razdan Alpha Science International Ltd. Oxford, U.K. CONTENTS Preface Acknowledgements vii ix 1. Introduction to Parallel Computing 1.1-1.37 1.1 Parallel Computing

More information

Collection of priority-job pairs; priorities are comparable.

Collection of priority-job pairs; priorities are comparable. Priority Queue Collection of priority-job pairs; priorities are comparable. insert(p, j) max(): read(-only) job of max priority extract-max(): read and remove job of max priority increase-priority(i, p

More information

DO NOT REPRODUCE. CS61B, Fall 2008 Test #3 (revised) P. N. Hilfinger

DO NOT REPRODUCE. CS61B, Fall 2008 Test #3 (revised) P. N. Hilfinger CS6B, Fall 2008 Test #3 (revised) P. N. Hilfinger. [7 points] Please give short answers to the following, giving reasons where called for. Unless a question says otherwise, time estimates refer to asymptotic

More information

CSCE 750, Fall 2002 Notes 3 Page 2 with memory access time. And this is not easy Symmetric Multi Processors (SMPs) (e.g., Cray vector machines,

CSCE 750, Fall 2002 Notes 3 Page 2 with memory access time. And this is not easy Symmetric Multi Processors (SMPs) (e.g., Cray vector machines, CSCE 750, Fall 2002 Notes 3 Page 1 5 Parallel Algorithms These notes are a distillation from a number of different parts of Mike Quinn's book. 5.1 Basic Concepts With ordinary computers and serial (=sequential)

More information

Binary Heaps. CSE 373 Data Structures Lecture 11

Binary Heaps. CSE 373 Data Structures Lecture 11 Binary Heaps CSE Data Structures Lecture Readings and References Reading Sections.1-. //0 Binary Heaps - Lecture A New Problem Application: Find the smallest ( or highest priority) item quickly Operating

More information

each processor can in one step do a RAM op or read/write to one global memory location

each processor can in one step do a RAM op or read/write to one global memory location Parallel Algorithms Two closely related models of parallel computation. Circuits Logic gates (AND/OR/not) connected by wires important measures PRAM number of gates depth (clock cycles in synchronous circuit)

More information

CSL 730: Parallel Programming. Algorithms

CSL 730: Parallel Programming. Algorithms CSL 73: Parallel Programming Algorithms First 1 problem Input: n-bit vector Output: minimum index of a 1-bit First 1 problem Input: n-bit vector Output: minimum index of a 1-bit Algorithm: Divide into

More information

CSL 730: Parallel Programming

CSL 730: Parallel Programming CSL 73: Parallel Programming General Algorithmic Techniques Balance binary tree Partitioning Divid and conquer Fractional cascading Recursive doubling Symmetry breaking Pipelining 2 PARALLEL ALGORITHM

More information

Recursion. COMS W1007 Introduction to Computer Science. Christopher Conway 26 June 2003

Recursion. COMS W1007 Introduction to Computer Science. Christopher Conway 26 June 2003 Recursion COMS W1007 Introduction to Computer Science Christopher Conway 26 June 2003 The Fibonacci Sequence The Fibonacci numbers are: 1, 1, 2, 3, 5, 8, 13, 21, 34,... We can calculate the nth Fibonacci

More information

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics 1 Sorting 1.1 Problem Statement You are given a sequence of n numbers < a 1, a 2,..., a n >. You need to

More information

Priority Queues. Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building a Heap Heapsort.

Priority Queues. Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building a Heap Heapsort. Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building a Heap Heapsort Philip Bille Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building

More information

The PRAM Model. Alexandre David

The PRAM Model. Alexandre David The PRAM Model Alexandre David 1.2.05 1 Outline Introduction to Parallel Algorithms (Sven Skyum) PRAM model Optimality Examples 11-02-2008 Alexandre David, MVP'08 2 2 Standard RAM Model Standard Random

More information

Priority queues. Priority queues. Priority queue operations

Priority queues. Priority queues. Priority queue operations Priority queues March 30, 018 1 Priority queues The ADT priority queue stores arbitrary objects with priorities. An object with the highest priority gets served first. Objects with priorities are defined

More information

A Parallel Algorithm for Relational Coarsest Partition Problems and Its Implementation

A Parallel Algorithm for Relational Coarsest Partition Problems and Its Implementation A Parallel Algorithm for Relational Coarsest Partition Problems and Its Implementation Insup Lee and S. Rajasekaran Department of Computer and Information Science University of Pennsylvania Philadelphia,

More information

Examination Questions Midterm 2

Examination Questions Midterm 2 CS1102s Data Structures and Algorithms 12/3/2010 Examination Questions Midterm 2 This examination question booklet has 6 pages, including this cover page, and contains 12 questions. You have 30 minutes

More information

Heaps. Heaps. A heap is a complete binary tree.

Heaps. Heaps. A heap is a complete binary tree. A heap is a complete binary tree. 1 A max-heap is a complete binary tree in which the value in each internal node is greater than or equal to the values in the children of that node. A min-heap is defined

More information

COMP4300/8300: Overview of Parallel Hardware. Alistair Rendell. COMP4300/8300 Lecture 2-1 Copyright c 2015 The Australian National University

COMP4300/8300: Overview of Parallel Hardware. Alistair Rendell. COMP4300/8300 Lecture 2-1 Copyright c 2015 The Australian National University COMP4300/8300: Overview of Parallel Hardware Alistair Rendell COMP4300/8300 Lecture 2-1 Copyright c 2015 The Australian National University 2.1 Lecture Outline Review of Single Processor Design So we talk

More information

Lecture 5: Sorting Part A

Lecture 5: Sorting Part A Lecture 5: Sorting Part A Heapsort Running time O(n lg n), like merge sort Sorts in place (as insertion sort), only constant number of array elements are stored outside the input array at any time Combines

More information

COMP4300/8300: Overview of Parallel Hardware. Alistair Rendell

COMP4300/8300: Overview of Parallel Hardware. Alistair Rendell COMP4300/8300: Overview of Parallel Hardware Alistair Rendell COMP4300/8300 Lecture 2-1 Copyright c 2015 The Australian National University 2.2 The Performs: Floating point operations (FLOPS) - add, mult,

More information

Comparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort?

Comparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort? So far we have studied: Comparisons Insertion Sort Merge Sort Worst case Θ(n 2 ) Θ(nlgn) Best case Θ(n) Θ(nlgn) Sorting Revisited So far we talked about two algorithms to sort an array of numbers What

More information

Parallel scan on linked lists

Parallel scan on linked lists Parallel scan on linked lists prof. Ing. Pavel Tvrdík CSc. Katedra počítačových systémů Fakulta informačních technologií České vysoké učení technické v Praze c Pavel Tvrdík, 00 Pokročilé paralelní algoritmy

More information

Chapter 6. Parallel Algorithms. Chapter by M. Ghaari. Last update 1 : January 2, 2019.

Chapter 6. Parallel Algorithms. Chapter by M. Ghaari. Last update 1 : January 2, 2019. Chapter 6 Parallel Algorithms Chapter by M. Ghaari. Last update 1 : January 2, 2019. This chapter provides an introduction to parallel algorithms. Our highlevel goal is to present \how to think in parallel"

More information

Comparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers

Comparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers So far we have studied: Comparisons Tree is completely filled on all levels except possibly the lowest, which is filled from the left up to a point Insertion Sort Merge Sort Worst case Θ(n ) Θ(nlgn) Best

More information

An NC Algorithm for Sorting Real Numbers

An NC Algorithm for Sorting Real Numbers EPiC Series in Computing Volume 58, 2019, Pages 93 98 Proceedings of 34th International Conference on Computers and Their Applications An NC Algorithm for Sorting Real Numbers in O( nlogn loglogn ) Operations

More information

Readings. Priority Queue ADT. FindMin Problem. Priority Queues & Binary Heaps. List implementation of a Priority Queue

Readings. Priority Queue ADT. FindMin Problem. Priority Queues & Binary Heaps. List implementation of a Priority Queue Readings Priority Queues & Binary Heaps Chapter Section.-. CSE Data Structures Winter 00 Binary Heaps FindMin Problem Quickly find the smallest (or highest priority) item in a set Applications: Operating

More information

Priority Queues and Heaps (continues) Chapter 13: Heaps, Balances Trees and Hash Tables Hash Tables In-class Work / Suggested homework.

Priority Queues and Heaps (continues) Chapter 13: Heaps, Balances Trees and Hash Tables Hash Tables In-class Work / Suggested homework. Outline 1 Chapter 13: Heaps, Balances Trees and Hash Tables Priority Queues and Heaps (continues) Hash Tables Binary Heaps Binary Heap is a complete binary tree, whose nodes are labeled with integer values

More information

Structure and Interpretation of Computer Programs Fall 2016 Midterm 2

Structure and Interpretation of Computer Programs Fall 2016 Midterm 2 CS 61A Structure and Interpretation of Computer Programs Fall 2016 Midterm 2 INSTRUCTIONS You have 2 hours to complete the exam. The exam is closed book, closed notes, closed computer, closed calculator,

More information

Algorithms Dr. Haim Levkowitz

Algorithms Dr. Haim Levkowitz 91.503 Algorithms Dr. Haim Levkowitz Fall 2007 Lecture 4 Tuesday, 25 Sep 2007 Design Patterns for Optimization Problems Greedy Algorithms 1 Greedy Algorithms 2 What is Greedy Algorithm? Similar to dynamic

More information

Transform & Conquer. Presorting

Transform & Conquer. Presorting Transform & Conquer Definition Transform & Conquer is a general algorithm design technique which works in two stages. STAGE : (Transformation stage): The problem s instance is modified, more amenable to

More information

ALGORITHM DESIGN DYNAMIC PROGRAMMING. University of Waterloo

ALGORITHM DESIGN DYNAMIC PROGRAMMING. University of Waterloo ALGORITHM DESIGN DYNAMIC PROGRAMMING University of Waterloo LIST OF SLIDES 1-1 List of Slides 1 2 Dynamic Programming Approach 3 Fibonacci Sequence (cont.) 4 Fibonacci Sequence (cont.) 5 Bottom-Up vs.

More information

A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads

A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads Sardar Anisul Haque Marc Moreno Maza Ning Xie University of Western Ontario, Canada IBM CASCON, November 4, 2014 ardar

More information

CSE 4351/5351 Notes 9: PRAM and Other Theoretical Model s

CSE 4351/5351 Notes 9: PRAM and Other Theoretical Model s CSE / Notes : PRAM and Other Theoretical Model s Shared Memory Model Traditional Sequential Algorithm Model RAM (Random Access Machine) Uniform access time to memory Arithmetic operations performed in

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Introduction to Parallel Computing George Karypis Sorting Outline Background Sorting Networks Quicksort Bucket-Sort & Sample-Sort Background Input Specification Each processor has n/p elements A ordering

More information

CS S-06 Binary Search Trees 1

CS S-06 Binary Search Trees 1 CS245-2008S-06 inary Search Trees 1 06-0: Ordered List T Operations: Insert an element in the list Check if an element is in the list Remove an element from the list Print out the contents of the list,

More information

CSE 4500 (228) Fall 2010 Selected Notes Set 2

CSE 4500 (228) Fall 2010 Selected Notes Set 2 CSE 4500 (228) Fall 2010 Selected Notes Set 2 Alexander A. Shvartsman Computer Science and Engineering University of Connecticut Copyright c 2002-2010 by Alexander A. Shvartsman. All rights reserved. 2

More information

Properties of a heap (represented by an array A)

Properties of a heap (represented by an array A) Chapter 6. HeapSort Sorting Problem Input: A sequence of n numbers < a1, a2,..., an > Output: A permutation (reordering) of the input sequence such that ' ' ' < a a a > 1 2... n HeapSort O(n lg n) worst

More information

Priority queues. Priority queues. Priority queue operations

Priority queues. Priority queues. Priority queue operations Priority queues March 8, 08 Priority queues The ADT priority queue stores arbitrary objects with priorities. An object with the highest priority gets served first. Objects with priorities are defined by

More information

Midterm solutions. n f 3 (n) = 3

Midterm solutions. n f 3 (n) = 3 Introduction to Computer Science 1, SE361 DGIST April 20, 2016 Professors Min-Soo Kim and Taesup Moon Midterm solutions Midterm solutions The midterm is a 1.5 hour exam (4:30pm 6:00pm). This is a closed

More information

CS 140 : Numerical Examples on Shared Memory with Cilk++

CS 140 : Numerical Examples on Shared Memory with Cilk++ CS 140 : Numerical Examples on Shared Memory with Cilk++ Matrix-matrix multiplication Matrix-vector multiplication Hyperobjects Thanks to Charles E. Leiserson for some of these slides 1 Work and Span (Recap)

More information

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8 CSE341T/CSE549T 09/17/2014 Lecture 8 Scan and its Uses 1 Scan Today, we start by learning a very useful primitive. First, lets start by thinking about what other primitives we have learned so far? The

More information

Data Structures. Giri Narasimhan Office: ECS 254A Phone: x-3748

Data Structures. Giri Narasimhan Office: ECS 254A Phone: x-3748 Data Structures Giri Narasimhan Office: ECS 254A Phone: x-3748 giri@cs.fiu.edu Motivation u Many applications where Items have associated priorities Job scheduling Long print jobs vs short ones; OS jobs

More information

Introduction to Parallel Algorithms

Introduction to Parallel Algorithms CS 1762 Fall, 2011 1 Introduction to Parallel Algorithms Introduction to Parallel Algorithms ECE 1762 Algorithms and Data Structures Fall Semester, 2011 1 Preliminaries Since the early 1990s, there has

More information

CSL 201 Data Structures Mid-Semester Exam minutes

CSL 201 Data Structures Mid-Semester Exam minutes CL 201 Data tructures Mid-emester Exam - 120 minutes Name: Roll Number: Please read the following instructions carefully This is a closed book, closed notes exam. Calculators are allowed. However laptops

More information

Heap sort. Carlos Moreno uwaterloo.ca EIT

Heap sort. Carlos Moreno uwaterloo.ca EIT Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 http://xkcd.com/835/ https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Last time, on ECE-250... Talked

More information

Structure and Interpretation of Computer Programs

Structure and Interpretation of Computer Programs CS 61A Fall 016 Structure and Interpretation of Computer Programs Midterm Solutions INSTRUCTIONS You have hours to complete the exam. The exam is closed book, closed notes, closed computer, closed calculator,

More information

Models In Parallel Computation

Models In Parallel Computation Models In Parallel Computation It is difficult to write programs without a good idea of how the target computer will execute the code. The most important information is knowing how expensive the operations

More information

Parallel algorithms at ENS Lyon

Parallel algorithms at ENS Lyon Parallel algorithms at ENS Lyon Yves Robert Ecole Normale Supérieure de Lyon & Institut Universitaire de France TCPP Workshop February 2010 Yves.Robert@ens-lyon.fr February 2010 Parallel algorithms 1/

More information

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model.

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model. U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 18 Professor Satish Rao Lecturer: Satish Rao Last revised Scribe so far: Satish Rao (following revious lecture notes quite closely. Lecture

More information

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. November Parallel Sorting

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. November Parallel Sorting Parallel Systems Course: Chapter VIII Sorting Algorithms Kumar Chapter 9 Jan Lemeire ETRO Dept. November 2014 Overview 1. Parallel sort distributed memory 2. Parallel sort shared memory 3. Sorting Networks

More information

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Priority Queues We mentioned priority queues in building Huffman tries Primary operations they needed: Insert Find item with highest priority E.g., findmin() or

More information

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. Fall Parallel Sorting

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. Fall Parallel Sorting Parallel Systems Course: Chapter VIII Sorting Algorithms Kumar Chapter 9 Jan Lemeire ETRO Dept. Fall 2017 Overview 1. Parallel sort distributed memory 2. Parallel sort shared memory 3. Sorting Networks

More information

Basic Communication Ops

Basic Communication Ops CS 575 Parallel Processing Lecture 5: Ch 4 (GGKK) Sanjay Rajopadhye Colorado State University Basic Communication Ops n PRAM, final thoughts n Quiz 3 n Collective Communication n Broadcast & Reduction

More information

Chapter 3: The Efficiency of Algorithms. Invitation to Computer Science, C++ Version, Third Edition

Chapter 3: The Efficiency of Algorithms. Invitation to Computer Science, C++ Version, Third Edition Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, C++ Version, Third Edition Objectives In this chapter, you will learn about: Attributes of algorithms Measuring efficiency Analysis

More information

Priority Queues. 04/10/03 Lecture 22 1

Priority Queues. 04/10/03 Lecture 22 1 Priority Queues It is a variant of queues Each item has an associated priority value. When inserting an item in the queue, the priority value is also provided for it. The data structure provides a method

More information

Priority Queues. Lecture15: Heaps. Priority Queue ADT. Sequence based Priority Queue

Priority Queues. Lecture15: Heaps. Priority Queue ADT. Sequence based Priority Queue Priority Queues (0F) Lecture: Heaps Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Queues Stores items (keys) in a linear list or array FIFO (First In First Out) Stored items do not have priorities. Priority

More information

Lecture 13: AVL Trees and Binary Heaps

Lecture 13: AVL Trees and Binary Heaps Data Structures Brett Bernstein Lecture 13: AVL Trees and Binary Heaps Review Exercises 1. ( ) Interview question: Given an array show how to shue it randomly so that any possible reordering is equally

More information

Heaps. Heapsort. [Reading: CLRS 6] Laura Toma, csci2000, Bowdoin College

Heaps. Heapsort. [Reading: CLRS 6] Laura Toma, csci2000, Bowdoin College Heaps. Heapsort. [Reading: CLRS 6] Laura Toma, csci000, Bowdoin College So far we have discussed tools necessary for analysis of algorithms (growth, summations and recurrences) and we have seen a couple

More information

( ) n 3. n 2 ( ) D. Ο

( ) n 3. n 2 ( ) D. Ο CSE 0 Name Test Summer 0 Last Digits of Mav ID # Multiple Choice. Write your answer to the LEFT of each problem. points each. The time to multiply two n n matrices is: A. Θ( n) B. Θ( max( m,n, p) ) C.

More information

Binary Heaps. COL 106 Shweta Agrawal and Amit Kumar

Binary Heaps. COL 106 Shweta Agrawal and Amit Kumar Binary Heaps COL Shweta Agrawal and Amit Kumar Revisiting FindMin Application: Find the smallest ( or highest priority) item quickly Operating system needs to schedule jobs according to priority instead

More information

The heap is essentially an array-based binary tree with either the biggest or smallest element at the root.

The heap is essentially an array-based binary tree with either the biggest or smallest element at the root. The heap is essentially an array-based binary tree with either the biggest or smallest element at the root. Every parent in a Heap will always be smaller or larger than both of its children. This rule

More information

Parallel Connected Components

Parallel Connected Components Parallel Connected Components prof. Ing. Pavel Tvrdík CSc. Katedra počítačových systémů Fakulta informačních technologií České vysoké učení technické v Praze c Pavel Tvrdík, 00 Pokročilé paralelní algoritmy

More information

Chapter 3: The Efficiency of Algorithms

Chapter 3: The Efficiency of Algorithms Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Java Version, Third Edition Objectives In this chapter, you will learn about Attributes of algorithms Measuring efficiency Analysis

More information

A data structure and associated algorithms, NOT GARBAGE COLLECTION

A data structure and associated algorithms, NOT GARBAGE COLLECTION CS4 Lecture Notes /30/0 Heaps, Heapsort, Priority Queues Sorting problem so far: Heap: Insertion Sort: In Place, O( n ) worst case Merge Sort : Not in place, O( n lg n) worst case Quicksort : In place,

More information

Topic: Heaps and priority queues

Topic: Heaps and priority queues David Keil Data Structures 8/05 1 Topic: Heaps and priority queues The priority-queue problem The heap solution Binary trees and complete binary trees Running time of heap operations Array implementation

More information

Chapter 3: The Efficiency of Algorithms Invitation to Computer Science,

Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Objectives In this chapter, you will learn about Attributes of algorithms Measuring efficiency Analysis of algorithms When things

More information