Parallel optimal numerical computing (on GRID) Belgrade October, 4, 2006 Rodica Potolea

Size: px
Start display at page:

Download "Parallel optimal numerical computing (on GRID) Belgrade October, 4, 2006 Rodica Potolea"

Transcription

1 Parallel optimal numerical computing (on GRID) Belgrade October, 4, 2006 Rodica Potolea

2 Outline Parallel solutions to problems Efficiency Optimal parallel solutions GRID Parallel implementations Conclusions Review 2

3 Outline Parallel solutions to problems Efficiency Optimal parallel solutions GRID Parallel implementations Conclusions Review 3

4 Parallel solutions to problems Why parallel solutions? Increase speed Process huge amount of data Solve problems in real time 4

5 Outline Parallel solutions to problems Efficiency Optimal parallel solutions GRID Parallel implementations Conclusions Review 5

6 Efficiency Parameters to be evaluated Time Memory Cases to be considered (depend on the algorithm implementing the problem) Best Worst Average Optimal algorithms Ω() = O() An algorithm is optimal if the running time in the worst case is of the same order of magnitude with function Ω of the problem to be solved Order of magnitude: a function which express the running time and Ω respectively if polynomial function, the polynom should have the same maximum degree, regardless the multiplicative constant Ex: Ω() = 2n2 and O()= 3n2 it is considered Ω = O, written as Ω(n2) = O(n2) Algorithms have O() >=Ω in the worst case O() <= Ω could be true for best case scenario, or even for the average case Ex: Sorting has Ω(n lgn), while several sorting algorithms have O(1) best case scenario 6

7 Efficiency cont. What kind of algorithms should be considered for parallel implementation? Alg1: O(nk) Oper. Time M1: nk T M2: nk T/V Vnk T For O(nk) alg: (n2)k= Vnk=(v1/k n)k Alg2: O(2n) Oper. 2n 2n V2n Hence n2= For O(2n) alg: (2) n2 = V 2 n =2 lgv + n Hence n2= Time T T/V T v1/k n n + lgv GOOD! BAD! Unhappy conclusion: regardless the number of times we improve the speed of the machine, the max. dimension for which we can run the algorithm 2 increases only with an additive constant!!! 7

8 Efficiency cont. Parameters to be evaluated for parallel processing Time Memory Number of processors involved Cost = Time x Number of processors (c = t x n) E = O()/cost (<=1) E=1 Cost = O() A parallel algorithm is optimal if the cost is of the same order of magnitude with the best known algorithm to solve the problem in the worst case Efficiency Optimal algorithms Other 8

9 Efficiency cont Parallel running time Computation time Communication time (data transfer, partial results transfer, information communication) Computation time As the number of processors involved increases, the computation time decreases Communication time Quite the opposite!!! 9

10 Efficiency cont Diagram of running time T N 10

11 Outline Parallel solutions to problems Efficiency Optimal parallel solutions GRID Parallel implementations Conclusions Review 11

12 Optimal Parallel Solution Parallel Architecture Flynn s taxonomy According to instruction/data flow single/multiple SISD von Newmann architecture SIMD -single control unit dispatches instructions to each processing unit MISD - the same data is analyzed from different points of view (MI) MIMD - different instructions run on different data 12

13 Optimal Parallel Solution Parallel Algorithms Data parallel Input data should be split to be processed by different CPU Control parallel Different sequences are processed by different CPU Code should be parallelized Conflicts should be avoided (removed, where possible, or not parallelize the given sequence) 13

14 Optimal Parallel Solution Parallel Algorithms cont. Constraints within the data/code Refer to precedence relations of computation to be satisfied to compute the problem correctly Read/Write conflicts Read before Write Write before Read Write before Write Define dependencies that limit parallel potential Some of them can be removed such that allow to increase the parallelism 14

15 Parallel Algorithms cont. Dependencies analysis Data Flow dependence Indicates a write before read ordering of operations which must be satisfied Can NOT be avoided Limits possible parallelism Ex: (assume sequence in a loop on i) S1: a(i) = x(i) 3 //assign first S2: b(i) = a(i) /c(i) //use (read) it second a(i) should be first calculated in S1, and only then used in S2. Therefore, S1 and S2 cannot be processed in parallel!!! 15

16 Parallel Algorithms cont. Dependencies analysis Data Antidependence Indicates a read before write ordering of operations which must be satisfied Can be avoided (techniques to remove this type of dependence) Ex: (assume sequence in a loop on i) S1: a(i) = x(i) 3 //assign it later S2: b(i) = c(i) + a(i+1) //use (read) first a(i+1) should be first used (with its former value) in S2, and only then computed in S1, at the next execution of the loop. Therefore, several iterations of the loop cannot be processed in parallel! 16

17 Parallel Algorithms cont. Dependencies analysis Output dependence Indicates a write before write ordering of operations which must be satisfied Can be avoided (techniques to remove this type of dependence) Ex: (assume sequence in a loop on i) S1: c(i+4) = b(i) + a(i+1) //assign first here S2: c(i+1) = x(i) //and here after 3 //executions of the loop c(i+4) is assigned in S1; after 3 executions of the loop, the same element in the array is assigned in S2. Therefore, several iterations of the loop cannot be processed in parallel 17

18 Parallel Algorithms cont. Dependencies analysis Each dependency defines a dependency vector They are defined by the distance between the loop where they interfere (data flow) S1: a(i) = x(i) 3 d=0 S2: b(i) = a(i) /c(i) (data antidep.) S1: a(i) = x(i) 3 d=1 S2: b(i) = c(i) + a(i+1) (output dep.) S1: c(i+4) = b(i) + a(i+1) d=3 S2: c(i+1) = x(i) All dependency vectors within a sequence define a dependency matrix 18

19 Outline Parallel solutions to problems Efficiency Optimal parallel solutions GRID Parallel implementations Conclusions Review 19

20 GRID A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. (Foster) It provides access to computational power 20

21 GRID cont. support for High Throughput Computing Support = computing power + storage (TB) 21

22 GRID cont. RO 09 UTCN W.N.01 C.E. ce01.mosigrid.utcluj.ro SEE GRID W.N.02 S.E. se01.mosigrid.utcluj.ro W.N.10 CE = computing element SE = storage element WN = work node 22

23 GRID cont. SEE-GRID 23

24 Outline Parallel solutions to problems Efficiency Optimal parallel solutions GRID Parallel implementations Conclusions Review 24

25 Parallel implementations optimal approach Matrix multiplication Matrix power Gaussian elimination Sorting Graph algorithms FFT Cryptographic algorithms Randomness tests I/O optimal approach Remarks (general and particular) Tests run by Alex Mascasan Ioana Gligan 25

26 Parallel implementations - optimal approach Matrix multiplication C=AxB procedure MAT_MULT (A, B, C) begin for i := 0 to n - 1 do for j := 0 to n - 1 do begin C[i, j] := 0; for k := 0 to n - 1 do C[i, j] := C[i, j] + A[i, k] x B[k, j]; endfor; end MAT_MULT Sequential implementation: O(n3) 26

27 Parallel implementations - optimal approach Matrix multiplication cont. Parallel solutions - Data parallel 1-D Partitioning A&B decomposed one dimension (either row or column) drawback: all processor need entire A (lot of broadcast, and system overload) 2-D Partitioning A and B partitioned into p blocks Dimension of each block n n x p p submatrix mult. Ci,j :Ai,k and Bk,j 27

28 Parallel implementations - optimal approach Matrix multiplication cont. Analysis 1D (absolute analysis) Matrix Multiplication 1D Time Processors Optimal sol: p=f(n) 28

29 Parallel implementations - optimal approach Matrix multiplication cont. Analysis 1D (relative analysis) Matrix Multiplication 1D Time % Processors Computation time (for small dim) is small compared to communication time: hence parallel implementation with 1D efficient on very large matrix dim. 29

30 Parallel implementations - optimal approach Matrix multiplication cont. Analysis 2D (absolute analysis) Matrix Multiplication 2D Time x x x x Processors 30

31 Parallel implementations - optimal approach Matrix multiplication cont. Analysis 2D (relative analysis) Matrix Multiplication 2D Time % Processors 31

32 Parallel implementations - optimal approach Matrix multiplication cont. Comparison 1D- 2D Comparison 1D 2D D Time 300 1D 700 1D D 100 2D D 700 2D D Processors 32

33 Parallel implementations - optimal approach Matrix power (derived from matrix multiplication) Matrix Power Time power 12 power power 4 power Processors 33

34 Parallel implementations - optimal approach Gaussian elimination Solving a system of equations: A x = B procedure GAUSSIAN_ELIMINATION (A, b, y) begin for k := 0 to n - 1 do begin for j := k + 1 to n - 1 do A[k, j] := A[k, j]/a[k, k]; y[k] := b[k]/a[k, k]; A[k, k] := 1; for i := k + 1 to n - 1 do begin for j := k + 1 to n - 1 do A[i, j] := A[i, j] - A[i, k] x A[k, j]; b[i] := b[i] - A[i, k] x y[k]; A[i, k] := 0; endfor; endfor; end GAUSSIAN_ELIMINATION O(n3) 34

35 Parallel implementations - optimal approach Gaussian elimination cont. 1D implementation: p processors (p<n), each receives n/p rows; Due to the un-even nature of the algorithm, most of the processors (actually more than half for half of the time) are idle (as they finished their work); t = n3/p without considering communication, hence the cost =n3 (tseq= 2n3/3), NOT cost optimal; To be used only when n>>p. Cyclic 1D mapping Balances the load of processors (difference at most one row) 35

36 Parallel implementations - optimal approach Gaussian elimination cont. Gauss Elimination Time (ms) Processors 36

37 Parallel implementations - optimal approach Sorting Ω(nlgn) QuickSort O(n2) worst case, O(nlgn) average case Data parallel approach (data split evenly among p processors) procedure QUICKSORT (A, q, r) begin if q < r then begin x := A[q]; s := q; for i := q + 1 to r do if A[i] <= x then begin s := s + 1; swap(a[s], A[i]); end if swap(a[q], A[s]); QUICKSORT (A, q, s); QUICKSORT (A, s + 1, r); end if end QUICKSORT 37

38 Parallel implementations - optimal approach Sorting cont. Analysis QuickSort (absolute analysis) QuickSort Time Processors Interpretation of results: after the optimal point is reach, the running time grows rapidly; it has to be a massive process communication! 38

39 Parallel implementations - optimal approach Sorting cont. Analysis QuickSort (relative analysis) QuickSort Time % Processors Improvement of 1 order of magnitude for large dim. for optimum nb. of proc. 39

40 Parallel implementations - optimal approach Sorting cont. Ω(nlgn) Enumeration Sort: for each element determine its rank in the ordered array, then move it directly in its final position. Data parallel implementation: split data among processors, run the sequential code on each processor merge the results 40

41 Parallel implementations - optimal approach Sorting cont. Relative Analysis Absolute Analysis Enumeration Sort Enumeration Sort Time Time % Processors Processors Parallel running time = 1.7% of the Note: as opposite to QS, the values mono processor running time, are not growing so fast to the right! hence, almost 2 order of magnitude!!!! 41

42 Parallel implementations - optimal approach Graph algorithms Minimum Spanning Trees Graph with 7 vertices: MST will have 6 edges with the sum of their weight minimum Possible solutions: MST weight = 20 42

43 Parallel implementations - optimal approach Graph algorithms cont. Prim s alg. procedure PRIM_MST(V, E, w, r) begin VT := {r}; d[r] := 0; for all v (V - VT ) do if edge(r, v) exists set d[v]:= w(r,v); else set d[v]:= ; while VT V do begin find a vertex u such that d[u]:=min{d[v] v (V - VT )}; VT := VT {u}; for all v (V VT ) do d[v]:= min{d[v],w(u, v)}; endwhile end PRIM_MST 43

44 Parallel implementations - optimal approach Graph algorithms cont. MST Absolute Analysis MST Relative Analysis Minimum Spannig Tree Minumum Spanning Tree Time Time % Observe the min value shifts to the right as graph s dimension increases Processors Processors 8 Observe the running time improvement with one order of magnitude (compared to sequential approach) for graphs with n =

45 Parallel implementations - optimal approach Graph algorithms cont. Transitive Closure: search for the existence of a path between any 2 vertices in a graph Possible solutions: nth power of the matrix Dijkstra s SPSP (O(n2)) Parallel solutions: source-partitioned formulation source-parallel formulation 45

46 Parallel implementations - optimal approach Graph algorithms cont. Transitive Closure: source-partitioned Each vertex vi is assigned to a process, and each process computes shortest path from each vertex in its set to any other vertex in the graph, by using the sequential algorithm (hence, data partitioning) Requires no inter process communication Transitive closure source partition Time Processors 46

47 Parallel implementations - optimal approach Graph algorithms cont. Transitive Closure: source-parallel assigns each vertex to a set of processes and uses the parallel formulation of the single-source algorithm (similar to MST) Requires inter process communication Transitive closure source parallel Time Processors 47

48 Parallel implementations - optimal approach Graph algorithms cont. Graph Isomorphism: exponential running time! search if 2 graphs matches trivial solution: generate all permutations of a graph (n!) and match with the other graph n! (n 1)! (n 1)! (n 1)! Each process is responsible for computing (n-1)! (data parallel approach) 48

49 Parallel implementations - optimal approach Graph algorithms cont. Graph Isomorphism Time logarithmic scale vertices 10 vertices vertices 12 vertices Processors 49

50 Parallel implementations - optimal approach FFT Based on Cooley-Tukey algorithm n-point, one-dimensional, unordered, radix-2 FFT requires inter-process interaction only during the first d = log p of the log n iterations FFT Time Processors 50

51 Parallel implementations -optimal approach Cryptographic algorithms 51

52 Parallel implementations -optimal approach Cryptographic algorithms cont. Decrypt MB MB MB MB speed increase is not linear, especially for small input sizes as input size increases, a constancy can be detected the 8 MB decryption takes half as long for 2 processors than for one, and approximately a quarter for 4 processors. The time remains aprox. constant as while doubling the dimension, we double the number of processors 52

53 Parallel implementations - optimal approach Randomness tests Most tests are based nm counting the number of 0 s and 1 s For a sequence to be random, they should be close to ½ Runs Test: how many runs of 0 and 1 bits there are in the input sequence. The focus of this test is the total number of runs in the sequence, where a run is an uninterrupted sequence of identical bits Implementation: at byte level Original alg.: traversing each byte and counting the passes from 0 to 1 and from 1 to 0, the test uses XOR operation, which returns a 1 only in the case when the two input bits have different values. A byte is therefore shifted one position to the right and XOR-ed with itself 53

54 Parallel implementations - optimal approach Randomness tests 54

55 Parallel implementations optimal approach Remarks (general and particular) Matrix multiplication Matrix power Gaussian elimination Sorting Graph algorithms FFT Cryptographic algorithms Randomness tests I/O optimal approach 55

56 Parallel implementations optimal approach I/O Read/write buffer (optimal) size Questions: Access method to use (local/se) R/W buffer size to get best results Network delays involved 56

57 Parallel implementations optimal approach I/O cont. 57

58 Parallel implementations optimal approach I/O cont. 58

59 Parallel implementations optimal approach I/O cont. Access method to use (local/global) As expected, local read is more efficient 59

60 Parallel implementations optimal approach I/O cont. 60

61 Parallel implementations optimal approach I/O cont Access method to use (local/se) R/W buffer size to get best results approximately bytes Network delays involved For 1-4 CPU to sec. For 5-8 CPU to sec. 61

62 Conclusions Parallelization process Parallelization Parabolic shape: more pregnant when Parabolic communication is involved Shift of optimal number of processors Shift How parallelization is done matters How Time up to 2 orders of magnitude Time decrease 62

63 Review Parallel solutions to problems Efficiency Optimal parallel solutions GRID Parallel implementations 63

Parallel Processing: October, 5, 2010

Parallel Processing: October, 5, 2010 Parallel Processing: Why, When, How? SimLab2010, Belgrade October, 5, 2010 Rodica Potolea Parallel Processing Why, When, How? Why? Problems too costly to be solved with the classical approach The need

More information

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8.

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8. CZ4102 High Performance Computing Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations - Dr Tay Seng Chuan Reference: Introduction to Parallel Computing Chapter 8. 1 Topic Overview

More information

Lecture 3: Sorting 1

Lecture 3: Sorting 1 Lecture 3: Sorting 1 Sorting Arranging an unordered collection of elements into monotonically increasing (or decreasing) order. S = a sequence of n elements in arbitrary order After sorting:

More information

Matrix multiplication

Matrix multiplication Matrix multiplication Standard serial algorithm: procedure MAT_VECT (A, x, y) begin for i := 0 to n - 1 do begin y[i] := 0 for j := 0 to n - 1 do y[i] := y[i] + A[i, j] * x [j] end end MAT_VECT Complexity:

More information

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms Introduction to Parallel & Distributed Computing Parallel Graph Algorithms Lecture 16, Spring 2014 Instructor: 罗国杰 gluo@pku.edu.cn In This Lecture Parallel formulations of some important and fundamental

More information

Lecture 4: Graph Algorithms

Lecture 4: Graph Algorithms Lecture 4: Graph Algorithms Definitions Undirected graph: G =(V, E) V finite set of vertices, E finite set of edges any edge e = (u,v) is an unordered pair Directed graph: edges are ordered pairs If e

More information

Sorting (Chapter 9) Alexandre David B2-206

Sorting (Chapter 9) Alexandre David B2-206 Sorting (Chapter 9) Alexandre David B2-206 Sorting Problem Arrange an unordered collection of elements into monotonically increasing (or decreasing) order. Let S = . Sort S into S =

More information

Unit 9 : Fundamentals of Parallel Processing

Unit 9 : Fundamentals of Parallel Processing Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing

More information

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu Department of Computer Science Aalborg University Fall 2007 This Lecture Merge sort Quick sort Radix sort Summary We will see more complex techniques

More information

Sorting (Chapter 9) Alexandre David B2-206

Sorting (Chapter 9) Alexandre David B2-206 Sorting (Chapter 9) Alexandre David B2-206 1 Sorting Problem Arrange an unordered collection of elements into monotonically increasing (or decreasing) order. Let S = . Sort S into S =

More information

Algorithms and Applications

Algorithms and Applications Algorithms and Applications 1 Areas done in textbook: Sorting Algorithms Numerical Algorithms Image Processing Searching and Optimization 2 Chapter 10 Sorting Algorithms - rearranging a list of numbers

More information

Parallel Graph Algorithms

Parallel Graph Algorithms Parallel Graph Algorithms Design and Analysis of Parallel Algorithms 5DV050 Spring 202 Part I Introduction Overview Graphsdenitions, properties, representation Minimal spanning tree Prim's algorithm Shortest

More information

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11 Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed

More information

Parallel Graph Algorithms

Parallel Graph Algorithms Parallel Graph Algorithms Design and Analysis of Parallel Algorithms 5DV050/VT3 Part I Introduction Overview Graphs definitions & representations Minimal Spanning Tree (MST) Prim s algorithm Single Source

More information

Numerical Algorithms

Numerical Algorithms Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0

More information

COS 226 Algorithms and Data Structures Fall Final Solutions. 10. Remark: this is essentially the same question from the midterm.

COS 226 Algorithms and Data Structures Fall Final Solutions. 10. Remark: this is essentially the same question from the midterm. COS 226 Algorithms and Data Structures Fall 2011 Final Solutions 1 Analysis of algorithms (a) T (N) = 1 10 N 5/3 When N increases by a factor of 8, the memory usage increases by a factor of 32 Thus, T

More information

Lecture 17: Array Algorithms

Lecture 17: Array Algorithms Lecture 17: Array Algorithms CS178: Programming Parallel and Distributed Systems April 4, 2001 Steven P. Reiss I. Overview A. We talking about constructing parallel programs 1. Last time we discussed sorting

More information

Matrix Multiplication

Matrix Multiplication Matrix Multiplication Nur Dean PhD Program in Computer Science The Graduate Center, CUNY 05/01/2017 Nur Dean (The Graduate Center) Matrix Multiplication 05/01/2017 1 / 36 Today, I will talk about matrix

More information

CS61BL. Lecture 5: Graphs Sorting

CS61BL. Lecture 5: Graphs Sorting CS61BL Lecture 5: Graphs Sorting Graphs Graphs Edge Vertex Graphs (Undirected) Graphs (Directed) Graphs (Multigraph) Graphs (Acyclic) Graphs (Cyclic) Graphs (Connected) Graphs (Disconnected) Graphs (Unweighted)

More information

CS Divide and Conquer

CS Divide and Conquer CS483-07 Divide and Conquer Instructor: Fei Li Room 443 ST II Office hours: Tue. & Thur. 1:30pm - 2:30pm or by appointments lifei@cs.gmu.edu with subject: CS483 http://www.cs.gmu.edu/ lifei/teaching/cs483_fall07/

More information

CSci 231 Final Review

CSci 231 Final Review CSci 231 Final Review Here is a list of topics for the final. Generally you are responsible for anything discussed in class (except topics that appear italicized), and anything appearing on the homeworks.

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

Lower and Upper Bound Theory. Prof:Dr. Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY

Lower and Upper Bound Theory. Prof:Dr. Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY Lower and Upper Bound Theory Prof:Dr. Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY 1 Lower and Upper Bound Theory How fast can we sort? Lower-Bound Theory can

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

O(n): printing a list of n items to the screen, looking at each item once.

O(n): printing a list of n items to the screen, looking at each item once. UNIT IV Sorting: O notation efficiency of sorting bubble sort quick sort selection sort heap sort insertion sort shell sort merge sort radix sort. O NOTATION BIG OH (O) NOTATION Big oh : the function f(n)=o(g(n))

More information

Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm

Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm Module 27: Chained Matrix Multiplication and Bellman-Ford Shortest Path Algorithm This module 27 focuses on introducing dynamic programming design strategy and applying it to problems like chained matrix

More information

Basic Communication Operations (Chapter 4)

Basic Communication Operations (Chapter 4) Basic Communication Operations (Chapter 4) Vivek Sarkar Department of Computer Science Rice University vsarkar@cs.rice.edu COMP 422 Lecture 17 13 March 2008 Review of Midterm Exam Outline MPI Example Program:

More information

CS 372: Computational Geometry Lecture 10 Linear Programming in Fixed Dimension

CS 372: Computational Geometry Lecture 10 Linear Programming in Fixed Dimension CS 372: Computational Geometry Lecture 10 Linear Programming in Fixed Dimension Antoine Vigneron King Abdullah University of Science and Technology November 7, 2012 Antoine Vigneron (KAUST) CS 372 Lecture

More information

Lecture: Analysis of Algorithms (CS )

Lecture: Analysis of Algorithms (CS ) Lecture: Analysis of Algorithms (CS583-002) Amarda Shehu Fall 2017 Amarda Shehu Lecture: Analysis of Algorithms (CS583-002) Sorting in O(n lg n) Time: Heapsort 1 Outline of Today s Class Sorting in O(n

More information

End-Term Examination Second Semester [MCA] MAY-JUNE 2006

End-Term Examination Second Semester [MCA] MAY-JUNE 2006 (Please write your Roll No. immediately) Roll No. Paper Code: MCA-102 End-Term Examination Second Semester [MCA] MAY-JUNE 2006 Subject: Data Structure Time: 3 Hours Maximum Marks: 60 Note: Question 1.

More information

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016 1 SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 11 CS2110 Spring 2016 Time spent on A2 2 Histogram: [inclusive:exclusive) [0:1): 0 [1:2): 24 ***** [2:3): 84 ***************** [3:4): 123 *************************

More information

UNIT-2 DIVIDE & CONQUER

UNIT-2 DIVIDE & CONQUER Overview: Divide and Conquer Master theorem Master theorem based analysis for Binary Search Merge Sort Quick Sort Divide and Conquer UNIT-2 DIVIDE & CONQUER Basic Idea: 1. Decompose problems into sub instances.

More information

Dense Matrix Algorithms

Dense Matrix Algorithms Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication

More information

10th August Part One: Introduction to Parallel Computing

10th August Part One: Introduction to Parallel Computing Part One: Introduction to Parallel Computing 10th August 2007 Part 1 - Contents Reasons for parallel computing Goals and limitations Criteria for High Performance Computing Overview of parallel computer

More information

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018 Object-oriented programming 1 and data-structures CS/ENGRD 2110 SUMMER 2018 Lecture 8: Sorting http://courses.cs.cornell.edu/cs2110/2018su Lecture 7 Recap 2 Introduced a formal notation for analysing the

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Design and Analysis of Algorithms CSE 5311 Lecture 8 Sorting in Linear Time Junzhou Huang, Ph.D. Department of Computer Science and Engineering CSE5311 Design and Analysis of Algorithms 1 Sorting So Far

More information

SORTING. Insertion sort Selection sort Quicksort Mergesort And their asymptotic time complexity

SORTING. Insertion sort Selection sort Quicksort Mergesort And their asymptotic time complexity 1 SORTING Insertion sort Selection sort Quicksort Mergesort And their asymptotic time complexity See lecture notes page, row in table for this lecture, for file searchsortalgorithms.zip Lecture 11 CS2110

More information

Scan Primitives for GPU Computing

Scan Primitives for GPU Computing Scan Primitives for GPU Computing Shubho Sengupta, Mark Harris *, Yao Zhang, John Owens University of California Davis, *NVIDIA Corporation Motivation Raw compute power and bandwidth of GPUs increasing

More information

Computer organization by G. Naveen kumar, Asst Prof, C.S.E Department 1

Computer organization by G. Naveen kumar, Asst Prof, C.S.E Department 1 Pipelining and Vector Processing Parallel Processing: The term parallel processing indicates that the system is able to perform several operations in a single time. Now we will elaborate the scenario,

More information

Chapter 8 Dense Matrix Algorithms

Chapter 8 Dense Matrix Algorithms Chapter 8 Dense Matrix Algorithms (Selected slides & additional slides) A. Grama, A. Gupta, G. Karypis, and V. Kumar To accompany the text Introduction to arallel Computing, Addison Wesley, 23. Topic Overview

More information

Embedded Systems Design with Platform FPGAs

Embedded Systems Design with Platform FPGAs Embedded Systems Design with Platform FPGAs Spatial Design Ron Sass and Andrew G. Schmidt http://www.rcs.uncc.edu/ rsass University of North Carolina at Charlotte Spring 2011 Embedded Systems Design with

More information

Principle Of Parallel Algorithm Design (cont.) Alexandre David B2-206

Principle Of Parallel Algorithm Design (cont.) Alexandre David B2-206 Principle Of Parallel Algorithm Design (cont.) Alexandre David B2-206 1 Today Characteristics of Tasks and Interactions (3.3). Mapping Techniques for Load Balancing (3.4). Methods for Containing Interaction

More information

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? v A classic problem in computer science! v Data requested in sorted order e.g., find students in increasing

More information

Sankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology. Assignment

Sankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology. Assignment Class: V - CE Sankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology Sub: Design and Analysis of Algorithms Analysis of Algorithm: Assignment

More information

UNIT Name the different ways of representing a graph? a.adjacencymatrix b. Adjacency list

UNIT Name the different ways of representing a graph? a.adjacencymatrix b. Adjacency list UNIT-4 Graph: Terminology, Representation, Traversals Applications - spanning trees, shortest path and Transitive closure, Topological sort. Sets: Representation - Operations on sets Applications. 1. Name

More information

CS Divide and Conquer

CS Divide and Conquer CS483-07 Divide and Conquer Instructor: Fei Li Room 443 ST II Office hours: Tue. & Thur. 1:30pm - 2:30pm or by appointments lifei@cs.gmu.edu with subject: CS483 http://www.cs.gmu.edu/ lifei/teaching/cs483_fall07/

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming Linda Woodard CAC 19 May 2010 Introduction to Parallel Computing on Ranger 5/18/2010 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor

More information

Introduction to Approximation Algorithms

Introduction to Approximation Algorithms Introduction to Approximation Algorithms Dr. Gautam K. Das Departmet of Mathematics Indian Institute of Technology Guwahati, India gkd@iitg.ernet.in February 19, 2016 Outline of the lecture Background

More information

Introduction to the Analysis of Algorithms. Algorithm

Introduction to the Analysis of Algorithms. Algorithm Introduction to the Analysis of Algorithms Based on the notes from David Fernandez-Baca Bryn Mawr College CS206 Intro to Data Structures Algorithm An algorithm is a strategy (well-defined computational

More information

DESIGN AND ANALYSIS OF ALGORITHMS GREEDY METHOD

DESIGN AND ANALYSIS OF ALGORITHMS GREEDY METHOD 1 DESIGN AND ANALYSIS OF ALGORITHMS UNIT II Objectives GREEDY METHOD Explain and detail about greedy method Explain the concept of knapsack problem and solve the problems in knapsack Discuss the applications

More information

Dr. Joe Zhang PDC-3: Parallel Platforms

Dr. Joe Zhang PDC-3: Parallel Platforms CSC630/CSC730: arallel & Distributed Computing arallel Computing latforms Chapter 2 (2.3) 1 Content Communication models of Logical organization (a programmer s view) Control structure Communication model

More information

Arrays, Vectors Searching, Sorting

Arrays, Vectors Searching, Sorting Arrays, Vectors Searching, Sorting Arrays char s[200]; //array of 200 characters different type than class string can be accessed as s[0], s[1],..., s[199] s[0]= H ; s[1]= e ; s[2]= l ; s[3]= l ; s[4]=

More information

Lecture 8 Parallel Algorithms II

Lecture 8 Parallel Algorithms II Lecture 8 Parallel Algorithms II Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Original slides from Introduction to Parallel

More information

15-750: Parallel Algorithms

15-750: Parallel Algorithms 5-750: Parallel Algorithms Scribe: Ilari Shafer March {8,2} 20 Introduction A Few Machine Models of Parallel Computation SIMD Single instruction, multiple data: one instruction operates on multiple data

More information

A Preliminary Assessment of the ACRI 1 Fortran Compiler

A Preliminary Assessment of the ACRI 1 Fortran Compiler A Preliminary Assessment of the ACRI 1 Fortran Compiler Joan M. Parcerisa, Antonio González, Josep Llosa, Toni Jerez Computer Architecture Department Universitat Politècnica de Catalunya Report No UPC-DAC-94-24

More information

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY 1 A3 and Prelim 2 SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 11 CS2110 Fall 2016 Deadline for A3: tonight. Only two late days allowed (Wed-Thur) Prelim: Thursday evening. 74 conflicts! If you

More information

Enhancing Parallelism

Enhancing Parallelism CSC 255/455 Software Analysis and Improvement Enhancing Parallelism Instructor: Chen Ding Chapter 5,, Allen and Kennedy www.cs.rice.edu/~ken/comp515/lectures/ Where Does Vectorization Fail? procedure vectorize

More information

High Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms

High Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms High Performance Computing Programming Paradigms and Scalability Part 6: Examples of Parallel Algorithms PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Scientific Computing

More information

Communication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1

Communication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1 Communication Networks I December, Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page Communication Networks I December, Notation G = (V,E) denotes a

More information

Reference Sheet for CO142.2 Discrete Mathematics II

Reference Sheet for CO142.2 Discrete Mathematics II Reference Sheet for CO14. Discrete Mathematics II Spring 017 1 Graphs Defintions 1. Graph: set of N nodes and A arcs such that each a A is associated with an unordered pair of nodes.. Simple graph: no

More information

Ph.D. Comprehensive Examination Design and Analysis of Algorithms

Ph.D. Comprehensive Examination Design and Analysis of Algorithms Ph.D. Comprehensive Examination Design and Analysis of Algorithms Main Books 1. Cormen, Leiserton, Rivest, Introduction to Algorithms, MIT Press, 2001. Additional Books 1. Kenneth H. Rosen, Discrete mathematics

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #7 2/5/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class

More information

HIGH PERFORMANCE NUMERICAL LINEAR ALGEBRA. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

HIGH PERFORMANCE NUMERICAL LINEAR ALGEBRA. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 HIGH PERFORMANCE NUMERICAL LINEAR ALGEBRA Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 BLAS BLAS 1, 2, 3 Performance GEMM Optimized BLAS Parallel

More information

Parallel Computing: Parallel Algorithm Design Examples Jin, Hai

Parallel Computing: Parallel Algorithm Design Examples Jin, Hai Parallel Computing: Parallel Algorithm Design Examples Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology ! Given associative operator!! a 0! a 1! a 2!! a

More information

UNIT 5 GRAPH. Application of Graph Structure in real world:- Graph Terminologies:

UNIT 5 GRAPH. Application of Graph Structure in real world:- Graph Terminologies: UNIT 5 CSE 103 - Unit V- Graph GRAPH Graph is another important non-linear data structure. In tree Structure, there is a hierarchical relationship between, parent and children that is one-to-many relationship.

More information

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo CSE 431/531: Analysis of Algorithms Greedy Algorithms Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Main Goal of Algorithm Design Design fast algorithms to solve

More information

NP Completeness. Andreas Klappenecker [partially based on slides by Jennifer Welch]

NP Completeness. Andreas Klappenecker [partially based on slides by Jennifer Welch] NP Completeness Andreas Klappenecker [partially based on slides by Jennifer Welch] Dealing with NP-Complete Problems Dealing with NP-Completeness Suppose the problem you need to solve is NP-complete. What

More information

Graph Algorithms. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

Graph Algorithms. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico. Graph Algorithms Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico May, 0 CPD (DEI / IST) Parallel and Distributed Computing 0-0-0 / Outline

More information

Parallel Sorting Algorithms

Parallel Sorting Algorithms Parallel Sorting Algorithms Ricardo Rocha and Fernando Silva Computer Science Department Faculty of Sciences University of Porto Parallel Computing 2016/2017 (Slides based on the book Parallel Programming:

More information

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n. Problem 5. Sorting Simple Sorting, Quicksort, Mergesort Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all 1 i j n. 98 99 Selection Sort

More information

Algorithms and Data Structures (INF1) Lecture 15/15 Hua Lu

Algorithms and Data Structures (INF1) Lecture 15/15 Hua Lu Algorithms and Data Structures (INF1) Lecture 15/15 Hua Lu Department of Computer Science Aalborg University Fall 2007 This Lecture Minimum spanning trees Definitions Kruskal s algorithm Prim s algorithm

More information

L.J. Institute of Engineering & Technology Semester: VIII (2016)

L.J. Institute of Engineering & Technology Semester: VIII (2016) Subject Name: Design & Analysis of Algorithm Subject Code:1810 Faculties: Mitesh Thakkar Sr. UNIT-1 Basics of Algorithms and Mathematics No 1 What is an algorithm? What do you mean by correct algorithm?

More information

Lecture 19 Sorting Goodrich, Tamassia

Lecture 19 Sorting Goodrich, Tamassia Lecture 19 Sorting 7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9 7 7 2 2 9 9 4 4 2004 Goodrich, Tamassia Outline Review 3 simple sorting algorithms: 1. selection Sort (in previous course) 2. insertion Sort (in previous

More information

High Performance Computing Systems

High Performance Computing Systems High Performance Computing Systems Shared Memory Doug Shook Shared Memory Bottlenecks Trips to memory Cache coherence 2 Why Multicore? Shared memory systems used to be purely the domain of HPC... What

More information

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions U.C. Berkeley CS70 : Algorithms Midterm 2 Solutions Lecturers: Sanjam Garg and Prasad aghavra March 20, 207 Midterm 2 Solutions. (0 points) True/False Clearly put your answers in the answer box in front

More information

Linear Programming. Readings: Read text section 11.6, and sections 1 and 2 of Tom Ferguson s notes (see course homepage).

Linear Programming. Readings: Read text section 11.6, and sections 1 and 2 of Tom Ferguson s notes (see course homepage). Linear Programming Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory: Feasible Set, Vertices, Existence of Solutions. Equivalent formulations. Outline

More information

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued. Linear Programming Widget Factory Example Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory:, Vertices, Existence of Solutions. Equivalent formulations.

More information

INDIAN STATISTICAL INSTITUTE

INDIAN STATISTICAL INSTITUTE INDIAN STATISTICAL INSTITUTE Mid Semestral Examination M. Tech (CS) - I Year, 2016-2017 (Semester - II) Design and Analysis of Algorithms Date : 21.02.2017 Maximum Marks : 60 Duration : 3.0 Hours Note:

More information

12 Dynamic Programming (2) Matrix-chain Multiplication Segmented Least Squares

12 Dynamic Programming (2) Matrix-chain Multiplication Segmented Least Squares 12 Dynamic Programming (2) Matrix-chain Multiplication Segmented Least Squares Optimal substructure Dynamic programming is typically applied to optimization problems. An optimal solution to the original

More information

CSE 373: Data Structures and Algorithms

CSE 373: Data Structures and Algorithms CSE 373: Data Structures and Algorithms Lecture 21: Finish Sorting, P vs NP Instructor: Lilian de Greef Quarter: Summer 2017 Today Announcements Finish up sorting Radix Sort Final comments on sorting Complexity

More information

Introduction to I/O Efficient Algorithms (External Memory Model)

Introduction to I/O Efficient Algorithms (External Memory Model) Introduction to I/O Efficient Algorithms (External Memory Model) Jeff M. Phillips August 30, 2013 Von Neumann Architecture Model: CPU and Memory Read, Write, Operations (+,,,...) constant time polynomially

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1 CS241 - week 1 review Special classes of algorithms: logarithmic: O(log n) linear: O(n) quadratic: O(n 2 ) polynomial: O(n k ), k 1 exponential: O(a n ), a > 1 Classifying algorithms is generally done

More information

Department of Computer Science and Engineering Analysis and Design of Algorithm (CS-4004) Subject Notes

Department of Computer Science and Engineering Analysis and Design of Algorithm (CS-4004) Subject Notes Page no: Department of Computer Science and Engineering Analysis and Design of Algorithm (CS-00) Subject Notes Unit- Greedy Technique. Introduction: Greedy is the most straight forward design technique.

More information

CS475 Parallel Programming

CS475 Parallel Programming CS475 Parallel Programming Sorting Wim Bohm, Colorado State University Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license. Sorting

More information

IV/IV B.Tech (Regular) DEGREE EXAMINATION. Design and Analysis of Algorithms (CS/IT 414) Scheme of Evaluation

IV/IV B.Tech (Regular) DEGREE EXAMINATION. Design and Analysis of Algorithms (CS/IT 414) Scheme of Evaluation IV/IV B.Tech (Regular) DEGREE EXAMINATION Design and Analysis of Algorithms (CS/IT 414) Scheme of Evaluation Maximum: 60 Marks 1. Write briefly about the following 1*12= 12 Marks a) Give the characteristics

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Last time Network topologies Intro to MPI Matrix-matrix multiplication Today MPI I/O Randomized Algorithms Parallel k-select Graph coloring Assignment 2 Parallel I/O Goal of Parallel

More information

Dynamic Programming (Part #2)

Dynamic Programming (Part #2) Dynamic Programming (Part #) Introduction to Algorithms MIT Press (Chapter 5) Matrix-Chain Multiplication Problem: given a sequence A, A,, A n, compute the product: A A A n Matrix compatibility: C = A

More information

Workloads Programmierung Paralleler und Verteilter Systeme (PPV)

Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Workloads 2 Hardware / software execution environment

More information

CSL 860: Modern Parallel

CSL 860: Modern Parallel CSL 860: Modern Parallel Computation PARALLEL ALGORITHM TECHNIQUES: BALANCED BINARY TREE Reduction n operands => log n steps Total work = O(n) How do you map? Balance Binary tree technique Reduction n

More information

CSC 8301 Design & Analysis of Algorithms: Warshall s, Floyd s, and Prim s algorithms

CSC 8301 Design & Analysis of Algorithms: Warshall s, Floyd s, and Prim s algorithms CSC 8301 Design & Analysis of Algorithms: Warshall s, Floyd s, and Prim s algorithms Professor Henry Carter Fall 2016 Recap Space-time tradeoffs allow for faster algorithms at the cost of space complexity

More information

Virtual University of Pakistan

Virtual University of Pakistan Virtual University of Pakistan Department of Computer Science Course Outline Course Instructor Dr. Sohail Aslam E mail Course Code Course Title Credit Hours 3 Prerequisites Objectives Learning Outcomes

More information

CSE 373: Data Structures and Algorithms

CSE 373: Data Structures and Algorithms CSE 373: Data Structures and Algorithms Lecture 21: Finish Sorting, P vs NP Instructor: Lilian de Greef Quarter: Summer 2017 Today Announcements Finish up sorting Radix Sort Final comments on sorting Complexity

More information

Quick-Sort. Quick-Sort 1

Quick-Sort. Quick-Sort 1 Quick-Sort 7 4 9 6 2 2 4 6 7 9 4 2 2 4 7 9 7 9 2 2 9 9 Quick-Sort 1 Outline and Reading Quick-sort ( 4.3) Algorithm Partition step Quick-sort tree Execution example Analysis of quick-sort (4.3.1) In-place

More information

Sorting Algorithms. - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup

Sorting Algorithms. - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup Sorting Algorithms - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup The worst-case time complexity of mergesort and the average time complexity of quicksort are

More information

SORTING AND SELECTION

SORTING AND SELECTION 2 < > 1 4 8 6 = 9 CHAPTER 12 SORTING AND SELECTION ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN JAVA, GOODRICH, TAMASSIA AND GOLDWASSER (WILEY 2016)

More information

Scan Algorithm Effects on Parallelism and Memory Conflicts

Scan Algorithm Effects on Parallelism and Memory Conflicts Scan Algorithm Effects on Parallelism and Memory Conflicts 11 Parallel Prefix Sum (Scan) Definition: The all-prefix-sums operation takes a binary associative operator with identity I, and an array of n

More information

COT 6405 Introduction to Theory of Algorithms

COT 6405 Introduction to Theory of Algorithms COT 6405 Introduction to Theory of Algorithms Topic 10. Linear Time Sorting 10/5/2016 2 How fast can we sort? The sorting algorithms we learned so far Insertion Sort, Merge Sort, Heap Sort, and Quicksort

More information

AC64/AT64 DESIGN & ANALYSIS OF ALGORITHMS DEC 2014

AC64/AT64 DESIGN & ANALYSIS OF ALGORITHMS DEC 2014 AC64/AT64 DESIGN & ANALYSIS OF ALGORITHMS DEC 214 Q.2 a. Design an algorithm for computing gcd (m,n) using Euclid s algorithm. Apply the algorithm to find gcd (31415, 14142). ALGORITHM Euclid(m, n) //Computes

More information

Algorithms. Algorithms ALGORITHM DESIGN

Algorithms. Algorithms ALGORITHM DESIGN Algorithms ROBERT SEDGEWICK KEVIN WAYNE ALGORITHM DESIGN Algorithms F O U R T H E D I T I O N ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.edu analysis of algorithms greedy divide-and-conquer

More information