Assignment no. 5 Implementing the Parallel Quick-sort algorithm in parallel distributed memory environment

Similar documents
CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

MIPS Assembly: Quicksort. CptS 260 Introduction to Computer Architecture Week 4.1 Mon 2014/06/30

DIVIDE & CONQUER. Problem of size n. Solution to sub problem 1

CSCI 2132 Software Development Lecture 18: Implementation of Recursive Algorithms

Divide and Conquer Sorting Algorithms and Noncomparison-based

Introduction to Algorithms and Data Structures. Lecture 12: Sorting (3) Quick sort, complexity of sort algorithms, and counting sort

COMP2012H Spring 2014 Dekai Wu. Sorting. (more on sorting algorithms: mergesort, quicksort, heapsort)

Data Structures and Algorithms

COMP171 Data Structures and Algorithms Fall 2006 Midterm Examination

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Quicksort. Repeat the process recursively for the left- and rightsub-blocks.

CS Sorting Terms & Definitions. Comparing Sorting Algorithms. Bubble Sort. Bubble Sort: Graphical Trace

Introduction to Fortran Programming. -Internal subprograms (1)-

Sorting. Sorting. Stable Sorting. In-place Sort. Bubble Sort. Bubble Sort. Selection (Tournament) Heapsort (Smoothsort) Mergesort Quicksort Bogosort

4. Sorting and Order-Statistics

Computer Science E-119 Practice Midterm

Data Structure Lecture#17: Internal Sorting 2 (Chapter 7) U Kang Seoul National University

Table ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum

PARALLEL XXXXXXXXX IMPLEMENTATION USING MPI

About this exam review

Arrays, Vectors Searching, Sorting

Lecture 6 Sorting and Searching

Divide-and-Conquer. Dr. Yingwu Zhu

Randomized Algorithms, Quicksort and Randomized Selection

CSC Design and Analysis of Algorithms

Quick Sort. CSE Data Structures May 15, 2002

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Advanced Pointer Topics

Quicksort (CLRS 7) We saw the divide-and-conquer technique at work resulting in Mergesort. Mergesort summary:

Lecture 16: Sorting. CS178: Programming Parallel and Distributed Systems. April 2, 2001 Steven P. Reiss

Review Functions Subroutines Flow Control Summary

Faster Sorting Methods

Module Contact: Dr Geoff McKeown, CMP Copyright of the University of East Anglia Version 1

Analysis of Algorithms. Unit 4 - Analysis of well known Algorithms

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer

e-pg PATHSHALA- Computer Science Design and Analysis of Algorithms Module 10

Algorithms and Data Structures. Marcin Sydow. Introduction. QuickSort. Sorting 2. Partition. Limit. CountSort. RadixSort. Summary

COMP Data Structures

Divide and Conquer 4-0

Lecture 2: Divide&Conquer Paradigm, Merge sort and Quicksort

Make sure the version number is marked on your scantron sheet. This is Version 1

B-Trees. nodes with many children a type node a class for B-trees. an elaborate example the insertion algorithm removing elements

Chapter 4. Divide-and-Conquer. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Key question: how do we pick a good pivot (and what makes a good pivot in the first place)?

Divide-and-Conquer. The most-well known algorithm design strategy: smaller instances. combining these solutions

CSE 2320 Section 002, Fall 2015 Exam 2 Time: 80 mins

Unit-2 Divide and conquer 2016

CHAPTER 7 Iris Hui-Ru Jiang Fall 2008

Divide & Conquer. 2. Conquer the sub-problems by solving them recursively. 1. Divide the problem into number of sub-problems

NOTE: Answer ANY FOUR of the following 6 sections:

Topics Applications Most Common Methods Serial Search Binary Search Search by Hashing (next lecture) Run-Time Analysis Average-time analysis Time anal

BM267 - Introduction to Data Structures

Your favorite blog : (popularly known as VIJAY JOTANI S BLOG..now in facebook.join ON FB VIJAY

Sorting is a problem for which we can prove a non-trivial lower bound.

CENG3420 Computer Organization and Design Lab 1-2: System calls and recursions

Aryan College. Fundamental of C Programming. Unit I: Q1. What will be the value of the following expression? (2017) A + 9

CSE 143. Two important problems. Searching and Sorting. Review: Linear Search. Review: Binary Search. Example. How Efficient Is Linear Search?

Parallel quicksort algorithms with isoefficiency analysis. Parallel quicksort algorithmswith isoefficiency analysis p. 1

Cpt S 122 Data Structures. Sorting

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

The divide and conquer strategy has three basic parts. For a given problem of size n,

IS 709/809: Computational Methods in IS Research. Algorithm Analysis (Sorting)

Divide and Conquer. Algorithm Fall Semester

We can use a max-heap to sort data.

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Algorithms and Data Structures

Administrivia. HW on recursive lists due on Wednesday. Reading for Wednesday: Chapter 9 thru Quicksort (pp )

Introduction. Sorting. Definitions and Terminology: Program efficiency. Sorting Algorithm Analysis. 13. Sorting. 13. Sorting.

CS 171: Introduction to Computer Science II. Quicksort

Variable A variable is a value that can change during the execution of a program.

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

ITEC2620 Introduction to Data Structures

ECE368 Exam 2 Spring 2016

Sorting: Quick Sort. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

8. Sorting II. 8.1 Heapsort. Heapsort. [Max-]Heap 6. Heapsort, Quicksort, Mergesort. Binary tree with the following properties Wurzel

Introduction. Sorting. Table of Contents

P.G.TRB - COMPUTER SCIENCE. c) data processing language d) none of the above

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

Searching Elements in an Array: Linear and Binary Search. Spring Semester 2007 Programming and Data Structure 1

Programming Languages

CSCI 2132 Software Development. Lecture 19: Generating Permutations

Quicksort Using Median of Medians as Pivot

MPATE-GE 2618: C Programming for Music Technology. Syllabus

Abstract Data Types (ADTs) & Design. Selecting a data structure. Readings: CP:AMA 19.5, 17.7 (qsort)

Sorting. Popular algorithms: Many algorithms for sorting in parallel also exist.

Stacks, Queues (cont d)

Shared Memory Parallel Sorting

7. Sorting I. 7.1 Simple Sorting. Problem. Algorithm: IsSorted(A) 1 i j n. Simple Sorting

Array Basics: Outline. Creating and Accessing Arrays. Creating and Accessing Arrays. Arrays (Savitch, Chapter 7)

Sample Problems for Quiz # 2

L14 Quicksort and Performance Optimization

CS 61B Summer 2005 (Porter) Midterm 2 July 21, SOLUTIONS. Do not open until told to begin

Parallel Sorting Algorithms

Data Structures And Algorithms

CSC Design and Analysis of Algorithms. Lecture 5. Decrease and Conquer Algorithm Design Technique. Decrease-and-Conquer

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Transcription:

Assignment no. 5 Implementing the Parallel Quick-sort algorithm in parallel distributed memory environment DCAMM PhD Course Scientific Computing DTU January 2008 1 Problem setting Given a sequence of n integer numbers, the task is to implement the Quick-sort algorithm in parallel using Fortran/C/C++ and MPI, and compare the parallel performance results for various pivot selection strategies. The implementation must be independent of n and the number of processors p to be used, thus, n and p must be input parameters. 2 Outline of the algorithm Algorithm 2.1 The Parallel Quick-sort algorithm 1 Divide the data into p equal parts, one per processor 2 Sort the data locally for each processor 3 Perform global sort 3.1 Select pivot element within each processor set 3.2 Locally in each processor, divide the data into two sets according to the pivot (smaller or larger) 3.3 Split the processors into two groups and exchange data pairwise between them so that all processors in one group get data less than the pivot and the others get data larger than the pivot. 3.4 Merge the two lists in each processor in one sorted list 4 Repeat 3.1-3.4 recursively for each half. The algorithm converges in log 2 p steps. For steps 2 and 3.4 you can use the qsort routine provided in the system libraries (see the appendix). 3 Data generation Generate your sequence of integers to be sorted using some random number generation procedure. Some examples follow. 1

Wichmann-Hill formula, x n+1 = (171x n )mod30269, will give uniformly distributed numbers between 0 and 30269. Exponentially distributed numbers can be computed using y n = 1 λ ln x n, where {x n are uniformly distributed between 0 and 1. Normally distributed numbers can be computed using the formula y n = µ + σ 2 ln x i cos(2πx i ), where x i and x j are two independent uniformly distributed random numbers between 0 and 1. You can also use the inbuilt random number generators in a proper way (see the appendix also). One of your test runs has to be on a backward sorted sequence as input. You can try also almost sorted input data. Clearly, one should use the same random number sequence for all pivot strategies in order to get fair comparisons. 4 Pivot strategies For the numerical tests implement the following pivot strategies: 1. Select the median in one processor in each group of processors. 2. Select the median of all medians in each processor group. 3. Select the mean value of all medians in each processor group. 4. (Non-compulsory) Select all pivots (p-1) at once: (a) each processor selects evenly l evenly distributed elements within its data set; (b) all lp elements are sorted globally; (c) choose p 1 evenly distributed elements in the above sequence and use as pivots. You are free to test other pivot selecting techniques as well. 5 Writing a report on the results The report has to have the following issues covered (see also the general instructions on the course web-page): 1. Describe the theoretical problem setting, the algorithm, the performance figures and expectations. 2. Describe the parallel implementation - the logical structure of the MPI machine you have in mind, data structures and other related issues. 3. Include table(s) of results for various values of n, varying number of processors p, corresponding speedup and efficiency figures. 2

4. Present plots of the speedup (show also the ideal speedup on the plot). The plots cam be done using Matlab, Maple or any other graphical tool you are familiar with. 5. Analyse the speedup and efficiency results in comparison with the theoretical expectations. Does the scalability of your formulation depend on the the architectural characteristics of the machine? Make relevant comparisons with the results you obtained from the experiments. 6. Add observations, comments how the particular computer architecture influences the parallel performance. Does the pair algorithm - computer system scales? Pay attention how to balance properly the workload per processor and the number of processors used. 7. Conclusions, ideas for possible optimizations The report must be written in English. A listing of the program code has to be attached to the report. Standard requirements are put on the design of the code, namely, structure and comments. You are encouraged to work in pairs. However, all topics in the assignment should be covered by each student. 6 Deadlines and credit points The full report (paper copy!) must be submitted no later than January 30, 2008. Assignments submitted after this date will not be approved. Success! Maya Neytcheva Any comments on the assignment will be highly appreciated and will be considered for further improvements. Thank you! 3

Appendix: Calling serial qsort From C: qsort() The qsort() function is an implementation of the quick-sort algorithm. It sorts a table of data in place. The contents of the table are sorted in ascending order according to the user-supplied comparison function. The base argument points to the element at the base of the table. The nel argument is the number of elements in the table. The width argument specifies the size of each element in bytes. The compar argument is the name of the comparison function, which is called with two arguments that point to the elements being compared. The function must return an integer less than, equal to, or greater than zero to indicate if the first argument is to be considered less than, equal to, or greater than the second argument. The contents of the table are sorted in ascending order according to the user supplied comparison function. Example 1 - Program sorts. The following program sorts a simple array: #include #include static int intcompare(const void *p1, const void *p2) { int i = *((int *)p1); int j = *((int *)p2); if (i > j) return (1); if (i < j) return (-1); return (0); int main() { int i; int a[10] = { 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 ; size_t nelems = sizeof (a) / sizeof (int); qsort((void *)a, nelems, sizeof (int), intcompare); for (i = 0; i < nelems; i++) { (void) printf("%d ", a[i]); (void) printf("\n"); return (0); 4

From Fortran: qsort qsort,qsort64: Sort the Elements of a one-dimensional array qsort( array, len, isize, compar ) array array Input Contains the elements to be sorted len INTEGER*4 Input Number of elements in the array. isize INTEGER*4 Input Size of an element, typically: 4 for integer compar function name Input Name of a user-supplied INTEGER*2 function. The compar(arg1,arg2) arguments are elements of array, returning: Negative If arg1 is considered to precede arg2 Zero If arg1 is equivalent to arg2 Positive If arg1 is considered to follow arg2 demo% cat tqsort.f external compar integer*2 compar INTEGER*4 array(10)/5,1,9,0,8,7,3,4,6,2/, len/10/, isize/4/ call qsort( array, len, isize, compar ) write(*, (10i3) ) array end integer*2 function compar( a, b ) INTEGER*4 a, b if ( a.lt. b ) compar = -1 if ( a.eq. b ) compar = 0 if ( a.gt. b ) compar = 1 return end demo% f77 -silent tqsort.f demo% a.out 0 1 2 3 4 5 6 7 8 9 5

From Fortran90: qsort SUBROUTINE qsort(a, n, t)! NON-RECURSIVE STACK VERSION OF QUICKSORT FROM N.WIRTH S PASCAL! BOOK, ALGORITHMS + DATA STRUCTURES = PROGRAMS.! SINGLE PRECISION, ALSO CHANGES THE ORDER OF THE ASSOCIATED ARRAY T. IMPLICIT NONE INTEGER, INTENT(IN) :: n REAL, INTENT(INOUT) :: a(n) INTEGER, INTENT(INOUT) :: t(n)! Local Variables INTEGER REAL :: i, j, k, l, r, s, stackl(15), stackr(15), ww :: w, x s = 1 stackl(1) = 1 stackr(1) = n! KEEP TAKING THE TOP REQUEST FROM THE STACK UNTIL S = 0. 10 CONTINUE l = stackl(s) r = stackr(s) s = s - 1! KEEP SPLITTING A(L),..., A(R) UNTIL L >= R. 20 CONTINUE i = l j = r k = (l+r) / 2 x = a(k)! REPEAT UNTIL I > J. DO DO IF (a(i).lt.x) THEN i = i + 1 CYCLE ELSE EXIT END DO! Search from lower end 6

DO IF (x.lt.a(j)) THEN j = j - 1 CYCLE ELSE EXIT END DO IF (i.le.j) THEN w = a(i) ww = t(i) a(i) = a(j) t(i) = t(j) a(j) = w t(j) = ww i = i + 1 j = j - 1 IF (i.gt.j) EXIT ELSE EXIT END DO! Search from upper end! Swap positions i & j IF (j-l.ge.r-i) THEN IF (l.lt.j) THEN s = s + 1 stackl(s) = l stackr(s) = j l = i ELSE IF (i.lt.r) THEN s = s + 1 stackl(s) = i stackr(s) = r r = j IF (l.lt.r) GO TO 20 IF (s.ne.0) GO TO 10 RETURN END SUBROUTINE qsort 7

From C++: /* * QuickSort, algorithm by C.A.R. Hoare (1960) * * 01.06.1998, implemented by Michael Neumann (neumann@s-direktnet.de) */ # ifndef QUICKSORT_HEADER # define QUICKSORT_HEADER # include <algorithm> template <class itemtype, class indextype=int> void QuickSort(itemType a[], indextype l, indextype r) { static itemtype m; static indextype j; indextype i; if(r > l) { m = a[r]; i = l-1; j = r; for(;;) { while(a[++i] < m); while(a[--j] > m); if(i >= j) break; std::swap(a[i], a[j]); std::swap(a[i],a[r]); QuickSort(a,l,i-1); QuickSort(a,i+1,r); # endif Random number generation: In Fortran: r=irand(i) irand returns positive integers in the range 0 through 2147483647. Similarly in C. More can be found on http://docs.sun.com. 8