Internal Sorting by Comparison

Similar documents
Internal Sorting by Comparison

PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS

[ 11.2, 11.3, 11.4] Analysis of Algorithms. Complexity of Algorithms. 400 lecture note # Overview

Chapter 2: Complexity Analysis

Big-O-ology. Jim Royer January 16, 2019 CIS 675. CIS 675 Big-O-ology 1/ 19

Introduction to Data Structure

DS: CS Computer Sc & Engg: IIT Kharagpur 1. File Access. Goutam Biswas. ect 29

Asymptotic Analysis Spring 2018 Discussion 7: February 27, 2018

Complexity of Algorithms. Andreas Klappenecker

Internal Sort by Comparison II

Internal Sort by Comparison II

Complexity of Algorithms

UNIT 1 ANALYSIS OF ALGORITHMS

INTRODUCTION. Analysis: Determination of time and space requirements of the algorithm

CS:3330 (22c:31) Algorithms

CSE 373 APRIL 3 RD ALGORITHM ANALYSIS

Internal Sort by Comparison II

Big-O-ology 1 CIS 675: Algorithms January 14, 2019

Algorithm. Algorithm Analysis. Algorithm. Algorithm. Analyzing Sorting Algorithms (Insertion Sort) Analyzing Algorithms 8/31/2017

Module 1: Asymptotic Time Complexity and Intro to Abstract Data Types

Complexity Analysis of an Algorithm

IT 4043 Data Structures and Algorithms

Same idea, different implementation. Is it important?

CS S-02 Algorithm Analysis 1

Algorithm Analysis. Algorithm Efficiency Best, Worst, Average Cases Asymptotic Analysis (Big Oh) Space Complexity

Introduction to Computers & Programming

Why study algorithms? CS 561, Lecture 1. Today s Outline. Why study algorithms? (II)

Remember to also pactice: Homework, quizzes, class examples, slides, reading materials.

CSE Winter 2015 Quiz 1 Solutions

How much space does this routine use in the worst case for a given n? public static void use_space(int n) { int b; int [] A;

Data Structures and Algorithms Chapter 2

C Programming 1. File Access. Goutam Biswas. Lect 29

Big-O-ology. How do you tell how fast a program is? = How do you tell how fast a program is? (2nd Try) Answer? Run some test cases.

4.4 Algorithm Design Technique: Randomization

Data Structures and Algorithms. Part 2

Assignment 1 (concept): Solutions

Overview of Data. 1. Array 2. Linked List 3. Stack 4. Queue

Algorithmic Analysis. Go go Big O(h)!

asymptotic growth rate or order compare two functions, but ignore constant factors, small inputs

Test 1 SOLUTIONS. June 10, Answer each question in the space provided or on the back of a page with an indication of where to find the answer.

Data Structures Question Bank Multiple Choice

CS 360 Exam 1 Fall 2014 Name. 1. Answer the following questions about each code fragment below. [8 points]

Algorithms and Data Structures

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Algorithm Analysis. CENG 707 Data Structures and Algorithms

Choice of C++ as Language

Analysis of Algorithms

0.1 Welcome. 0.2 Insertion sort. Jessica Su (some portions copied from CLRS)

CS 261 Data Structures. Big-Oh Analysis: A Review

Algorithm Analysis. (Algorithm Analysis ) Data Structures and Programming Spring / 48

Scientific Computing. Algorithm Analysis

O(n): printing a list of n items to the screen, looking at each item once.

Asymptotic Analysis of Algorithms

Data Structures and Algorithms

Comparison of x with an entry in the array

1. Answer the following questions about each code fragment below. [8 points]

CS 360 Exam 1 Fall 2014 Solution. 1. Answer the following questions about each code fragment below. [8 points]

Analysis of Algorithm. Chapter 2

An algorithm is a sequence of instructions that one must perform in order to solve a wellformulated

L6,7: Time & Space Complexity

1. Answer the following questions about each code fragment below. [8 points]

CS240 Fall Mike Lam, Professor. Algorithm Analysis

CSE 146. Asymptotic Analysis Interview Question of the Day Homework 1 & Project 1 Work Session

THE UNIVERSITY OF WESTERN AUSTRALIA

Introduction to Analysis of Algorithms

CS141: Intermediate Data Structures and Algorithms Analysis of Algorithms

How fast is an algorithm?

User Defined Data: Product Constructor

Ceng 111 Fall 2015 Week 12b

CS 6402 DESIGN AND ANALYSIS OF ALGORITHMS QUESTION BANK

CS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims. Lecture 10: Asymptotic Complexity and

Algorithms: Efficiency, Analysis, techniques for solving problems using computer approach or methodology but not programming

INTRODUCTION. An easy way to express the idea of an algorithm (very much like C/C++, Java, Pascal, Ada, )

How do we compare algorithms meaningfully? (i.e.) the same algorithm will 1) run at different speeds 2) require different amounts of space

Data Structures Lecture 8

Math 501 Solutions to Homework Assignment 10

Algorithm Analysis. Performance Factors

Algorithm Analysis. Spring Semester 2007 Programming and Data Structure 1

CS141: Intermediate Data Structures and Algorithms Analysis of Algorithms

STA141C: Big Data & High Performance Statistical Computing

Data Structures, Algorithms & Data Science Platforms

Data Structure and Algorithm Homework #1 Due: 1:20pm, Tuesday, March 21, 2017 TA === Homework submission instructions ===

CSCI 104 Runtime Complexity. Mark Redekopp David Kempe

CSCI 104 Runtime Complexity. Mark Redekopp David Kempe Sandra Batista

Outline and Reading. Analysis of Algorithms 1

Sorting. Bubble Sort. Selection Sort

CS302 Topic: Algorithm Analysis. Thursday, Sept. 22, 2005

Computers in Engineering COMP 208. Where s Waldo? Linear Search. Searching and Sorting Michael A. Hawker

NCS 301 DATA STRUCTURE USING C

Sorting. Task Description. Selection Sort. Should we worry about speed?

Computational Complexity: Measuring the Efficiency of Algorithms

COT 5407: Introduction to Algorithms. Giri Narasimhan. ECS 254A; Phone: x3748

Data Structures and Algorithms CSE 465

CSE 101. Algorithm Design and Analysis Miles Jones Office 4208 CSE Building Lecture 2: Asymptotics

Solutions. (a) Claim: A d-ary tree of height h has at most 1 + d +...

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

Data Structure Lecture#5: Algorithm Analysis (Chapter 3) U Kang Seoul National University

Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

STA141C: Big Data & High Performance Statistical Computing

The Complexity of Algorithms (3A) Young Won Lim 4/3/18

Transcription:

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 1 Internal Sorting by Comparison

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 2 Problem Specification Consider the collection of data related to the students of a particular class. Each data consists of Roll Number: char rollno[9] Name: char name[50] cgpa: double cgpa It is necessary to prepare the merit list of the students.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 3 Roll No. Name CGPA 02ZO2001 V. Bansal 7.50 02ZO2002 P. K. Singh 8.00 02ZO2003 Imtiaz Ali 8.50 02ZO2004 S. P. Sengupta 8.25 02ZO2005 P. Baluchandran 9.25 02ZO2006 V. K. R. V. Rao 9.00 02ZO2007 L. P. Yadav 6.50 02ZO2008 A. Maria Watson 8.00 02ZO2009 S. V. Reddy 7.00 02ZO2010 D. K. Sarlekar 7.50

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 4 Sorting The merit list should be sorted on cgpa of a student.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 5 Roll No. Name CGPA 02ZO2005 P. Baluchandran 9.25 02ZO2006 V. K. R. V. Rao 9.00 02ZO2003 Imtiaz Ali 8.50 02ZO2004 S. P. Sengupta 8.25 02ZO2002 P. K. Singh 8.00 02ZO2008 A. Maria Watson 8.00 02ZO2001 V. Bansal 7.50 02ZO2010 D. K. Sarlekar 7.50 02ZO2009 S. V. Reddy 7.00 02ZO2007 L. P. Yadav 6.50

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 6 Problem Abstraction We only consider the cgpa field for discussion of sorting algorithms. Unsorted Data 7.5 8.0 8.5 8.25 9.25 9.0 6.5 8.0 7.0 7.5 Sorted Data 9.25 9.0 8.5 8.25 8.0 8.0 7.5 7.5 7.0 6.5

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 7 Simple Sorting Algorithms Selection Sort Insertion Sort Bubble Sort

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 8 Selection Sort The data is stored in an 1-D array and we sort them in non-ascending order. Let the number of data be n for i 0 to n 2 do maxindex indexofmax({a[i],, a[n-1]}) a[i] a[maxindex] #Exchange endfor

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 9 Unsorted Data 7.5 8.0 6.5 9.25 7.5 8.5 9.0 7.0 0 1 2 3 4 5 6 7 Index of Max After i=0 9.25 8.0 6.5 7.5 7.5 8.5 9.0 7.0 0 1 2 3 4 5 6 7 Index of Max After i=1 9.25 9.0 6.5 7.5 7.5 8.5 8.0 7.0 0 1 2 3 4 5 6 7 After i=2 Index of Max 9.25 9.0 8.5 7.5 7.5 6.5 8.0 7.0 0 1 2 3 4 5 6 7 Index of Max

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 10 C Program

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 11 int indexofmax(double cgpa[],int low,int high) { int max ; if(low == high) return low; max = indexofmax(cgpa,low+1,high); if(cgpa[low] > cgpa[max]) return low ; return max; } // selsort.c #define EXCH(X,Y,Z) ((Z)=(X), (X)=(Y), (Y)=(Z)) void selectionsort(double cgpa[], int noofstdnt) { int i ; for(i = 0; i < noofstdnt - 1; ++i) { int max = indexofmax(cgpa, i, noofstdnt-1);

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 12 double temp ; EXCH(cgpa[i], cgpa[max], temp); } } // selsort.c

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 13 Measure of Goodness of an Algorithm Correctness of the algorithm. Increase of execution time with the increase in the size of input. Increase of the requirement of extra space (other than the space required by the input data) with the increase in the size of input. Difficulty in coding the algorithm,

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 14 Execution Time The execution time of a program (algorithm) depends on many factors e.g. the machine parameters (clock speed, instruction set, memory access time etc.), the code generated by the compiler, other processes sharing time on the OS, data set, data structure and encoding of the algorithm etc.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 15 Execution Time Abstraction It is necessary to get an abstract view of the execution time, to compare different algorithms, that essentially depends on the algorithm and the data structure.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 16 Execution of selectionsort() If there are n data, the for-loop in the function selectionsort(), is executed (n 1) times ([i : 0 (n 2)]), so the number of assignments, array acess, comparison and call to indexofmax() are all approximately proportional to the data count, n a. a It is difficult to get the exact count of these operations from the high-level coding of the algorithm.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 17 Execution of indexofmax() For each value of i in the for-loop of selectionsort() there is a call to indexofmax() (low i) The call generates a sequence of n i recursive calls (including the first one). The total number of comparisons for each i inside indexofmax(), are 2(n i) 1. There are also (n i 1) assignments and 2(n i 1) array access.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 18 Execution Time The total number of function calls is n 2 i=0(n i) = (n+2)(n 1) 2. The total number of comparisons is n+ n 2 i=0 2(n i) 1 = n 2 +n 1. The total number of assignments is k(n 1)+ n 2 i=0(n i 1) = n(n 1) 2 +k(n 1). The total number of array access is n 2 i=0 2(n i 1) = n(n 1).

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 19 Execution Time Different operations have different costs, that makes the execution time a complex function of n. But for a large value of n (data count), the number of each of these operations is approximately proportional to n 2.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 20 Execution Time If we assume identical costs for each of these operations (abstraction), the running time of selection sort is approximately proportional to n 2a. This roughly means that the running time of selection sort algorithm will be four times if the data count is doubled. a n is the number of data to be sorted.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 21 Time Complexity We say that the running time or the time complexity of selection sort is of order n 2, Θ(n 2 ). We shall define this notion precisely.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 22 Space Complexity In the recursive implementation of indexofmax(), the depth of nested calls may go upto n. Considering the space for the local variables e.g. max, the extra space requirement is proportional to n.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 23 Space Complexity The extra space requirement or the space complexity of this program is of order n, Θ(n). We can reduce the space requirement using a non-recursive function.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 24 Selection Sort without Recursion

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 25 #define EXCH(X,Y,Z) ((Z)=(X), (X)=(Y), (Y)=(Z)) void selectionsort(double cgpa[], int noofstdnt) { int i ; for(i = 0; i < noofstdnt - 1; ++i) { int max, j ; double temp ; temp = cgpa[i] ; max = i ; for(j = i+1; j < noofstdnt; ++j) if(cgpa[j] > temp) { temp = cgpa[j] ; max = j ;

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 26 } EXCH(cgpa[i], cgpa[max], temp); } } // selsort1.c

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 27 Space Complexity In the non-recursive version, the volume of extra space does not depends on the number of data elements. There are four (4) local variables and two (2) parameters. The space requirement is a constant and is expressed as Θ(1) (n 0 = 1).

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 28 Note We shall introduce the notion of upper bound (O), lower bound (Ω) and order (Θ) of non-decreasing positive real-valued functions. These notations will be useful to talk about the running time and space usages of algorithms.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 29 Big O: Asymptotic Upper Bound Consider two functions f,g : N R +. We say f(n) is O(g(n)) (f(n) O(g(n)) or f(n) = O(g(n))), if there are two positive constants c and n 0 such that 0 f(n) cg(n), for all n n 0. g(n) is an upper bound of f(n).

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 30 Ω: Asymptotic Lower Bound Consider two functions f,g : N R +. We say f(n) is Ω(g(n)) (f(n) Ω(g(n)) or f(n) = Ω(g(n))), if there are two positive constants c and n 0 such that 0 cg(n) f(n), for all n n 0. g(n) is a lower bound of f(n).

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 31 cg(n) f(n) kh(n) n 1 n 0 n

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 32 kh(n) cg(n) f(n) n 0 n 1 n

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 33 Examples n 2 +n+5 = O(n 2 ): It is easy to verify that 2n 2 n 2 +n+5 for all n 3 i.e. c = 2 and n 0 = 3. n 2 +n+5 O(n) and n 2 +n+5 = O(n 3 ),O(n 4 ),.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 34 Examples n 2 +n+5 = Ω(n 2 ): It is easy to verify that n 2 2 < n2 +n+5 for all n i.e. c = 0.5 and n 0 = 0. n 2 +n+5 = Ω(n), Ω(nlogn), Ω(logn) and n 2 +n+5 Ω(n 3 ),Ω(n 4 ),.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 35 Big Θ: Asymptotically Tight Bound Consider two functions f,g : N R +. We say f(n) = Θ(g(n)), if there are three positive constants c 1, c 2 and n 0 such that 0 c 1 g(n) f(n) c 2 g(n), for all n n 0. g(n) is an asymptotically tight bound of f(n) or g(n) is of order f(n). f(n) = Θ(g(n)) is equivalent to f(n) = O(g(n)) and g(n) = O(f(n)).

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 36 c g(n) 1 f(n) c g(n) 0 n 0 n

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 37 Examples g(n) = n 2 +n+5 = Θ(n 2 ), take c 1 = 1 3, c 2 = 1 and n 0 = 2. n 0 1 2 3 1 3 g(n) 5 3 7 3 11 3 17 3 n 2 0 1 4 9 But n 2 +n+5 Θ(n 3 ), Θ(n),.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 38 Selection Sort Running time of selection sort is Θ(n 2 ) and the space requirement is Θ(1) (no-recursive), where n is the number of data to sort.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 39 Note Let n be the size of the input. The worst case running time of an algorithm is Θ(n) implies that it takes almost double the time if the input size is doubled; Θ(n 2 ) implies that it takes almost four times the time if the input is doubled; Θ(logn) implies that it takes a constant amount of extra time if the input is doubled;

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 40 Insertion Sort

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 41 Unsorted Data 7.5 8.5 6.5 8.0 7.5 8.5 9.0 7.0 0 1 2 3 4 5 6 7 1st Step 1 7.5 8.5 6.5 8.0 7.5 8.5 9.0 7.0 0 2 3 4 5 6 7 8.5 temp 2nd Step 2 8.5 7.5 6.5 8.0 7.5 8.5 9.0 7.0 0 1 3 4 5 6 7 6.5 temp 2nd Step 1 3 8.5 7.5 6.5 8.0 7.5 8.5 9.0 7.0 0 2 4 5 6 7 8.0 temp After 2nd Step 3 4 8.5 8.0 7.5 6.5 7.5 8.5 9.0 7.0 0 1 2 3 4 5 6 7

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 42 Insertion Sort Algorithm for i 1 to noofstdnt 1 do temp cgpa[i] for j i-1 downto 0 do if cgpa[j] < temp cgpa[j+1] cgpa[j] else go out of loop endfor cgpa[j+1] temp endfor

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 43 C Program

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 44 void insertionsort(double cgpa[], int noofstdnt){ int i, j ; for(i=1; i < noofstdnt; ++i) { double temp = cgpa[i] ; for(j = i-1; j >= 0; --j) { if(cgpa[j]<temp) cgpa[j+1]=cgpa[j]; else break ; } cgpa[j+1] = temp ; } } // insertionsort.c

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 45 Execution Time Let n be the number of data. The outer for-loop will always be executed n 1 times. The number of times the inner for-loop is executed depends on data. It is entered at least once but the maximum number of execution may be i.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 46 Execution Time If for most of the values of i, 0 i < n, the inner loop is executed near the minimum value (for an almost sorted data), the execution time will be almost proportional to n i.e. linear in n.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 47 Worst Case Execution Time But in the worst case, The inner for-loop will be executed n 1 i=1 i = n(n 1) 2 = Θ(n 2 ) times. So the running time of insertion sort is O(n 2 ), the worst case running is Θ(n 2 ), the best case running time is Θ(n).

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 48 Extra Space for Computation The extra space required for the computation of insertion sort does not depend on number of data. It is Θ(1) (so it is also O(1) and Ω(1)).

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 49 Bubble Sort

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 50 i = 0 j = 7 j = 6 j = 5 j = 4 j = 3 j = 2 j = 1 0 1 2 3 4 5 6 7 53 22 49 15 21 82 37 16 no exchange 0 1 2 3 4 5 6 7 53 22 49 15 21 82 37 16 no exchange 0 1 2 3 4 5 6 7 53 22 49 15 21 82 37 16 exchange 0 1 2 3 4 5 6 7 53 22 49 15 82 21 37 16 exchange 0 1 2 3 4 5 6 7 53 22 49 82 15 21 37 16 exchange 0 1 2 3 4 5 6 7 53 22 82 49 15 21 37 16 exchange 0 1 2 3 4 5 6 7 53 82 22 49 15 21 37 16 exchange 0 1 2 3 4 5 6 7 82 53 22 49 15 21 37 16

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 51 Bubble Sort Algorithm for i 0 to noofstdnt 2 do exchange = NO for j noofstdnt 1 downto i +1 do if (cgpa[j-1] < cgpa[j]) cgpa[j-1] cgpa[j] # Exchange exchange = YES endfor if (exchange == NO) break endfor

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 52 C Program

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 53 #define EXCHANGE 0 #define NOEXCHANGE 1 #define EXCH(X,Y,Z) ((Z)=(X), (X)=(Y), (Y)=(Z)) void bubblesort(double cgpa[], int noofstdnt) { int i, j, exchange, temp ; for(i=0; i < noofstdnt - 1; ++i) { exchange = NOEXCHANGE ; for(j = noofstdnt - 1; j > i; --j) if(cgpa[j-1] < cgpa[j]) { EXCH(cgpa[j-1], cgpa[j], temp); exchange = EXCHANGE ; } if(exchange) break ;

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 54 } } // bubblesort.c

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 55 Execution Time The number of times the outer for-loop is executed depends on the input data, as there is a conditional break. If the data is sorted in the desired order, there is no exchange, and in the best case the outer loop is executed only once. This makes the best running time of bubble sort approximately proportional to n.

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 56 Execution Time and Space But in the worst case the outer loop is executed n 1 times. The inner loop is executed (n 1) i times for every value of i. So in the worst case, the total number of times the inner loop is executed is times. n 1 i=0 (n 1) i = n(n 1) 2 = Θ(n 2 )

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 57 Worst Case Complexity The running time of bubble sort (worst case time complexity) is O(n 2 ) (quadratic in n). The extra storage requirement does not depend on the size of data and the space complexity is Θ(1).