Algorithm Analysis. Big Oh

Similar documents
What is an Algorithm?

CMPSCI 187: Programming With Data Structures. Lecture 5: Analysis of Algorithms Overview 16 September 2011

PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS

Algorithmic Analysis. Go go Big O(h)!

CSE 146. Asymptotic Analysis Interview Question of the Day Homework 1 & Project 1 Work Session

Java How to Program, 9/e. Copyright by Pearson Education, Inc. All Rights Reserved.

Computer Science 210 Data Structures Siena College Fall Topic Notes: Complexity and Asymptotic Analysis

Algorithm Analysis. Performance Factors

10/5/2016. Comparing Algorithms. Analyzing Code ( worst case ) Example. Analyzing Code. Binary Search. Linear Search

CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis. Aaron Bauer Winter 2014

Algorithm Analysis. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

Week 12: Running Time and Performance

Algorithm Analysis. CENG 707 Data Structures and Algorithms

Introduction to the Analysis of Algorithms. Algorithm

What did we talk about last time? Finished hunters and prey Class variables Constants Class constants Started Big Oh notation

10/11/2013. Chapter 3. Objectives. Objectives (continued) Introduction. Attributes of Algorithms. Introduction. The Efficiency of Algorithms

Lecture 5 Sorting Arrays

Measuring algorithm efficiency

Introduction to Analysis of Algorithms

Chapter 3: The Efficiency of Algorithms

CS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims. Lecture 10: Asymptotic Complexity and

(Refer Slide Time: 1:27)

ASYMPTOTIC COMPLEXITY

Chapter 2: Complexity Analysis

[ 11.2, 11.3, 11.4] Analysis of Algorithms. Complexity of Algorithms. 400 lecture note # Overview

Chapter 3: The Efficiency of Algorithms. Invitation to Computer Science, C++ Version, Third Edition

Sum this up for me. Let s write a method to calculate the sum from 1 to some n. Gauss also has a way of solving this. Which one is more efficient?

asymptotic growth rate or order compare two functions, but ignore constant factors, small inputs

CS240 Fall Mike Lam, Professor. Algorithm Analysis

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Lecture Notes on Sorting

Lecture 5: Running Time Evaluation

Today s Outline. CSE 326: Data Structures Asymptotic Analysis. Analyzing Algorithms. Analyzing Algorithms: Why Bother? Hannah Takes a Break

Chapter 3: The Efficiency of Algorithms Invitation to Computer Science,

csci 210: Data Structures Program Analysis

CSE 373 APRIL 3 RD ALGORITHM ANALYSIS

Choice of C++ as Language

CS 261 Data Structures. Big-Oh Analysis: A Review

CS 3410 Ch 5 (DS*), 23 (IJP^)

For searching and sorting algorithms, this is particularly dependent on the number of data elements.

RUNNING TIME ANALYSIS. Problem Solving with Computers-II

UNIT 1 ANALYSIS OF ALGORITHMS

What is an algorithm?

Scientific Computing. Algorithm Analysis

Algorithm Analysis. Spring Semester 2007 Programming and Data Structure 1

CS240 Fall Mike Lam, Professor. Algorithm Analysis

Analysis of Algorithms. 5-Dec-16

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016

CS 6402 DESIGN AND ANALYSIS OF ALGORITHMS QUESTION BANK

Chapter 3. The Efficiency of Algorithms INVITATION TO. Computer Science

Outline. runtime of programs algorithm efficiency Big-O notation List interface Array lists

ALGORITHM ANALYSIS. cs2420 Introduction to Algorithms and Data Structures Spring 2015

Analysis of algorithms

Recitation 9. Prelim Review

Elementary maths for GMT. Algorithm analysis Part I

Pseudo code of algorithms are to be read by.

Chapter 8 Search and Sort

Algorithm Analysis. (Algorithm Analysis ) Data Structures and Programming Spring / 48

CSE 373 MAY 24 TH ANALYSIS AND NON- COMPARISON SORTING

Data Structures Lecture 3 Order Notation and Recursion

9/10/2018 Algorithms & Data Structures Analysis of Algorithms. Siyuan Jiang, Sept

CS S-02 Algorithm Analysis 1

Sorting. Bubble Sort. Selection Sort

Checking for duplicates Maximum density Battling computers and algorithms Barometer Instructions Big O expressions. John Edgar 2

Remember, to manage anything, we need to measure it. So, how do we compare two implementations? ArrayBag vs. LinkedBag Which is more efficient?

Outline and Reading. Analysis of Algorithms 1

ASYMPTOTIC COMPLEXITY

O(1) How long does a function take to run? CS61A Lecture 6

O(n): printing a list of n items to the screen, looking at each item once.

DATA STRUCTURES AND ALGORITHMS. Asymptotic analysis of the algorithms

CS 310: Order Notation (aka Big-O and friends)

LECTURE 9 Data Structures: A systematic way of organizing and accessing data. --No single data structure works well for ALL purposes.

Search Lesson Outline

Introduction to Computer Science

Analysis of Algorithms

6. Asymptotics: The Big-O and Other Notations

Chapter Fourteen Bonus Lessons: Algorithms and Efficiency

CS:3330 (22c:31) Algorithms

Lecture 7: Searching and Sorting Algorithms

Recursion. COMS W1007 Introduction to Computer Science. Christopher Conway 26 June 2003

CSCA48 Winter 2018 Week 10:Algorithm Analysis. Marzieh Ahmadzadeh, Nick Cheng University of Toronto Scarborough

Asymptotic Analysis of Algorithms

CS 231 Data Structures and Algorithms Fall Algorithm Analysis Lecture 16 October 10, Prof. Zadia Codabux

Algorithm Efficiency and Big-O

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1

How do we compare algorithms meaningfully? (i.e.) the same algorithm will 1) run at different speeds 2) require different amounts of space

Computational Complexity: Measuring the Efficiency of Algorithms

CS302 Topic: Algorithm Analysis #2. Thursday, Sept. 21, 2006

ECE 2400 Computer Systems Programming Fall 2018 Topic 8: Complexity Analysis

Algorithm. Lecture3: Algorithm Analysis. Empirical Analysis. Algorithm Performance

CSCI 261 Computer Science II

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

Computer Science Approach to problem solving

Data Structures Lecture 8

CSCI-1200 Data Structures Spring 2018 Lecture 7 Order Notation & Basic Recursion

Algorithms and Data Structures

Data structure and algorithm in Python

CS171:Introduction to Computer Science II. Algorithm Analysis. Li Xiong

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

Transcription:

Algorithm Analysis with Big Oh Data Structures and Design with Java and JUnit Chapter 12 Rick Mercer

Algorithm Analysis w Objectives Analyze the efficiency of algorithms Analyze a few classic algorithms Linear Search, Binary Search, Selection Sort Know the differences between O(1), O(n), O(log n), and O(n 2 ) Visualize runtime differences with experiments

Algorithms continued w Computer Scientists focus on problems such as How fast do algorithms run How much memory does the process require w Example Applications Make the Internet run faster Pink-Degemark's routing algorithms Gene Meyers determined the sequences of the Human genome using his whole genome shotgun algorithm

Analysis of Algorithms w We have ways to compare algorithms Generally, the larger the problem, the longer it takes the algorithm to complete Sorting 100,000 elements can take much more time than sorting 1,000 elements and more than 10 times longer the variable n suggests the "number of things" If an algorithm requires 0.025n 2 + 0.012n + 0.0005 seconds, just plug in a value for n

A Computational Model w To summarize algorithm runtimes, we can use a computer independent model instructions are executed sequentially count all assignments, comparisons, and increments there is infinite memory every simple instruction takes one unit of time

Simple Instructions w Count the simple instructions assignments have cost of 1 comparisons have a cost of 1 let's count all parts of the loop for (int j = 0; j < n; j++) j=0 has a cost of 1, j<n executes n+1 times,and j++ executes n times for a total cost of 2n+2 each statement in the repeated part of a loop have have a cost equal to number of iterations

Examples Cost sum = 0; -> 1 sum = sum + next; -> 1 Total Cost: 2 Cost for (int i = 1; i <= n; i++) -> 1 + n+1 + n = 2n+2 sum = sum++; -> n Total Cost: 3n + 2 Cost k = 0 -> 1 for (int i = 0; i < n; i++) -> 2n+2 for (int j = 0; j < n; j++) -> n(2n+2) = 2n 2 +2n k++; -> n 2 Total Cost: 3n 2 + 4n + 3`

Total Cost of Sequential Search Cost for (int index = 0; index < n; index++) -> 2n + 2 if(searchid.equals(names[index]) -> n return index; -> 0 or 1 return -1 // if not found -> 0 or 1 Total cost = 3n+3

Different Cases w The total cost of sequential search is 3n + 3 But is it always exactly 3n + 3 instructions? The last assignment does not always execute But does one assignment really matter? How many times will the loop actually execute? that depends If searchid is found at index 0: iterations best case If searchid is found at index n-1: iterations worst case

Typical Case of sequential (linear) w The average describes the more typical case w First, let the the entire cost be simplified to n Assume the target has a 50/50 chance of being in the array n comparisons are made: worst-case occurs 1/2 the time Assume if it's in a, it's as likely to be in one index as another 1 2 n 1 + 2 n 2 = n 2 n 4 = 3 + 4 n Half the time it is n comparisons, the other half it is n/2 comparisons So the typical case is 3/4 n comparisons

The Essence of Linear Search Plot the function this is why sequential search is also called linear search. As n increases, runtime forms a line 75 f(n) 45 60 100 n

Linear Search Continued u This equation is a polynomial: 3n + 3 u The fastest growing term is the high-order term u The high order term (which could be n 2 or n 3 ), represents the most important part of the analysis function u We refer to the rate of growth as the order of magnitude, which measure the rate of growth

Rate of Growth w Imagine two functions: f(n) = 100n g(n) = n 2 + n When n is small, which is the bigger function? When n is big, which is the bigger function? We can say: g(n) grows faster than f(n)

Rate of Growth, another view Function growth and weight of terms as a percentage of all terms as n increases for f(n) = n 2 + 80n + 500 Conclusion: consider highest order term with the coefficient dropped, also drop all lower order terms

Definition w The asymptotic growth of an algorithm describes the relative growth of an algorithm as n gets very large With speed and memory increases doubling every two years, the asymptotic efficiency where n is very large is the thing to consider There are many sorting algorithm that are "on the order of" n 2 (there are roughly n n instructions executed) Other algorithms are "on the order of" n log 2 n and this is a huge difference when n is very large

Constant Function w Some functions don't grow with n If the sorting program initializes a few variables first, the time required does not change when n increases These statements run in constant time e.g. construct an empty List with capacity 20 The amount of time can be described as a constant function f(n) = k, where k is a constant it takes ~0.0003 seconds no matter the size of n

Big O w Linear search is "on the order of n", which can be written as O(n) to describe the upper bound on the number of operations w This is called big O notation w Orders of magnitude: O(1) constant (the size of n has no effect) O(n) linear O(log n) logarithmic O(n log n) no other way to say it, John K s License plate O(n 2 ) quadratic O(n 3 ) cubic O(2 n ) exponential

Binary Search w We'll see that binary search can be a more efficient algorithm for searching If the element in the middle is the target report target was found and the search is done if the key is smaller search the array to the left Otherwise search the array to the right w This process repeats until the target is found or there is nothing left to search w Each comparison narrows search by half

Binary Search Harry Data reference loop 1 loop 2 Bob a[0] left Carl Debbie Evan Froggie Gene Harry Igor Jose a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] mid left mid right right

How fast is Binary Search? w Best case: 1 w Worst case: when target is not in the array w At each pass, the "live" portion of the array is narrowed to half the previous size. w The series proceeds like this: n, n/2, n/4, n/8,... w Each term in the series represents one comparison How long does it take to get to 1? This will be the number of comparisons

Binary Search (con.) w Could start at 1 and double until we get to n 1, 2, 4, 8, 16,..., k >= n or 2 0, 2 1, 2 2, 2 3, 2 4,..., 2 c >= n w The length of this series is c+1 w The question is 2 to what power c is greater than or equal to n? if n is 8, c is 3 if n is 1024, c is 10 if n is 16,777,216, c is 24 w Binary search runs O(log n) logarithmic

Comparing O(n) to O(log n) Rates of growth and logarithmic functions Power of 2 n log 2 n 2 4 16 4 2 8 128 8 2 12 4,096 12 2 24 16,777,216 24

n Graph Illustrating Relative Growth n, log n, n 2 f(n) n 2 n log n

Other logarithm examples w The guessing game: Guess a number from 1 to 100 try the middle, you could be right if it is too high check near middle of 1..49 if it is too low check near middle of 51..100 Should find the answer in a maximum of 7 tries If 1..250, a maximum of 2 c >= 250, c == 8 If 1..500, a maximum of 2 c >= 500, c == 9 If 1..1000, a maximum of 2 c >= 1000, c == 10

Logarithmic Explosion w Assuming an infinitely large piece of paper that can be cut in half, layered, and cut in half again as often as you wish. How many times do you need to cut and layer until paper thickness reaches the moon? Assumptions paper is 0.002 inches thick distance to moon is 240,000 miles 240,000 * 5,280 feet per mile * 12 inches per foot = 152,060,000,000 inches to the moon

Examples of Logarithmic Explosion w The number of bits required to store a binary number is logarithmic add 1 bit to get much larger ints 8 bits stored 256 values log 2 256 = 8 log 2,147,483,648 = 31 w The inventor of chess asked the Emperor to be paid like this: 1 grain of rice on the first square, 2 on the next, double grains on each successive square 2 63

Compare Sequential and Binary Search w Output from CompareSearches.java (1995) Search for 20000 objects Binary Search #Comparisons: 267248 Average: 13 Run time: 20ms Sequential Search #Comparisons: 200010000 Average: 10000 Run time: 9930ms Difference in comparisons : 199742752 Difference in milliseconds: 9910 1200 1000 800 600 400 200 0 Seconds 2013 0 200 400 600 800 1000

O(n 2 ) quadratic w O(n 2 ) reads on the order of n squared or quadratic w When n is small, rates of growth don t matter w Quadratic algorithms are greatly affected by increases in n Consider the selection sort algorithm Find the largest, n-1 times

Actual observed data for O(n 2 ) sort 400 350 300 250 200 150 100 50 0 Time required to sort an array of size n 1 10 20 3 0 40 n in thousands Seconds

Two O(n 2 ) algorithms w Many known sorting algorithms are O(n 2 ) w Given n points, find the pair that are closest Compare p1 with p2, p3, p4, p5 (4 comparisons) Compare p2 with p3, p4, p5 Compare p3 with p4, p5 Compare p4 with p5 When n is 5, make 10 comparisons In general, #comparisons is n(n-1) / 2 == n 2 /2 - n/2 (3 comparisons) (2 comparisons) (1 comparisons) highest order term is n 2, drop ½ and runtime is O(n 2 )

O(n 3 ) algorithms w Matrix Multiplication (naïve): for(int i = 0; i < m.length; i++) { for(int j = 0; j < m2.length - 1; j++) { for(int k = 0; k < m2.length; k++){ m[i][j] += m[i][k] * m2[k][j]; } } }

Big O and Style Guidelines w Big O is similar to saying the runtime is less than or equal to Big O notation. O(f) is an upper bound w Don't use constants or lower-order terms These are no-nos for now (you will use coefficients in C Sc 345) O(n 2 + n) should be written O(n 2 ) O(5500n) should be written O(n) O(2.5n) should be written O(n)

Properties of Big-O Summarizing two main properties If f(n) is a sum of several terms, the one with the largest growth rate is kept, and all others omitted If f(n) is a product of several factors, any constants (terms in the product that do not depend on n) are omitted which means you can drop coefficients

Properties of Big-O We can drop coefficient Example: f(n) = 100*n then f(n) is O(n)

Summation of same Orders The property is useful when an algorithm contains several loops of the same order Example: f(n) is O(n) f2(n) is O(n) then f(n) + f2(n) is O(n) + O(n), which is O(n)

Summation of different Orders This property works because we are only concerned with the term of highest growth rate Example: f 1 (n) is O(n 2 ) f 2 (n) is O(n) so f 1 (n) + f 2 (n) = n 2 + n is O(n 2 )

Product This property is useful for analyzing segments of an algorithm with nested loops Example: f 1 (n) is O(n 2 ) f 2 (n) is O(n) then f 1 (n) x f 2 (n) is O(n 2 ) x O(n), which is O(n 3 )

Limitations of Big-Oh Analysis w Constants sometimes make a difference n log n may be faster than 10000n Doesn't differentiate between data cache memory, main memory, and data on a disk--there is a huge time difference to access disk data thousands of times slower Worst case doesn't happen often it's an overestimate

Quick Analysis w Can be less detailed w Running time of nested loops is the product of each loop's number of iterations w Several consecutive loops the longest running loop 3n is O(n) after all

Runtimes with for loops int n = 1000; int[] x = new int[n]; w O(n) for(int j = 0; j < n; j++) x[j] = 0; w O(n 2 ) int sum = 0; for (int j = 0; j < n; j++) for (int k = 0; k < n; k++) sum += j * k;

Run times with for loops w O(n 3 ) for (int j = 0; j < n; j++) for (int k = 0; k < n; k++) for (int l = 0; l < n; l++) sum += j * k * l; w O(n) for (int j = 0; j < n; j++) sum++; for (int j = 0; j < n; j++) sum--; w O(log n) for (int j = 1; j < n; j = 2 * j) sum += j;

Analyze this public void swap(int[] a, int left, int right) { int temp = array[left]; array[left] = array[right]; array [right] = temp; }

Analyze that for (int j = 0; j < n; j++) sum += l; for (int k = 0; k < n; k++) sum += l; for (int l = 0; l < n; l++) sum += l;

Analyze that for (int j = 0; j < n; j++) for (int k = 0; k < n; k++) sum += k + l; for (int l = 0; l < n; l++) sum += l;

Analyze this for (int top = 0; top < n - 1; top++) { int smallestindex = top; for (int index = top; index < n; index++) { if(a[index] < a[smallestindex]) smallestindex = index; } // Swap smallest to the top index swap(a, top, smallestindex); }