CS240 Fall Mike Lam, Professor. Algorithm Analysis

Similar documents
CS240 Fall Mike Lam, Professor. Algorithm Analysis

CSE 146. Asymptotic Analysis Interview Question of the Day Homework 1 & Project 1 Work Session

[ 11.2, 11.3, 11.4] Analysis of Algorithms. Complexity of Algorithms. 400 lecture note # Overview

PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS

CSE 373 APRIL 3 RD ALGORITHM ANALYSIS

CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis. Aaron Bauer Winter 2014

Analysis of Algorithms

Lecture 5: Running Time Evaluation

Computer Science 210 Data Structures Siena College Fall Topic Notes: Complexity and Asymptotic Analysis

Data Structures Lecture 8

CS:3330 (22c:31) Algorithms

Today s Outline. CSE 326: Data Structures Asymptotic Analysis. Analyzing Algorithms. Analyzing Algorithms: Why Bother? Hannah Takes a Break

Elementary maths for GMT. Algorithm analysis Part I

Choice of C++ as Language

asymptotic growth rate or order compare two functions, but ignore constant factors, small inputs

CS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims. Lecture 10: Asymptotic Complexity and

Chapter 2: Complexity Analysis

Programming II (CS300)

Algorithms and Data Structures

Algorithm Analysis. Applied Algorithmics COMP526. Algorithm Analysis. Algorithm Analysis via experiments

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1

10/5/2016. Comparing Algorithms. Analyzing Code ( worst case ) Example. Analyzing Code. Binary Search. Linear Search

Analysis of Algorithm. Chapter 2

Algorithm. Lecture3: Algorithm Analysis. Empirical Analysis. Algorithm Performance

Algorithm Analysis. (Algorithm Analysis ) Data Structures and Programming Spring / 48

UNIT 1 ANALYSIS OF ALGORITHMS

Asymptotic Analysis Spring 2018 Discussion 7: February 27, 2018

Complexity of Algorithms. Andreas Klappenecker

Algorithms and Data Structures

L6,7: Time & Space Complexity

ASYMPTOTIC COMPLEXITY

Algorithmic Analysis. Go go Big O(h)!

Computer Science Approach to problem solving

Complexity of Algorithms

ASYMPTOTIC COMPLEXITY

Algorithm Analysis. Performance Factors

Complexity Analysis of an Algorithm

Outline and Reading. Analysis of Algorithms 1

Algorithm Analysis. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

RUNNING TIME ANALYSIS. Problem Solving with Computers-II

Algorithms: Efficiency, Analysis, techniques for solving problems using computer approach or methodology but not programming

4.1 Real-world Measurement Versus Mathematical Models

CS302 Topic: Algorithm Analysis. Thursday, Sept. 22, 2005

Introduction to Data Structure

Algorithm. Algorithm Analysis. Algorithm. Algorithm. Analyzing Sorting Algorithms (Insertion Sort) Analyzing Algorithms 8/31/2017

Algorithm Analysis. Algorithm Efficiency Best, Worst, Average Cases Asymptotic Analysis (Big Oh) Space Complexity

Algorithm Analysis. Gunnar Gotshalks. AlgAnalysis 1

ANALYSIS OF ALGORITHMS

Analysis of Algorithms. CSE Data Structures April 10, 2002

Data Structures and Algorithms

Why study algorithms? CS 561, Lecture 1. Today s Outline. Why study algorithms? (II)

Another Sorting Algorithm

Module 1: Asymptotic Time Complexity and Intro to Abstract Data Types

Introduction to Computer Science

ECE 2400 Computer Systems Programming Fall 2018 Topic 8: Complexity Analysis

STA141C: Big Data & High Performance Statistical Computing

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

Introduction to Computer Science

Algorithms. Algorithms 1.4 ANALYSIS OF ALGORITHMS

9/10/2018 Algorithms & Data Structures Analysis of Algorithms. Siyuan Jiang, Sept

INTRODUCTION. Analysis: Determination of time and space requirements of the algorithm

CSI33 Data Structures

Checking for duplicates Maximum density Battling computers and algorithms Barometer Instructions Big O expressions. John Edgar 2

Data Structures and Algorithms

CS S-02 Algorithm Analysis 1

How do we compare algorithms meaningfully? (i.e.) the same algorithm will 1) run at different speeds 2) require different amounts of space

The Complexity of Algorithms (3A) Young Won Lim 4/3/18

STA141C: Big Data & High Performance Statistical Computing

Algorithm Analysis. Part I. Tyler Moore. Lecture 3. CSE 3353, SMU, Dallas, TX

Data Structures Lecture 3 Order Notation and Recursion

Algorithm Analysis. Spring Semester 2007 Programming and Data Structure 1

(Refer Slide Time: 1:27)

Algorithm Analysis. Big Oh

Introduction to the Analysis of Algorithms. Algorithm

Algorithm Analysis. October 12, CMPE 250 Algorithm Analysis October 12, / 66

Chapter 6 INTRODUCTION TO DATA STRUCTURES AND ALGORITHMS

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016

Recall from Last Time: Big-Oh Notation

Measuring algorithm efficiency

4.4 Algorithm Design Technique: Randomization

Algorithm Performance. (the Big-O)

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Introduction to Computers & Programming

LECTURE 9 Data Structures: A systematic way of organizing and accessing data. --No single data structure works well for ALL purposes.

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

We will use the following code as an example throughout the next two topics:

CS/COE 1501

Algorithm Analysis. CENG 707 Data Structures and Algorithms

Asymptotic Analysis of Algorithms

Introduction to Algorithms 6.046J/18.401J/SMA5503

Week 12: Running Time and Performance

The growth of functions. (Chapter 3)

Introduction to Algorithms 6.046J/18.401J

Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

TREES AND ORDERS OF GROWTH 7

Theory and Algorithms Introduction: insertion sort, merge sort

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY

Chapter Fourteen Bonus Lessons: Algorithms and Efficiency

CS171:Introduction to Computer Science II. Algorithm Analysis. Li Xiong

How fast is an algorithm?

Transcription:

CS240 Fall 2014 Mike Lam, Professor Algorithm Analysis

Algorithm Analysis Motivation: what and why Mathematical functions Comparative & asymptotic analysis Big-O notation ("Big-Oh" in textbook)

Analyzing algorithms We want efficient algorithms What metric should we use? How should we normalize? How should we compare?

Empirical Analysis "Run it and see" Use the time.h library in C Vary experiment parameters Input size, algorithm used, number of cores, etc. Report running times in a graph or table Input Size 100 500 1000 5000 Algorithm A Algorithm B 1s 3s 2s 9s 3s 27 s 4s 81 s

Problems with Empirical Analysis Hard to compare across environments Hardware/software differences Hard to be comprehensive How many experiments do we need to run? Did we test all relevant input sizes? You actually need the code to run! We have to invest development time into implementing algorithms that might not work out

Case Study Which is better? Input Size 10 20 30 40 Algorithm A Algorithm B 1s 330 s 2s 430 s 3s 490 s 4s 530 s

Case Study Which is better? Input Size 10 20 30 40 1,000 Algorithm A Algorithm B 1s 330 s 2s 430 s 3s 490 s 4s 530 s 100 s 997 s 10,000 1,000 s 1,329 s 100,000 10,000 s 1,661 s 1,000,000 100,000 s 1,993 s ~28 hours ~33 minutes

Case Study Which is better? Input Size 10 100 1,000 10,000 100,000 1,000,000 Algorithm A Algorithm B 1.2 s 1.1 s 2.0 s 1.9 s 3.4 s 3.3 s 4.5 s 4.7 s 5.9 s 5.9 s 7.0 s 6.8 s

Case Study Which is better? Input Size 10 100 1,000 10,000 100,000 1,000,000 Algorithm A Algorithm B Algorithm A Algorithm B 1.2 s 1.1 s 1 MB 1 MB 2.0 s 1.9 s 2 MB 11 MB 3.4 s 3.3 s 3 MB 96 MB 4.5 s 4.7 s 4 MB 1 GB 5.9 s 5.9 s 5 MB 12 GB 7.0 s 6.8 s 6 MB 140 GB

Case Study Which is better? bool search(int nums[], int size, int value) { bool found = false; int i; for (i=0; i<size; i++) { if (nums[i] == value) { found = true; return found; bool search(int nums[], int size, int value) { int left = 0; int right = size; int mid; while (right > left+1) { mid = (right-left)/2 + left; if (nums[mid] > value) { right = mid; else if (nums[mid] < item) { left = mid+1; else { left = mid; right = mid+1; return (left < size && nums[left] == value)

Case Study Which is better? bool search(int nums[], int size, int value) { bool found = false; int i; for (i=0; i<size; i++) { if (nums[i] == value) { found = true; return found; bool search(int nums[], int size, int value) { bool found = false; int i; for (i=0; i<size; i++) { if (nums[i] == value) { found = true; break; return found;

Case Study Which is better? bool search(int nums[], int size, int value) { bool found = false; int i; for (i=0; i<size; i++) { if (nums[i] == value) { found = true; return found; Best: n comparisons Worst: n comparisons Average: n comparisons bool search(int nums[], int size, int value) { bool found = false; int i; for (i=0; i<size; i++) { if (nums[i] == value) { found = true; break; return found;

Case Study Which is better? bool search(int nums[], int size, int value) { bool found = false; int i; for (i=0; i<size; i++) { if (nums[i] == value) { found = true; return found; Best: n comparisons Worst: n comparisons Average: n comparisons bool search(int nums[], int size, int value) { bool found = false; int i; for (i=0; i<size; i++) { if (nums[i] == value) { found = true; break; return found; Best: 1 comparison Worst: n comparisons Average: n/2 comparisons

Lessons Learned Running times can be deceiving CPU time isn't the only metric of interest Memory usage, I/O time, power usage, etc. Focus on primitive operations (for simplicity) Code length has little bearing on performance We have to normalize by input size More complicated code can be faster Best, worst, average cases can all be different Focus on the worst case (for guarantees)

Analyzing algorithms We want efficient algorithms What metric should we use? How should we normalize? How should we compare?

Analyzing algorithms We want efficient algorithms What metric should we use? Worst-case primitive operations How should we normalize? How should we compare?

Analyzing algorithms We want efficient algorithms What metric should we use? How should we normalize? Worst-case primitive operations By input size How should we compare?

Analyzing algorithms We want efficient algorithms What metric should we use? How should we normalize? Worst-case primitive operations By input size How should we compare? Asymptotic analysis

Functions First, a brief foray into mathematics... (don't worry, it will be brief!)

Functions Constant function: f(n) = c O(1) Input size doesn't matter As long as c is relatively small, constant time is as good as it gets!

Functions Logarithm function: f(n) = logb n O(log n) Grows logarithmically with input size Usually the base (b) is 2 Usually encountered with divide-and-conquer methods

Functions Linear function: f(n) = n O(n) Grows linearly with input size Often, this is the best we can hope for Reading objects into memory is O(n)

Functions Linearithmic ("quasi-linear") function: f(n) = n logb n O(n log n) Grows slightly faster than linear In computer science: b = 2 (usually) Many important algorithms are O(n log n) Most of the "good" sorting algorithms

Functions Quadratic function: f(n) = n2 O(n2) Scales quadratically with input size Usually arises from nested loops Or naïve all-to-all comparisons

Functions Cubic function: f(n) = n3 O(n3) Scales cubically with input size Usually arises from triply-nested loops

Functions Polynomial function: f(n) = nx O(nx) Generalization of quadratic/cubic functions We want x to be as small as possible Usually, x > 4 is impractical

Functions Exponential function: f(n) = bn O(bn) Usually the base is 2 Currently infeasible when n > ~100 Avoid this!

Functions There are even worse functions Factorial: f(n) = n! Double exponential: f(n) = bbn We won't be using these much in this class But you should know the other eight!

Comparing Functions Plotting all functions on one graph is difficult Use log-log axes

Comparing Functions We now have an ordering of functions: 1. Constant: 2. Logarithmic: 3. Linear: 4. Linearithmic: 5. Quadratic: 6. Cubic: 7. Polynomial: 8. Exponential: f(n) = 1 (slowest-growing) f(n) = log n f(n) = n f(n) = n log n f(n) = n2 f(n) = n3 f(n) = nb f(n) = bn (fastest-growing)

Functions We have actually described eight function families There are a infinite number of functions in each family, with different constant scalar factors Example: n, 3n, and 42n are all linear functions Example: n2, 3n2, and 42n2 are all quadratic functions Within a family, smaller constants are better How do we compare between families? Use our function ordering!

Comparing Functions So we won't talk about the running time of an algorithm directly... but rather we'll talk about how fast the running time grows as the problem size increases and compare the growth rates of various algorithms

Comparing Functions This type of analysis is called "asymptotic analysis" Because it deals with the behavior of functions in the asymptotic sense as n (input size) increases to infinity

Asymptotic Analysis Big-O notation Method for mathematically comparing functions Provides us with a robust way of saying "this function grows faster than that one" We will use that statement as a proxy for: "this algorithm is more efficient than that one"

Big-O Notation Formal definition: Let f(n) and g(n) be functions Mapping input sizes to running time We say this: f(n) is O(g(n)) If there is a constant c > 0 and an integer n0 1 such that: f(n) c g(n) for n n0

Big-O Notation Informally, we say "f(n) is O(g(n))" if f(n) grows as slow or slower than g(n) According to our ordering of function growth Or: "Algorithm X is O(f(n))" if the growth rate of the running time of Algorithm X is O(f(n)) Examples: "Linear search is O(n)" "Binary search is O(log n)" "Matrix multiplication is O(n3)"

Big-O Notation Instead of this: f (n) is O(g(n)) Some people say this: f (n) O(g(n)) This is set notation describing sets or families of functions (called complexity classes) The former is a shorthand for the second

Big-O Notation Big-O: f(n) is O(g(n)) iff. f(n) c g(n) for n n0 (upper bound) Big-Omega: f(n) is Ω(g(n)) iff. f(n) c g(n) for n n0 (lower bound) Big-Theta: f(n) is Θ(g(n)) iff. c' g(n) f(n) c'' g(n) (strict bounds: upper and lower) for n n0

Big-O Notation Limit-based definitions: f(n) is O(g(n)) if f(n) is Ω(g(n)) if f(n) is Θ(g(n)) if f (n) lim =c n g(n) f (n) lim =c n g(n) f (n) lim =c n g(n) where c is a constant and c< where c is a constant and c>0 where c is a constant and 0<c<

Big-O Notation We now have a strict ordering of complexity classes: Constant Θ(1) Logarithmic Θ(log n) Linear Θ(n) Linearithmic ("n log n") Θ(n log n) Quadratic Θ(n2) Cubic Θ(n3) Higher Polynomial Θ(nb), Exponential Θ(bn) (slowest-growing) b>3 (fastest-growing)

Big-O Notation Find the slowest-growing function family for which the Big-O definition is true Example: Don't say Algorithm X is O(n3) if it is O(n2) even though the former is technically true as well Walking traveler example Drop slower-growing ("lower-order") terms Example: Don't say Algorithm X is O(n + log n) Drop the slower-growing function and say it is O(n) Goldfish and German shepherd example

A Word of Caution Sometimes Big-O notation can hide large constant factors The fact that Algorithm X is O(n) doesn't matter if the constant is 10100! Something to keep in mind

So what is efficient? "Efficient" vs. "feasible" Everything O(n log n) is generally considered efficient for all reasonable input sizes For small n, any algorithm can be feasible Obviously, the slower-growing, the better Generally, small polynomials are the limit of feasibility Sometimes approximation algorithms can help Exponentials are right out

Ceiling and Floor Functions log n is rarely an integer value Often we want to coerce values to be integers for the sake of analysis We can use the floor and ceiling functions to round real numbers to nearest integers: x = floor(x) = largest integer x x = ceil(x) = largest integer x

Little-O Notation Big-O: iff. f(n) c g(n) for some c, n n0 f(n) is o(g(n)) iff. f(n) c g(n) for all c, n n0 Little-O: Basically means "f(n) grows much slower than g(n)" f(n) is O(g(n)) Alternately, "f(n) is dominated by g(n)" Similarly defined for Little-Ω (ω)

L'Hôpital's Rule If lim f (n) = lim g (n) = and f ' (n) and g ' (n) exist, then n n f (n) f ' (n) lim = lim n g(n) n g' (n) This is useful for proving Big-O assertions Also demonstrates why lower order terms aren't important Uses first derivatives f'(n) and g'(n)

Key Masteries You should be able to: Explain why we need asymptotic analysis Compare functions and complexity classes Define/explain Big-O notation (O, Ω, Θ) Especially the members of the eight function families we talked about Use it to prove relations between complexity classes Describe growth rates for concrete algorithms Using operation counts and Big-O notation