Principles of Algorithm Analysis. Biostatistics 615/815

Similar documents
PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS

Algorithm Analysis. CENG 707 Data Structures and Algorithms

CS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims. Lecture 10: Asymptotic Complexity and

Algorithm Analysis. Performance Factors

Sum this up for me. Let s write a method to calculate the sum from 1 to some n. Gauss also has a way of solving this. Which one is more efficient?

Algorithm Analysis. Spring Semester 2007 Programming and Data Structure 1

Java How to Program, 9/e. Copyright by Pearson Education, Inc. All Rights Reserved.

Recall from Last Time: Big-Oh Notation

Algorithms. Algorithms 1.4 ANALYSIS OF ALGORITHMS

Choice of C++ as Language

Analysis of Algorithm. Chapter 2

Recursive Functions. Biostatistics 615 Lecture 5

Computational biology course IST 2015/2016

CSE 146. Asymptotic Analysis Interview Question of the Day Homework 1 & Project 1 Work Session

ALGORITHM ANALYSIS. cs2420 Introduction to Algorithms and Data Structures Spring 2015

Algorithm Analysis. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

Introduction to the Analysis of Algorithms. Algorithm

Algorithm Efficiency & Sorting. Algorithm efficiency Big-O notation Searching algorithms Sorting algorithms

For searching and sorting algorithms, this is particularly dependent on the number of data elements.

4.1 Performance. Running Time. The Challenge. Scientific Method. Reasons to Analyze Algorithms. Algorithmic Successes

Running Time. Analytic Engine. Charles Babbage (1864) how many times do you have to turn the crank?

Checking for duplicates Maximum density Battling computers and algorithms Barometer Instructions Big O expressions. John Edgar 2

MPATE-GE 2618: C Programming for Music Technology. Unit 4.2

CSE 373 APRIL 3 RD ALGORITHM ANALYSIS

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

15110 Principles of Computing, Carnegie Mellon University - CORTINA. Binary Search. Required: List L of n unique elements.

CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis. Aaron Bauer Winter 2014

Measuring algorithm efficiency

ECE 122. Engineering Problem Solving Using Java

Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Analysis of Algorithms

Lecture 5: Running Time Evaluation

Analysis of Algorithms

Algorithms A Look At Efficiency

Data Structures Lecture 3 Order Notation and Recursion

CSCI 261 Computer Science II

Teach A level Compu/ng: Algorithms and Data Structures

Algorithm Analysis and Design

Cosc 241 Programming and Problem Solving Lecture 7 (19/3/18) Arrays

4.1 Performance. Running Time. The Challenge. Scientific Method

CSCA48 Winter 2018 Week 10:Algorithm Analysis. Marzieh Ahmadzadeh, Nick Cheng University of Toronto Scarborough

CS240 Fall Mike Lam, Professor. Algorithm Analysis

Today s Outline. CSE 326: Data Structures Asymptotic Analysis. Analyzing Algorithms. Analyzing Algorithms: Why Bother? Hannah Takes a Break

THE UNIVERSITY OF WESTERN AUSTRALIA

The Complexity of Algorithms (3A) Young Won Lim 4/3/18

DATA STRUCTURES AND ALGORITHMS. Asymptotic analysis of the algorithms

CSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators)

Lesson 12: Recursion, Complexity, Searching and Sorting. Modifications By Mr. Dave Clausen Updated for Java 1_5

Sorting Algorithms. Biostatistics 615 Lecture 7

CSCI-1200 Data Structures Spring 2018 Lecture 7 Order Notation & Basic Recursion

CS 231 Data Structures and Algorithms Fall Algorithm Analysis Lecture 16 October 10, Prof. Zadia Codabux

CSE 373: Data Structures and Algorithms

Sorting Pearson Education, Inc. All rights reserved.

Outline. runtime of programs algorithm efficiency Big-O notation List interface Array lists

csci 210: Data Structures Program Analysis

Building Java Programs Chapter 13

6. Asymptotics: The Big-O and Other Notations

Computational thinking, problem-solving and programming:

Algorithm Analysis. Gunnar Gotshalks. AlgAnalysis 1

Data Structures and Algorithms

Introduction to Computers and Programming. Today

University of the Western Cape Department of Computer Science

CMPSCI 187: Programming With Data Structures. Lecture 5: Analysis of Algorithms Overview 16 September 2011

How do we compare algorithms meaningfully? (i.e.) the same algorithm will 1) run at different speeds 2) require different amounts of space

10/5/2016. Comparing Algorithms. Analyzing Code ( worst case ) Example. Analyzing Code. Binary Search. Linear Search

Algorithmic Analysis. Go go Big O(h)!

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

CS240 Fall Mike Lam, Professor. Algorithm Analysis

Algorithm Efficiency & Sorting. Algorithm efficiency Big-O notation Searching algorithms Sorting algorithms

Searching, Sorting. Arizona State University 1

Algorithm Analysis. Algorithm Efficiency Best, Worst, Average Cases Asymptotic Analysis (Big Oh) Space Complexity

CS171:Introduction to Computer Science II. Algorithm Analysis. Li Xiong

CS:3330 (22c:31) Algorithms

Outline and Reading. Analysis of Algorithms 1

Module 1: Asymptotic Time Complexity and Intro to Abstract Data Types

Algorithm Analysis. Applied Algorithmics COMP526. Algorithm Analysis. Algorithm Analysis via experiments

What is an algorithm?

Lecture 5 Sorting Arrays

[ 11.2, 11.3, 11.4] Analysis of Algorithms. Complexity of Algorithms. 400 lecture note # Overview

UNIT 1 ANALYSIS OF ALGORITHMS

Programming II (CS300)

UNIT 5C Merge Sort. Course Announcements

Lecture 2: Algorithm Analysis

Algorithm Analysis. Big Oh

Algorithms and Data Structures, Fall Introduction. Rasmus Pagh. Based on slides by Kevin Wayne, Princeton. Monday, August 29, 11

1.4 Analysis of Algorithms

Computer Science 210 Data Structures Siena College Fall Topic Notes: Complexity and Asymptotic Analysis

Chapter 6 INTRODUCTION TO DATA STRUCTURES AND ALGORITHMS

algorithm evaluation Performance

O(1) How long does a function take to run? CS61A Lecture 6

CS 112 Introduction to Programming

Algorithm Complexity Analysis: Big-O Notation (Chapter 10.4) Dr. Yingwu Zhu

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

CS 6402 DESIGN AND ANALYSIS OF ALGORITHMS QUESTION BANK

Algorithms and Data Structures

Data Structures and Algorithms

Lecture Notes on Quicksort

Searching in General

SEARCHING AND SORTING HINT AT ASYMPTOTIC COMPLEXITY

Transcription:

Principles of Algorithm Analysis Biostatistics 615/815 Lecture 3

Snapshot of Incoming Class 25 Programming Languages 20 15 10 5 0 R C/C++ MatLab SAS Java Other Can you describe the QuickSort Algorithm? Can you describe Simulated Annealing? No Yes Maybe Blank No Yes Maybe Blank

Homework Notes Provide a hard copy including Your Answers Your Code Write specific answer to each question Supported by table or graph if appropriate Source code Indented and commented, if appropriate

Office Hours Wednesdays 1:30 4:00 pm SPH Tower, M4614 Alternatively, e-mail me at: goncalo@umich.edu

Last Week An Introduction to C Strongly typed language Variable and function types set explicitly Functional language Programs are a collection of functions Compiling and debugging C programs Setup a basic projects Review compile errors and warnings Step through code line by line Set breakpoints

Today Strategies for comparing algorithms Common relationships between algorithm complexity and input data Compare two simple search algorithms

Objectives Framework for Empirical Testing Theoretical Analysis Highlight performance characteristics of algorithms

Specific Questions Compare two algorithms for one task Predict performance in a new environment If we had a computer that was 10x faster and could store 10x more data, how would approach perform? Set values of algorithm parameters

Two Common Mistakes Ignore performance of algorithm Shun faster algorithms to avoid complexity in program Waiting for simple but inefficient i algorithms to run, when efficient alternatives of modest complexity exist Too much weight on performance of algorithm Improving program that is already very fast not worth it Time spent tinkering with code is useful

Empirical analysis Given two algorithms which is better? Run both Say, algorithm A takes 3 seconds Say, algorithm B takes 30 seconds Empirical studies may not always be practical p y y p Some algorithms may take too long to run! Other algorithms may take too long to code

Choices of Input Data Actual data Measures performance in use Random data Generic approach, may not be representative ti Perverse data Perverse data Attempt worst case analysis

Limitations of Empirical Analysis Quality of implementation Is our favored implementation coded more carefully than another? Extraneous factors Compiler Machine Computer system

Limitations of Empirical Analysis Requires a working program Theoretical analysis is an alternative Estimate potential gains Predict effectiveness relative to new algorithms or computers (that t may not yet exist)

Theoretical Analysis Predict performance of algorithm based on theoretical properties Independent of actual implementation Several constructs occur frequently in Several constructs occur frequently in algorithm analysis

Limitations of Theoretical Analysis Efficiency can depend on compiler Efficiency may fluctuate with input data Some algorithms are not well understood d

The idea Given a code fragment #Find parent of node i i = a[i]; Consider how many times it is executed ecuted But not how long each execution takes

Two typical analyses Average-case for random input Worst-case Are these representative of real world problems? Check with empirical predictions

The Primary Parameter N Examples Number of parameters to likelihood function Number of fitems in dataset tto be processed Number of characters in a string Size of file to be sorted Some other abstract measure of problem size With multiple inputs, focus on one at a time, while holding the others constant

Running time as a function of N f(n) Description 1 constant - Running time when N doubles log N logarithmic constant increase N linear doubles Nl log N log-linearli more than doubles N 2 quadratic increases fourfold N 3 cubic increases eightfold 2 N exponential running time squares

Running time as a function of N Multiple terms may be involved e.g. N + N log N Typically, we ignore Smaller terms Constant coefficient Focus on inner loop In rare cases, smaller terms and constant coefficient will be important

Time to Solve Large Problem operations per second Problem Size N = 1,000,000 N N log N N 2 10 6 seconds minutes months 10 9 instant instant hours 10 12 instant instant seconds

Time to Solve Huge Problem operations per second Problem Size N = 1,000,000,000 N N log N N 2 10 6 hours days never 10 9 seconds minutes centuries 10 12 instant instant months

Application Analysis of two search algorithms Each algorithm: Considers a set of items stored in an array Considers a set of items stored in an array Searches through items to decide whether a particular value occurs

Sequential Search int search(int a[], int value, int start, int stop) { // Variable declarations int i; // Search through each item for (i = start; i <= stop; i++) if (value == a[i]) return i; // Search failed return -1; }

Sequential Search Properties Algorithm: Look through array sequentially, until we find a match Average cost If match found: N/2 If match not found: N Actual cost depends on fraction of successful searches

Better Sequential Search If items are sorted Stop unsuccessful search early, when we reach item with higher value Cost for unsuccessful searches is now N/2 Overall, algorithm is still O(N)

Binary Search int search(int a[], int value, int start, int stop) { while (stop >= start) { // Find midpoint int mid = (start + stop) / 2; // Compare midpoint to value if (value == a[mid]) return mid; // Reduce input in half!... if (value > a[mid]) { start = mid + 1; } else { stop = mid - 1; } } // Search failed // Sea c a ed return -1; }

Binary Search Properties Algorithm: Halve number of items to consider with each comparison Worst-case cost Maximum cost is never greater than log 2 N Much better than sequential search, but even better methods exist!

Sequential vs. Binary Search M = 1,000 M = 10,000 M = 100,000 N S B S B S B 125 1 1 13 2 130 20 250 3 0 25 2 251 22 500 5 0 49 3 492 23 1250 13 0 128 3 1276 25 2500 26 1 267 3 * 28 Timings in seconds, for M searches in table of N elements

Big-Oh Notation Algorithm is O(N) or O(N log N) Common statement What does it mean? Summarizes performance for large N F l di t f i Focuses on leading terms of expression describing running time

Big-Oh Notation Consider function g(n) It is said to be O(f(N)) If there exist c 0 and N 0 such that: N > N 0 implies c 0 f(n) > g(n)

From N to Running Time Common relationships N 2 log N N log N N Describe examples of how these arise Cost of running program is C N

O(N 2 ) Loop through input successively, eliminate one item at a time C N = C = C N 1 N 2 + N + ( N for 1) + N 2, N... = 1+ 2 +... + ( N 1) + N N ( N +1) = 2 C 1 = 1

O(log N) Recursive program, halves input in one step C N C C n n 1 2, for 1 1 1 2 2 = + = C C n n 3 1 1 3 2 2 2 2 2 + = + + = n C 1... 0 2 + = n N n 2 1 = + =

O(N log N) Recursive program, processes each item, splits input into two halves, examines each one CN = 2CN / 2 + N for N 2, C1= 0 C C 2 2 n n 2 n = 2C 2 n 1 + 2 n n 2C n 1 + 2 2 = n 2 C n 2 = 1 + 1 n 1 2 C n 2 2 + 1+ 1 = n 2 2... = n

O(2N) Halves input, must examine each item C N = C N / 2 + N for N 2, C 1= 1 N N N = N + + + +... 2 4 8 2N

Summary Outline principles for analysis of algorithms Introduced some common relationships between N and running time Described two simple search algorithms

Further Reading Read chapter 2 of Sedgewick

Tip of the Day: Defensive Programming Document code and programs Indicate intended purpose Specify required inputs Always indicate author Check for error conditions dto