CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016

Similar documents
PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: September 28, 2016 Edited by Ofir Geri

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: April 1, 2015

The Running Time of Programs

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

Module 08: Searching and Sorting Algorithms

COSC 311: ALGORITHMS HW1: SORTING

CS302 Topic: Algorithm Analysis. Thursday, Sept. 22, 2005

Algorithm Analysis. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

1 Introduction. 2 InsertionSort. 2.1 Correctness of InsertionSort

Quicksort (CLRS 7) We saw the divide-and-conquer technique at work resulting in Mergesort. Mergesort summary:

What is an algorithm? CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 8 Algorithms. Applications of algorithms

Theory and Algorithms Introduction: insertion sort, merge sort

Algorithm Analysis. (Algorithm Analysis ) Data Structures and Programming Spring / 48

COE428 Lecture Notes Week 1 (Week of January 9, 2017)

Algorithms and Data Structures

CISC 1100: Structures of Computer Science

CS2 Algorithms and Data Structures Note 1

Lecture 5 Sorting Arrays

EECS 203 Spring 2016 Lecture 8 Page 1 of 6

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1

Introduction to Algorithms 6.046J/18.401J

String Algorithms. CITS3001 Algorithms, Agents and Artificial Intelligence. 2017, Semester 2. CLRS Chapter 32

Sorting & Growth of Functions

CS302 Topic: Algorithm Analysis #2. Thursday, Sept. 21, 2006

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

Agenda. The worst algorithm in the history of humanity. Asymptotic notations: Big-O, Big-Omega, Theta. An iterative solution

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Week - 03 Lecture - 18 Recursion. For the last lecture of this week, we will look at recursive functions. (Refer Slide Time: 00:05)

Algorithms - Ch2 Sorting

Question 7.11 Show how heapsort processes the input:

Introduction to Algorithms 6.046J/18.401J/SMA5503

Reading for this lecture (Goodrich and Tamassia):

Algorithm Efficiency & Sorting. Algorithm efficiency Big-O notation Searching algorithms Sorting algorithms

Pseudo code of algorithms are to be read by.

Algorithmics. Some information. Programming details: Ruby desuka?

CITS5501 Software Testing and Quality Assurance Formal methods

CS 372: Computational Geometry Lecture 3 Line Segment Intersection

Introduction to Computers and Programming. Today

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

Lecture 2: Getting Started

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

Algorithms in Systems Engineering IE172. Midterm Review. Dr. Ted Ralphs

Data Structures and Algorithms CSE 465

Adam Blank Lecture 2 Winter 2017 CSE 332. Data Structures and Parallelism

Assignment 1 (concept): Solutions

UNIT 1 ANALYSIS OF ALGORITHMS

COMP 161 Lecture Notes 16 Analyzing Search and Sort

Introduction to Algorithms 6.046J/18.401J

CSE373: Data Structures and Algorithms Lecture 4: Asymptotic Analysis. Aaron Bauer Winter 2014

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY

Analyze the obvious algorithm, 5 points Here is the most obvious algorithm for this problem: (LastLargerElement[A[1..n]:

The growth of functions. (Chapter 3)

UNIT 1 BASICS OF AN ALGORITHM

Lecture 15 : Review DRAFT

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 1 Lecturer: Michael Jordan August 28, Notes 1 for CS 170

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 4 Notes. Class URL:

0.1 Welcome. 0.2 Insertion sort. Jessica Su (some portions copied from CLRS)

SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016

Analysis of Algorithms

Outline. Introduction. 2 Proof of Correctness. 3 Final Notes. Precondition P 1 : Inputs include

Admin. How's the project coming? After these slides, read chapter 13 in your book. Quizzes will return

CS 310: Order Notation (aka Big-O and friends)

Sorting. Bubble Sort. Selection Sort

(Refer Slide Time: 1:27)

Sorting: Given a list A with n elements possessing a total order, return a list with the same elements in non-decreasing order.

Computer Science 210 Data Structures Siena College Fall Topic Notes: Complexity and Asymptotic Analysis

CSE 143. Complexity Analysis. Program Efficiency. Constant Time Statements. Big Oh notation. Analyzing Loops. Constant Time Statements (2) CSE 143 1

Lecture Summary CSC 263H. August 5, 2016

Elementary maths for GMT. Algorithm analysis Part II

RUNNING TIME ANALYSIS. Problem Solving with Computers-II

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

The divide and conquer strategy has three basic parts. For a given problem of size n,

10/5/2016. Comparing Algorithms. Analyzing Code ( worst case ) Example. Analyzing Code. Binary Search. Linear Search

For searching and sorting algorithms, this is particularly dependent on the number of data elements.

Computer Algorithms. Introduction to Algorithm

Midterm solutions. n f 3 (n) = 3

CSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD

Module 1: Asymptotic Time Complexity and Intro to Abstract Data Types

Algorithm Analysis. Spring Semester 2007 Programming and Data Structure 1

COMP1730/COMP6730 Programming for Scientists. Algorithm and problem complexity

EXAM ELEMENTARY MATH FOR GMT: ALGORITHM ANALYSIS NOVEMBER 7, 2013, 13:15-16:15, RUPPERT D

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

Lecture Notes for Advanced Algorithms

Scientific Computing. Algorithm Analysis

Sorting Pearson Education, Inc. All rights reserved.

Notes on Turing s Theorem and Computability

CS1 Lecture 30 Apr. 2, 2018

Recursive Algorithms II

Solutions to Problem Set 1

Introduction to Algorithms

Algorithms. Notations/pseudo-codes vs programs Algorithms for simple problems Analysis of algorithms

Lecture 2: Analyzing Algorithms: The 2-d Maxima Problem

Sorting. Dr. Baldassano Yu s Elite Education

CLASS: II YEAR / IV SEMESTER CSE CS 6402-DESIGN AND ANALYSIS OF ALGORITHM UNIT I INTRODUCTION

Multiple-choice (35 pt.)

Introduction to Analysis of Algorithms

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Transcription:

CITS3001 Algorithms, Agents and Artificial Intelligence Semester 2, 2016 Tim French School of Computer Science & Software Eng. The University of Western Australia 2. Review of algorithmic concepts CLRS, Chs. 2 3

Summary We will review some basic algorithmic concepts from CITS2200 and CITS2211 o Algorithm design o Complexity o Recurrence relations o Numerical stability o Proof techniques o Finite-state automata We will present examples of these concepts in action o Fibonacci numbers o Sorting algorithms o String algorithms (next week) CITS3001: Algorithms, Agents and AI 2 2. Review

Problems, instances, algorithms, and programs A computational problem is a general (usually parameterised) description of a question to be answered A problem instance is a specific question usually obtained by providing concrete values for the parameters An algorithm is a well-defined finite set of rules that specifies a series of elementary operations that are applied to some input, producing after a finite amount of time some output o An algorithm solves any problem instance presented to it o Algorithms can be presented in many forms: code, pseudo-code, equations, tables, flowcharts, graphs, pictures, etc. A program is (largely) the implementation of some algorithm(s) in some programming language Algorithms and data structures are the fundamental building blocks from which programs are constructed o Data structures represent the state of a program o Algorithms transform (parts of) this state as the program executes CITS3001: Algorithms, Agents and AI 3 2. Review

An example Calculating Fibonacci numbers is a problem o The k th Fibonacci number is the sum of the previous two Fibonacci numbers Calculating the eighth Fibonacci number is a problem instance The following two rules are a simple algorithm for solving this problem o fib k = fib k 1 + fib k 2, k > 1 o fib 0 = fib 1 = 1 A (basic, inefficient) program for fib might work by implementing these rules CITS3001: Algorithms, Agents and AI 4 2. Review

Design and analysis An algorithmic solution to a computational problem involves designing an algorithm for the problem, and analysing the algorithm, usually wrt its correctness and performance Design requires both the background knowledge of algorithmic techniques and the creativity to apply them to new problems o Art as much as science Analysis is more mathematical/logical in nature, and usually more systematic o Science/engineering more than art In this unit (and in CITS2200) you will learn about both CITS3001: Algorithms, Agents and AI 5 2. Review

Contd. The design and analysis of an algorithm involves the following questions. 1. What problem are we trying to solve? 2. Does a solution exist already? 3. Otherwise can we find a solution, and are there multiple solutions? 4. Is the algorithm correct? 5. Is the algorithm efficient? Is it efficient enough? 6. Does the implementation of the algorithm present any problems? CITS3001: Algorithms, Agents and AI 6 2. Review

An example: Fibonacci numbers The most obvious algorithm for calculating Fibonacci numbers is the one we saw earlier fib(k) = fib(k 1) + fib(k 2), if k > 1 = 1, if k <= 1 Consider the runtime of this algorithm: fib(25) fib(26) fib(27) fib(28) fib(29) fib(30) fib(31) fib(32) fib(33) fib(34) fib(35) fib(36) fib(37) fib(38) 25ms 40ms 65ms 108ms 173ms 283ms 455ms 734ms 1,194ms 1,932ms 3,116ms 5,069ms 8,187ms 13,237ms What do you notice about these times? CITS3001: Algorithms, Agents and AI 7 2. Review

Contd. We can improve this algorithm easily, just by calculating from the bottom up fib(k) = fib (k, 1, 1) fib (k, x, y) = x, if k == 0 = fib (k 1, y, x+y), if k > 0 This is a linear algorithm: the times are a tad better fib(38) fib(10 3 ) fib(10 4 ) fib(10 5 ) 0.013ms 0.208ms 4.59ms 358ms This is a (very) basic instance of dynamic programming (or memoisation), which we will meet later We can also derive a closed form for the Fibonacci sequence, by solving the recurrence relation CITS3001: Algorithms, Agents and AI 8 2. Review

Comparing sorting algorithms Sorting a sequence of items means generating a permutation of the sequence where each adjacent pair of items obeys some comparison relation o Sorting is one of the most studied problems in CS o In pre-internet days, it was often said that at any given moment, 90% of the computers in the world would be sorting One common sorting algorithm is insertsort procedure insertsort(l): for j = 2 to length(l) key = L[j] i = j 1 while i > 0 and L[i] > key L[i + 1] = L[i] i = i 1 L[i + 1] = key This is an example of pseudo-code sorted L j unsorted CITS3001: Algorithms, Agents and AI 9 2. Review

Correctness of insertsort We can establish the correctness of insertsort by considering an invariant INV The invariant in this case is that after n iterations of the main loop, the first n+1 items on L are sorted There are three parts to verification using invariants Initialisation: INV must be true at the start: o After zero iterations, the first item is sorted Termination: if INV is true at the end of the procedure, that must imply that the procedure has sorted L o The procedure performs length(l) 1 iterations, after which length(l) items are sorted Maintenance: we must prove that if INV is true after n iterations, then it is also true after n+1 iterations o Usually done using mathematical induction o Informally, the inner loop doesn t change the order of the already-sorted items, and it places the new item into the list in front of all bigger items, and behind all smaller items Hence INV starts off true, it stays true to the end, and its truth establishes the correctness of the procedure CITS3001: Algorithms, Agents and AI 10 2. Review

An aside: proof by contradiction Another proof technique you need to be familiar with is proof by contradiction The idea is that to prove that a proposition P is true, we assume that P is false, and show that that assumption leads to a contradiction o If P leads to a falsehood, P must be true e.g. prove that it is impossible to construct a set S that contains all of the real numbers between 0 and 1 We will use Cantor s diagonal argument (en.wikipedia.org/wiki/cantor%27s_diagonal_argument) We assume that we can construct such a set: call the set S List the elements of S as shown 0.x 11 x 12 x 13 x 14 x 15 x 16 (we can pad any number with 0s) 0.x 21 x 22 x 23 x 24 x 25 x 26 0.x 31 x 32 x 33 x 34 x 35 x 36 Now construct a number 0.y 1 y 2 y 3 y 4 y 5 y 6 where y k = 5, if x kk 5 y k = 6, if x kk = 5 Clearly 0 < y < 1, and clearly y x k, for all k So y S: contradiction! CITS3001: Algorithms, Agents and AI 11 2. Review

An aside: numerical stability Another aspect of the correctness of an algorithm is its numerical stability, if it performs real arithmetic o This is an implementation issue really Computers represent real numbers to a certain precision o Thus most real numbers must be approximated, implying small errors Computers use a lot of digits in their precision, so usually these errors don t matter But for algorithms with many arithmetic operations, the errors can accumulate and get big enough to matter Two common operations to watch out for are o subtracting nearly-equal numbers if x y = z, then (x+e) (y e) = z+2e, which might be substantially different if z e o dividing by very small numbers x / y and x / (y±e) might be substantially different if y e We won t explore this issue explicitly any further CITS3001: Algorithms, Agents and AI 12 2. Review

Complexity of insertsort For simple procedures, it is easy to figure out the complexity from the pseudo-code (or from the code) procedure insertsort(l): for j = 2 to length(l) key = L[j] i = j 1 while i > 0 and L[i] > key L[i + 1] = L[i] i = i 1 L[i + 1] = key Given n = length(l), the main loop does n 1 iterations, each of which does 3 assignments In the worst case, the inner loop does j 1 iterations, each of which does 2 assignments Thus insertsort does 3(n 1)+2(1 + 2 +.. n 1) assignments, i.e. n 2 +2n 3 How many assignments does it perform in the best case? How many assignments does it perform if we use binary search to find the appropriate place for L[j]? CITS3001: Algorithms, Agents and AI 13 2. Review

Big-O notation The complexity of an algorithm gives us a deviceindependent measure of how much time it consumes o Does not depend on the details of any particular machine, or language, or programmer, etc. What we normally care about is the growth-rate of this measure, i.e. how it changes with the size of a problem instance o For insertsort, the size of an instance is the length of its argument Big-O notation abstracts away from the details of the calculation, focusing on the largest term o Thus insertsort is O(n 2 ) Growth-wise, this is more important than the coefficients of the terms in the expression o Over time, we need to solve bigger instances o Over time, our capability tends to grow leading us to look at bigger instances! Big-O notation specifies an upper-bound on growth o Technically an algorithm that is O(n 2 ) is also O(n 3 ) Big-Θ notation specifies both lower- and upper-bounds o So an algorithm that is Θ(n 2 ) is not also Θ(n 3 ) You may also come across Ω-notation, little-o notation, ω-notation, CITS3001: Algorithms, Agents and AI 14 2. Review

Some graphs 100 80 O(n) O(sqrt n) O(log n) O(1) 60 40 20 0 0 10 20 30 40 50 60 70 80 90 100 10,000 8,000 6,000 O(n^2) O(n) O(sqrt n) O(log n) 4,000 2,000 0 0 10 20 30 40 50 60 70 80 90 100 CITS3001: Algorithms, Agents and AI 15 2. Review

More graphs 1,000,000 800,000 600,000 400,000 O(n^3) O(n^2) O(n) O(sqrt n) O(log n) 200,000 0 0 10 20 30 40 50 60 70 80 90 100 Note the log-scale y-axis 1.E+30 1.E+24 1.E+18 1.E+12 O(2^n) O(n^3) O(n^2) O(n) O(sqrt n) O(log n) 1.E+06 1.E+00 0 10 20 30 40 50 60 70 80 90 100 CITS3001: Algorithms, Agents and AI 16 2. Review

More graphs 1.E+160 Note the log-scale y-axis 1.E+140 1.E+120 1.E+100 1.E+80 1.E+60 O(n!) O(2^n) O(n^3) O(n^2) O(n) O(sqrt n) O(log n) 1.E+40 1.E+20 1.E+00 0 10 20 30 40 50 60 70 80 90 100 Notice that each new line (complexity) completely dominates all previous ones Mathematicians and theoretical computer scientists distinguish only between o Polynomial/tractable algorithms o Exponential/intractable algorithms Engineers (and users!) also distinguish between other categories of algorithms But people even care about intractable problems too CITS3001: Algorithms, Agents and AI 17 2. Review

Back to sorting mergesort works by separately sorting the front and back halves of a list, then merging them together whilst maintaining the ordering procedure mergesort(l, p, r): // sorting indices p r of L if p < r q = floor ((p + r) / 2) mergesort(l, p, q) mergesort(l, q+1, r) merge(l, p, q, r) procedure merge(l, p, q, r) // pre: items p q are sorted, items q+1 r are sorted // post: items p r are sorted sorted sorted CITS3001: Algorithms, Agents and AI 18 2. Review

Complexity of mergesort In each call to mergesort: o The call to merge takes O(r p) time o The argument-length is halved for the recursion Letting n be the length of the original list: o Because of the halving, there are log(n) levels o At the first level, there is 1 call to merge with size n o At the second level, 2 calls each with size n/2 o At the third level, 4 calls each with size n/4 At each level, the total time to run merge is O(n), so the total time taken is nlog(n) o Far better than insertsort! CITS3001: Algorithms, Agents and AI 19 2. Review

Worst-case, best-case, average-case analysis Different problem instances of the same size may take different amounts of time to process The analyses so far have been worst-case o Assumes the data is in its least favourable form o This is important because it gives a guaranteed upper bound on the algorithm s performance Best-case analysis assumes the data is in its most favourable form o Usually leads to less runtime (e.g. for insertsort, best-case is O(n)) o But not always (e.g. makes no difference for mergesort) Average-case analysis attempts to analyse how much time the algorithm needs on typical data o Much harder to do! Note though that worst-case analysis can be difficult too o Sometimes we have to imagine data that is worse than any possible real data CITS3001: Algorithms, Agents and AI 20 2. Review

Finite-state automata We won t be constructing FSAs, but you will need to understand how they work Make sure you understand this example, which recognises binary strings that represent multiples of 5 o State 0 is both the start state and the accept state 0 0 1 4 1 0 1 2 0 0 1 1 1 0 3 CITS3001: Algorithms, Agents and AI 21 2. Review

Contd. 25 in binary is 11001, so the machine operates as follows initial after 1 1 0 0 1 State 0 1 3 1 2 0 The machine finishes in an accept state, therefore the string satisfies the spec o We say that the machine recognises the string o 25 is a multiple of 5! Try it for 26, 27, 28, 29, 30 o Can you see what information each state represents? The CITS2211 notes go much further: they are at teaching.csse.uwa.edu.au/units/cits2211/resources.php CITS3001: Algorithms, Agents and AI 22 2. Review