Analysis of Algorithms - Introduction - Andreas Ermedahl MRTC (Mälardalens Real-Time Research Center) andreas.ermedahl@mdh.se Autumn 004 Administrative stuff Course leader: Andreas Ermedahl Email: andreas.ermedahl@mdh.se Office: room U3, IDt (3:rd floor) Telephone: 0 57334 Course homepage: http://www.idt.mdh.se/kurser/cd5370/04 Administrative stuff (cont.) Schedule: Period (week 45-5), lectures Tuesdays 0:5- room: R-4, T-05 Thursdays 0:5- room: R-4,T-05, R-8 Material presented using Powerpoint slides Printed handouts will be provided at the beginning of each lecture Handouts will be put at the course homepage after each lecture Administrative stuff (cont.) Literature: Cormen, Leiserson, Rivest and Stein. Introduction to Algorithms, nd Edition. ISBN: 0 6 0393 7 Books have been ordered to Akademibokhandeln in Kårhuset Cost: 60 kr Some more material will be handed out Administrative stuff (cont.) Assignments: Three (not mandatory) including both programming and theory Will give bonus points on the exam Examination: Written exam Friday 005-0- 4, 8:30-3:30 (place not decided yet) Course contents In general, methods for: Analysing algorithms Running time formulas Order of growth Recurrence formula Comparing algorithms Designing algorithms Sorting algorithms Graph traversal algorithms Special techniques for designing algorithms. for j to length[a]. do sum sum + A[j] 3. return sum 3 4 5 6 7 4 5 7 3 6 merge merge 5 4 7 3 6 merge merge merge merge 5 4 7 3 6 c ifn= T( n) = T ( n/) + cn ifn> b 8 c 7 d 4 9 a i e 4 4 7 6 8 0 h g f merge
L: Today: Course welcome and overview, Ch. : Introduction L: Ch. : Getting started L3: Ch. 3: Growth of Functions / Asymptotic bounds L4: Ch. 4: Recurrences L5: Ch. 6: Heapsort L6: Ch. 7: Quicksort L7: Ch. 8: Sorting in Linear Time L8: Ch. 9: Medians and Order Statistics L9: Ch. 5: Dynamic Programming The lectures will follow the chapters in the book L0: Ch. 6: Greedy Algorithms, Ch.7: Amortized Analysis L: Ch. : Elementary Graph Algorithms L: Ch. 3: Minimum Spanning Tree L3: Ch. 4: Single-Source Shortest Paths L4: Examples from the exams L: : Today: Course welcome and overview,, Ch. : Introduction Introduction to running time analysis What is the execution time for this code? L9: : Ch. 5: Dynamic Programming L0: : Ch. 6: Greedy Algorithms,, Ch.7: Amortized Analysis L: : Ch. : Elementary Graph Algorithms. for j to length[a] L: : Ch. 3: Minimum Spanning Tree L3: : Ch. 4: Single-Source Source Shortest. Paths do sum sum + A[j] L4: Examples from the exams 3. return sum L: : Today: Course welcome and overview,, Ch. : Introduction What does it mean What does it mean that f(n) is O(g(n))? that the execution time of an algorithm is O(f(n))? L9: : Ch. 5: Dynamic Programming L0: : Ch. 6: Greedy Algorithms,, Ch.7: Amortized Analysis L: : Ch. : Elementary Graph Algorithms L: : Ch. 3: Minimum Spanning Tree L3: : Ch. 4: Single-Source Source Shortest Paths L4: Examples from the exams L: : Today: Course welcome and overview,, Ch. : Introduction How do we derive such a recurrence formula? L7: : Ch. 8: Sorting in Linear Execution Time time T(n) of MERGESORT L8: : Ch. 9: Medians and can Order be formulated Statistics as a recurrence formula: L9: : Ch. 5: Dynamic Programming L0: : Ch. 6: Greedy Algorithms,, Ch.7: c Amortized Analysis if n = T( n) = L: : Ch. : Elementary Graph Algorithms T ( n / ) + cn if n > L: : Ch. 3: Minimum Spanning Tree L3: : Ch. 4: Single-Source Source The solution Shortest to Paths this recurrence formula is: L4: Examples from the T(n) exams is O(n log n) How do we derive the solution? L: : Today: Course welcome and overview,, Ch. : Introduction Detailed analyses of sorting algorithms Execution time T(n) is linear L9: : Ch. 5: Dynamic Programming Example: RADIXSORT Original After sorting After sorting After sorting L0: : Ch. 6: Greedy Algorithms,, Ch.7: Amortized Analysis input on last digit on next digit on first digit L: : Ch. : Elementary Graph Algorithms 39 70 70 39 L: : Ch. 3: Minimum 457 Spanning Tree 39 436 436 436 L3: : Ch. 4: Single-Source Source Shortest Paths 457 457 L4: Examples from the 436exams 70 39 457 70 L: : Today: Course welcome and Fibonacci overview, formula:, Ch. : Introduction fib(0) = fib() = fib(n) = fib(n-) + fib(n-) What is fib(537)? Exponentional execution time if straightforward implemented Linear execution time if implemented using dynamic programming L9: : Ch. 5: Dynamic Programming L0: Ch. 6: Greedy Algorithms,, Ch.7: Amortized Analysis L: : Ch. : Elementary Graph Algorithms L: : Ch. 3: Minimum Spanning Tree Special techniques for L3: : Ch. 4: Single-Source Source Shortest Paths designing algorithms L4: Examples from the exams
L: : Today: Course welcome and overview,, Ch. : Introduction Minimal Spanning Tree: L3: : Ch. 3: Growth of Functions Connect / Asymptotic all nodes by a bounds tree of minimal cost We will investigate clever algorithms for doing this 8 7 L5: : Ch. 6: Heapsort b c d 4 9 L7: : Ch. 8: Sorting in Linear Time a i 4 4 e 7 6 8 0 L9: : Ch. 5: Dynamic Programming h g f L0: Ch. 6: Greedy Algorithms,, Ch.7: Amortized Analysis L: : Ch. : Elementary Graph Algorithms L: : Ch. 3: Minimum Spanning Tree L3: : Ch. 4: Single-Source Source Shortest Paths Useful graph L4: Examples from the exams algorithms Introduction input output What is an algorithm? Informally; a procedure that: Takes some value (or a set of values) as input Produces some value (or set of values) as output A sequence of computational steps to transform the input into the output Typically to solve some well specified computational problem An algorithm describes a specific computational procedure for achieving the input/output relationship? Algorithmic usage Algorithms are used everywhere... Controlling internet traffic Identifying and organizing data about human DNA sequences Safe transactions for commerce and banking Allocate limited resources: drilling of oil, code compilation, air traffic landing reservation scheme... Algorithms often work on data Important how to store and organize data No data structure a suitable for all type of problems Example of algorithmic usage Hardware construction Modern hardware very complex Objectives: Minimize size and energy Maximize parallellism and speed Requires smart algorithms for circuit layout Requires smart algorithms to test and guarantee correctness Another example A modern car contains more than 80 computers connected through one or more networks Many different (maybe communicating) CPUs Many different type of algorithms runs on each CPU Often very limited resources Overview of Volvo S80 network layout Another example Google search engine Find web pages related to key words Programs search the web to catalogize it Google soon introduced on the stock-market 5-5 billion dollars Why do people prefer Google compared to other search engines? What type of services can also be provided?
Another example Constructing school schedules A number of lecture rooms Each with a certain cost A number of classes Each with different amount of people and resource demands Different lectures Must be able to alter already made reservations What is algorithm analysis? Given: a problem and an algorithm for solving it What do we know about the algorithm: Whether it solves the problem correctly? Whether it solves the problem efficiently? We will mostly focus on the second question Analysis of algorithms is to use mathematical techniques to analyze the efficiency of algorithms We will also look into how to design efficient algorithms Algorithmic efficiency What do we mean by efficiency? Is usually given with respect to some cost measure Cost measures are defined in terms of resource usage: Execution time Memory usage Communication bandwidth Computer hardware chip area Energy consumption etc. We will mainly look at cost in terms of execution time Example: Efficiency Algorithms often differ in their efficiency Can be much more significant than differences due to used hardware (and software) Computer speed yearly increase Clock frequency (MHz) year Example: Efficiency (cont.) We select the two extremes from the diagram Fastest: Pentium 4 from 003: 3. GHz = 3.*0 9 instr/sec Slowest: 386 from 986: 6 MHz = 6*0 6 instr/sec fl Fastest is about 00 times faster than Slowest Problem: Sort one million (0 6 ) numbers Insertionsort: takes ~ n instructions to sort n numbers Quicksort: takes ~ 50 n lg n instructions to sort n numbers Assume Fastest uses Insertsort and Slowest uses Quicksort Time to sort: Fastest: *(0 6 ) instr / 3.*0 9 instr/sec = 65 seconds Slowest: 50* 0 6 lg 0 6 instr / 6*0 6 instr/sec = 8.75 seconds fl Slowest runs about 65 / 8.75 = 33 times faster than Fastest Input size and running time The time taken by an algorithm normally grows with input size We want to describe running time as a function of input size Best notion of input size depends on the problem being studied Number of items in input (e.g. sorting) Total number of bits (e.g. multiplication of two numbers) Number of nodes and edges (e.g. graph algorithms) The running time of an algorithm is defined in terms of number of primitive operations or steps executed Convenient to be as machine independent as possible
Model of computation Calculating execution time (or memory requirements, or power consumption, or ) requires a machine model There are many possible machine models! We will use a generic one-processor, random-access (RAM) machine model Instructions are executed in sequence (no concurrent operations) Flat memory model Not really realistic considering modern hardware! We will only consider sequential algorithms Algorithms described in a sequential pseudo-language Other machine models require other languages Running Time On a particular input, the running time it is the number of primitive operations (steps) executed Want to define steps to be machine-independent. Exec time for instructions given on source code level Assume that each line of pseudo-code have a constant cost Different lines i, j, may take a different amount of time Each execution of line i takes same constant time c i Assuming that the line consists only of primitive operations If the line is a subroutine call, then the call takes constant time, but the subroutine execution might not. Example : Sequential code Two sequential statements One using addition and one using multiplication Different statements have different (constant) execution times. g e + f c. h e * f c Multiplication is for many processors more costly than addition Differs however only by the size of the constants: c > c Example : Single Loop Code for summing all elements in an array A[ ] Length of array A[ ] is size of problem: n Each execution of a statement takes same constant time Stmt executed within the context of the loop body.... for j to length[a] c n+. do sum sum + A[j] c n 3. return sum c 3... therefore it will be executed n times Running Time Calculation Assumption: Each statement has constant execution time every time executed Each statement s contributes with: cost(s) * times(s) cost(s) is the (constant) execution time cost for statement s times(s) is the number of times statement s is executed Total execution time T(n) is sum of all contributions of all statements: T(n) = S cost(s) * times(s) s e statements executed Example (cont.) Algorithm to sum all elements in an array A[ ] Length of array A[ ] is size of problem: n. for j to length[a] c n+. do sum sum + A[j] c n 3. return sum c 3 Total execution time T(n): T(n) = c * (n+) + c * n + c 3
Example 3: Nested Loop Algorithm to sum all elements in a two- dimensional n * n array A[ ]. for i to n c n+. for j to mn c n*(n+) 3. do sum sum + A[i,j] c 3 n*n 4. return sum c 4 Total execution time T(n): T(n) = c (n+) + c (n * (n+)) + c 3 (n * n) + c 4 Example 4: Triangular loop Two nested loops Algorithm summing selected elements of two-dimensional n* n array A Number of executions in inner loop depends on current iteration number of outer loop. for i to n c n+. for j to i c? 3. do sum sum + A[i,j] c 3? 4. return sum c 4 How many times will statments and 3 execute? Example 4 (cont.) We investigate the executions of stmt closer When i = the stmt will execute times When i = the stmt will execute 3 times... When i = n the stmt will execute n+ times Total number of executions of stmt are therefore: + 3 +... + n + (n+) = ( + +.. + n) + n The first arithmetic series can be formulated as a sum: n k = + +... + n = n( n + ) k= The total number of executions of stmt are therefore:. for i to n. for j to i 3. do sum sum + A[i,j] 4. return sum See Eq A in course book! n ( n + ) + n Example 4 (cont.). for i to n. for j to i 3. do sum sum + A[i,j] 4. return sum We can in a similar way examine stmt 3 closer When i = the stmt will execute times When i = the stmt will execute times... When i = n the stmt will execute n times Total number of executions of stmt 3 are therefore: + + 3 +... + n This is equal to a regular sum: n k = + +... + n = n( n + ) k = See Eq A in course book! Example 4: Triangular loop We can now give the cost for the stmts:. for i to n c n+. for j to i c ½ n(n+) + n 3. do sum sum + A[i,j] c 3 ½ n(n+) 4. return sum c 4 Example 5: Function Call If stmt is subroutine call, then call takes constant time, but the subroutine execution might not. n length[a] c. ret FOO(n) c 3. return ret c 3 Constant cost for calling FOO Later we will look into more examples of frequently occuring series & sums Nothing to be learn by heart, use provided collection of formulas (see e.g. page 3) FOO(i). for j to i c 4 n+. do sum sum + A[j] c 5 n 3. return sum c 6 Execution time of FOO depends size of input n Total execution time depends on size of input n
Example 6: Function Call Context Executions of function statements must be consider in a calling context Example (again): sum all elements in a two- dimensional n*n array A[ ]. res 0 c. for row to n c n+ 3. do res res + SUMROW(row) c 3 n 4. return res c 4 SUMROW(i). sum 0 c 5 n. for j to n c 6 n(n+) 3. do sum sum + A[i,j] c 7 n* n 4. return sum c 8 n Calls function to sum all elements in an array row Example 7: Recursive formulas Some problems might be solved using recursion Recursion appears when a function is calling itself Can be direct recursion (FOO calls FOO) Or indirectly recursion (FOO calls BAR, BAR calls FOO) Example: Recursively sum all elements in array A[ ] Called initially with RECSUM() Makes n different function calls RECSUM(i). if i > length[a] c n+ Number of executions depends 3. else return A[i] + RECSUM(i+) c 3 n. return 0 c on number of calls made Statement only executed once n recursive calls required Run Time Analysis: Summary Problem with exact run-time analysis Often messy and rather complicated to derive exact execution time formulas Hard to see how execution time grows with input size Hard to directly compare efficiency of two or more algorithms All presented algorithms had one possible execution Most algorithms execute differently depending on input Interesting to consider best, worst and average case performance Next lecture will present methods for handling these problems! The End!