F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods: Heuristics Approximatio algorithms (short) Backtrackig How do we evaluate algorithms? What does it mea to evaluate a algorithm? Is it: Does it work? Is it: to be able to compare two algorithms? Is it: optimal (i speed, memory etc.) Is it: How to make i better? 3 levels 1. It must be correct - it must work. 2. Must be easy to uderstad, code, maitai etc. 3. Should use computer resources well (speed, memory,...) e.g. should be efficiet The result of the evaluatio of computer resources is called the algorithm's complexity. Computig sciece folklore - The fastest algorithm ca frequetly be replaced by oe that is almost as fast ad much easier to uderstad. - Make it work before you make it fast - If a program does t work it does t matter how fast it rus. Complexity 1 Complexity 2 Algorithm complexity We will ow look at how to mathematically aalyse algorithms to determie their resource demads. Defiitio: The complexity of a algorithm is the cost to use the algorithm to solve a problem. The cost ca be measured i terms of executed istructios (the amout of work the algorithm does), ruig time, memory cosumptio or somethig similar. So complexity is a measure of the algorithms resource demads, it s ot about how hard the algorithm is to uderstad The most iterestig resource seems to be ruig time but is that a good measure? The ruig time depeds o may thigs: 1. The programmers skill ad the programmig laguage. 2. The computers machie laguage + hardw. 3. The code that the compiler geerates. 4. The size of the problem (e.g. br of iputs). 5. The ature of iput (sorted/usorted). 6. Time complexity of the algorithm itself, of the method chose. Items 1-6 is the base for empirical complexity, we ca measure the time the algorithm takes o some iput - the ruig time. Drawbacks with empirical complexity: - experimets ca't cover all iputs - you must implemet the algorithm - differet algorithms has to be ru o the same software/hardware/programmer Complexity 3 Complexity 4
Items 1-3 are (usually) ot affected by the algorithm used ad will oly affect ruig time with a costat factor that may vary with differet laguages, computers etc. We like to igore these costat factors ad create a theoretical aalytical framework... Item 4-6 represets the theoretical complexity where we evaluate the algorithm itself. Elemetary operatios We will measure the amout of work the algorithm does i terms of elemetary operatios. Elemetary operatios are assumed to require oe cost - uit e.g. it s a operatio whose complexity ca be bouded by a costat. Complexity is a relative cocept, oly iterestig together with the correspodig elemetary operatio. + see drawbacks with empirical complexity! Problem Problem size Elemetary op. Drawbacks with a theoretical framework: Fid x i list Number of elemets i list Comparig x to list elemet - costats ca matter - software/hardware/programmers ca give you subtle problems for istace Strigs i Java that is ot visible here Multiplicatio of two matrices Sort a array Dimesio of matrices Number of elemets Multiplicatio of two umbers Comparig array elemets or movemet Traverse a tree/graph Number of odes Follow a poiter Complexity 5 Complexity 6 Problem size ifluece complexity - Logarithmic cost criteria Sice complexity depeds o the size of the problem, we defie complexity to be a fuctio of problem size Defiitio: Let T() deote the complexity for a algorithm that is applied to a problem of size. The size ( i T()) of a problem istace (I) is the umber of (biary) bits used to represet the istace. So problem size is the legth of the biary descriptio of the istace. This is called Logarithmic cost criteria. Ex: iput x 1, x 2, x 3,, x k size = log(x 1 ) + log(x 2 ) +... + log(x k ) Complexity 7 Uiform cost criteria: Iff you assume that every computer istr. takes oe time uit, every register is oe storage uit ad that a umber always fits i a register the you ca use the umber of iputs as problem size sice the legth of iput (i bits) will be a costat times the umber of iputs. Example: bigproblemtosolve(x 1, x 2, x 3,, x ) {...} Iputs are x 1, x 2, x 3,, x The the size of the problem is log(x 1 ) + log(x 2 ) + log(x 3 ) + + log(x ) this is *log(max(x i )) ad log(max(x i )) is a costat log(it.max) so * k This is called Uiform cost criteria. Complexity 8
Asymptotic order of growth rate ad Big O Small is ot iterestig! The growth rate of complexity e.g. what happes whe is big, or whe we double or triple the size of iput, that s iterestig. We call this asymptotic growth rate ad we eed a way to express this mathematically: Big O. (Ordo of f, big O of f, order of f) Let, for a give fuctio f, O(f) be the set of all fuctios t that are asymptotically bouded above by a costat multiple of f. Formally: O f ={ t c 0 0 0 0 t c f } where f, t:n R 0+,c R +, 0 N or i words: T() is O(f()) if there exists costats c>0 ad 0 0 so that for all 0, we have T() c*f(). We write T() O(f()). Rules of ordo otatio O(f)+O(g) = O(f + g) = O(max(f, g)) c O(f) = O(c f) = O(f) f O(g) = O(f) O(g) = O(f g) O(O(f)) = O(f) Big Ω (Omega) Ordo otatio gives us a way of talkig about upper bouds it takes o more time tha... We also like to be able to talk about lower bouds, to be able to say that it takes at least this much time to... Let, for a give fuctio f, Ω(f) be the set of all fuctios t that are asymptotically bouded below by a costat multiple of f. Ω f ={ t c 0 0 0 0 t c f } Complexity 9 Complexity 10 Nature of iput ifluece complexity Complexity also depeds o the ature of the iput. (for istace: sortig umbers ca be much quicker if they are (almost) sorted already) Therefore we use the worst case as a measure of complexity Defiitio: Let T() be the maximum cost over all to apply the algorithm o a problem of size. This gives us a upper boud of the amout of work the algorithm performs, it ca ever take more time. Whe we say that a algorithm takes time O( 2 ) we mea that it takes o more time tha a costat times 2, for large eough, to solve the problem i worst case. Whe we say that a algorithm takes time Ω( 2 ) we mea that it takes at least a costat times 2, for large eough, to solve the problem i worst case. I both cases this does ot mea that it always will take that much time, oly that there are at least oe istace for which it does. Example: Lookig for x i a usorted list with elemets takes O() ad Ω() time i the worst case but if x is first i the list it takes O(1) time. Complexity 11 Complexity 12
Fuctios of more tha oe variable What if we have several parameters T(, m)? (for istace matrix idices as parameters) We ca geeralize: But more ofte: try to express complexity as a oe parameter fuctio for istace by addig or multiplyig the parameters: f(') = f(*m) or f(') = f(+m) Coditioal otatio: O(f() P()) where P() is a predicate. O( 2 odd) valid oly for odd O(log =2 k ) must be power of 2. Calculatig complexity 1. Oly executable statemets costs. 2. Simple statemets like assigmet, read / write of simple variables ad so o take O(k). 3. For a sequece of statemets the cost is the sum of the idividual costs. 4. For decisios like if ad case statemets you add the cost for the decisio (usually costat) plus the cost of the most expesive alterative. if <0 the prit ; else prit a array 1.. 5. Loops with costat step: the sum over all turs of the time to evaluate the coditio plus the body. (i.e. the umber of turs times the largest time of the body over all turs)... 6. Multiplicative loops. see ---> 7. Subprograms are aalysed accordig to the rules above. Recursive algorithms are a bit more difficult. We must express T() as a recursio equatio... Solvig them ca be a bit tricky. Complexity 13 Complexity 14 Examples loops Sum up the umbers 1 to, elem.op.= aritm. operatios, readig a variable is free pedatic aalysis: 1 it sum = 0; (1 op) 2 for ( it ; (1 op) 3 i<=; (1 op every tur) 4 i++ ){ (2 op every tur) 5 sum = sum + i; (2 op every tur) } Gives us the complexity fuctio: Row T() = 1 + 1 1 + 2 (+1)*1+ 3 (br of turs)*1 *2 4 *2 5 = 3 + 5* = a exact solutio O() mathematically correct estimate: 2 5=2 5 1=2 5 O rough estimate: turs cost of body = Example: for i = 0.. loop for j = 1.. loop somethig O(1) ed loop ed loop mathematically correct: (the sum over all turs of the time to evaluate the coditio plus the body) T = i=0 j=1 O 1 (kow sums: 1=1 1... 1= O 2... = 1 O 2 ) 2 rough estimate: i.e. the umber of turs times the largest time of the body over all turs T() = 1 or T() = (+1) 1 Complexity 15 Complexity 16
Multiplicative loops with multiplicatio/divisio with a costat like cotrol = 1; while cotrol <= loop somethig O(1) cotrol = 2 * cotrol; ed loop; After k iteratios we have cotrol = 2 k k = 2 log(cotrol) sice k is the umber of iteratios ad cotrol = at the ed of the loop the k = log() Observe that the base of the logarithm does t matter sice the logarithm of a umber i oe base is equal to a costat times the logarithm i aother base so O( a log) = O( b log). Usual Ordo fuctios Polyomial complexity (or better): (Polyomial complexity is aythig smaller tha or equal to a polyom) O(1), O(k): costat complexity O( y log ): logarithmic complexity O(): liear complexity O(log): li-log complexity O( 2 ): quadratic complexity O( 3 ): cubic complexity O( 4 ): O( 5 ): Worse: O(2 ): expoetial complexity O(!) factorial complexity Ad combiatios of these like O( 2 log ), O( 3 2 ),... Complexity 17 Complexity 18 Recursive fuctios The faculty fuctio is defied as 1 if =1! = { 1! otherwise it fac(it ) if <= 1 the retur 1 else retur *fac(-1) ed if ed fac If the call to fac() take T(), the the call to fac(-1) should take T(-1). So we get T = { c 1 if =1 T 1 c 2 1 whose solutio is O(). Mergesort is a famous sortig algorithm array mergesort (array v; it ) // v is a array of legth if = 1 the retur v else split v i two halfs v1 och v2, with legth /2 retur merge(mergesort(v1, /2), mergesort(v2, /2)) ed if ed mergesort Merge takes two sorted arrays ad merges these with oe aother to a sorted list. If T() is the wc time ad we assume is a power of 2, we get: T() = { c 1 if =1 2T (/2)+c 2 if >1 The first 2 is the umber of subsolutios ad /2 is the size of the subsolutios c 2 is the test to discover that 1, (O(1)), to break the list i two parts (O(1)) ad to merge them (O()). The solutio is O(log) Complexity 19 Complexity 20
Solvig recursio equatios meas that we try to express them i closed form, i.e without T(...) term o the right had side. Repeated substitutio - expad the idetity util all terms o the right had side oly cotais T(c) which is a costat. Guess a solutio f() ad prove it to be correct, that T() f(), with iductio. You eed experiece with guessig! Simple trasformatios of rage or domai (for example Z-trasform) Geeratig fuctios (special case of trasformig the domai) Special methods like! Direct summatio.! Cacellatio of terms. Methods for solvig differece ad differetial equatio Tables ad books Give up... Strategy for complexity calculatios 1) Set up the requiremets, pricipally which elemetary operatios (EO) you use i.e. what costs are you coutig. (ad possibly what you are NOT coutig) 2) Set up the formula for the complexity Do that rather exactly (mathematically correct) ad motivate carefully what you do. Example: i the sum 3+ 5 you must motivate where 3,, ad 5 came from ad why a sum is appropriate, referrig to the pseudocode/algorithm. This ca be doe by havig row umbers i the pseudocode ad writig like row 1-3 does oe EO each row 4 is a loop that start with sice. Complexity 21 Complexity 22 3) Solve the formula Thik about what the result is goig to be used for: - If you oly wat a O(.): simplify as eeded if it makes solvig the formula easier. (But do't do it i a routielike fashio) Ad thik about what you do, example: a sigle loop with a O() body costs as much as a double loop with costat body. - But if you eed to compare differet algorithms with the same O(.) you eed to solve more precisely. Always motivate what you do. Math ca be motivated like this: (... is the formula that you are workig with)... = {divide everywhere with 4}... = {use formula for geometric sum}... = ad so o Trivial thigs eed ot be motivated but be over-explicit rather tha the opposite. With log formulas with may subexpressios (like double sums for istace) you ca solve the subexpressios separately so i 1 3+ 5 i=0 you ca solve the ier sum before or after this calculatio ad just isert the result here. = {solutio of ier sum, see below} Complexity 23 Complexity 24