Speeding-up dynamic programming in sequence alignment

Size: px
Start display at page:

Download "Speeding-up dynamic programming in sequence alignment"

Transcription

1 Departmet of Computer Sciece Aarhus Uiversity Demark Speedig-up dyamic programmig i sequece aligmet Master s Thesis Dug My Hoa December, Supervisor: Christia Nørgaard Storm Pederse Implemetatio code :

2 Abstract Computig the optimal cost ad aligmet of two sequeces is oe of the most fudametal problems i bioiformatics. The stadard dyamic programmig based algorithm computes the optimal cost ad aligmet i O ( ) time ad space. Hirschberg gave a algorithm to compute the optimal aligmet i O () space, but the time remaied O ( ). The first major improvemet of ( the ) asymptotic ruig time was with the Four-Russias speedup. It reduced the time to O for computig the optimal cost, but it did ot address how to compute log the optimal aligmet withi that time. That was give by Kudeti ad Rajaskara, who combied Hirschberg s algorithm with the Four-Russias speedup. However the Four-Russias speedup is ot commoly used i practice, this is perhaps due to the overhead of usig the Four-Russias speedup. The motivatio of this thesis is to ivestigate if there is a practical speedup usig Four- Russias o dyamic programmig i sequece aligmet. This thesis also discusses issues which arise whe implemetig the Four-Russias speedup for edit distace ad script i quadratic as well as liear space. Also, implemetatios usig the Four-Russias with a theoretical speedup of O (t) rather tha O (log ) are preseted together with experimets showig the performace of these, ad a evaluatio of the Four-Russias speedup applicability o dyamic programmig i sequece aligmet. Dug My Hoa - 443

3 Cotets Itroductio 4. Motivatio Objectives Thesis outlie Backgroud 5. Sequece aligmet Iterpretatio Optimal cost ad aligmet Time ad space Dyamic programmig Time ad space Four-Russias speedup t-block Block fuctio Offset trick Applyig the offset block fuctio Time ad space Implemetatio ad experimetal Setup 3. Implemetatio Data hadlig Readig data from a represetatio Maagig sequeces which is ot a multiple of t Preprocessig Offset block fuctio Experimets Computer specificatio Test data ad parameters Expectatios Edit distace i quadratic space 9 4. Four-Russias speedup Time ad space Experimetal results Ratio Multiple pair-wise aligmets Edit script i quadratic space 3 5. Four-Russias speedup Hadlig padded sequeces Time ad space Experimetal results Edit distace i liear space 5 6. Four-Russias speedup Time ad space Experimetal results Ratio Multiple pair-wise aligmets Dug My Hoa - 443

4 7 Edit script i liear space 3 7. Hirschberg s algorithm Fidig the sub-path Splittig ito sub-problems Time ad space Four-Russias speedup Backtrackig usig offset block fuctio Hadlig padded sequeces Time ad space Experimetal results Ratio Summary ad discussio Quadratic vs. liear space Edit distace Edit script Future work 4 9. Local aligmet ad geeral cost fuctio Parallel programmig Coclusio 4 Dug My Hoa

5 Itroductio Oe of the most commo methods used for iferrig the biological fuctio of gees is sequece similarity search i protei ad DNA sequece databases. With the developmet of rapid methods for sequece aligmet, results based solely o sequece homology have become routie. Although dyamic programmig based sequece aligmet does provide optimal solutios they are computatioally expesive. Therefore, the most commoly used methods are curretly based o heuristics which are much faster, such as BLAST (Basic Local Aligmet Search Tool), at the cost of providig optimal solutios. Speed is importat give the size ad growth of the sequece databases curretly available. So the cotiuig developmet of fast ad accurate algorithms is appealig. The first algorithm itroduced usig dyamic programmig i global aligmet, was by Needlema ad Wusch [3]. Later algorithms for variatios of global aligmet, such as local aligmet ad affie gap cost, were developed i [6], [6] ad []. The first major improvemet i the asymptotic ruig time was achieved i [], also kow as the Four-Russias speedup or Four Russia algorithm. The algorithm improves the ruig time for computig the optimal cost by a factor of O (log ) but it did ot address how to compute the optimal aligmet withi the same ruig time. Hirschberg [9] gave a algorithm for computig the optimal aligmet i O ( ) time ad O () space. The space savig idea i Hirschberg s algorithm was applied i [7] ad []. However the asymptotic ruig time of computig the optimal aligmet remaied the same O ( ). Also, parallel algorithms for the optimal cost were studied i [3] ad [5]. I [4] liear space parallel algorithms were give, however the asymptotic ruig time was still assumed to be O ( ). A survey o all these algorithms ca be foud i ( [8]. I ) [] Hirschberg s algorithm were combied with the Four-Russias speedup givig a O algorithm for computig both the optimal cost ad aligmet i O () space. log. Motivatio Sequece aligmet is a fudametal applicatio i bioiformatics where it used to ifer fuctioal, structural ad evolutioary relatioships betwee sequeces. This makes it a iterestig area to explore speedup possibilities. Eve though the Four Russia paradigm have bee applied for a umber of dyamic programmig algorithms a actual implemetatio is rare [4], [5]. The motivatio of this thesis is to explore practical speedup possibilities of dyamic programmig i sequece aligmet. I decided to focus o the Four-Russias speedup as the theoretical speedup is a major improvemet i the computatio of sequece aligmet. I will ivestigate the Four- Russias speedup applicability o dyamic programmig i global sequece aligmet ad the compare the Four-Russias speedup with a stadard dyamic programmig based algorithm [3] ad Hirschberg s algorithm [9].. Objectives The mai objective of this thesis is to evaluate the applicability of Four-Russias speedup o dyamic programmig i sequece aligmet. I also aim to show that the Four-Russias speedup Although oly oe of them was Russia. Dug My Hoa

6 does ot just give a theoretical but also a practical speedup i the computatio of sequece aligmet. The theoretical ruig time ca ot always be achieved i practice, as the practical ruig time is depedet o the implemetatio choices. To fully uderstad the practical ruig time, a discussio o the implemetatio issues ad details which follows a implemetatio of the Four-Russias speedup ad how they may affect the practical ruig time is preseted. There are several elemets that eed to be take ito accout. How to hadle data, how to speedup the preprocessig part ad how to efficietly applied the Four-Russias speedup o a stadard implemetatio of dyamic programmig i sequece aligmet. The implemetatios will be based o edit distace ad script for global aligmet i quadratic as well as liear space..3 Thesis outlie Sectio will give a itroductio to sequece aligmet ad the algorithms used i the differet implemetatios. Sectio 3 presets implemetatio issues ad details o procedures ad elemets which are commo for both the quadratic ad liear space implemetatio of the Four-Russia speedup. Experimets used to test the implemetatios ad the expectatios of these are also preseted i sectio 3. I sectio 4, 5, 6 ad 7, the implemetatio issues ad details for the differet implemetatios will be preseted followed by the results of the experimets. To summarize the experimetal results for the differet implemetatios, a overview ad a discussio comparig the results will be preseted i sectio 8. Other results which are ot preseted i sectio 4, 5, 6 ad 7 will also be preseted here together with a discussio of these. A discussio o other speedup possibilities ad some ideas to further improve the practical ruig time of the Four- Russias speedup i sequece aligmet will be preseted i sectio 9. Fially i sectio the applicability of the Four-Russia speedup o dyamic programmig i sequece aligmet will be evaluated. Backgroud I this sectio I will give a overview of the algorithms used for the differet implemetatios. A detailed itroductio to sequece aligmet, dyamic programmig ad Four-Russias speedup. I will also to cover the termiologies, otatios ad ideas used for the differet implemetatios. Hirschberg s algorithm is oly used to compute a optimal aligmet i liear space, so it is ot covered i this sectio, but i sectio 7 alog with the implemetatio issues ad details. I this sectio the sequeces are assumed to be of equal legth. How to hadle sequeces of differet legth will be discussed i the implemetatios details for the differet implemetatios.. Sequece aligmet I bioiformatics, a sequece aligmet is a way of arragig DNA, RNA, or protei sequeces to idetify regios of similarity. Highly similar regios may be a cosequece of fuctioal, structural, or evolutioary relatioships betwee the sequeces. Aliged sequeces of ucleotide or amio acid residues are typically represeted as rows withi a matrix. Gaps are iserted betwee the residues so that idetical or similar characters are aliged i successive colums Dug My Hoa

7 .. Iterpretatio Two aliged characters correspods to either a match i both sequeces or a substitutio from sequece A to sequece B, i.e poit mutatios 4. A gap itroduced i sequece A correspods to a isertio i sequece B, ad a gap itroduced i sequece B correspods a deletio i sequece A, i.e idels 5... Optimal cost ad aligmet The similarity of two sequeces is measured as a cost for trasformig sequece A ito sequece B. There are differet types of cost measures. A commoly used measure, the edit distace, couts the umber of operatios required for the trasformatio. I a geeralizatio of the edit distace, the cost of each operatio is summed istead of just the umber of operatios eeded to trasform oe sequece ito aother. Sequece aligmet is about optimizig the cost of the aligmet. To emphasize similarity the objective is to maximize the umber of matches. To explaiig differeces the objective is to miimize the umber of idels ad poit mutatios, hece this is a optimizatio problem. There are differet types of edit distaces such as the Hammig distace 6, which oly measures substitutios betwee sequeces. The Leveshtei distace 7, which measures all the operatios metioed i sectio.., ormally referred to as edit distace. Ad the Damerau-Leveshtei distace 8, which icludes aother operatio which is ot metioed i sectio.., a traspose of two adjacet characters. Computig the edit distace correspods to settig the cost of each operatio to oe. I the geeral case each operatio ca have differet cost which meas that substitutios, isertios ad deletios ca be weighted differetly. To represet the cost of differet operatios a cost table ad a gap fuctio is typically used. A optimal aligmet is a aligmet with a optimal cost. Regios of mutatios ca be idetified from a optimal aligmet, so ofte the optimal aligmet is more importat tha the optimal cost. Heceforth edit distace will refer to the Leveshtei distace, ad edit script will refer to a optimal aligmet with the optimal edit distace...3 Time ad space A aive algorithm to compute the edit distace is show i figure (a). Implemetig this aively without storig edit distaces to sub-sequeces is very time cosumig, as the edit distace for sub-sequeces will be computed multiple times. As for the space it oly depeds o the size of the sequeces, i.e. O (). Computig a edit script is almost the same as computig the edit distace. The oly differece is that after the optimal edit distace have bee foud for a etry, we eed to backtrack, i.e. compute the aligmet which gave rise to the edit distace. A algorithm for computig the edit script is show i figure (b). Fidig the edit script is eve more time cosumig tha computig Dug My Hoa

8 optcost(i,j): d, v, h, s = udef if i > ad j > d = cost(i-,j-) + ed(a[i],b[j]) if i > ad j v = cost(i,j-) + if i ad j > h = cost(i-,j) + if i = ad j = s = retur mi(d,v,h,s) (a) Computig the edit distace aively. optalig(i,j): same as i cost fuctio except the last lie o = mi(d,v,h,s) if o = d optalig(i-,j-) alig A[i] with B[j] if o = v optalig(i-,j) alig A[i] with a gap if o = h optalig(i,j-) alig B[j] with a gap if o = s retur (b) Computig the edit script aively. Figure : Simple implemetatio. the optimal edit distace because of additioal calls to the cost fuctio. remais the same as computig the edit distace. Space cosumptio. Dyamic programmig Dyamic programmig is a method of breakig complex optimizatio problems ito smaller optimizatio sub-problems, which ca be combied to solve etire optimizatio problems. To be able to use dyamic programmig the problem has to cosist of slightly smaller overlappig sub-problems. Also the procedure to solve the problem must be to repeatedly solve the same sub-problem. The idea is to store solutios to sub-problems, which ca be retrieved later to solve bigger sub-problems ad thereby solve the etire optimizatio problem 9. Sequece aligmet is a optimizatio problem, where the optimal solutio ca be computed from combiig the optimal solutios from slightly smaller sub-problems. The solutios to the subproblems are used multiple times to compute other sub-problems, i.e. overlappig sub-problems. So applyig dyamic programmig to sequece aligmet will improve the computatio time of fidig the edit distace ad script. The edit distace for each pair of sub-sequeces are stored i a distace table, so they ca be retrieved for later use... Time ad space A algorithm for computig the edit distace with dyamic programmig is almost the same as the aive versio. Iterate over i =,..., ad j =,..., ad replace each recursive fuctio call with a lookup i the distace table. This way each value that is required to compute the edit distace i a etry is already computed ad ca be retrieved from the distace table. This is also kow as forward dyamic programmig. The time complexity is O ( ), as each etry i the distace table is computed oce i costat time. There are ( + ) pairs of sub-sequeces as the empty sequece is also treated as a sub-sequece, hece the space cosumptio is O ( ). Whe the distace table has bee filled out, computig the optimal edit script ca be doe by backtrackig i the distace table. The algorithm is almost the same as the aive versio, but with 9 Dug My Hoa

9 each recursive fuctio call replaced with a lookup i the distace table. Makig three lookups i the distace table for each etry takes costat time, which meas the time cosumptio oly depeds o the umber of etries we eed to visit i the distace table. Worst case is whe sequece A ad B is oly aliged with gaps, hece O () time..3 Four-Russias speedup Four-Russias speedup is a method to speedup dyamic programmig. The geeral idea of Four- Russias speedup is to partitio the distace table ito t-blocks ad compute essetial values i the table oe block at a time. The essetial values i the distace table are the values required to compute a block. The goal is to oly use O (t) time o each block, istead of the ormal Θ(t ) [8]..3. t-block Now cosider the stadard dyamic programmig procedure of computig the edit distace i a block i the distace table (see figure ). The block D is computed from the sub-seq A ad B, start value S, row R ad colum C. It is clear that the block D is a fuctio of these, hece the last row ad colum of the block is a fuctio of sub-seq A ad B, start value S, row R ad colum C. sub-seq B sub-seq A S R C D Figure : A block i the distace table. Let a t-block be a block of size t i the distace table, where the last row i the block is shared with the first row i the block below it (if ay), ad the last colum i the block is shared with the first colum i the block to its right (if ay)..3. Block fuctio Give sub-seq A ad B, start value S, row R ad colum C the block fuctio computes the last row ad colum of the block. The computatio time is Θ(t ) whe doe aively. The goal is to oly use O (t) o each block. Oe way is to precompute all possible iputs for the block fuctio, so that the last row ad colum ca be computed i O (t) time. By defiitio each etry ca hold a distace value from zero to, so there are + possible values for ay t-legth row ad colum. Hece the possible iput combiatios to the block fuctio is ( + ) t σ t, where σ is the size of the alphabet. For each iput, the block fuctio takes Θ(t ) time to compute the last row ad colum of a block. So the overall time to precompute the fuctio output is Θ(( + ) t σ t t ). But as t is at least oe this gives a Ω( ) precomputatio time. So there is o speedup with this solutio. Dug My Hoa

10 .3.3 Offset trick The domiat term i the precomputatio time is ( + ) t, as the size of σ is assumed fixed. Now cosider the values i the distace table, each D[i, j], with i, j >, is computed with the values from D[i, j ], D[i, j] ad D[i, j ]. If A[i] B[j] the D[i, j] is equal to the miimum of the three etries plus oe, if A[i] = B[j] the D[i, j] is equal to D[i, j ], hece D[i, j] is less tha or equal to these etries plus oe. Coversely, for adjacet row etries, the optimal edit distace of A[... i] ad B[... j] is located i D[i, j], by omittig B[j] from the aligmet the optimal edit distace for A[... i] ad B[... j ] is located i D[i, j ]. Now if the aligmet matches B[j] with some character i A[... i] the by omittig B[j] from the aligmet, the distace is icreased by at most oe. If B[j] is ot matched the its omissio will decrease the distace by at most oe. Hece D[i, j ] D[i, j] +. The same goes for adjacet colum etries. For adjacet diagoal etries, if A[i] is aliged with B[j] the it is clear that D[i, j ] D[i, j] +. If A[i] is ot aliged with B[j] the either A[i] or B[j] is aliged with a gap, ad D[i, j ] D[i, j]. Hece two adjacet etries ca differ by at most oe. With this kowledge it is easy to see that a row or colum ca be represeted as a start value ad the differece (offset) of each subsequet etry i the row or colum. A offset vector is the a t-legth vector of values {,, }, where the first etry must be zero. The key to make the Four-Russias speedup efficiet is to compute the edit distace usig oly the offset vectors. Because the umber of offsets is much less the the umber of possible distaces, makig the precomputatio time oly O ( 3 t σ t t ). Computig the offset vector of the last row ad colum of a t-block ca be doe without ay actual edit distace. Cosider a t-block i the distace table where the upper left corer is D[i, j] = S, where S is a ukow edit distace. The for a colum, k, i the block, the value i D[i, k] is the, S plus the total of the offsets i row i from colum j + to k. So eve though the value of S is ukow, the value of the etry ca be expressed as S plus a value which is computed from the row offset vector i row i. Each D[k, j] ca be expressed the same way. Now let D[i, j + ] = S + I ad D[i +, j] = S + J where I ad J is kow (the offset vectors of row i ad colum j). If A[i] = B[j] the D[i +, j + ] = D[i, j], if A[i] B[j] the D[i +, j + ] is the miimum of D[i, j] +, D[i, j + ] + ad D[i +, j] +. The compariso ca be doe by kowig the value of I ad J, hece D[i +, j + ] ca be expressed as S, S +, S + I + or S + J +. This way every etry i a block ca be expressed as a ukow S plus a value that ca be determied. Sice every etry ivolves the same variable S, the offset vector of the last row ad colum for a block ca be determied with a abitrary value of S..3.4 Applyig the offset block fuctio To use the offset block fuctio, cover the ( + ) distace table with t-blocks, with overlappig rows ad colums. Iitialized the first row ad colum of the distace table ad fid the offset values for them. Row-wise determie the last row ad colum of each block. Because the blocks overlap, the last row i a block provides the first row i the block below it (if ay) ad the last colum i a block provides the first colum i the block to its right (if ay). If Q is the total of the offset values computed for etries i row, the D[, ] = D[, ] + Q = + Q. Dug My Hoa

11 .3.5 Time ad space The offset block fuctio computes the last row ad colum, so each block uses oly ( O ) (t). There are Θ( t ) blocks, so the total time used whe applyig the Four-Russias is O t. Settig ( ) t = log gives a ruig time of O. If the distace table occupies quadratic space the log space usage is the O ( + 3 t σ t t ) ad O ( + 3 t σ t t ) for distace table i liear space. 3 Implemetatio ad experimetal Setup This sectio will describe the implemetatio ad experimetal setup. The reaso for this sectio is because there are some part of the implemetatios, which is the same for both the quadratic ad liear space Four-Russias speedup. Data hadlig, as i represetatio of the data used, see sectio 3... Maagig sequeces, where the offset block fuctio ca ot be applied o the whole distace table, see sectio For preprocessig of t-blocks, see sectio The experimets are very similar i both the quadratic ad liear space. The oly major differece are the limitatio o iput size for the quadratic space. See sectio 3.. for specificatio of the computer used i the experimets. Test data ad parameters as i t-block size ad iput size ad the reaso for these choices, see sectio 3... A discussio o what to expect from them, see sectio I the rest of the report, sequece A refers to the sequece represeted across the rows with size, ad variable i correspods to a row i the distace table. Sequece B refers to the sequece represeted across the colums with size m, ad variable j correspods to a colum i the distace table. 3. Implemetatio Four differet types of implemetatios were made for sequece aligmet. A stadard dyamic programmig implemetatio i quadratic space, ad oe where Four-Russias speedup has bee applied. See sectio 4 ad 5 for the implemetatio of edit distace ad edit script respectively. A stadard dyamic programmig implemetatio i liear space, ad oe where the Four-Russias speedup has bee applied. See sectio 6 ad 7 for the implemetatio af edit distace ad edit space respectively. The implemetatios without the Four-Russias speedup will be referred to as stadard implemetatios. This sectio will describe parts of the implemetatios with Four-Russias speedup i a more detailed level. First is the data hadlig, as i the represetatio of sub-sequeces ad offset vectors. Why the represetatio is appropriate, how much space it cosumes ad how it might affect the ruig time of the computatio. Secod is the maagemet of sequeces, whe the distace table ca ot be partitioed ito t-blocks, e.g. there are missig some rows ad colums, so the offset block fuctio ca ot be applied o the whole distace table. I will preset a solutio ad discuss how this might affect the ruig time. Fially I describe the preprocessig, how to compute the offset block fuctio, implemetatio choices ad how it affect the ruig time. Dug My Hoa - 443

12 3.. Data hadlig The size of the sub-sequeces ad offset vectors is t which differs from what have bee described i sectio.3. The first offset i every offset vector is always set to, so here there is o eed to have a offset vector of size t. The first character i each sub-sequece ca be omitted from a block sice the offset vectors are give before preprocessig or computig the edit distaces. The distace values i the first row ad colum ca be computed from these offset vectors with a abitrary start value (see sectio.3.3), so the first character i each sub-sequece does ot participate i the computatio ad therefore ot eeded for lookups either. The Four-Russia speedup is about precomputig ad storig iformatio about all possible istaces of a sub-problem. For all possible combiatios of sub-sequeces ad offset vectors, i this case (3 4) (t ) as the sub-sequeces ad offset vectors is of size t, there are two offset vectors associated with each istace. So (t )sizeof (it) space is used per offset vector if implemeted aively. But as t grows i size the data structure used to store the precomputed data will explode i size. As a example whe t = 5 the size of the structure would be (3 4) 4 4, over GB of space. The eed to pack the data arises, ad a idea could be to oly use bits to represet the bases ad offsets. For this there is the bitset ad the vector bool cotaier from Stadard Template Library i C++. vector bool allows for dyamic resizig whereas bitset is fixed, boost also made a variatio of the bitset which allows dyamic resizig. I decided ot to use ay of these because with t = 6 the structure occupies (3 4) 5 B assumig that each etry oly uses oe byte which is roughly 6GB (sectio.3.5). I aim to test with t = 5 ad therefore the sub-sequeces ad offset vectors are stored usig usiged iteger istead. I could have used siged iteger istead, which would ot have made ay differece. This way I ca use bit-wise operatios to pack ad upack the data. Also, whe usig the offset block fuctio, the offset vector represetatios it returs ca be stored for future lookups without havig to upack the data, see below for more details. Two bits is used to represet a base A, C, G, T ad two bits to represet a offset,,. For t = 5 the structure uses (3 4) 4 4, 6GB assumig that each etry oly uses four bytes, which is still ruable o a 4GB RAM machie. character -bit idex A C G T 3 - Table : Table of character ad offset covertio. Each base ad offset is hardcoded, as the implemetatio is oly iteded for DNA sequece with edit distace. These are kept i a char ad it array, sigma ad offset, used for costructig a actual sub-sequece ad offset vector from a represetatio. Furthermore there is a table covertig each of the bases ito their idex value, eeded for costructig a usiged iteger iterpretatio give a sub-sequece. For offsets, the idex value is the offset value +, i this way you ca get the offset vector values directly from a usiged iteger represetatio by oly usig Dug My Hoa - 443

13 bit shift operatios. Also you ca costruct a offset vector represetatio from the offset vector values, by usig the offset value plus oe... = ACGT.. = {,,, } Figure 3: Example of sub-sequece ad offset vector represetated as a usiged iteger. There are oly three offsets so there is a combiatio of bits which will ot be used. For that reaso a table for covertig a offset vector represetatio to a idex value is eeded offbitsit ad from a idex value to offset vector represetatio itoffbits. Because whe allocatig the structure to store the preprocessed data oly (3 4) (t ) etries are eeded. But idexig the etries to the usiged iteger represetatio of a offset vector will icrease the size, as if there were four offsets istead of three, givig a size of (4 ) (t ) B, assumig that each etry oly uses oe byte. But that would ot be the case whe packig the data usig oly the size eeded to represet a offset vector. As t = 5 the bits eeded to represet a offset vector is 8 ad there is two of them so there is (4 ) 4 which is 8GB, ad havig a structure of this size eve with a machie which is capable of testig the implemetatio, a large part of the structure will ot be cached which leads to a poor performace because of RAM latecy. 3.. Readig data from a represetatio Readig a character from a sub-sequece represetatio ca be doe usig a idex value of the sub-sequece. The represetatio of the sub-sequece is stored at the (t ) least sigificat bits (assumig right side of the usiged iteger). So for a umber, subbits, betwee {,,..., 4 t }, bit shift subbits to the left by wordsize (t ). The sub-sequece represetatio is ow at the (t ) most sigificat bits. Depedig o which idex i that eeds to be read, startig at idex, bit shift subbits further to the left by i, the bit shift subbits to the right by wordsize so there is oly two bits left represetig the sub-sequece. The by usig the usiged iteger value of subbits i the array sigma which cotais the character, the character of idex i from subbits ca be retrieved... = ACGT.. read idex = C Figure 4: A example of readig a character from a represetatio. Readig a offset from a offset vector represetatio, works the same way as for readig a character from a sub-sequece. The oly differece is the bit combiatio will ot appear i a offset vector represetatio as there are oly three offsets, ad the array offset is used istead to look-up the actual offset value. Dug My Hoa - 443

14 3..3 Maagig sequeces which is ot a multiple of t The idea behid Four-Russias speedup is to partitio the distace table ito t-blocks. A t-block has its last row shared with the first row i the t-block below it (if ay) ad its last colum shared with the first colum of the t-block o its right (if ay). This meas that the sequeces have to be a multiple of t, if ot the there are missig some rows at the bottom of the table ad/or some colums to the right of the table. Cosequetly the offset block fuctio ca ot be applied o the last rows ad/or colums of the table. Oe way to solve this is to use stadard dyamic programmig o the rows ad colums where offset block fuctio ca ot be applied. That meas worst case havig (t ) + (t ) m etries that uses stadard dyamic programmig to be filled i. This is however ot very good i practise, as the time used with stadard dyamic programmig will be (t )+m(t ) (t ) i worse case. Though it will ot chage the asymptotic ruig time of the program. A better solutio would be to make sure you ca use the offset block fuctio o the whole table. This is doe by paddig the sequeces without chagig the optimal edit distace. Furthermore oe has to make sure that the etry cotaiig the optimal edit distace is either i a row or colum which is a multiple of t, so the optimal edit distace is attaiable whe doe applyig the offset block fuctio o the whole table. Also, oe eeds to make sure that the optimal edit script ca still be obtaied with these paddigs. By lettig sequeces A always be the logest of two, there are two case of paddigs. If is ot a multiple of t, the both sequeces are padded with As i frot, so that the padded sequeces A ow has a size which is a multiple of t. The if m plus the size padded i frot is ot multiple of t, the As are padded i the back of sequeces B. This way the offset block fuctio ca be applied o the whole table. The size to pad i frot of each sequeces, padfrot, is give by (t ) ( mod (t )), ad the size to pad i the back of sequece B, padm ed, is give by (t ) ((m + padfrot) mod (t )). AA A (a) Whe ad m is ot a multiple of t. (b) Padded As i frot. (c) Whe m is ot a multiple of t. Figure 5: Paddig sequeces. The paddigs i frot do ot chage the optimal edit distace of the two sequeces, as the optimal edit distace will go diagoal dow to where the actual sequeces starts, takig edit distace value with it. As for the paddig at the ed of sequece B, they are igored as they are oly there to make sure that offset block fuctio ca be used o the whole table. So by kowig how much have bee padded i the frot of both sequece ad i the ed of sequece B, the optimal edit distace ca be read from the last row i the table i etry, + padfrot, m padm ed. As Dug My Hoa

15 for the optimal aligmet, it ca still be obtaied with these paddigs as follows. The aligmet with paddigs i frot has extra As aliged for both sequeces, which are igored whe computig the edit script. Backtrackig starts i the etry where the optimal edit distace is located, so the paddigs i the ed of sequeces B will ot participate i the backtrackig. A A A A A A Figure 6: Edit distace ad script ca still be computed with padded sequeces. So what do these paddig do to the ruig time of the algorithm? Worst case is that it is oly, the size of sequece A, which is ot a multiple of t. So by paddig As i frot of both sequeces, meas that you will have to pad As i the back of sequece B, as the padded sequece B o loger is a multiple of t. This results i two extra look-ups for each row of t-blocks, which is a costat umber of extra lookups. Cosequetly the asymptotic ruig time of the algorithm remais the same Preprocessig The preprocessig is about makig aligmets for all possible sub-sequeces ad offset vectors. The save the offset vectors of the aligmets i a table for fast retrieval. First, all possible subsequeces ad offset vectors are costructed. These are eeded for computig the edit distace for all the sub-problems. They are ot eeded for other the the preproceesig step, so they are oly stored temporarily. Costructig all possible sub-sequeces is easy as they ca geerated from their idex value. So by goig through the umbers,..., 4 t, the sub-sequeces ca be geerated by readig idex,,..., t from their idex value ad cocateate the characters read. For details o how to read a character i a sub-sequeces represetatio see sectio 3... The offset vectors requires a bit more work as there are oly three offsets. So it is ot all the combiatios of two bits that is used. Costructig a offset vector is doe by keepig a local usiged iteger, offbits, which is modified, for every umber of offset vectors there is, to reflect a offset vector represetatio. So by goig through the umbers,..., 3 t, oe is added to offbits as log as the two least sigificat bits are ot equal to two. If the two least sigificat bits are equal to two, reset them to zero by bit shiftig offbits to the right by two. If the two ew least sigificat bits are equal to two, reset it agai by bit shiftig offbits to the right by two. This cotiues util the two least sigificat bits are ot equal to two ad oe is added to offbits. Shift the two bits, that were affected by addig oe, back to their origial positios. This way offbits ow represet oe of the possible offset vector. It is the same as alteratig the last offset i a offset vector through all possible offset values. Whe reachig the ed of possible offset value, ( i bit represetatio), the last offset value is reset ad a carrier is added to the ext offset i the offset vector. If this offset is the last of all possible offset values, reset it ad move the carrier Dug My Hoa

16 to the ext offset value ad so o. For each iteratio offbits acts as a couter for offset vectors i their usiged iteger represetatios. From the costructed offset vector represetatio the offset vectors ca be determied by readig from the represetatio. For details o how to read a offset value i a offset vector represetatio see sectio = {-,,,} add oe.. {-,,,} add oe, gives carrier, reset two bits.... reset two bits add carrier move bits back to origial positio.. {-,,,} Figure 7: A example of costructig a usiged iteger represetatio of a offset vector. Now all possible sub-sequeces ad offset vectors have bee costructed ad ca be retrieved by their idex value i a temporary storage. So fidig the offset vectors of all the possible combiatios of sub-sequeces ad offset vectors ca be doe by goig through all the possible combiatios of their idex values ad fetchig each sub-sequece ad offset vector. The perform a stadard dyamic programmig computatio o each combiatio usig a (t ) (t ) distace table D. By usig the start value t ad the applyig the offset vectors to this value, the distace table will ot cotai ay egative distaces. Not that it matters as it is oly the offsets which is eeded. A distace table of size (t ) (t ) is oly eeded i the preprocessig step, as the first row ad colum are give by the start value ad offset vectors. The oly iformatio eeded for each etry is the diagoal, vertical ad horizotal values from the etry. These are kept i temporary variables alog with three extra variable, a to remember the first value i the last colum, b to remember the first value i the last row ad bh for storig the first horizotal value for a row which is used whe goig from a row to the row below it. a a bh d v h bh d v h b (a) Start of a stadard dyamic programmig computatio. b (b) d,v ad h movig alog a row. Figure 8: Preprocessig usig stadard dyamic programmig. Whe v reaches the last colum, but is still ot withi the distace table, the a = v. The same goes for b whe h reaches the last row. bh is set to h each time h starts oe a ew row. D is filled out the same way as i sectio 4. The first offset for the last colum is D[, t ] a ad the remaiig offsets are D[i, t ] D[i, t ] for i =,..., t. The first offset for the last row is D[t, ] b ad the remaiig offsets are D[t, j] D[t, j ] for j =,..., t. Costructig the offset vector represetatio from the offset values is doe by usig a temporary usiged iteger, offbits, Dug My Hoa

17 offset block table pair ptr itoffbits Figure 9: Mappig of offset vectors. iitialized to. For each offset value plus oe, add it to offbits the bit shift offbits left by two ad add the ext offset value plus oe, ad so o util all the offset values have bee added to offbits. The offset block fuctio is basically a four dimesioal array. For each possible combiatio of sub-sequeces ad offset vectors, the offset block fuctio returs a pair of offset vector represetatios which correspods to the last row ad colum of a t-block. Istead of havig two offset vectors stored for each etry i the offset block table, a poiter to a pair of offset vector poiters is used, see figure 9. I this way each etry of the offset block table oly uses a wordsize. The pair of poiters poits to the offset vector represetatios i itoffbits, which is the table used to get a offset vector represetatio from its idex value. After costructig the offset vector represetatio i the preprocessig phase, the idex values are also eeded to set a pair of poiters saved i a array, pair ptr. The idex values are retrieved from offbitsit which is the mappig of a offset vector represetatio to its idex value. The preprocessig time is the the time to costruct the sub-sequeces ad offset vectors, plus the time to fid the aligmets for all combiatios of these, ad plus the time it takes to read the offsets ad costruct the offset vector represetatios. The time to costruct the sub-sequeces ad offset vectors is O ( (4 t + 3 t )(t ) ). The time for aligig the combiatio ad makig the offset vector represetatio is O ( 4 (t ) 3 (t ) ((t ) + (t )) ). But sice (t ) < (t ) ad (4 t + 3 t )(t ) < 4 (t ) 3 (t ) (t ) the overall time is O ( 4 (t ) 3 (t ) (t ) ). The space cosumptio is O ( 4 (t ) 3 (t )) as each etry i the offset block table oly uses four bytes, regardless the size of t Offset block fuctio To use the offset block fuctio the idex values of the sub-sequeces ad offset vectors are eeded. To get these their represetatio eeds to be costructed. After costructig the offset vector represetatio the idex value ca be retrieved from offbitsit. For the sub-sequeces the idex values correspods to their represetatios. 3. Experimets The objective of my experimets is to ivestigate whether the Four-Russias speedup for dyamic programmig i sequece aligmet does give a speedup i practise. So I eed to costruct experimets that show if this occurs with ad without preprocessig. As the advatage of the Four- Russias speedup is the preprocessed data ca be used o multiple pair-wise aligmets. Also, for fidig a edit script the Four-Russias should be able to give a speedup. I the quadratic space, the distace table has to be fill i before a edit script ca be computed. With the Four-Russias speedup the distace table is oly partial filled i. A backtrackig o a partial filled distace table Dug My Hoa

18 is the to go through the blocks that cotai the optimal path. Whe the time used to compute the sub-paths i these block is less tha the speedup gaied by usig the Four-Russias, it should yield a better ruig time. See sectio 5 for more details. For liear space, the Four-Russias speedup ca be directly applied to Hirschberg s algorithm for computig optimal aligmets i liear space. Whe the preprocessig takes less time tha the time used to compute the edit script, there is a good possibility that the overall time for computig a edit script is faster whe oly usig Hirschberg s algorithm, see sectio 7 for more details. The experimets will be ru with ad without optimizatio. The reaso is that the stadard implemetatios will beefit a lot more from optimizatio tha the oes with Four-Russias speedup. Each etry i the stadard implemetatios is computed by comparig three other etries ad two characters, so costat time is used i each etry. Each block i the implemetatios with Four-Russias speedup cosist of computig the idex values of the sub-problem, applyig the offset block fuctio ad fillig i the last row (ad colum) of the block, so the time is depedet o the size of the block. Comparig the workload of each loop it is clear that the stadard implemetatios have less work per loop. Now cosider the loop sizes. There are m etries to be filled i for the stadard implemetatios ad t m t blocks with the Four-Russias speedup. Although there are fewer loops with Four-Russias speedup the workload is heavier. Hece ay loop optimizatio (loop urollig) will beefit the stadard implemetatios a lot more the the oes with Four-Russias speedup. As the overhead, of usig the offset block fuctio ad oliear writes to the memory (fillig i the last colum of a block) for the quadratic space versio, will domiate the ruig time with Four-Russias speedup. So it is iterestig to see how the results without optimizatio (-O) are compared to results with optimizatio (-O3). 3.. Computer specificatio The experimets are ru o a.66 GHz Itel Core quad (Q945) CPU machie with 4GB RAM ad 6MB cache, ruig Ubutu Test data ad parameters Radom sequeces are used for testig i geeral. The reaso is the cotets of the sequeces will ot affect the stadard implemetatios, as they do ot use a lookup table like the Four-Russias speedup. So to be able compare the implemetatios radom sequeces are used. Padded sequeces might have a effect o the ruig time for the implemetatios with Four-Russias speedup. As there will be more lookups with the offset block fuctios. So the legth of the sequeces will always be a multiple of t. The block size to be tested are, 3, 4 ad 5. Block size will ot make ay sese sice t would be zero. For block size larger tha 5 the offset block table is too big to be tested by the machies available. With the block size, 3, 4 ad 5, sequeces which are a multiple of 6 is used as 6 is a multiple of,, 3 ad 4. The first test is to show whe we achieve a speedup usig Four-Russias with ad without preprocessig compared to the stadard implemetatio. For this purpose the iput size has small iterval betwee them ad the max size will be. If the Four-Russias has ot achieved a speedup for sequeces of this size, the implemetatio has failed. The a test to see the advatage of the preprocessed data used o multiple pair-wise aligmets (oly doe for edit distace). The iput sizes will be determie from a sigle ru of sequece aligmet, where the Four-Russias Dug My Hoa

19 outperforms the stadard implemetatio. These tests will be doe for both computig the edit distace ad script. A test where the sequeces cosist of As oly to see how a aligmet that have may of the same look-up would affect the ruig time of the implemetatio. Block size 5 will ot be tested for the quadratic space, as the size of the distace table grows there will ot be eough space for the offset block table. The test o sequeces with oly As will oly be tested o block size 4 ad 5 as the test is just to see how much faster it would be compared to radom sequeces. All tests are ru multiple times to equalize the ruig time, tests o block size are doe with the same two sequeces Expectatios Tests with block size are expected to be slower tha the stadard implemetatio. Because with block size the whole distace table will be filled out ad the overhead of usig Four-Russias will domiate the ruig time. To use the offset block fuctio for a etry the idex values of the combiatio are eeded. Whereas for the stadard implemetatios fillig i a etry is to fid the maximum of other three etries, which is basically comparig three values. So costructig/loadig four idex values ad makig a lookup usig the offset block fuctio versus comparig three values, I would guess the latter case to be fastest of the two. As for block size 3 there should be a advatage whe usig Four-Russias. There will be at least oe etry for each block that is ot filled i (itermediate results are ot filled i whe usig Four-Russias speedup i liear space, see sectio 6. for more details). Without preprocessig the Four-Russias speedup should outperform the stadard implemetatio later the for larger block sizes. With preprocessig the Four-Russias speedup should outperform the stadard implemetatio earlier tha for larger block sizes, as the preprocessig time is lower. For block size 4 the preprocessig time is icreased. So eve though the Four-Russias speedup without preprocessig might outperform the stadard implemetatio earlier tha for block size 3, it will occur later with preprocessig. The questio would the be whe it is more beeficial to use block size 4 over block size 3. For large sequeces there would be a advatage while for smaller sequeces it might tur out to be more appropriate to use block size 3. For block size 5 the preprocessig time will be very log ad it might tur out to be useless for sequeces sizes that I am capable of testig with the machies I have available. Never the less, the speedup without the preprocessig should be faster tha for block size 4. The iterestig part would be how much faster it is compared to the other block sizes. The iterestig part of the test with sequeces cosistig of oly As would be to see how much faster it is compared to radom sequeces, as the lookup data is goig to be catched. So it is expected to be the fastest of all the tests, as it is tested without ay itermediate results beig filled i. For edit script tests i quadratic space, the backtrackig part is slower for the Four-Russias implemetatio tha for the stadard implemetatio. Because, beside backtrackig o the distace table, some of the etries i the table still eed to fill i to be able to backtrack o the distace table. But the overall ruig time should be faster as the distace table eeds to be filled i before this is possible. The backtrackig part, eve though it takes loger time tha the stadard implemetatio, will still be low ad combiig it with the ruig time to compute the edit distace will probably ot result i a sigificat icrease. For liear space, the ruig Dug My Hoa

20 time should be faster tha the stadard implemetatio as it uses the offset block fuctio while backtrackig. A quick summary, without preprocessig the larger the block size is the better the speedup is. Oly block size might show to be slower tha the stadard implemetatio. With preprocessig block size 3 might be best for small sequeces ad block size 4 for larger sequeces, while block 5 might tur out to be useless. I geeral the tests o quadratic space are expected to be slower the the tests o liear space. Because, the distace table i liear space will always be withi the cache size with the sequeces sizes used, which is ot the case i quadratic space. 4 Edit distace i quadratic space The stadard implemetatio is very straight forward. Here, forward dyamic programig is used to fill i the distace table. So for a sequece A with size ad a sequece B with size m, the distace table D is of size + m + with + rows ad m + colums. Start by fillig out the first row with,..., ad the the first colum with,..., m. Goig through the etries row by row startig i D[, ], the etries are filled i accordigly: the etry D[i, j], where i =,..., ad j =,..., m, is equal to D[i, j ] if A[i ] equals B[j ]. If A[i ] is ot equal to B[j ] the D[i, j] is equal to the miimum of D[i, j ] +, D[i, j] + ad D[i, j ] +. Whe the whole table is filled i the edit distace ca be foud i D[, m]. It is easy to see that the time ad space are both O (m). 4. Four-Russias speedup Applyig the offset block fuctio o the distace table correspods to loopig over the idexes of the table which are multiples of t. The for each sub-problem costruct the sub-sequece represetatios for the sub-problem. Costructig a sub-sequece represetatio is very straight forward. For each character i a sub-sequece look up the idex value i the charit table ad add it to a local usiged iteger, subbits, which is iitialized to. For each subsequet character i the sub-sequece, bit shift subbits left by two ad add the idex value of the character. There is o eed to costruct the offset vector represetatios, because whe either the first row or the first colum is a part of the sub-problem the offset vectors cosist of oly s. From how the offset vector represetatios are saved i itoffbits that offset vector represetatio ca be foud i the last idex of itoffbits (see sectio 3.. for details). Furthermore the offset block fuctio returs two offset vector represetatios. The offset vectors of a row ca be saved i their usiged iteger represetatios, so they ca be retrieved later for lookups i the row of blocks below. By usig this kowledge, there is o eed to costruct the offset vector represetatios, see figure. The pair of offset vector represetatios that the offset block fuctio returs are used to fill i the last row ad colum of the block. For details o how to read from a offset vector represetatio, see sectio 3... Subtractig oe from a offset idex value will give the offset value, so there is o eed to look it up i offset, which stores them, because the idex value is just the offset value plus oe, see table. Whe fillig i the distace table, the last offset i the colum offset vector is ot eeded, because the last etry of the block will be filled out usig the last offset of the row offset vector. Dug My Hoa

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Ruig Time of a algorithm Ruig Time Upper Bouds Lower Bouds Examples Mathematical facts Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

EE123 Digital Signal Processing

EE123 Digital Signal Processing Last Time EE Digital Sigal Processig Lecture 7 Block Covolutio, Overlap ad Add, FFT Discrete Fourier Trasform Properties of the Liear covolutio through circular Today Liear covolutio with Overlap ad add

More information

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions: CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Computers and Scientific Thinking

Computers and Scientific Thinking Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput

More information

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 9 Poiters ad Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 9.1 Poiters 9.2 Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Slide 9-3

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 8 Strigs ad Vectors Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Slide 8-3 8.1 A Array Type for Strigs A Array Type for Strigs C-strigs ca be used to represet strigs

More information

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS) CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

CMPT 125 Assignment 2 Solutions

CMPT 125 Assignment 2 Solutions CMPT 25 Assigmet 2 Solutios Questio (20 marks total) a) Let s cosider a iteger array of size 0. (0 marks, each part is 2 marks) it a[0]; I. How would you assig a poiter, called pa, to store the address

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1 CS200: Hash Tables Prichard Ch. 13.2 CS200 - Hash Tables 1 Table Implemetatios: average cases Search Add Remove Sorted array-based Usorted array-based Balaced Search Trees O(log ) O() O() O() O(1) O()

More information

Message Integrity and Hash Functions. TELE3119: Week4

Message Integrity and Hash Functions. TELE3119: Week4 Message Itegrity ad Hash Fuctios TELE3119: Week4 Outlie Message Itegrity Hash fuctios ad applicatios Hash Structure Popular Hash fuctios 4-2 Message Itegrity Goal: itegrity (ot secrecy) Allows commuicatig

More information

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 8 Strigs ad Vectors Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Copyright 2015 Pearso Educatio, Ltd..

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Data Structures Week #9. Sorting

Data Structures Week #9. Sorting Data Structures Week #9 Sortig Outlie Motivatio Types of Sortig Elemetary (O( 2 )) Sortig Techiques Other (O(*log())) Sortig Techiques 21.Aralık.2010 Boraha Tümer, Ph.D. 2 Sortig 21.Aralık.2010 Boraha

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Counting the Number of Minimum Roman Dominating Functions of a Graph

Counting the Number of Minimum Roman Dominating Functions of a Graph Coutig the Number of Miimum Roma Domiatig Fuctios of a Graph SHI ZHENG ad KOH KHEE MENG, Natioal Uiversity of Sigapore We provide two algorithms coutig the umber of miimum Roma domiatig fuctios of a graph

More information

Analysis of Algorithms

Analysis of Algorithms Presetatio for use with the textbook, Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Aalysis of Algorithms Iput 2015 Goodrich ad Tamassia Algorithm Aalysis of Algorithms

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

Examples and Applications of Binary Search

Examples and Applications of Binary Search Toy Gog ITEE Uiersity of Queeslad I the secod lecture last week we studied the biary search algorithm that soles the problem of determiig if a particular alue appears i a sorted list of iteger or ot. We

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

CSE 417: Algorithms and Computational Complexity

CSE 417: Algorithms and Computational Complexity Time CSE 47: Algorithms ad Computatioal Readig assigmet Read Chapter of The ALGORITHM Desig Maual Aalysis & Sortig Autum 00 Paul Beame aalysis Problem size Worst-case complexity: max # steps algorithm

More information

Lecture 18. Optimization in n dimensions

Lecture 18. Optimization in n dimensions Lecture 8 Optimizatio i dimesios Itroductio We ow cosider the problem of miimizig a sigle scalar fuctio of variables, f x, where x=[ x, x,, x ]T. The D case ca be visualized as fidig the lowest poit of

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Homework 1 Solutions MA 522 Fall 2017

Homework 1 Solutions MA 522 Fall 2017 Homework 1 Solutios MA 5 Fall 017 1. Cosider the searchig problem: Iput A sequece of umbers A = [a 1,..., a ] ad a value v. Output A idex i such that v = A[i] or the special value NIL if v does ot appear

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to

More information

Alpha Individual Solutions MAΘ National Convention 2013

Alpha Individual Solutions MAΘ National Convention 2013 Alpha Idividual Solutios MAΘ Natioal Covetio 0 Aswers:. D. A. C 4. D 5. C 6. B 7. A 8. C 9. D 0. B. B. A. D 4. C 5. A 6. C 7. B 8. A 9. A 0. C. E. B. D 4. C 5. A 6. D 7. B 8. C 9. D 0. B TB. 570 TB. 5

More information

Σ P(i) ( depth T (K i ) + 1),

Σ P(i) ( depth T (K i ) + 1), EECS 3101 York Uiversity Istructor: Ady Mirzaia DYNAMIC PROGRAMMING: OPIMAL SAIC BINARY SEARCH REES his lecture ote describes a applicatio of the dyamic programmig paradigm o computig the optimal static

More information

LU Decomposition Method

LU Decomposition Method SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS LU Decompositio Method Jamie Traha, Autar Kaw, Kevi Marti Uiversity of South Florida Uited States of America kaw@eg.usf.edu http://umericalmethods.eg.usf.edu Itroductio

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

6.851: Advanced Data Structures Spring Lecture 17 April 24

6.851: Advanced Data Structures Spring Lecture 17 April 24 6.851: Advaced Data Structures Sprig 2012 Prof. Erik Demaie Lecture 17 April 24 Scribes: David Bejami(2012), Li Fei(2012), Yuzhi Zheg(2012),Morteza Zadimoghaddam(2010), Aaro Berstei(2007) 1 Overview Up

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

arxiv: v2 [cs.ds] 24 Mar 2018

arxiv: v2 [cs.ds] 24 Mar 2018 Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

On (K t e)-saturated Graphs

On (K t e)-saturated Graphs Noame mauscript No. (will be iserted by the editor O (K t e-saturated Graphs Jessica Fuller Roald J. Gould the date of receipt ad acceptace should be iserted later Abstract Give a graph H, we say a graph

More information

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting)

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting) MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fittig) I this chapter, we will eamie some methods of aalysis ad data processig; data obtaied as a result of a give

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

Minimum Spanning Trees

Minimum Spanning Trees Miimum Spaig Trees Miimum Spaig Trees Spaig subgraph Subgraph of a graph G cotaiig all the vertices of G Spaig tree Spaig subgraph that is itself a (free) tree Miimum spaig tree (MST) Spaig tree of a weighted

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015 15-859E: Advaced Algorithms CMU, Sprig 2015 Lecture #2: Radomized MST ad MST Verificatio Jauary 14, 2015 Lecturer: Aupam Gupta Scribe: Yu Zhao 1 Prelimiaries I this lecture we are talkig about two cotets:

More information

2. ALGORITHM ANALYSIS

2. ALGORITHM ANALYSIS 2. ALGORITHM ANALYSIS computatioal tractability survey of commo ruig times 2. ALGORITHM ANALYSIS computatioal tractability survey of commo ruig times Lecture slides by Kevi Waye Copyright 2005 Pearso-Addiso

More information

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0 Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity

More information

A Parallel DFA Minimization Algorithm

A Parallel DFA Minimization Algorithm A Parallel DFA Miimizatio Algorithm Ambuj Tewari, Utkarsh Srivastava, ad P. Gupta Departmet of Computer Sciece & Egieerig Idia Istitute of Techology Kapur Kapur 208 016,INDIA pg@iitk.ac.i Abstract. I this

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

One advantage that SONAR has over any other music-sequencing product I ve worked

One advantage that SONAR has over any other music-sequencing product I ve worked *gajedra* D:/Thomso_Learig_Projects/Garrigus_163132/z_productio/z_3B2_3D_files/Garrigus_163132_ch17.3d, 14/11/08/16:26:39, 16:26, page: 647 17 CAL 101 Oe advatage that SONAR has over ay other music-sequecig

More information

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 ) EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders

More information

EVALUATION OF TRIGONOMETRIC FUNCTIONS

EVALUATION OF TRIGONOMETRIC FUNCTIONS EVALUATION OF TRIGONOMETRIC FUNCTIONS Whe first exposed to trigoometric fuctios i high school studets are expected to memorize the values of the trigoometric fuctios of sie cosie taget for the special

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

Ch 9.3 Geometric Sequences and Series Lessons

Ch 9.3 Geometric Sequences and Series Lessons Ch 9.3 Geometric Sequeces ad Series Lessos SKILLS OBJECTIVES Recogize a geometric sequece. Fid the geeral, th term of a geometric sequece. Evaluate a fiite geometric series. Evaluate a ifiite geometric

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 4 Procedural Abstractio ad Fuctios That Retur a Value Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 4.1 Top-Dow Desig 4.2 Predefied Fuctios 4.3 Programmer-Defied Fuctios 4.4

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information