Efficient Bulk Loading of Large High-Dimensional Indexes

Size: px
Start display at page:

Download "Efficient Bulk Loading of Large High-Dimensional Indexes"

Transcription

1 It. Cof. o Data Warehousig ad Kowledge Discovery DaWaK 99 Efficiet Bulk Loadig of Large High-Dimesioal Idexes Christia Böhm ad Has-Peter Kriegel Uiversity of Muich, Oettigestr. 67, D Muich, Germay {boehm,kriegel}@iformatik.ui-mueche.de Abstract. Efficiet idex costructio i multidimesioal data spaces is importat for may kowledge discovery algorithms, because costructio times typically must be amortized by performace gais i query processig. I this paper, we propose a geeric bulk loadig method which allows the applicatio of user-defied split strategies i the idex costructio. This approach allows the adaptatio of the idex properties to the requiremets of a specific kowledge discovery algorithm. As our algorithm takes ito accout that large data sets do ot fit i mai memory, our algorithm is based o exteral sortig. Decisios of the split strategy ca be made accordig to a sample of the data set which is selected automatically. The sort algorithm is a variat of the well-kow Quicksort algorithm, ehaced to work o secodary storage. The idex costructio has a rutime complexity of O( log ). We show both aalytically ad experimetally that the algorithm outperforms traditioal idex costructio methods by large factors. 1. Itroductio Efficiet idex costructio i multidimesioal data spaces is importat for may kowledge discovery tasks. May algorithms for kowledge discovery [JD 88, KR 90, NH 94, EKSX 96, BBBK 99], especially clusterig algorithms, rely o efficiet processig of similarity queries. I such a settig, multidimesioal idexes are ofte created i a preprocessig step to kowledge discovery. If the idex is ot eeded for geeral purpose query processig, it is ot permaetly maitaied, but discarded after the KDD algorithm is completed. Therefore, the time spet i the idex costructio must be amortized by rutime improvemets durig kowledge discovery. Usually, idexes are costructed usig repeated isert operatios. This dyamic idex costructio, however, causes a serious performace degeeratio. We show later i this paper that i a typical settig, every isert operatio leads to at least oe access to a data page of the idex. Therefore, there is a icreasig iterest i fast bulk-loadig operatios for multidimesioal idex structures which cause substatially fewer page accesses for the idex costructio. A secod problem is that idexes must be carefully optimized i order to achieve a satisfactory performace (cf. [Böh 98, BK 99, BBJ+ 99]). The optimizatio objectives [BBKK 97] deped o the properties of the data set (dimesio, distributio, umber of objects, etc.) ad o the types of queries which are performed by the KDD algorithm (rage queries [EKSX 96], earest eighbor queries [KR 90, NH 94], similarity jois [BBBK 99], etc.). O the other had, we may draw some advatage from the fact that we do ot oly kow a sigle data item at each poit of time (as i the dyamic idex costructio) but a large amout of data items. It is a commo kowledge that a higher faout ad storage utilizatio of the idex pages ca be achieved by applyig bulk-load operatios. A higher faout yields a better search performace. Kowig all data a priori allows us to choose a alterative data space partitioig. As we have show i [BBK 98a], a strategy of splittig the data space ito two equally-sized portios causes, uder certai circumstaces, a poor search performace i cotrast to a ubalaced

2 split. Therefore, it is a importat property of a bulk-loadig algorithm that it allows to exchage the splittig strategy accordig to the requiremets specific to the applicatio. The curretly proposed bulk-loadig methods either suffer from poor performace i the idex costructio or i the query evaluatio, or are ot suitable for idexes which do ot fit ito mai memory. I cotrast to previous bulk-loadig methods, we preset i this paper a algorithm for fast idex costructio o secodary storage which provides efficiet query processig ad is geeric i the sese that the split strategy ca be easily exchaged. It is based o a extesio of the Quicksort algorithm which facilitates sortig o secodary storage (cf. sectio 3.3 ad 3.4). The split strategy (sectio 3.2) is a user-defied fuctio. For the split decisios, a sample of the data set is exploited which is automatically geerated by the bulk-loadig algorithm. 2. Related Work Several methods for bulk-loadig multidimesioal idex structures have bee proposed. Space-fillig curves provide a meas to order the poits such that spatial eighborhoods are maitaied. I the Hilbert R-tree costructio method [KF 94], the poits are sorted accordig to their Hilbert value. The obtaied sequece of poits is decomposed ito cotiguous subsequeces which are stored i the data pages. The page regio, however, is ot described by the iterval of Hilbert values but by the miimum boudig rectagle of the poits. The directory is built bottom up. The disadvatage of Hilbert R-trees is the high overlap amog page regios. VAM-Split trees [JW 96], i cotrast, use a cocept of hierarchical space partitioig for bulk-loadig R-trees or KDB-trees. Sort algorithms are used for this purpose. This approach does ot exploit a priori kowledge of the data set ad is ot adaptable. Buffer trees [BSW 97] are a geeralized techique to improve the costructio performace for dyamic isert algorithms. The geeral idea is to collect isert operatios to certai braches of the tree i buffers. These operatios are propagated to the ext deeper level wheever such a buffer overflows. This techique preserves the properties of the uderlyig idex structure. 3. Our New Techique Durig the bulk-load operatio, the complete data set is held o secodary storage. Although oly a small cache i the mai memory is required, cost itesive disk operatios such as radom seeks are miimized. I our algorithms, we strictly separate the split strategy from the core of the costructio algorithm. Therefore, we ca easily replace the split strategy ad thus, create a arbitrary overlap-free partitio for the give storage utilizatio. Various criteria for the choice of directio ad positio of split hyperplaes ca be applied. The idex costructio is a recursive algorithm cosistig of the followig subtasks: determiig the tree topology (height, faout of the directory odes, etc.) choice of the split strategy exteral bisectio of the data set accordig to tree topology ad split strategy costructio of the idex directory. 3.1 Determiatio of the Tree Topology The first step of our algorithm is to determie the topology of the tree resultig from our bulk-load operatio. The height of the tree ca be determied as follows [Böh 98]: h = log Ceff,dir --- ) + 1 C eff,data

3 1 x; 50, ,000 : 50,000 4 y; 16,667 3 y; 16, ,333 : 16, ,667 : 33,333 5 y; 16,666 2 y; 33,333 16,667 : 16, ,666 : 16,667 Figure 1: The Split Tree. The faout is give by the followig formula: faout( h, ) = mi( h 2, C max,dir ) C eff,data C eff,dir 3.2 The Split Strategy I order to determie the split dimesio, we have to cosider two cases: If the data subset fits ito mai memory, the split dimesio ad the subset size ca be obtaied by computig selectivities or variaces from the complete data subset. Otherwise, decisios are based o a sample of the subset which fits ito mai memory ad ca be loaded without causig too may radom seek operatios. We use a simple heuristic to sample the data subset which loads subsequet blocks from three differet places i the data set. 3.3 Recursive Top-Dow Partitioig Now, we are able to defie a recursive algorithm for partitioig the data set. The algorithm cosists of two procedures which are ested recursively (both procedures call each other). The first procedure, partitio(), that is called oce for each directory page has the followig duties: call the topology module to determie the faout of the curret directory page call the split-strategy module to determie a split tree for the curret directory page call the secod procedure, partitio_acc_to_split_tree() The secod procedure partitios the data set accordig to the split dimesios ad the proportios give i the split tree. However, the proportios are ot regarded as fixed values. Istead, we will determie lower ad upper bouds for the umber of objects o each side of the split hyperplae. This will help us to improve the performace of the ext step, the exteral bipartitioig. Let us assume that the ratio of the umber of leaf odes o each side of the curret ode i the split tree is l : r, ad that we are curretly dealig with N data objects. A exact split hyperplae would exploit the proportios: l r N left = N ad N. l + r right = N = N N l + r left Istead of usig the exact values, we compute a upper boud for N left such that N left is ot too large to be placed i l subtrees with height h 1 ad a lower boud for N left such that N right is ot too large for r subtrees: N max,left = l C max,tree ( h 1) N mi,left = N r C max,tree ( h 1) A overview of the algorithm is depicted i C-like pseudocode i figure 2. For the presetatio of the algorithm, we assume that the data vectors are stored i a array o secodary

4 idex_costructio (it ) { it h = (it)(log (/Ceffdata) / log (Ceffdir) + 1) ; partitio (0,, h) ; } partitio (it start, it, it height) { if (height == 0) {... // write data page, propagate ifo to paret retur ; } it f = faout (height, ) ; SplitTree st = split_strategy (start,, f) ; partitio_acc_to_splittree (start,, height, st) ;... // write directory page, propagate ifo to paret } partitio_acc_to_splittree (it start, it, it height, SplitTree st) { if (is_leaf (st)) { partitio (start,, height - 1) ; retur ; } it mtc = max_tree_capacity (height - 1) ; _maxleft = st->l_leaves * mtc ; _mileft = N - st->r_leaves * mtc ; _real = exteral_bipartitio (start,, st->splitdim, _mileft, _maxleft) ; partitio_acc_to_splittree (start, _real, st->leftchild, height) ; partitio_acc_to_splittree (start + _real, - _real, st->rightchild, height) ; } Figure 2: Recursive Top-Dow Data Set Partitioig. storage ad that the curret data subset is referred to by the parameters start ad, where is the umber of data objects ad start represets the address of the first object. The procedure idex_costructio() determies the height of the tree ad calls partitio() which is resposible for the geeratio of a complete data or directory page. The fuctio partitio() first determies the faout of the curret page ad calls split_strategy() to costruct a adequate split tree. The partitio_acc_to_splittree() is called to partitio the data set accordig to the split tree. After partitioig the data, partitio_acc_to_splittree() calls partitio() i order to create the ext deeper idex level. The height of the curret subtree is decremeted i this idirect recursive call. Therefore, the data set is partitioed i a top-dow maer, i.e. the data set is first partitioed with respect to the highest directory level below the root ode. 3.4 Exteral Bipartitioig of the Data Set Our bipartitioig algorithm is comparable to the well-kow Quicksort algorithm [Hoa 62, Sed 78]. Bipartitioig meas to split the data set or a subset ito two portios accordig to the value of oe specific dimesio, the split dimesio. After the bipartitioig step, the lower part of the data set cotais values i the split dimesio which are

5 lower tha a threshold value, the split value. The values i the higher part will be higher tha the split value. The split value is iitially ukow ad is determied durig the ru of the bipartitioig algorithm. Bipartitioig is closely related to sortig the data set accordig to the split dimesio. I fact, if the data is sorted, bipartitioig of ay proportio ca easily be achieved by cuttig the sorted data set ito two subsets. However, sortig has a complexity of o( log ), ad a complete sort-order is ot required for our purpose. Istead, we will preset a bipartitioig algorithm with a average-case complexity of O(). The basic idea of our algorithm is to adapt Quicksort as follows: Quicksort makes a bisectio of the data accordig to a heuristically chose pivot value ad the recursively calls Quicksort for both subsets. Our first modificatio is to make oly oe recursive call for the subset which cotais the split iterval. We are able to do that because the objects i the other subsets are o the correct side of the split iterval ayway ad eed o further sortig. The secod modificatio is to stop the recursio if the positio of the pivot value is iside the split iterval. The third modificatio is to choose the pivot values accordig to the proportio rather tha to reach the middle. Our bipartitioig algorithm works o secodary storage. It is well-kow that the Mergesort algorithm is better suited for exteral sortig tha Quicksort. However, Mergesort does ot facilitate our modificatios leadig to a O() complexity ad was ot further ivestigated for this reaso. I our implemetatio, we use a sophisticated scheme reducig disk I/O ad especially reducig radom seek operatios much more tha a ormal cachig algorithm would be able to. The algorithm ca ru i two modes, iteral or exteral, depedig o the questio whether the processed data set fits ito mai memory or ot. The iteral mode is quite similar to Quicksort: The middle of three split attribute values i the database is take as pivot value. The first object o the left side havig a split attribute value larger tha the pivot value is exchaged with the last elemet o the right side smaller tha the pivot value util left ad right object poiters meet at the bisectio poit. The algorithm stops if the bisectio poit is iside the goal iterval. Otherwise, the algorithm cotiues recursively with the data subset cotaiig the goal iterval. The exteral mode is more sophisticated: First, the pivot value is determied from the sample which is take i the same way as described i sectio 3.2 ad ca ofte be reused. A complete iteral bipartitio rus o the sample data set to determie a suitable pivot value. I the followig exteral bisectio (cf. figure 3), trasfers from ad to the cache are always processed with a blocksize half of the cache size. Figure 3a shows the iitializatio of the cache from the first ad last block i the disk file. The, the data i the cache is processed by iteral bisectio with respect to the pivot value. If the bisectio poit is i the lower part of the cache (figure 3c), the right side cotais more objects tha fit ito oe block. Oe block, startig from the bisectio poit, is writte back to the file ad the ext block is read ad iterally bisected agai. Usually, objects remai i the lower ad higher eds of the cache. These objects are used later to fill up trasfer blocks completely. All remaiig data is writte back i the very last step ito the middle of the file where additioally a fractio of a block has to be processed. Fially, we test if the bisectio poit of the exteral bisectio is i the split iterval. If the poit is outside, aother recursio is required.

6 (a) Iitializig the cache from file: file cache (b) Iteral bisectio of the cache: cache (c) Writig the larger half partially back to disk: file cache (d) Loadig oe further block to cache: file cache (e) Writig the larger half partially back to disk: file cache Figure 3: Exteral Bisectio. 3.5 Costructig the Idex Directory As data partitioig is doe by a recursive algorithm, the structure of the idex is represeted by the recursio tree. Therefore, we are able to create a directory ode after the completio of the recursive calls for the child odes. These recursive calls retur the boudig boxes ad the correspodig secodary storage addresses to the caller, where the iformatio is collected. There, the directory ode is writte, the boudig boxes are combied to a sigle boudig box comprisig of all boxes of child odes, ad the result is agai propagated to the ext higher level. A depth-first post-order sequetializatio of the idex is writte to the disk. 3.6 Aalytical Evaluatio of the Costructio Algorithm I this sectio, we will show that our bottom-up costructio algorithm has a average case time complexity of O( log ). Moreover, we will cosider disk accesses i a more exact way, ad thus provide a aalytically derived improvemet factor over the dyamic idex costructio. For the file I/O, we determie two parameters: The umber of radom seek operatios ad the amout of data read or writte from or to the disk. Uless o further cachig is performed (which is true for our applicatio, but caot be guarateed for the operatig system) ad provided that seeks are uiformly distributed variables, the I/O processig time ca be determied as t i/o = t seek seek_ops + t trasfer amout. I the followig, we deote by the cache capacity the umber of objects fittig ito the cache: cachesize = sizeof (object)

7 Lemma 1. Complexity of bisectio The bisectio algorithm has the complexity O(). Proof (Lemma 1) We assume that the pivot elemet is radomly chose from the data set. After the first ru of the algorithm, the pivot elemet is located with uiform probability at oe of the positios i the file. Therefore, the ext ru of the algorithm will have the legth k with a probability 1 for each 1 < k <. Thus, the cost fuctio C ( ) ecompasses the cost for the algorithm, + 1 compariso operatios plus a probability weighted sum of the cost for processig the algorithm with legth k 1, Ck ( ). We obtai the followig recursive equatio: C ( ) = k = 1 Ck ( 1) which ca be solved by multiplyig with ad subtractig the same equatio for 1. This ca be simplified to C ( ) = 2 + C ( 1), ad, C ( ) = 2 = O( ). Lemma 2. Cost Bouds of Recursio (1) The amout of data read or writte durig oe recursio of our techique does ot exceed four times the file-size. (2) The umber of seek operatios required is bouded by 8 seek_ops( ) log 2 ( ) Proof (Lemma 2) (1) follows directly from Lemma 1 because every compared elemet has to be trasferred at most oce from disk to mai memory ad at most oce back to disk. (2) I each ru of the exteral bisectio algorithm, file I/O is processed with a blocksize of cachesize/2. The umber of blocks read i each ru is therefore because oe extra read is required i the fial step. The umber of write operatios is the same ad thus 8 seek_ops( ) = 2 blocks_read ru () i log 2 ( ). Lemma 3. Average Case Complexity of Our Techique Our techique has a average case complexity of O( log ) uless the split strategy has a complexity worse tha O(). Proof (Lemma 3) For each level of the tree, the complete data set has to be bisectioed as ofte as the height of the split tree idicates. As the height of the split tree is determied by the directory page capacity, there are at most h ( ) C max,dir = O( log ) bisectio rus ecessary.therefore, our techique has the complexity O( log ). blocks_read bisectio ( ) r iterval i = 0 =

8 = 1,000,000 Improvemet Factor = 10,000,000 = 100,000,000 Cache Capacity Figure 4: Improvemet Factor for the Idex Costructio Accordig to Lemmata 1-5. Lemma 4. Cost of Symmetric Partitioig For symmetric splittig, the procedure partitio() hadles a amout of file I/O data of log 2 ) + log Cmax,dir ) 4 filesize ad requires radom seek operatios. Proof (Lemma 4) Left out due to space limitatios, cf. [Böh 98]. Lemma 5. Cost of Dyamic Idex Costructio log 2 ) + log Cmax,dir ) log 2 () Dyamic X-tree costructio requires 2 seek operatios. The trasferred amout of data is 2 pagesize. Proof (Lemma 5) For the X-tree, it is geerally assumed that the directory is completely held i mai memory. Data pages are ot cached at all. For each isert, the correspodig data page has to be loaded ad writte back after completig the operatio. Moreover, o better cachig strategy for data pages ca be applied, sice without preprocessig of the iput data set, o locality ca be exploited to establish a workig set of pages. From the results of lemmata 4 ad 5 we ca derive a estimate for the improvemet factor of the bottom-up costructio over dyamic idex costructio. The improvemet factor for the umber of seek operatios is approximately: Improvemet log 2 ) + log Cmax,dir ) It is almost (up to the logarithmic factor i the deomiator) liear i the cache capacity. Figure 4 depicts the improvemet factor (umber of radom seek operatios) for varyig cache sizes ad varyig database sizes.

9 4. Experimetal Evaluatio To show the practical relevace of our bottom-up costructio algorithm, we have performed a extesive experimetal evaluatio by comparig the followig idex costructio techiques: Dyamic idex costructio (repeated isert operatios), Hilbert R-tree costructio ad our ew method. All experimets have bee computed o HP9000/780 workstatios with several GBytes of secodary storage. Although our techique is applicable to most R-tree-like idex structures, we decided to use the X-tree as a uderlyig idex structure because accordig to [BKK 96], the X-tree outperforms other high-dimesioal idex structures. All programs have bee implemeted i C++. I our experimets, we compare the costructio times for various idexes. The exteral sortig procedure of our costructio method was allowed to use oly a relatively small cache (32 kbytes). Note that, although our implemetatio does ot provide ay further disk I/O cachig, this caot be guarateed for the operatig system. I cotrast, the Hilbert costructio method was implemeted with iteral sortig for simplicity. The costructio time of the Hilbert method is therefore uderestimated by far ad would worse i combiatio with exteral sortig whe the cache size is strictly limited. All Hilbert-costructed idexes have a storage utilizatio ear 100%. Figure 5 shows the costructio time of dyamic idex costructio ad of the bottom-up methods. I the left diagram, we fix the dimesio to 16, ad vary the database size from 100,000 to 2,000,000 objects of sythetic data. The resultig speed-up of the bulk-loadig techiques over the dyamic costructio was so eormous that a logarithmic scale must be used i figure 5. I cotrast, the bottom-up methods differ oly slightly i their performace. The Hilbert techique was the best method, havig a costructio time betwee 17 ad 429 sec. The costructio time of symmetric splittig rages from 26 to 668 sec., whereas ubalaced splittig required betwee 21 ad 744 sec. i the moderate case ad betwee 23 ad 858 sec. for the 9:1 split. I cotrast, the dyamic costructio time raged from 965 to 393,310 sec. (4 days, 13 hours). The improvemet factor of our methods costatly icreases with growig idex size, startig from 37 to 45 for 100,000 objects ad reachig 458 to 588 for 2,000,000 objects. The Hilbert costructio is up to 915 times faster tha the dyamic idex costructio. This eormous factor is ot oly due to iteral sortig but also due to reduced overhead i chagig the orderig attribute. I cotrast to Hilbert costructio, our techique chages the sortig criterio durig the sort process accordig to the split tree. The more ofte the sortig criterio is chaged, the more ubalaced the split becomes because the height of the Figure 5: Performace of Idex Costructio Agaist Database Size ad Dimesio.

10 split tree icreases. Therefore, the 9:1-split has the worst improvemet factor. The right diagram i figure 5 shows the costructio time for varyig idex dimesios. Here, the database size was fixed to 1,000,000 objects. It ca be see that the improvemet factors of the costructio methods (betwee 240 ad 320) are rather idepedet from the dimesio of the data space. Our further experimets, which are ot preseted due to space limitatios [Böh 98], show that the Hilbert costructio method yields a bad performace i query processig. The reaso is the high overlap amog the page regios. Due to improved space partitioig resultig from kowig the data set a priori, the idexes costructed by our ew method outperform eve the dyamically costructed idexes by factors up to Coclusio I this paper, we have proposed a fast algorithm for costructig idexes for high-dimesioal data spaces o secodary storage. A user-defied split-strategy allows the adaptatio of the idex properties to the requiremets of a specific kowledge discovery algorithm. We have show both aalytically ad experimetally that our costructio method outperforms the dyamic idex costructio by large factors. Our experimets further show that these idexes are also superior with respect to the search performace. Future work icludes the ivestigatio of various split strategies ad their impact o differet query types ad access patters. 6. Refereces [BBBK 99] Böhm, Braumüller, Breuig, Kriegel: Fast Clusterig Usig High-Dimesioal Similarity Jois, submitted for publicatio, [BBJ+ 99] Berchtold, Böhm, Jagadish, Kriegel, Sader: Idepedet Quatizatio: A Idex Compressio Techique for High-Dimesioal Data Spaces, submitted for publicatio, [BBK 98a] Berchtold, Böhm, Kriegel: Improvig the Query Performace of High-Dimesioal Idex Structures Usig Bulk-Load Operatios, It. Cof. o Extedig Database Tech., EDBT, [BBKK 97] Berchtold, Böhm, Keim, Kriegel: A Cost Model For Nearest Neighbor Search i High- Dimesioal Data Space, ACM PODS Symp. Priciples of Database Systems, [BK 99] Böhm, Kriegel: Dyamically Optimizig High-Dimesioal Idex Structures, subm., [BKK 96] Berchtold, Keim, Kriegel: The X-Tree: A Idex Structure for High-Dimesioal Data, It. Cof. o Very Large Data Bases, VLDB, [BSW 97] va de Bercke, Seeger, Widmayer: A Geeral Approach to Bulk Loadig Multidimesioal Idex Structures, It. Cof. o Very Large Databases, VLDB, [Böh 98] Böhm: Efficietly Idexig High-Dimesioal Data Spaces, PhD Thesis, Uiversity of Muich, Herbert Utz Verlag, [EKSX 96] Ester, Kriegel, Sader, Xu: A Desity-Based Algorithm for Discoverig Clusters i Large Spatial Databases with Noise, It. Cof. Kowl. Disc. ad Data Miig, KDD, [Hoa 62] Hoare: Quicksort, Computer Joural, Vol. 5, No. 1, [JD 88] Jai, Dubes: Algorithms for Clusterig Data, Pretice-Hall, Ic., [JW 96] Jai, White: Similarity Idexig: Algorithms ad Performace, SPIE Storage ad Retrieval for Image ad Video Databases IV, Vol. 2670, [KF 94] Kamel, Faloutsos: Hilbert R-tree: A Improved R-tree usig Fractals. It. Cof. o Very Large Data Bases, VLDB, [KR 90] Kaufma, Rousseeuw: Fidig Groups i Data: A Itroductio to Cluster Aalysis, Joh Wiley & Sos, [NH 94] Ng, Ha: Efficiet ad Effective Clusterig Methods for Spatial Data Miig, It. Cof. o Very Large Data Bases, VLDB, [Sed 78] Sedgewick: Quicksort, Garlad, New York, [WSB 98] Weber, Schek, Blott: A Quatitative Aalysis ad Performace Study for Similarity-Search Methods i High-Dimesioal Spaces, It. Cof. o Very Large Databases, VLDB, 1998.

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 201 Heaps 201 Goodrich ad Tamassia xkcd. http://xkcd.com/83/. Tree. Used with permissio uder

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

CSE 2320 Notes 8: Sorting. (Last updated 10/3/18 7:16 PM) Idea: Take an unsorted (sub)array and partition into two subarrays such that.

CSE 2320 Notes 8: Sorting. (Last updated 10/3/18 7:16 PM) Idea: Take an unsorted (sub)array and partition into two subarrays such that. CSE Notes 8: Sortig (Last updated //8 7:6 PM) CLRS 7.-7., 9., 8.-8. 8.A. QUICKSORT Cocepts Idea: Take a usorted (sub)array ad partitio ito two subarrays such that p q r x y z x y y z Pivot Customarily,

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Data Structures Week #9. Sorting

Data Structures Week #9. Sorting Data Structures Week #9 Sortig Outlie Motivatio Types of Sortig Elemetary (O( 2 )) Sortig Techiques Other (O(*log())) Sortig Techiques 21.Aralık.2010 Boraha Tümer, Ph.D. 2 Sortig 21.Aralık.2010 Boraha

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees. Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

On (K t e)-saturated Graphs

On (K t e)-saturated Graphs Noame mauscript No. (will be iserted by the editor O (K t e-saturated Graphs Jessica Fuller Roald J. Gould the date of receipt ad acceptace should be iserted later Abstract Give a graph H, we say a graph

More information

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms Chapter 4 Sortig 1 Objectives 1. o study ad aalyze time efficiecy of various sortig algorithms 4. 4.7.. o desig, implemet, ad aalyze bubble sort 4.. 3. o desig, implemet, ad aalyze merge sort 4.3. 4. o

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Priority Queues. Binary Heaps

Priority Queues. Binary Heaps Priority Queues Biary Heaps Priority Queues Priority: some property of a object that allows it to be prioritized with respect to other objects of the same type Mi Priority Queue: homogeeous collectio of

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1 CS200: Hash Tables Prichard Ch. 13.2 CS200 - Hash Tables 1 Table Implemetatios: average cases Search Add Remove Sorted array-based Usorted array-based Balaced Search Trees O(log ) O() O() O() O(1) O()

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

why study sorting? Sorting is a classic subject in computer science. There are three reasons for studying sorting algorithms.

why study sorting? Sorting is a classic subject in computer science. There are three reasons for studying sorting algorithms. Chapter 5 Sortig IST311 - CIS65/506 Clevelad State Uiversity Prof. Victor Matos Adapted from: Itroductio to Java Programmig: Comprehesive Versio, Eighth Editio by Y. Daiel Liag why study sortig? Sortig

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

Data Structures and Algorithms Part 1.4

Data Structures and Algorithms Part 1.4 1 Data Structures ad Algorithms Part 1.4 Werer Nutt 2 DSA, Part 1: Itroductio, syllabus, orgaisatio Algorithms Recursio (priciple, trace, factorial, Fiboacci) Sortig (bubble, isertio, selectio) 3 Sortig

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Operating System Concepts. Operating System Concepts

Operating System Concepts. Operating System Concepts Chapter 4: Mass-Storage Systems Logical Disk Structure Logical Disk Structure Disk Schedulig Disk Maagemet RAID Structure Disk drives are addressed as large -dimesioal arrays of logical blocks, where the

More information

prerequisites: 6.046, 6.041/2, ability to do proofs Randomized algorithms: make random choices during run. Main benefits:

prerequisites: 6.046, 6.041/2, ability to do proofs Randomized algorithms: make random choices during run. Main benefits: Itro Admiistrivia. Sigup sheet. prerequisites: 6.046, 6.041/2, ability to do proofs homework weekly (first ext week) collaboratio idepedet homeworks gradig requiremet term project books. questio: scribig?

More information

Σ P(i) ( depth T (K i ) + 1),

Σ P(i) ( depth T (K i ) + 1), EECS 3101 York Uiversity Istructor: Ady Mirzaia DYNAMIC PROGRAMMING: OPIMAL SAIC BINARY SEARCH REES his lecture ote describes a applicatio of the dyamic programmig paradigm o computig the optimal static

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

A graphical view of big-o notation. c*g(n) f(n) f(n) = O(g(n))

A graphical view of big-o notation. c*g(n) f(n) f(n) = O(g(n)) ca see that time required to search/sort grows with size of We How do space/time eeds of program grow with iput size? iput. time: cout umber of operatios as fuctio of iput Executio size operatio Assigmet:

More information

OCR Statistics 1. Working with data. Section 3: Measures of spread

OCR Statistics 1. Working with data. Section 3: Measures of spread Notes ad Eamples OCR Statistics 1 Workig with data Sectio 3: Measures of spread Just as there are several differet measures of cetral tedec (averages), there are a variet of statistical measures of spread.

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS) CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a

More information

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

Combination Labelings Of Graphs

Combination Labelings Of Graphs Applied Mathematics E-Notes, (0), - c ISSN 0-0 Available free at mirror sites of http://wwwmaththuedutw/ame/ Combiatio Labeligs Of Graphs Pak Chig Li y Received February 0 Abstract Suppose G = (V; E) is

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

CSE 417: Algorithms and Computational Complexity

CSE 417: Algorithms and Computational Complexity Time CSE 47: Algorithms ad Computatioal Readig assigmet Read Chapter of The ALGORITHM Desig Maual Aalysis & Sortig Autum 00 Paul Beame aalysis Problem size Worst-case complexity: max # steps algorithm

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Minimum Spanning Trees

Minimum Spanning Trees Miimum Spaig Trees Miimum Spaig Trees Spaig subgraph Subgraph of a graph G cotaiig all the vertices of G Spaig tree Spaig subgraph that is itself a (free) tree Miimum spaig tree (MST) Spaig tree of a weighted

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Computer Science Foundation Exam. August 12, Computer Science. Section 1A. No Calculators! KEY. Solutions and Grading Criteria.

Computer Science Foundation Exam. August 12, Computer Science. Section 1A. No Calculators! KEY. Solutions and Grading Criteria. Computer Sciece Foudatio Exam August, 005 Computer Sciece Sectio A No Calculators! Name: SSN: KEY Solutios ad Gradig Criteria Score: 50 I this sectio of the exam, there are four (4) problems. You must

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 26 Ehaced Data Models: Itroductio to Active, Temporal, Spatial, Multimedia, ad Deductive Databases Copyright 2016 Ramez Elmasri ad Shamkat B.

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

quality/quantity peak time/ratio

quality/quantity peak time/ratio Semi-Heap ad Its Applicatios i Touramet Rakig Jie Wu Departmet of omputer Sciece ad Egieerig Florida Atlatic Uiversity oca Rato, FL 3343 jie@cse.fau.edu September, 00 . Itroductio ad Motivatio. relimiaries

More information

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU)

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU) Graphs Miimum Spaig Trees Slides by Rose Hoberma (CMU) Problem: Layig Telephoe Wire Cetral office 2 Wirig: Naïve Approach Cetral office Expesive! 3 Wirig: Better Approach Cetral office Miimize the total

More information

6.851: Advanced Data Structures Spring Lecture 17 April 24

6.851: Advanced Data Structures Spring Lecture 17 April 24 6.851: Advaced Data Structures Sprig 2012 Prof. Erik Demaie Lecture 17 April 24 Scribes: David Bejami(2012), Li Fei(2012), Yuzhi Zheg(2012),Morteza Zadimoghaddam(2010), Aaro Berstei(2007) 1 Overview Up

More information

Speeding-up dynamic programming in sequence alignment

Speeding-up dynamic programming in sequence alignment Departmet of Computer Sciece Aarhus Uiversity Demark Speedig-up dyamic programmig i sequece aligmet Master s Thesis Dug My Hoa - 443 December, Supervisor: Christia Nørgaard Storm Pederse Implemetatio code

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

Extending The Sleuth Kit and its Underlying Model for Pooled Storage File System Forensic Analysis

Extending The Sleuth Kit and its Underlying Model for Pooled Storage File System Forensic Analysis Extedig The Sleuth Kit ad its Uderlyig Model for Pooled File System Foresic Aalysis Frauhofer Istitute for Commuicatio, Iformatio Processig ad Ergoomics Ja-Niclas Hilgert* Marti Lambertz Daiel Plohma ja-iclas.hilgert@fkie.frauhofer.de

More information

Recursive Procedures. How can you model the relationship between consecutive terms of a sequence?

Recursive Procedures. How can you model the relationship between consecutive terms of a sequence? 6. Recursive Procedures I Sectio 6.1, you used fuctio otatio to write a explicit formula to determie the value of ay term i a Sometimes it is easier to calculate oe term i a sequece usig the previous terms.

More information

CIS 121. Introduction to Trees

CIS 121. Introduction to Trees CIS 121 Itroductio to Trees 1 Tree ADT Tree defiitio q A tree is a set of odes which may be empty q If ot empty, the there is a distiguished ode r, called root ad zero or more o-empty subtrees T 1, T 2,

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

Algorithm Design Techniques. Divide and conquer Problem

Algorithm Design Techniques. Divide and conquer Problem Algorithm Desig Techiques Divide ad coquer Problem Divide ad Coquer Algorithms Divide ad Coquer algorithm desig works o the priciple of dividig the give problem ito smaller sub problems which are similar

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 10: Caches Prof. Yajig Li Uiversity of Chicago Midterm Recap Overview ad fudametal cocepts ISA Uarch Datapath, cotrol Sigle cycle, multi cycle Pipeliig Basic idea,

More information

EE123 Digital Signal Processing

EE123 Digital Signal Processing Last Time EE Digital Sigal Processig Lecture 7 Block Covolutio, Overlap ad Add, FFT Discrete Fourier Trasform Properties of the Liear covolutio through circular Today Liear covolutio with Overlap ad add

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Examples and Applications of Binary Search

Examples and Applications of Binary Search Toy Gog ITEE Uiersity of Queeslad I the secod lecture last week we studied the biary search algorithm that soles the problem of determiig if a particular alue appears i a sorted list of iteger or ot. We

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA Creatig Exact Bezier Represetatios of CST Shapes David D. Marshall Califoria Polytechic State Uiversity, Sa Luis Obispo, CA 93407-035, USA The paper presets a method of expressig CST shapes pioeered by

More information

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

Reading. Subdivision curves and surfaces. Subdivision curves. Chaikin s algorithm. Recommended:

Reading. Subdivision curves and surfaces. Subdivision curves. Chaikin s algorithm. Recommended: Readig Recommeded: Stollitz, DeRose, ad Salesi. Wavelets for Computer Graphics: Theory ad Applicatios, 996, sectio 6.-6.3, 0., A.5. Subdivisio curves ad surfaces Note: there is a error i Stollitz, et al.,

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information