Accelerating Multi Dimensional Queries in Data Warehouses

Size: px
Start display at page:

Download "Accelerating Multi Dimensional Queries in Data Warehouses"

Transcription

1 Acceleratig Multi Dimesioal Queries i Data Warehouses Russel Pears ad Brya Houlisto School of Computig ad Mathematical Scieces, Aucklad Uiversity of Techology, New Zealad rpears@aut.ac.z Data Warehouses are widely used for supportig decisio makig. O Lie Aalytical Processig or OLAP is the mai vehicle for queryig data warehouses. OLAP operatios commoly ivolve the computatio of multidimesioal aggregates. The major bottleeck i computig these aggregates is the large volume of data that eeds to be processed which i tur leads to prohibitively expesive query executio times. O the other had, Data Aalysts are primarily cocered with discerig treds i the data ad thus a system that provides approximate aswers i a timely fashio would suit their requiremets better. I this chapter we preset the Prime Factor scheme, a ovel method for compressig data i a warehouse. Our data compressio method is based o aggregatig data o each dimesio of the data warehouse. Extesive experimetatio o both real-world ad sythetic data have show that it outperforms the Haar Wavelet scheme with respect to both decodig time ad error rate, while maitaiig comparable compressio ratios (Pears ad Houlisto, 007). Oe ecouragig feature is the stability of the error rate whe compared to the Haar Wavelet. Although Wavelets have bee show to be effective at compressig data, the approximate aswers they provide varies widely, eve for idetical types of queries o early idetical values i distict parts of the data. This problem has bee attributed to the thresholdig techique used to reduce the size of the ecoded data ad is a itegral part of the Wavelet compressio scheme. I cotrast the Prime Factor scheme does ot rely o thresholdig but keeps a smaller versio of every data elemet from the origial data ad is thus able to achieve a much higher degree of error stability which is importat from a Data Aalysts poit of view. Keywords: Data Warehousig, OLAP, Multi Dimesioal Aggregates, Approximate Query Processig, Wavelets, Prime Numbers, Data Compressio. INTRODUCTION Data Warehouses are icreasigly beig used by decisio makers to aalyze treds i data (Cuigham, Sog ad Che, 006, Elmasri ad Navathe, 003). Thus a marketig aalyst is able to track variatio i sales icome across dimesios such as time period, locatio, ad product o their ow or i combiatio with each other. This aalysis requires the processig of multi-dimesioal aggregates ad group by operatios agaist the uderlyig data warehouse. Due to the large volumes of data that eed to be scaed from secodary storage, such queries, referred to as O Lie Aalytical Processig (OLAP) queries, ca take from miutes to hours i large scale data warehouses (Elmasri, 003, Oracle 9i). The stadard techique for improvig query performace is to build aggregate tables that are targeted at kow queries (Triatafillakis, Kaellis, ad Martakos 004; Elmasri, 003). For example the idetificatio of the top te sellig products ca be speeded up by buildig a

2 summary table that cotais the total sales value (i dollar terms) for each of the products sorted i decreasig order of sales value. It would the be a simple matter of queryig the summary table ad retrievig the first te rows. The mai problem with this approach is the lack of flexibility. If the aalyst ow chooses to idetify the bottom te products a expesive re-sort would have to be performed to aswer this ew query. Worst still, if the iformatio is to be tracked by sales locatio the the summary table would be of o value at all. This problem is symptomatic of a more geeral oe where Database Systems which have bee tued for a particular access patter perform poorly as chages to such patters occur over a period of time. I their study (Zhe ad Darmot, 005) showed that database systems which have bee optimized through clusterig to suit particular query patters rapidly degrade i performace whe such query patters chage i ature. The limitatios i the above approach ca be addressed by a data compressio scheme that preserves the origial structure of the data. The chapter is orgaized as follows. I the ext sectio we review related work. The ext sectio itroduces the Prime Factor Compressio (PFC) approach. We the preset the algorithms required for ecodig ad decodig with the PFC approach. The O Lie recostructio of Queries is discussed thereafter. Implemetatio related issues are the discussed, followed by a performace evaluatio of PFC ad a compariso with the Haar Wavelet algorithm. We the discuss future treds i optimizig multi-dimesioal queries i the light of the results of this research. We coclude with a summary of the mai achievemets of the research. BACKGROUND Previous research has teded to cocetrate o computig exact aswers to OLAP queries (Ho, ad Agrawal, 997, Wag 00). Ho describes a method that pre-processes a data cube to give a prefix sum cube. The prefix sum cube is computed by applyig the trasformatio: P[A i ]=C[A i ]+P[A i- ] alog each dimesio of the data cube, where P deotes the prefix sum cube, C the origial data cube, A i deotes a elemet i the cube, ad i is a idex i a rage..d i (D i is the size of the dimesio D i ). This meas that the prefix cube requires the same storage space as the origial data cube. The above approach is efficiet for low dimesioal data cubes. For high dimesioal eviromets, two major problems exist. Firstly, the umber of accesses required is d (Ho et al, 997), which ca be prohibitive for large values of d (where d deotes the umber of dimesios). Secodly, the storage required to store the prefix sum cube ca be excessive. I a typical OLAP eviromet the data teds to be massive ad yet sparse at the same time. The degree of sparsity icreases with the umber of dimesios (OLAP) ad thus the umber of o zero cells may be a very small fractio of the prefix sum cube, which by its ature has to be dese for its query processig algorithms to work correctly. Aother exact techique is the Dwarf cube method (Sismais ad Deligiaakis, 00) which seeks to elimiate structural redudacies ad factor them out by coalescig their store. Pre-fix redudacy arises whe the fact table cotais a group of tuples havig a prefix of redudat values for its dimesio attributes. O the other had, suffix redudacy occurs whe groups of tuples cotai a suffix of redudat values for its dimesio attributes. Elimiatio of these redudacies has bee show to be effective for dese cubes. Ufortuately, i the case of sparse cubes with a large umber of dimesios the size of the fact table ca actually icrease i size (Sismais et al, 00), thus providig o gais i storage efficiecy.

3 I cotrast to exact methods, approximate methods attempt to strike a good balace betwee the degree of compressio ad accuracy. Query performace is ehaced by storig a compact versio of the data cube. A histogram based approach was used by (Matias ad Vitter, 998), (Poosala ad Gati, 999), (Vitter et al, 998), to summarize the data cube. However, histograms too suffer from the curse of high dimesioality, with both space ad time complexity icreasig expoetially with the umber of dimesios (Matias et al, 998). The Progressive Approximate Aggregate approach (Lazaridis ad Mehrotra, 00) uses a tree structure to speed up the computatio of aggregates. Aggregates are computed by idetifyig tree odes that itersect the query regio ad the accumulatig their values icremetally. All tree odes that are fully cotaied i the query regio provide a exact cotributio to the query result, whereas odes that have a partial itersectio provide a estimate. This estimatio is based o a assumptio of spatial uiformity of attribute values across the uderlyig data cube. I practice this assumptio may be ivalid as with the case of the real-world data (US Cesus) that we experimeted with i our study. I cotrast, our method makes o such assumptios o the shape of the source data distributio. Aother issue with the above scheme is that it has a worst case ru time performace that is liear i the umber of data elemets covered by the query, as observed by (Che ad Che, 003). This has egative implicatios for queries that spa a large fractio of the uderlyig data cube, particularly sice compressio is ot utilized to store source data. Samplig is aother approach that has bee used to speed up aggregate processig. Essetially, a small radom sample of the database is selected ad the aggregate is computed o this sample. The samplig operatio ca be doe off-lie, whereby the sample is extracted from the database ad all queries are ru o this sigle extracted sample. O the other had, i o-lie samplig data is read radomly from the database each time a query is executed ad the aggregate computed o the dyamically geerated sample (Hellerstei et al, 996). The very ature of samplig makes it very efficiet i terms of ru time, but its accuracy has bee show to be a limitig factor i its widespread adoptio (Vitter ad Wag, 999). Vitter et al, use the wavelet techique to trasform the data cube ito a compact form. It is essetially a data compressio techique that trasforms the origial data cube ito a Wavelet Trasform Array (WTA) which is a fractio of the size of the origial data cube. I their research Matias et al show that wavelets are superior to the histogram based methods, both i terms of accuracy ad storage efficiecy. Wavelets have also bee show to provide good compressio with sparse data cubes, ulike the Dwarf compressio method. Give the wavelet s superior performace over its rivals ad the fact that it is a approximate techique, it was a ideal choice for compariso agaist our Prime Factor scheme which is also approximate i ature. The ext sectio provides a brief overview of the ecodig ad decodig procedures used i wavelet data compressio. I priciple, ay data compressio scheme ca be applied o a data warehouse For example, a 3- dimesioal warehouse that tracks sales by time period, locatio ad products ca be compressed alog all three dimesios ad the stored i the form of chuks (Sarawagi ad Stoebraker, 994). Chukig is a techique that is used to partitio a d-dimesioal array ito smaller d- dimesioal uits. However a high compressio ratio is eeded to offset the potetially huge secodary storage access times. This effectively ruled out stadard compressio techiques such as Huffma Codig (Cormack, 985), LZW ad its variats (Lempel ad Ziv, 977; Hut 998) as well as

4 Arithmetic Codig (Lagdo, 984). These schemes eable decodig to the origial data with 00% accuracy, but suffer from modest compressio ratios (Ng ad Ravishakar, 997). O the other had the treds aalysis ature of decisio makig meas that query results do ot eed to reflect 00% accuracy. For example, durig a drill-dow query sequece i ad-hoc data miig, iitial queries i the sequece usually determie the truly iterestig queries ad regios of the database. Providig approximate, yet reasoably accurate aswers to these iitial queries gives users the ability to focus their exploratios quickly ad effectively, without cosumig iordiate amouts of valuable system resources (Hellerstei, Haas ad Wag, 997). This meas that lossy schemes which exhibit relatively high compressio ad ear 00% accuracy would be the ideal solutio to achievig acceptable performace for OLAP queries. This chapter ivestigates ad presets the performace of a ovel scheme, called Prime Factor Compressio (PFC) ad compares it agaist the well kow Wavelet approach (Vitter ad Wag, 998; Vitter ad Wag 999, Chakrabarti ad Garofalakis, 000). Recet results have idicated that the Prime Factor Compressio scheme outperforms the Wavelet scheme i terms of error stability, maitaiig a very low ad virtually costat level of accuracy irrespective of the size of the query (Pears ad Houlisto, 007). This is i marked cotrast to the Wavelet scheme which exhibits large swigs i accuracy with varyig query size. This problem has bee attributed to the thresholdig techique used to reduce the size of the ecoded data (Garofalakis ad Gibbos, 004) ad is a itegral part of the Wavelet compressio scheme. The Prime Factor scheme, o the other had does ot rely o thresholdig but keeps a smaller versio of every data elemet from the origial data ad is thus able to achieve a much higher degree of error stability which is importat from a Data Aalysts poit of view. I this chapter we provide a fuller exploratio of the PFC scheme, icludig detailed results o various differet types of datasets ad a formal evaluatio of its performace o sparse data. Wavelet Data Compressio Scheme The Wavelet scheme compresses by trasformig the source data ito a represetatio (the Wavelet Trasform Array or WTA) that cotais a large umber of zero or ear-zero values. The trasformatio uses a series of averagig operatios that operate o each pair of eighborig source data elemets. I this maer the origial data is trasformed ito a data set (the Level trasform set) that cotais the averages of pairs of elemets from the origial data set. The pairwise averagig process is the applied recursively o the Level trasform set. The process cotiues i this maer util the size of the trasformed set is equal to. I order to be able to recostruct the origial data the pair-wise differeces betwee eighbors is kept i additio to the average. The WTA array the cosists of all pair-wise averages ad differeces accumulated across all levels. A thresholdig fuctio is the applied o the WTA to remove a large fractio of array elemets which are small i value. The thresholdig fuctio applies a weightig scheme o the WTA elemets as those elemets at the higher levels play a more sigificat role i recostructio tha their couterparts at the lower levels. For details of the wavelet ecodig ad decodig techiques the reader is referred to (Vitter et al, 999). Wavelet Decodig The decodig process recostructs the origial data by usig the trucated versio of the WTA. The decodig process is best illustrated with the followig example. Figure shows how the

5 coefficiets of the origial array are recostructed from the WTA (the C coefficiets hold the origial array while the S coefficiets hold the WTA). S0 S S S3 S4 S5 S6 S7 C0 C C C3 C4 C5 C6 C7 Figure : The Wavelet Recostructio Process Ay coefficiet C(i) is recostructed by usig its acestor S coefficiets i the path from the root ode to itself. Thus for example, C(0) = S(0) +S()+S() +S(4) ad C() = S(0)+S()+S()-S(4). Cosider a sceario where S(4), although relatively large i compariso to S(0), S() ad S(), is thresholded out due to its lower weightig, thus leadig to iaccuracies i the estimatio of C(0) ad C(). This is symptomatic of the geeral case where the lower level coefficiets are sigificat cotributors to the recostructio process but are thresholded out to make way for their more highly weighted acestor coefficiets. The problem is especially acute i the case of data that is both skewed ad have a high degree of variability. THE PRIME FACTOR SCHEME I respose to the problems associated with wavelets, we preset the Prime factor Scheme. The scheme works broadly o the same lies as the Wavelet techique i the sese that data compressio is used to reduce the size of the data cube prior to processig of OLAP queries. OLAP queries are ru o the compressed data, which is decoded to yield a approximate aswer to the query. Our ecodig scheme uses pre-processig to reduce the degree of variatio i the source data which results i better compressio. A overview of the ecodig process is give i the followig sectio. Overview of Prime Factor Ecodig Scheme The data is first scaled usig the stadard mi-max scalig techique. With this techique, ay value V i the origial cube is trasformed ito V, where V =(V-mi)*(max-mi)/(maxmi)+mi, where mi, max represet the miimum ad maximum values respectively i the origial data cube; mi ad max are the correspodig miimum ad maximum values of the

6 scaled set of values. Each scaled value V is the approximated by its earest prime umber i the scale rage [mi, max]. The choice of mi ad max iflueces both the degree of compressio ad the error rate as we shall show later i the Experimetal Setup sectio. The ratioale behid the scalig is to iduce a higher degree of homogeeity betwee values by compressig the origial scale [mi, max] ito a smaller scale [mi, max], with mi mi ad max < max. I doig so, this preprocessig step improves the degree of compressio. The scaled data cube is the partitioed ito equal sized chuks. The size of the chuk c represets the umber of cells that are eclosed by the chuk. The size of the chuk affects the decodig (query processig) time. Higher values mea fewer chuks eed to be decoded, thus reducig the decodig time (see Theorem 3 i the sectio o O lie Recostructio of Queries ). Each chuk is ecoded by the prime factor ecodig algorithm which yields a array cotaiig c cells. Although the ecoded versio has twice the umber of cells it is much smaller i size sice each cell is very small i umerical value. I fact, our experimetal results reveal that the vast majority of values are very small itegers (see Figures 3a ad 3b i the Experimetal Setup sectio). Figure below summarizes the Prime Factor ecodig process. I additio to trasformig values to very small itegers, the other major beefit of the algorithm is that the itegers are highly skewed i value towards zero. For example, o the Cesus data set (US Cesus) with a chuk size of 64, over 75% of the values tured out to be zero. These results were also bore out with the sythetic data sets that we tested our scheme o. The source (origial) data i some of these data sets were ot skewed i ature, showig that the skew was iduced, rather tha beig a iheret feature of the origial data. Mi Max Scalig Scaled values i tighter rage Chukig of scaled values Chuked iput stream Prime Factor Ecodig Prime Factor array Elias Ecodig of Prime Factor Array Ecoded Stream Figure : Overview of the Prime Factor Scheme We exploited the skewed ature of the ecoded data by usig the Elias variable legth codig scheme (Elias, 975). Elias ecodig works by ecodig smaller itegers i a smaller umber of bits, thus further improvig the degree of compressio. The ext sectio will describe the details ivolved i step 3 of the above process, the PFC ecodig algorithm. The Prime Factor Ecodig Algorithm The algorithm takes as its iput the scaled set of values produced by the mi-max scalig techique. For each chuk, every scaled value is coverted ito the prime umber that is closest to it. We refer to each such prime umber as a prime factor. The algorithm makes use of two

7 operators which we defie below. Both operators α ad β take as their iput a pair of prime factors V k ad V k+. Defiitio : The operator α is defied by α(v k,v k+ ) = earest prime (V k +V k+ +I(V k+ )- I(V k )), where I(V k+ ), I(V k ) deote the ordial (idex) positios of V k+ ad V k o the prime umber scale. The operator takes two primes (V k,v k+ ) adds them together with the differece i idex positios betwee the d prime ad the st prime ad coverts the sum obtaied to the earest prime umber. Defiitio : The operator β is defied by β(v k,v k+ ) = earest prime(v k +V k+ ) The α operator is applied pair-wise across all values (a total of c prime factors) i the chuk. This yields a stream of c/ prime factors. The α operator is the applied recursively o the processed stream util a sigle prime factor is obtaied. This recursive procedure gives rise to a tree of height log c where c is the chuk size. We refer to this tree as the Prime Idex Tree. I parallel with the costructio of the prime idex tree we costruct aother tree, called the Prime Tree. The prime tree is costructed i the same maer as its prime idex couterpart except that we apply the β operator, istead of the α operator. We illustrate the costructio of the trees with the help of the followig example. For the sake of simplicity we take the cube to be of size 4, the chuk size c to be 4 (i.e. we have oly oe chuk) ad the scale rage [mi, max] to be [0, 0] Figu 3 6 Figure 3a: The costructio of the Prime Idex Tree Figure 3b: The costructio of the Prime Tree For the prime idex tree (Figure 3a), the prime values 37 ad at the leaf level are summed as 37++I()-I(37), which is trasformed to its earest prime umber, 9. Similarly 7 ad 97 whe processed yield the prime umber, 73. The ode values 9 ad 73 i tur yield a root value of 33. As show i Figure 3b, the leaf values 37 ad whe summed ad coverted ito its earest prime yields a value of 37. Followig the same process, we obtai values of 67 ad 99 for the remaiig odes. We aotate each iteral ode by its correspodig idex value. The ecoded array E ow cosists of the differeces i idex positios across correspodig positios i the two trees. These differeces are referred to as differetials. For the example above the array turs out to be E = {5, -, }. We also store the idex value of each prime tree root separately i aother array R. Each chuk gives rise to its ow prime tree root, ad for the simple example above, sice we have oly oe chuk, this results i a sigle value {47} for R.

8 Decodig requires the use of the two arrays E ad R. We start with the first value i array R, which is 47, ad the add it to the first value i array E which is 5, yieldig the value 5. We use 5 as a idex ito a table of primes (this table has the pseudo prime 0 added to it with idex value 0) ad extract 33 as the correspodig prime value. We ow search for pairs of primes (P, P) which satisfy the coditio: α(p, P) = 33, β(p3, P4) = 99, I(P3) = I(P+) ad I(P4) = I(P-) (), where P3, P4 are the correspodig odes to P, P i the prime tree This search i geeral would yield a set S of cadidate pairs for (P, P) rather tha a sigle pair. I order to extract the correct pair, we associate a iteger with each iteral ode of the Prime Idex tree which records the ordial positio withi the set S which would eable us to desced to the ext level of the tree. I this case this iteger turs out to be 0 sice there is oly oe pair that satisfies coditio (). These itegers are collectively referred to as offsets. The complete set of offsets for the tree above is {0, 0, 0}. The complete versio of E cotais the sequece of differetials followed by the sequece of offsets. Thus for our example we have E = {5,,-,0,0,0}. We are ow able to decode by descedig both trees i parallel ad recover the origial set of leaf values 37,, 7 ad 97. For ease of uderstadig, the ecodig process above has bee described for a -d dimesioal case. The extesio to d dimesios follows aturally by ecodig alog each dimesio i sequece. For example, if we have a dimesioal cube <D, D > we would first costruct - dimesioal chuks. With a chuk size of 6 ad the use of equal width across each dimesio, each chuk would cosist of a 4 by 4 -d array. We first ru the ecodig process across dimesio D. This would yield a -dimesioal array cosistig of 4 prime root values for each chuk. The differetials ad offsets that result from this ecodig are stored i a temporary cache. The 4 prime root values from ecodig o D are the subjected to the ecodig process across dimesio D to yield the fial ecodig. The differetials ad offsets that result from ecodig alog D are merged with those from ecodig alog dimesio D to yield the fial set of ecoded values. Ecodig Performace The ecodig takes place i four steps as give i Figure. Steps ad, ivolvig scalig ad chukig ca be doe together. As data is read it ca be scaled o the fly with the chukig process. If the origial data is stored i dimesio order (D, D, D d ) with the rightmost idices chagig more rapidly, the it ca be show the I/O complexity ivolved i chukig is N N O log M, where M is the available memory size ad B is the block size of the uderlyig B B B database system. The reader is referred to [Vitter 999] for a proof. The I/O complexity of steps 3 N ad 4 is O. Thus the I/O complexity is bouded by the N N O log M term required for the B B B B chukig step. However it should be oted that this oly reflects a oe time cost i reorgaizig the data from dimesio order to chuked format. Oce this is doe, o further chukig is required as updates to values do ot affect the chuk structure. Thus the time complexity o a regular basis would be N bouded above by O. The oly exceptio occurs whe the dimesios are reorgaized ad B either grow or shrik as a result. This would require repetitio of the chukig step.

9 The Ratioale behid the Prime Factor Scheme Prime umbers provide us with a atural way of costraiig the search space sice they are much less dese tha ordiary itegers. The first 00 positive itegers are distributed across oly 5 primes. At the same time the primes themselves become less dese as we move up the iteger scale (Adrews, 994). The ext 900 positive itegers after 00 oly cotai 43 primes. This meas that the prime umber ecodig techique has good scalability with respect to data value size. From the error poit of view the use of primes itroduces oly small errors as it is possible to fid a prime i close eighborhood to ay give iteger (Adrews, 994). Theorem below gives the distributio of primes i the geeral case. Theorem The umber of prime umbers less tha or equal to a give umber N approaches N for large N. log ( N) e Proof: The reader is referred to (Adrews, 994) for a proof. Theorem reiforces the observatios made above. Firstly, the divisio by the logarithm term esures that the primes are less dese that ordiary itegers. Secodly, the average gap betwee a N give prime N ad its successor is approximately N / = loge N. loge N This meas that ecodig a umber N usig its earest prime will result i a absolute error of log e N o the average. These properties make prime umbers a attractive buildig block for ecodig umerical data. The basic idea that we utilize is to covert a stream of primes ito a sigle prime, the Prime Tree root value by a series of pair wise add operatios. We the augmet the root value with a set of coefficiets to eable us to decode. The use of prime umbers eables us to drastically reduce the search space ivolved i decodig ad as a cosequece reduce the space required to store the ecoded data. As a example cosider the prime root value of 9. I order to decode (assumig that we simply use the Prime Tree) to the ext level we have to cosider just eight combiatios (5,3), (3,5), (7,3), (3,7), (,9), (9,), (3, 7) ad (7,3). O the other had if prime umbers were ot used as the basis, the we would have a total of 30 combiatios to cosider i geeral, if N were the prime root, the N+ combiatios would have to be tested. The use of the Prime Idex tree i cojuctio with the Prime Tree eables us to costrai the search space eve further. With the itroductio of the former we are able to distiguish betwee pairs such as (5,3) ad (3,5). The pair (5,3) ecodes as 3 i the Prime Idex tree ad 9 i the Prime tree, whereas (3,5) ecodes as 3 i the Prime Idex tree ad 9 i the Prime tree. Apart from this, the other major beefit of growig two trees istead of oe is that we ca ecode takig the differetials betwee correspodig odes across the two trees rather tha ode values themselves. Sice the two trees evolve from a commo set of leaf values, the α ad β operators evaluate to approximately the same values which i tur causes the differetials to be much smaller tha the ode values themselves (see Figure 3a i the Experimetal Setup sectio).

10 Compariso of the Wavelet ad Prime Factor Schemes The wavelet ad prime factor schemes both use the cocept of reducig the origial data to differetials betwee progressively icreasig sub sets. However oe major differece is that that the Wavelet scheme uses thresholdig to drop some of the differetials. As oted before, this makes it ustable with respect to the error rate (Garofalakis ad Gibbos, 004) ad this is cofirmed by our results which we preset later. I cotrast, the Prime Factor scheme does ot use thresholdig but exploits the skew iduced by the prime factor trasformatio to ecode the resultig coefficiets usig a Elias code. This meas that every value i the origial data set is represeted i the ecoded versio which results i much greater stability over the Wavelet scheme. Aother major differece is that the ecoded data i the Wavelet scheme are represeted as relatioal tuples i the form <i, i, i d, c> where each i k (i the rage to d) idetifies the idex positio withi dimesio k i which the survivig coefficiet with value c is located. For large values of d, the degree of compressio obtaied ca degrade quite severely as each idex takes up additioal storage. I cotrast, the Prime Factor scheme does ot require ay such idexes as thresholdig is ot used ad thus we would expect it to have relatively better compressio for high dimesioality data. The Prime Factor scheme also copes well i the case of sparse data. All zero values i the origial data will scale to 0 with the scalig scheme that we use. Let us cosider a example where we have a ru of six zero s followed by the primes 0 ad 6 i the scaled data set ad suppose that we use a chuk size of 8. Sice we have exteded the prime umber set to iclude the pseudo prime 0, the 0 values ecode to 0 ad this results i a left prefix tree where we have o-zero values preceded by a ru of zero values. Figures 4a ad 4b below show the resultig trees N N Figure 4a: Left Prefix Prime Idex Tree Figure 4b: Left Prefix Prime Tree Similarly, whe the origial data stream cosists of a sequece of o zero values followed by a ru of zero values, we obtai a right prefix tree after ecodig, where a o zero iteral ode has a zero valued ode as its siblig i each of the Prime Idex ad Prime trees. The left ad right prefix trees ca be collapsed ad stored compactly as idicated by the followig Lemma ad Theorem. Lemma If β(p, P ) = 0, the we must have P +P =0.

11 Proof: From the defiitio of β, we have β(p, P ) = P +P +s = (), where s is a iteger that rouds the sum of P +P to the earest prime factor. Suppose that s 0. We ow cosider two cases, s>0 ad s<0. Case s<0. Sice s<0, it follows that P +P <------(), otherwise, it will be impossible to roud the sum of P +P to 0. We also have P +P (3) sice P 0, P 0. From () ad (3) it follows that P +P =0. Case s>0. The proof is similar to Case above. Sice s >0, the oly way that we ca satisfy () is for P +P 0 ---(4), i order to have ay chace of roudig to 0. From (3) ad (4) it follows that P +P =0. Theorem If i a left prefix or right prefix tree a give ode ecodes to zero, the it follows that the etire sub tree uder that ode ad its associated leaf values must be zero valued. Proof: We use the proof by cotradictio method as a proof sketch. We start with a pair of leaf odes havig values P ad P. Suppose that P 0, P 0. Foe the prime idex tree, we have α(p, P ) = P +P +I(P )-I(P )+r = (5), where r is a iteger that rouds the value of the sum of the precedig terms to the earest prime factor. From Lemma we have P +P =0. Substitutig this i (5), we have α(p, P ) = I(P )-I(P )+r =0. Thus I(P )=I(P )+r. Thus, if r>0, we have I(P )>I(P ), which i tur meas that P >P. With P >P ad P, P 0, we have P +P >0, which leads to a cotradictio. Similarly, with r<0, we have P >P, which agai leads to P +P >0. Thus, i either case, our origial assumptio of P 0, P 0 must be i error ad we have P =0 ad P =0. A similar proof holds for the prime tree. This meas that ay ode with a value of 0 must give rise to a pair of zero valued child odes (this is true of both the Prime Idex ad Prime trees). Each of these zero valued child odes i tur will yield childre who have zero valued. This completes the proof sketch. We ow retur to the example i Figures 4a ad 4b. I Figure 4a we ca collapse the tree by pruig all braches for the sub trees rooted at odes N ad N. This meas that we oly eed to store a total of six coefficiets, made up of 3 differetials ad 3 offsets (as opposed to a total of 4 for the o-sparse case). Thus it ca be see that the Prime Factor scheme adapts to a sparse data set without the eed to keep explicit idexes to keep track of the zero values i the data set. Ecodig Error Rate for the Prime Factor Algorithm I this sectio we preset a formal aalysis for the average relative error ivolved i decodig with the Prime Factor scheme. Theorem below quatifies this error rate. log ( ) Theorem 3 :The average relative error is approximately e CV, where C is the chuk size, V adv is the average value of a elemet withi a chuk

12 Proof Sketch: Cosider the Prime tree give i Figure 5 below with chuk size 8. P5 P3 P4 P9 P0 P P P P P P3 P4 P5 P6 P7 P8 Figure 5: Error Tree for Chuk Size 8 The value of a iteral ode, say P9 isgive by P9 = P + P + P, P where is a error term that deotes the approximatio to the earest prime umber Similarly,P = P + P + ad P = P + P +. Substitutig P for P 4 P4, P3 3 9 Thus it ca be see that a paret ode s value is give by the sum of the values of the leaf ode values that ca be reached by the paret ode plus the error terms at the itermediate ode levels. I geeral, we have P ad P 0 P0, P we obtai = P + P + P + P + 3, the root ode value give by P P + C C C i P i, i i= i= C The sum of the values T i the chuk is give by T 4 i the expressio for P P, P + P 4, P 3 + = P, P 0 i= 9 P i = C Thus we have d = P From Theorem, we C T = have the C i= P i, i average () gap betwee a prime umber P ad its ext as log e (P).This result ca be used to estimate loge ( P LC + P as RC ), where P LC,P the average value of the error coefficiet represet the left ad right give paret ode, the diviso by is ecessary because the Prime Factor RC childre of a algorithm ecodes each value with its earest prime umber.

13 Put P i + P i = KiT, where Ki for i =,,., C - Each error term ca ow be writte as P log ( ) i K, i = e i T C C Thus from () above we have d log ( ) log ( ) + ( )log ( ) e KiT e Ki C e T usig the Taylor series expasio for the log fuctio ad takig a first order approximatio for 0 K. i i= C i= C i= From the defiitio of K, we have K log ( C) i i i= ( K i ) + ( C )loge( T ) Sice log( C) << ( C ) << ( C )loge( T ), we have d ( C ) loge( T ) d ( C ) log ( CV ) The relative error, e. Note : the scalig terms to scale the differece d ad CV C V the sum CV to the origial data scale do ot eed to be cosidered as they cacel each other off i the umerator ad deomiator. Thus d loge( CV ) for large V C The above result captures the relatioship betwee the error rate, chuk size ad scale rage. As the chuk size icreases, the error rate grows as a log fuctio. This is show visually i Figure b, whereby the average error rate icreases sub liearly with respect to icreasig chuk size. The above expressio also shows that for a give chuk size C, the error rate decreases with a icrease i scale rage size. As the scale rage size icreases, the V term icreases at a faster rate tha the log (CV) term, thus causig a drop i the error rate. e this tred. ON LINE RECONSTRUCTION OF QUERIES Figure0b displays I this sectio we examie how multi-dimesioal rage sum queries are aswered with the Prime Factor scheme. These queries are of the form: {l i D i h i l i, h i {0,,, D i -}, h i = l i + i }, where i is a positive iteger. We will first derive a expressio for the time complexity ivolved i aswerig a -dimesioal query as some of the cocepts ivolved are shared with the geeral the -dimesioal case.

14 Aswerig Oe Dimesioal Rage Sum Queries I this case we eed to cosider three regios R, R ad R 3 as show i Figure 6 below. The query requires the sum of all elemets i the -dimesioal cube which are i the rage [l, h ], with l = ad h =690. The bulk of the query resides i regio R (the shaded regio) which is aliged with the chuk structure. The sum across this regio ca be aswered very efficietly by takig the sum of the root values for the chuks that spa this regio, thus avoidig the eed for decodig across a large portio of the query. Sice each of these values correspods to the root associated with the Prime tree they are a close approximatio to the sum cotaied withi a chuk. R R R 3 Figure 6 Decodig ad aswerig a dimesioal query Furthermore, with a large eough chuk size we would expect that the root values comprise a very small fractio of the size of the origial cube ad thus be either held i memory or stored o a small umber of blocks o secodary storage (with a chuk size of 56 for example, these root values comprise just /56 of the umber of elemets withi the origial cube). Note that this is true irrespective of the dimesioality of the query or that of the cube. The oly decodig that is ecessary is for the two chuks at the eds of the query, i.e. regios R ad R 3. Thus it ca be see that the chukig has helped to miimize the effort ivolved i decodig. It is also importat to ote that the decodig effort is depedet o the dimesioality of the query ivolved ad ot o the dimesioality of the cube. For example, for a query such as [l 3, h 3 ] (say with l 3 = ad h 3 = 690) o dimesio 3 of a 5-dimesioal cube, the umber of chuks to be decoded is still sice we oly eed to cosider the oe dimesioal slice alog dimesio 3 which is aggregated across dimesios,, 4 ad 5. Agai, this observatio holds for the geeral case as well. However, the amout of effort (i terms of the umber of chuks to be decoded) ivolved will deped o the dimesioality of the query as we shall see i the ext sectio. I the ext sectio we will preset a expressio for the geeral case, ivolvig queries o dimesios. The -Dimesioal Case l = h = 690 Suppose that the query spas dimesios. O each of these dimesios we ca defie three regios, the two regios at the eds of the query ad the middle regio which is aliged with the chuk structure. This leads to a total of 3 regios. Out of these the majority of the query is aliged with the chuk structure i a sigle cotiguous regio i -dimesioal space, thus requirig decodig across 3 regios. Each of these regios is o a boudary of the query i - dimesioal space ad thus teds to be small i size. Theorem 4 below quatifies the amout of decodig effort ivolved i the geeral case. Theorem 4 The total umber of chuks to be decoded for a -dimesioal query is 3 + Δ (d i ) where d i = i i = c

15 Proof Sketch: We will first cosider the regios o the corers of the cube defied by the query. There are such corers. Each of these corer regios are cotaied i exactly oe chuk. This leaves a total of (3 ) regios, occurrig alog dimesios. Each of these regios alog dimesio i will spa (d i ) chuks, sice of the total umber comprises the corer chuks. Thus the total umber of these chuks is 3 (d i ). To this umber we i = must add the corer regios. This completes the proof. d R R 5 R d R 6 R 7 R 9 R 3 R R 4 8 Figure 7a: The Regio aliged with the chuk structure Figure 7b: Regios bouded by a -dimesioal query Figures 7a ad 7b provide a geometrical explaatio for the D case. Figure 7a illustrates the regio that requires o decodig, ad it ca be see that it covers the bulk of the query space. Figure 7b gives a complete breakdow of regios bouded by the query. The regio R 7 correspods to the shaded regio i Figure 7a; R, R, R 3, R 4 represet the four corer regios that each require decodig a sigle chuk; the regios R 5, R 6, R 8 ad R 9 i geeral spa larger regios cotaiig multiple chuks that require decodig. Theorem 5 For a give query dimesioality, the chukig scheme improves decodig d efficiecy by a miimum factor of i i = 3 d i i = Proof Sketch: Follows from Theorem 4 above ad the fact that a -dimesioal query requires decodig d i chuks if aggregate sums (prime tree root values) are ot stored at chuk level. i = IMPLEMENTATION CONSIDERATIONS I this sectio we will provide a brief overview of the implemetatio of some of the major compoets of PFC. The system was implemeted etirely i Java. We made use of a table of prime umbers (Alfeld) readily available from the Web. All other fuctioality was custom built. Ecodig The three mai fuctios that were used extesively with respect to ecodig were Fid_Nearest_Prime(N), Evaluate_Alpha(N, M) ad Evaluate_Beta(N, M). The Fid_Nearest_Prime(N) fuctio was fairly straightforward to implemet as the use of the prime table meat that the complexity of testig for primes was avoided. The other two fuctios,

16 Evaluate_Alpha(N, M) ad Evaluate_Beta(N, M), formed the core of the PFC scheme ad were used to build the Prime Idex ad Prime Trees respectively. These fuctios essetially took two prime umbers as their argumets ad coverted their sum to the earest prime umber (the Evaluate_Alpha fuctio also added a compoet that ivolved the differece i idex positios betwee the two give primes). Thus these two fuctios were quite straightforward to implemet as well, as they oly ivolved simple computatios. Decodig With respect to decodig there were two cases to cosider. The first case ivolved chuks that required o recovery of the idividual leaf odes withi the chuk. This case covers a large fractio of the chuks ivolved i a query, as quatified by Theorem 4. For these chuks all that was required was extractio of the prime tree root value ad subsequet re-scalig to trasform the value back to the origial source data scale rage. The secod case ivolved chuks that required actual decodig ad this was accomplished through a fuctio, Desced_Prime_Tree(N), that was used to desced oe level dow the Prime Tree. Basically, this fuctio took a prime umber deotig the acestor ode i the correspodig tree as its argumet, ad the retured a set cotaiig pairs of primes that were cadidates for the child odes. We used a heuristic to speed up the geeratio of these cadidate sets. For a give acestor ode, with prime value N, we extracted the prime umber M that is earest to N/. The pair (M, M) is guarateed to be a cadidate for the Prime Tree child odes. This pair was used as a startig poit for the geeratio of the rest of the pairs. Figure 8 below illustrates the algorithm used for geeratig the Prime Tree cadidate set. Desced_Prime_Tree (N) { C={}; M = earest_prime ( N/ ) ; C = C {(M, M)}; Idx = I(M); // the idex value of prime M U = M; L = earest_prime(p[idx-]); // the largest prime less tha M while (L mi ad U max) // [mi..max] represets the scale rage { V = β(u, L); if (V = = N) the { C = C {(L, U), (U, L)}; U = earest_prime(p[i[u]+]); // icrease upper value i prime pair } else if (V <N) the U = earest_prime(p[i[u]+]); // icrease upper value i prime pair else L = earest_prime(p[i[l]-]); // decrease lower value i prime pair } } Figure 8: Algorithm for geeratig Cadidate Prime Pairs I order to further speed up the process of decodig we cached the prime pairs for a give iteger N. The decodig process beefits from such a cache as prime tree odes with a give value N are likely to recur may times across differet chuks. I such cases, the process of geeratig prime pairs is reduced to a fast i-memory table lookup. Our experimets show that this cache occupied less tha % of the storage of the origial source data.

17 With the compilatio of the cadidate prime pairs, the decodig process reduces to applyig the differetial ad offset coefficiets to desced both trees i parallel to extract the leaf level values, as described earlier i the sectio o The Prime Factor Ecodig Algorithm. EXPERIMENTAL SETUP I this sectio we describe the experimets we carried out with the Prime Factor ad Wavelet schemes. The experimetatio could be broadly divided ito two mai categories: A performace compariso betwee the two schemes, A ivestigatio ito the performace of the PFC scheme with respect to key parameters We use two metrics, Compressio Ratio, ad Relative Error to quatify the degree of compressio ad error respectively. The compressio ratio (cr) is defied by: cr = ecoded data size i bytes/size of origial data set i bytes, while the relative error (re) is give by: re = S- R /S, where S represets sum of the query o the origial (u-ecoded) data set, ad R is the recostructed sum after decodig has bee carried out with the Prime Factor or Wavelet schemes. Compariso of the Prime Factor ad Wavelet Schemes I this sectio we focus o a performace compariso betwee the Prime Factor ad Wavelet schemes o a rage of differet data sets. We used both real-world data from the (US Cesus) that was used i (Vitter et al, 998; 999, Chakrabarti et al, 000) as well as sythetic data sets for this purpose. The sythetic data sets were modeled o the criteria used i (Ng ad Ravishakar, 997). Two parameters, degree of skew ad degree of variatio were used i their geeratio. The degree of skew was take to be high whe 70 percet of the data elemets were draw from 30% of the domai value rage, with the other 30% of the data draw from a uiform data distributio i the domai value rage. O the other had, the degree of variatio was take to be high whe the differece i data elemet values is more tha 00% of the average data elemet value. Whe the differeces i data elemet values was o more tha 0% of the average data elemet value, the the data variatio was take to be low. Variatio i the value of these two parameters gave rise to four differet data sets: (No Skew, Low Variace), (High Skew Low Variace), (No Skew, High Variace) ad (High Skew, High Variace). Decodig Time The first experimet was ru o the US Cesus data ad was desiged to gai a isight ito how the decodig times varied with the size (i.e. the portio of the data that the query retrieved, expressed as a percetage) of the query posed. We used a chuk size of 64 ad a scale rage of [0..307] for the PFC ad a threshold of 7 % for the Wavelet. These parameter settigs esured that the two schemes produced roughly the same degree of compressio i order to isolate the effect of ecoded file size o the timigs. At each value of query size we ra 50 tests, cosistig of a batch of 0 for each value of the query dimesioality parameter which raged from to 5, ad the average time was recorded across the 50 rus. Figure 9 shows that the Prime Factor outperformed the Wavelet scheme i the etire rage tested. The two curves start at roughly the same poit at the lower ed of the size scale, but diverge quite

18 rapidly as the query size icreases. The good performace of the Prime Factor scheme ca be attributed to two factors. Firstly, its lower average (take across the etire rage of query size) decodig time per data elemet was 0.00 ms versus ms for the Wavelet. I the case of the Prime Factor scheme the iformatio (i.e. the decodig coefficiets) ecessary for decodig a data elemet is localized withi the chuk that ecapsulates it. There is o cocept of a sigle global tree, sice each chuk is ecoded separately, thus givig rise to a collectio of idepedet trees. This meas that chuks ca be decoded idepedetly from each other, ad oly those coefficiets belogig to chuks that require decodig eed to be examied. O the other had, the Wavelet scheme ecodes usig a sigle tree ad thus query processig ivolves searchig through the etire set of wavelet coefficiets to determie the cotributios made by each idividual coefficiet (Vitter, 999). Secodly, PFC decodes a much smaller umber of data elemets, as oly chuks alog the boudaries of a query eed to be decoded (as proved by Theorem 5). Prime Factor vs Wavelet Decodig Time Decode Time (ms) Prime Factor Wavelet Timig Query Size (%) Figure 9: Effect of Query Size o Decodig Time Error Rates I the ext set of experimets we compared the two schemes o error rate. As before we varied the query size i steps of 5 i the rage 5% to 90% ad measured the relative error at each step. For each size value we ra 0 tests, with each test retrievig data from differet regios i the data set. For each batch of 0 rus we measured the miimum, maximum ad average relative error values (all expressed as percetages) for the PFC (with scale rage of [0..307]) ad the Wavelet (with thresholdig set at 7%). As ca be see from Tables ad for roughly the same compressio ratio the two schemes have sigificatly differet error throughout the rage tested. The PFC scheme exhibits a small average error rate of aroud 0.3% right up to the 90% mark (this was true for higher percetages of the query size parameter - up to 99% which we have omitted for reasos of brevity). As Theorem demostrates the average error rate for the PFC is essetially depedet o the scale rage ad the chuk size used ad is ot a fuctio of query size. The differetial betwee the average error rates for the two schemes is much higher at the lower ed of the query size rage (e.g. from 5% to 55%) tha at the upper ed. At the upper ed of the query size rage the Wavelet s performace improves as the top level wavelet coefficiet represetig the overall mea of the data set assumes more importace i recostructio.

19 The relative error for the PFC scheme is remarkably stable throughout the query size rage. The miimum, average ad maximum values are much closer together tha its Wavelet couterparts. With the Wavelet scheme we have sigificat deviatios betwee the miimum ad maximum error values i the etire size rage. This is i lie with previous research (Vitter et al, 999), (Garofalakis et al, 004). I practice error stability is importat as this would ispire more cofidece i users of the accuracy of results to ad-hoc OLAP queries. We cross checked these results with the sythetic data sets that we geerated ad observed the same treds. Compressio Ratio PF Wavelet (7%) 7.37 Table : Comparative Compressio Rates Query Size (%) Relative Error (%) Mi Avg PFC Mi PFC Avg Max PF Max 307 Wavelet 307 Wavelet 307 Wavelet 5% % % % % % % % % % % % % % % % % % A Table : Comparative error rates Sesitivity of Prime Factor Performace o Key Parameters We ivestigated the effects of scale rage size ad chuk size o performace. We also looked at the distributios of the ecoded coefficiets i order to gai a isight ito whether the Elias code would be a effective represetatio mechaism. The experimets were coducted o the real world US Cesus data set. Effect of Scale Rage Size I this set of experimets we tested the effect of scale rage o both compressio ratio ad accuracy. We kept the lower boud of the scale at 0, ad varied the upper boud from 0 to 009 i approximate steps of 00. Figures 0a ad 0b track the effects of this variatio o compressio ratio ad relative error. We measured the relative error o queries that radomly picked a 0% sample of the data (i actual fact the relative error was obtaied by averagig across 0 rus for each value of the scale rage parameter).

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0 Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Combination Labelings Of Graphs

Combination Labelings Of Graphs Applied Mathematics E-Notes, (0), - c ISSN 0-0 Available free at mirror sites of http://wwwmaththuedutw/ame/ Combiatio Labeligs Of Graphs Pak Chig Li y Received February 0 Abstract Suppose G = (V; E) is

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual) Wavelet Trasform CSE 49 G Itroductio to Data Compressio Witer 6 Wavelet Trasform Codig PACW Wavelet Trasform A family of atios that filters the data ito low resolutio data plus detail data high pass filter

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA Creatig Exact Bezier Represetatios of CST Shapes David D. Marshall Califoria Polytechic State Uiversity, Sa Luis Obispo, CA 93407-035, USA The paper presets a method of expressig CST shapes pioeered by

More information

South Slave Divisional Education Council. Math 10C

South Slave Divisional Education Council. Math 10C South Slave Divisioal Educatio Coucil Math 10C Curriculum Package February 2012 12 Strad: Measuremet Geeral Outcome: Develop spatial sese ad proportioal reasoig It is expected that studets will: 1. Solve

More information

Math 10C Long Range Plans

Math 10C Long Range Plans Math 10C Log Rage Plas Uits: Evaluatio: Homework, projects ad assigmets 10% Uit Tests. 70% Fial Examiatio.. 20% Ay Uit Test may be rewritte for a higher mark. If the retest mark is higher, that mark will

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

Counting Regions in the Plane and More 1

Counting Regions in the Plane and More 1 Coutig Regios i the Plae ad More 1 by Zvezdelia Stakova Berkeley Math Circle Itermediate I Group September 016 1. Overarchig Problem Problem 1 Regios i a Circle. The vertices of a polygos are arraged o

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued

More information

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1 CS200: Hash Tables Prichard Ch. 13.2 CS200 - Hash Tables 1 Table Implemetatios: average cases Search Add Remove Sorted array-based Usorted array-based Balaced Search Trees O(log ) O() O() O() O(1) O()

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

6.851: Advanced Data Structures Spring Lecture 17 April 24

6.851: Advanced Data Structures Spring Lecture 17 April 24 6.851: Advaced Data Structures Sprig 2012 Prof. Erik Demaie Lecture 17 April 24 Scribes: David Bejami(2012), Li Fei(2012), Yuzhi Zheg(2012),Morteza Zadimoghaddam(2010), Aaro Berstei(2007) 1 Overview Up

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 5 Fuctios for All Subtasks Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 5.1 void Fuctios 5.2 Call-By-Referece Parameters 5.3 Usig Procedural Abstractio 5.4 Testig ad Debuggig

More information

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8) CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters. SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

Evaluating Top-k Selection Queries

Evaluating Top-k Selection Queries Evaluatig Top-k Selectio Queries Surajit Chaudhuri Microsoft Research surajitc@microsoft.com Luis Gravao Columbia Uiversity gravao@cs.columbia.edu Abstract I may applicatios, users specify target values

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

Alpha Individual Solutions MAΘ National Convention 2013

Alpha Individual Solutions MAΘ National Convention 2013 Alpha Idividual Solutios MAΘ Natioal Covetio 0 Aswers:. D. A. C 4. D 5. C 6. B 7. A 8. C 9. D 0. B. B. A. D 4. C 5. A 6. C 7. B 8. A 9. A 0. C. E. B. D 4. C 5. A 6. D 7. B 8. C 9. D 0. B TB. 570 TB. 5

More information

prerequisites: 6.046, 6.041/2, ability to do proofs Randomized algorithms: make random choices during run. Main benefits:

prerequisites: 6.046, 6.041/2, ability to do proofs Randomized algorithms: make random choices during run. Main benefits: Itro Admiistrivia. Sigup sheet. prerequisites: 6.046, 6.041/2, ability to do proofs homework weekly (first ext week) collaboratio idepedet homeworks gradig requiremet term project books. questio: scribig?

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms Chapter 4 Sortig 1 Objectives 1. o study ad aalyze time efficiecy of various sortig algorithms 4. 4.7.. o desig, implemet, ad aalyze bubble sort 4.. 3. o desig, implemet, ad aalyze merge sort 4.3. 4. o

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to

More information

Computers and Scientific Thinking

Computers and Scientific Thinking Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

Consider the following population data for the state of California. Year Population

Consider the following population data for the state of California. Year Population Assigmets for Bradie Fall 2016 for Chapter 5 Assigmet sheet for Sectios 5.1, 5.3, 5.5, 5.6, 5.7, 5.8 Read Pages 341-349 Exercises for Sectio 5.1 Lagrage Iterpolatio #1, #4, #7, #13, #14 For #1 use MATLAB

More information

Intermediate Statistics

Intermediate Statistics Gait Learig Guides Itermediate Statistics Data processig & display, Cetral tedecy Author: Raghu M.D. STATISTICS DATA PROCESSING AND DISPLAY Statistics is the study of data or umerical facts of differet

More information

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

ECE4050 Data Structures and Algorithms. Lecture 6: Searching ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

A General Framework for Accurate Statistical Timing Analysis Considering Correlations

A General Framework for Accurate Statistical Timing Analysis Considering Correlations A Geeral Framework for Accurate Statistical Timig Aalysis Cosiderig Correlatios 7.4 Vishal Khadelwal Departmet of ECE Uiversity of Marylad-College Park vishalk@glue.umd.edu Akur Srivastava Departmet of

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Major CSL Write your name and entry no on every sheet of the answer script. Time 2 Hrs Max Marks 70

Major CSL Write your name and entry no on every sheet of the answer script. Time 2 Hrs Max Marks 70 NOTE:. Attempt all seve questios. Major CSL 02 2. Write your ame ad etry o o every sheet of the aswer script. Time 2 Hrs Max Marks 70 Q No Q Q 2 Q 3 Q 4 Q 5 Q 6 Q 7 Total MM 6 2 4 0 8 4 6 70 Q. Write a

More information

Appendix A. Use of Operators in ARPS

Appendix A. Use of Operators in ARPS A Appedix A. Use of Operators i ARPS The methodology for solvig the equatios of hydrodyamics i either differetial or itegral form usig grid-poit techiques (fiite differece, fiite volume, fiite elemet)

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 8 Strigs ad Vectors Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Slide 8-3 8.1 A Array Type for Strigs A Array Type for Strigs C-strigs ca be used to represet strigs

More information

OCR Statistics 1. Working with data. Section 3: Measures of spread

OCR Statistics 1. Working with data. Section 3: Measures of spread Notes ad Eamples OCR Statistics 1 Workig with data Sectio 3: Measures of spread Just as there are several differet measures of cetral tedec (averages), there are a variet of statistical measures of spread.

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

are two specific neighboring points, F( x, y)

are two specific neighboring points, F( x, y) $33/,&$7,212)7+(6(/)$92,',1* 5$1'20:$/.12,6(5('8&7,21$/*25,7+0,17+(&2/285,0$*(6(*0(17$7,21 %RJGDQ602/.$+HQU\N3$/86'DPLDQ%(5(6.$ 6LOHVLDQ7HFKQLFDO8QLYHUVLW\'HSDUWPHQWRI&RPSXWHU6FLHQFH $NDGHPLFND*OLZLFH32/$1'

More information

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis IOSR Joural of Egieerig Redudacy Allocatio for Series Parallel Systems with Multiple Costraits ad Sesitivity Aalysis S. V. Suresh Babu, D.Maheswar 2, G. Ragaath 3 Y.Viaya Kumar d G.Sakaraiah e (Mechaical

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

Improving Information Retrieval System Security via an Optimal Maximal Coding Scheme

Improving Information Retrieval System Security via an Optimal Maximal Coding Scheme Improvig Iformatio Retrieval System Security via a Optimal Maximal Codig Scheme Dogyag Log Departmet of Computer Sciece, City Uiversity of Hog Kog, 8 Tat Chee Aveue Kowloo, Hog Kog SAR, PRC dylog@cs.cityu.edu.hk

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

Speeding-up dynamic programming in sequence alignment

Speeding-up dynamic programming in sequence alignment Departmet of Computer Sciece Aarhus Uiversity Demark Speedig-up dyamic programmig i sequece aligmet Master s Thesis Dug My Hoa - 443 December, Supervisor: Christia Nørgaard Storm Pederse Implemetatio code

More information

Homework 1 Solutions MA 522 Fall 2017

Homework 1 Solutions MA 522 Fall 2017 Homework 1 Solutios MA 5 Fall 017 1. Cosider the searchig problem: Iput A sequece of umbers A = [a 1,..., a ] ad a value v. Output A idex i such that v = A[i] or the special value NIL if v does ot appear

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 26 Ehaced Data Models: Itroductio to Active, Temporal, Spatial, Multimedia, ad Deductive Databases Copyright 2016 Ramez Elmasri ad Shamkat B.

More information

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis

More information

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 8 Strigs ad Vectors Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Copyright 2015 Pearso Educatio, Ltd..

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Massachusetts Institute of Technology Lecture : Theory of Parallel Systems Feb. 25, Lecture 6: List contraction, tree contraction, and

Massachusetts Institute of Technology Lecture : Theory of Parallel Systems Feb. 25, Lecture 6: List contraction, tree contraction, and Massachusetts Istitute of Techology Lecture.89: Theory of Parallel Systems Feb. 5, 997 Professor Charles E. Leiserso Scribe: Guag-Ie Cheg Lecture : List cotractio, tree cotractio, ad symmetry breakig Work-eciet

More information