Storing Matrices on Disk: Theory and Practice Revisited

Size: px
Start display at page:

Download "Storing Matrices on Disk: Theory and Practice Revisited"

Transcription

1 Storng Matrces on Dsk: Theory and Practce Revsted Y Zhang Duke Unversty yzhang@cs.duke.edu Kamesh Munagala Duke Unversty kamesh@cs.duke.edu Jun Yang Duke Unversty junyang@cs.duke.edu ASTRACT We consder the problem of storng arrays on dsk to support scalable data analyss nvolvng lnear algebra. We propose Lnearzed Array -tree, or LA-tree, whch supports flexble array layouts and automatcally adapts to varyng sparsty across parts of an array and over tme. We reexamne the -tree splttng ategy for handlng nsertons and the flushng polcy for batchng updates, and show that common practces may n fact be suboptmal. Through theoretcal and emprcal studes, we propose alternatves wth good theoretcal guarantees and/or practcal performance. Introducton Arrays are one of the fundamental data types. Vectors and matrces, n partcular, are the most natural representaton of data for many statstcal analyss and machne learnng tasks. As we apply ncreasngly sophstcated analyss to bgger and bgger datasets, effcent handlng of large arrays s rapdly ganng mportance. In the RIOT project [9], we are buldng a system to support scalable statstcal analyss of massve data n a transparent fashon, whch allows users to enjoy the convenence of languages lke R and MATLA wth bult-n support for vectors/matrces and lnear algebra, wthout rewrtng code to use systems lke databases that scale better over massve data. Scalablty requres effcent handlng of dsk-resdent arrays. Our target applcatons make prevalent use of hgh-level, whole-array operators such as matrx multply, nverse, and factorzaton, but low-level, element-wse reads and wrtes are also possble. We have dentfed the followng requrements for an array storage engne:. We must support dfferent array access patterns (ncludng those that appear random). Our storage engne should allow a user or optmzer to select from a varety of storage layouts, because many whole-array operators have access patterns that prefer specfc storage layouts: e.g., I/O-effcent matrx multply prefers row, column, or blocked layouts, whle FFT prefers the bt-reversal order. Moreover, a sngle array may be used n operators wth dfferent access patterns; nstead of convertng the storage layout for every use, sometmes t s cheaper to allow access patterns that do not match the storage layout (see Remark. n appendx for a concrete example), even though t Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redbute to lsts, requres pror specfc permsson and/or a fee. Artcles from ths volume were nvted to present ther results at The 7th Internatonal Conference on Very Large Data ases, August 9th - September rd, Seattle, Washngton. Proceedngs of the VLD Endowment, Vol., No. Copyrght VLD Endowment 5-97//... $.. makes accesses more random. Fnally, some operators nherently contan some degree of randomness n ther access patterns that cannot be removed by storage layouts, e.g., LU factorzaton wth partal pvotng.. We must handle updates. One common update pattern s populatng an array one element at a tme n some order, whch may or may not be the same as the storage layout order. Handlng updates goes beyond bulk loadng: some operators, such as LU factorzaton, teratvely update an array and read prevously updated values, whch means that we cannot smply log all updates wthout effcently supportng nterleavng (and sometmes random) reads to updated values.. We want the storage format to automatcally adapt to array sparsty. For a sparse array, we want to avod wastng space for elements that are zero (or some other default value), whch can be done by storng array ndces and values only for non-zero elements. On the other hand, for a dense array, we want to avod the overhead of storng array ndces by densely packng the values and nferrng ther ndces from storage postons. In practce, there s no obvous delneaton between sparse and dense ; sparsty often vares across parts of an array and over tme, and s dffcult to predct n advance. For example, consder an applcaton program that updates an ntally empty (all-zero) matrx one element at a tme n random order accordng to some ongong computaton. The matrx may turn out dense, sparse, or partly dense (e.g., mostly upper-trangular); regardless of ts fnal content, our storage engne should store the matrx n a way that provdes good performance throughout the update uence, wthout user nterventon. There has been a myrad of approaches to storng arrays on dsk, but many fal to meet all requrements above. Targetng n-memory computaton, popular platforms for statstcal computng such as R and MATLA offer separate dense and sparse storage formats, but these formats do not adapt to varyng sparsty across parts of an array and over tme, and users must choose one format n advance. Compressed sparse column, used by MATLA and representatve of popular sparse formats, does not support updates or random accesses for dsk-resdent arrays. Alternatvely, a database system can store an array as a table wth columns representng array ndex and value, but the overhead s hgh for dense arrays. It s generally beleved that specal support for arrays s needed n database systems, ether through user-defned extensons or by completely new desgns [,,, 7]. Secton surveys addtonal related work. A promsng approach s to leverage -tree []. To handle multdmensonal arrays, we use lnearzaton, whch maps a mult- For memory-resdent arrays, ths format s easer to search but stll neffcent to update.

2 dmensonal coordnate to a -d array ndex accordng to a lnearzaton functon that offers control over data layout. To adapt to varyng sparsty, we apply the dea of compresson, allowng each -tree leaf to swtch dynamcally between sparse and dense formats accordng to the array densty wthn the leaf. Smply outfttng -tree wth these features, however, falls short of offerng optmal performance for arrays, as lluated below. Example. Consder uentally nsertng elements of array nto an empty -tree, whch s a very common update pattern. Suppose the array has sze and a -tree leaf can hold at most records. When a leaf overflows, the standard ategy s spltn-mddle, whch dvdes the leaf nto two wth equal number of records (or as closely as possble). The leaf level of the -tree after the nserton uence looks as follows (only record keys are shown). About half of the space s empty, whch s partcularly wasteful as no future nsertons can possbly fll t. The suboptmal space utlzaton also hurts access performance; e.g., array scans become twce as costly Whle one can handle uental nsertons as a specal case, other patterns that lead to waste are dffcult to detect. Are there alternatve splttng ateges that are provably reslent aganst such waste, wthout knowng the nserton uence n advance? Example. A popular trck to speed -tree updates s to batch them by keepng ndvdual record updates n a memory buffer. When the buffer flls up, we flush the buffered updates by applyng them n key order. Ths approach reduces I/Os by applyng multple updates to the same -tree leaf wth a sngle leaf access. A large buffer also helps make the leaf accesses more uental. However, for the followng update uence, the conventonal polcy of flushng all buffered updates when the buffer s full s not optmal. Here, K denotes the number of updates that the buffer can hold, and each P represents an update of some record on leaf P. P,..., P, P,..., P, P,..., P, P,..., P,... {z } {z } {z } {z } K/ K K K The flush-all polcy ncurs two leaf wrtes (of P and P +) for every K updates. However, the optmal polcy would flush all P updates after the (K/)-th update; subuently, only one leaf wrte would be ncurred for every K updates. For ths smple nserton uence, flush-all s only a factor of worse than the optmal. As we wll see later, however, there exst uences for whch flush-all s a factor of Ω( K) worse. Are there flushng polces that offer better compettve rato n theory or perform better n practce? In ths paper, we present LA-tree (Lnearzed Array -tree), the backbone of the RIOT array storage engne, whch meets all requrements dentfed earler. LA-tree offers flexble layouts va lnearzaton; t nherts from -tree effcent support for accesses and updates; and t adapts to varyng sparsty by swtchng between dense and sparse storage formats automatcally on a per-leaf bass. LA-tree reexamnes the leaf splttng ateges and batched update flushng polces, for whch common practces have been rarely questoned. We present theoretcal and emprcal results that contrbute to the fundamental understandng of these problems. These results challenge the common practces. For leaf splttng, explotng the fact that the doman of array ndces s bounded and dscrete, we devse a ategy that naturally produces trees wth no-dead-space, often twce as effcent as those produced by spltn-mddle. Ths advantage does ncur a fundamental trade-off n the worst-case, splt-n-mddle has compettve rato, whle ths ategy has, whch s the best possble for any no-dead-space ategy. Nonetheless, on common workloads, ths ategy consstently and sgnfcantly outperforms splt-n-mddle. For update batchng, we gve a flushng polcy wth compettve rato O(log K) n the worst case, beatng flush-all s Ω( K). For common workloads, however, flush-all actually performs better n practce. On the other hand, startng from a smple polcy wth a poor compettve rato of Ω(K), we devse a randomzed varant that ncurs fewer number of I/Os than flush-all for some workloads (and comparable numbers for others). Our approach can be seen as brngng to the update batchng problem the same level of rgor as n the study of cachng (though results do not carry over because of fundamental dfferences n ther problem defntons). Fnally, we note that our technques are easy to mplement as they do not requre ntrusve modfcatons to the conventonal - tree. Also, many of our results generalze to other settngs: the dea of no-dead-space splttng makes sense for other dscrete, ordered key domans; theoretcal analyss of update batchng generalzes to other block-orented or dbuted data uctures. Related Work Database systems have been extended wth support for arrays, and more specfcally, lnear algebra. esdes storng arrays as tables whose rows correspond to ndvdual array elements, UDTs and UDFs are popular mplementaton optons (e.g., [, ]). In general, these approaches can be seen as dvdng an array nto chunks and storng each chunk n a database row as a unt of access. SQL can express many lnear algebra operatons by callng UDFs that operate on chunks or pars of chunks. Database ndexng s used for accessng chunks. Whle ths paper does not store arrays n databases, many deas, such as lnearzaton, dynamc storage format, and update batchng, are readly applcable by regardng a table of chunks as a block-orented storage ucture. There has also been work buldng database systems specalzng n arrays (e.g., RasDaMan [], ArrayD [], and ScD [7]). These approaches dvde arrays nto rectangular chunks, and often rely on spatal ndexng to retreve chunks n hgh-dmensonal arrays. Our approach of lnearzaton supports more layouts (e.g., bt-reversed) and avods the dffculty of hgh-dmensonal ndexng. One reason for ths dfferent approach s that we focus less on ad hoc regon-based retreval, but more on whole-matrx operatons wth more predctable but specfc access patterns. Nonetheless, t would be nterestng to see how our deas can be appled n ther settngs (e.g., lnearzaton, alternatve ndex reorganzaton and update bufferng methods) and vce versa (e.g., allowng replcaton of boundary elements between neghborng chunks as n ScD). Lnearzaton s frequently used for mult-dmensonal ndexng. U-tree [] s the most related to our work n ths regard. Whle U-tree lnearzes arrays usng Z-order, LA-tree provdes more lnearzaton optons to match dfferent applcaton needs (wth a smlar goal as RodentStore [7], but at a dfferent level). More mportantly, we reexamne ndex reorganzaton and update bufferng practces, whch U-tree does not address. There s no shortage of -tree trcks [, ] amed at mprovng ts effcency. Prefx -tree compresson, for example, s a more general form of compresson than our dynamc leaf format, though ts generalty also carres some overhead. There s also work on alternatve splttng ateges, such as avod splttng by scannng adjacent nodes for free space [9]. Most of these technques are orthogonal to ours and may further mprove LA-tree n some cases. We are not aware of any prevous work on alternatve splttng ateges for bounded, dscrete key domans and how they nteract

3 wth compresson. Work on update batchng dates back to Lohman et al. []. Lke us, nstead of a complete reorganzaton, Lang et al. [] propose accumulatng nsertons n a batch, sortng them by key, and applyng them to -tree by traversng from left to rght and backtrackng along root-to-leaf paths when necessary. Our contrbuton to the update batchng problem les n analyzng and questonng the standard practce of flushng all buffered updates. Overvew of LA-Tree ased on -tree, LA-tree ntroduces modfcatons and extensons desgned for arrays: lnearzaton (ths secton), new leaf splttng ateges (Secton.), dynamc leaf storage format (Secton.), and alternatve flushng polces for update batchng (Secton 5). Each LA-tree has a lnearzaton functon that specfes the storage layout of the array. For an array of dmenson d and sze N N d, a lnearzaton functon f : [, N ) [, N d ) [, N N d ), where all ntervals are over N, s a bjecton that maps each d-d array ndex to a -d array ndex. When d =, f s a permutaton. Conceptually, LA-tree ndexes the values of array elements by ther lnearzed array ndces;.e., the element of array A wth ndex ı =,..., d s ndexed as the key-value par (f( ı), A[ ı]). Popular layouts, such as row-major, column-major, blocked, Z-order, bt-reversal, can be easly and succnctly defned as lnearzaton functons (Remark. gves concrete examples). LA-tree supports arbtrary user-defned lnearzaton functons; for convenence and effcency, however, the frequently used ones have support bult nto LA-tree. Each LA-tree also has a default value (often ) for array elements. Conceptually, LA-tree only ndexes elements whose values dffer from the default. A new, empty array s flled wth the default value. Settng a default-valued element to non-default value amounts to an nserton; the nverse operaton amounts to a deleton. For convenence and wthout loss of generalty, we wll assume the default value to be for the remander of the paper. Wth LA-tree, we support three types of array accesses: Accessng an element by ts array ndex ı, whch amounts to accessng the LA-tree wth key f( ı). Accessng elements of an array va an terator wth lnearzaton functon g, whch specfes the access order and may dffer from the lnearzaton functon f used for controllng the storage order. The -th element n the access order has LA-tree key f(g ()). We mplement varous optmzatons to speed up key calculaton, ncludng ncremental computaton of f g and detectng the specal (but common) case of f = g. Further detals can be found n Remark.. We also support an opton to terate over only non-zero elements. Readng/wrtng elements n a specfed hyper-rectangle n the array ndex space. Ths type of access s common n I/O-effcent matrx algorthms (such as multply) that process matrces a chunk at a tme, whose sze depends on the amount of avalable memory. Supportng such accesses as batch operatons allows us to avod the overhead of terator calls and provde more effcent mplementaton for bult-n lnearzaton functons. Effcency Through etter Space Utlzaton Ths secton tackles -trees effcency problem from two angles: splttng ategy (Secton.) and leaf storage format (Secton.). oth am at mprovng space utlzaton, whch, as ponted out n [5] and valdated by our emprcal study (Secton.), s largely n lne wth the goal of mprovng tme effcency as well. We show that by explotng the specal characterstcs of arrays, LA-trees can acheve much better performance than conventonal -trees.. Splttng Strategy Revsted As motvated n Secton, the standard -tree splttng ategy can lead to lots of wasted space wthn leaves that wll never get used. In the followng, we formalze the desrable propertes of a splttng ategy, propose several alternatves, and dscuss ther propertes. We begn wth some termnology. Let κ denote the leaf capacty, or the maxmum number of records that can be stored n a leaf of the ndex. Each leaf has a (key) range, whch contans all keys of records stored n ths leaf. The set of all leaf ranges forms a dsjont parttonng of the key doman. Snce our ndex stores a -d array, a leaf range s an nterval [l, u), where l and u are the lower bound (nclusve) and upper bound (exclusve) of the -based array ndces stored n the leaf. We defne the densty of a leaf l, denoted ρ(l), as the number of records n l dvded by ts capacty. Densty can be smlarly defned for a set of leaves or the entre ndex. When a record needs to be nserted nto a leaf wth range [l, u) and already κ records (thereby causng t to overflow), a splttng ategy chooses a splttng pont x, such that the orgnal leaf s splt nto two leaves wth ranges [l, x) and [x, u). A splttng ategy operates n an onlne fashon;.e., t processes the current nserton wthout knowledge of future nsertons. To ensure low runtme overhead, we consder only local splttng ateges,.e., ones that do not read or modfy leaves other than the one beng nserted nto. Also, we focus on leaf splttng ateges; splttng at upper levels of the ndex has lttle mpact on the overall space and effcency, and we smply follow the standard -tree ategy. The standard -tree leaf splttng ategy s as follows: Splt-n-Mddle. Gven an overflowng leaf wth κ + records wth keys,,..., κ, ths ategy chooses the splttng pont to be x = j, where j = (κ + )/. There are two desrable propertes that a good splttng ategy should have: bounded space consumpton and no dead space. The space consumpton of a splttng ategy can be measured by ts compettve rato wth respect to an optmal offlne algorthm. Formally, a splttng ategy Σ s α-compettve f, for any nserton uence S, the number of leaves produced by Σ at the end of S s less than α tmes that produced by an optmal offlne algorthm, wthn an addtve constant. Knowng the entre S, the optmal offlne algorthm bascally stores all non-zero array elements compactly, so an array wth range [, N) and n N non-zero elements can be stored n n/κ leaves. Splt-n-mddle s clearly -compettve, because t always generates leaves that are half full. It turns out that ths compettve rato s the best we can hope for: we show that no determnstc local splttng ategy can have a compettve rato of less than (Theorem n appendx). A second desrable property of splttng ateges s no-deadspace. y dead space we mean empty slots n leaves that can never be flled by future nsertons. For example, every leaf except the last one n Example has two slots of dead space. Note that the noton of dead space s specal to unque ndexes wth dscrete key domans such as our settng. General -tree leaves do not have dead space; t s always possble to nsert a record wth a duplcate key, or a record between two adjacent exstng keys (up to some lmt precson of floatng-pont keys or maxmum length of ng keys). Formally, we defne the no-dead-space property as follows. Wthout loss of generalty, assume that the array sze s a multple of κ. We assume standard -tree leaf format for now; optmzatons for dense array regons are dscussed later n Secton.. Otherwse, for an array wth range [, N), the last leaf can have N κ N/κ slots of dead space.

4 Defnton (No-Dead-Space). A splttng ategy Σ s no-deadspace f for any ndex state Σ may result n, there exsts a future nserton uence that causes all leaves to be full under Σ. As we have seen Secton, splt-n-mddle does not have ths property. ut how mportant s no-dead-space, gven that splt-nmddle already has the best possble compettve rato? Consder any array (or a regon wthn an array) wth densty ϱ. A ategy that s no-dead-space would be guaranteed to have a compettve rato of no more than /ϱ for storng the array (or the dense regon). In contrast, regardless of densty, splt-n-mddle may well take twce the mnmum space requred, as lluated n Example. Thus, splt-n-mddle s less attractve than a no-dead-space ategy when ϱ > /, whch s a rather common case n our settng. For example, all dense matrces fall nto ths case, unless they are at the early stage of beng populated n non-uental order. Hence, no-dead-space s an mportant property that focuses less on the worst case and more on the common case of dense matrces or dense regons n matrces. We propose a novel ategy that s naturally no-dead-space: Splt-Algned. Gven an overflowng leaf l wth range [l, u), ths ategy chooses the splttng pont x to be a multple of κ that mnmzes the dfference between the number of records n [l, x) and that n [x, u). If multple values of x satsfy the condton, the one that mnmzes x l+u s chosen. In other words, splt- favors a splt that s most balanced, lke splt-n-mddle, but under the condton that the splttng pont algns wth, κ, κ,...,.e., endponts of the leaf ranges had we lad out all array elements (zero or non-zero) compactly. For example, wth κ = 5, splt- wll choose the followng splt: It s easy to see that, startng wth a sngle leaf wth range [, N), splt- s no-dead-space. An obvous queston s how splt- does on compettve rato. Unfortunately, there s a fundamental trade-off between nodead-space and bounded space consumpton we show that any no-dead-space splttng ategy must have a compettve rato of at least (Theorem n appendx), whch s worse than splt-nmddle n the worst case. We also show that splt- ndeed has a compettve rato of ;.e., t s the best no-dead-space ategy possble (Theorem n appendx). Ths bound s non-trval, consderng that splt- may generate near-empty leaves. esdes splt-n-mddle and splt-, we also consder: Splt-off-Dense. Gven a leaf to splt wth range [l, u), ths ategy frst consders two canddate splttng ponts l + κ and u κ, whch would result n a leaf wth range [l, l + κ) or one wth range [u κ, u), respectvely. Note these leaves wll never be splt further. If ether leaf has densty greater than.5, we choose the splttng pont that would result n the leaf wth the hgher densty. Otherwse, we fall back to splt-n-mddle. Intutvely, ths ategy can be seen as a tweak to splt-n-mddle that frst tres to splt off a dense leaf that wll not splt agan n the future. It s not hard to see that splt- s no worse than splt-n-mddle n terms of compettve rato, but splt-offdense may sometmes do better, e.g., the uental nserton uence n Example. Splt-Defer-Next. Ths ategy tres to choose a splttng pont that delays the splt of ether result leaf as much as possble. Suppose we splt a leaf l wth range [l, u) and keys,..., κ nto leaves l and l wth splttng pont x. Assumng that each future nserton hts each mssng key wth equal probablty, we can calculate τ(x), the expected number of future nsertons nto [l, u) that wll cause the frst splt of ether l or l, usng a formula nvolvng l, u, and,..., κ (see Remark. n appendx for the formula and ts dervaton). Splt-defer-next choose the splttng pont to be arg max x τ(x). Unfortunately, the formula for τ(x) s qute nvolved, and we have no closedform soluton for ths maxmzaton problem; therefore, we resort to tryng every x {,..., κ} n a brute-force fashon. Splt-alanced-Rato. Ths ategy shares the same goal as splt-defer-next, but uses a smpler optmzaton objectve that s computatonally easer. Gven a leaf l, consder the rato χ(l) between the number of free storage slots n l and the number of keys mssng from (and hence can be later nserted nto) l s range. Intutvely, a bgger χ(l) means l s less lkely to splt n the future. Splt-balanced-rato pcks the splttng pont that maxmzes the mnmum of the two resultng leaves ratos. Specfcally, gven an overflowng leaf wth range [l, u) and keys,,..., κ, ths ategy sets x = k, where k = arg max j mn κ j, κ (κ+ j) ( j l) j (u j. ) (κ+ j) Secton. compares these ateges wth splt-n-mddle and splt usng varous metrcs, and evaluates ther performance n practce wth common workloads for matrces. We have only dscussed nsertons so far. Deletons can be handled usng standard -tree technques; see Remark.. They are not the focus of ths paper because we fnd deletons to be rare n our workloads and hence less mportant to overall performance.. Dynamc Leaf Storage Format As dscussed n Secton, plan -trees are not effcent for dense arrays. We want LA-tree to be effcent for dense arrays as well as arrays whose sparsty vares over tme and across dfferent regons nsde them. To ths end, LA-tree supports two leaf storage formats, sparse and dense. Dfferent leaves can have dfferent storage formats, and each leaf can swtch between the two formats dynamcally. A sparse-format leaf stores each non-zero array element n ts range as a key-value par; zeros are not stored. Let κ s denote the sparse leaf capacty,.e., the maxmum number of records that can be stored by a sparse-format leaf. A dense-format leaf, on the other hand, stores all values (zero or non-zero) of array elements from a contnuous subrange of ts key range. The key that starts the subrange s also stored, but the other keys n the subrange are not, because they can be smply nferred from the startng key and the entry postons. Let κ d denote the dense leaf capacty,.e., the maxmum length of the subrange, or the maxmum number of records that can be stored by a dense-format leaf. Clearly, κ d > κ s. For example, f the keys are -bt ntegers and values are -bt doubles, then κ d κ s. Ths two-format approach can be regarded as a smple compresson method, whch we feel provdes a good trade-off between storage space and access tme. More sophstcated compresson methods are certanly possble, but they wll lkely add non-trval decompresson overhead to data accesses. LA-tree automatcally swtches between the two formats when a leaf s wrtten. We call the effectve range of a leaf l to be the tghtest nterval contanng all keys stored n l. The effectve range of l s always contaned n the range of l. If an nserton overflows a sparse-format leaf l, and the length of l s effectve range (contanng all κ s + keys) s no greater than κ d, then we swtch l to the dense format wthout splttng l. Conversely, f an nserton nto a dense-format leaf l expands the length of ts effectve range to greater than κ d but the total number of records s stll below κ s, then we swtch l to the sparse format wthout splttng l. The splttng ateges n Secton. need to be modfed to

5 # pages allocated ( 5 ) n-mddle balanced-rato # pages allocated ( 5 ) n-mddle balanced-rato # pages allocated ( 5 ) nt n-mddle balanced-rato I/O + CPU = total tme ( s) n-mddle balanced-rato Fgure : Splttng ateges, wth all leaves usng the sparse format. In the frst three graphs (for,, and nt), horzontal axes show the percentage of elements nserted so far; each plot contans one data pont every nsertons, and shows one tck every nsertons. In the last fgure, the vertcal axs shows the break-down of runnng tme nto I/O and CPU, wth CPU on top. work wth the dynamc leaf format. For splt-, we requre the splttng pont to be a multple of κ d. Other necessary modfcatons are not dffcult to devse, but care s needed to cover all possble cases. ecause of lmted space, we wll lluate just one ntrcacy wth an example. Wth κ s = and κ d =, consder the followng overflowng dense leaf upon the nserton of key 97: Wthout modfcaton, splt- would choose (a multple of κ d ) as the splttng pont. However, the result rght leaf cannot store all of, 9,,,, and 97, wth ether dense or sparse format. Hence, t s necessary to further modfy splt- to rule out nfeasble splttng ponts. In ths case, wll be ruled out, and 9 wll be chosen nstead.. Expermental Evaluaton Splttng Strateges on Common Inserton Patterns We frst compare the performance of varous splttng ateges, for now assumng sparse formats across all leaves. We consder the followng patterns for populatng an ntally empty matrx wth row-major layout: (uental) nserts elements n row-major order; (ded) nserts elements n column-major order; nt(erleaved) nserts elements n row- and column-major orders n an nterleavng fashon (as n LU factorzaton); and ran(dom) nserts elements n random order. Fgure summarzes the results for a matrx and a M buffer pool; see Remark. for detaled expermental setup. Results on other scales are smlar. For ths experment, ran s too expensve to run to completon; t takes an hour just to process % of the nsertons. As ts performance s clearly unacceptable regardless of the choce of splttng ategy, we do not dscuss ran further here. We wll, however, revst ran n Secton 5. because update batchng helps mprove ts performance. From the frst three graphs n Fgure, we see that standard spltn-mddle uses about twce as much space as others throughout the course of each workload. From the last graph, we see that spltn-mddle s smpler splttng logc s not enough to make up for ts loss n I/O effcency. On the other hand, splt- mantans a notceable lead ahead splt-n-mddle n runnng tme, and s the best ategy overall n both space and tme effcency. As for other ateges, splt- has curously hgh runnng tme for despte ts low number of I/Os (whose plots are not shown here but are consstent wth the frst three graphs); a closer examnaton of the traces reveals that splt- s tendency to generate far more unbalanced leaves than others leads to Note that our CPU tme accountng ncludes tme spent outsde system calls on behalf of I/Os. In partcular, tme spent on I/Os served from our buffer pool wthout httng the dsk s counted towards the CPU tme nstead of the I/O tme. In ths fgure, the CPU tme s sgnfcant proporton s n part explaned by the effectveness of our buffer pool for these workloads. 97 # pages allocated ( 5 ) n-mddle balanced-rato nt I/O + CPU = total tme ( s) n-mddle balanced-rato Fgure : Splttng ateges, wth dynamc leaf storage format. very scattered I/Os. Splt-balanced-rato has no better space utlzaton than splt- but carres hgher CPU overhead. We omt splt-defer-next here and subuently, because t has prohbtve CPU overhead but offers no sgnfcant space savngs. Next, we repeat the experments wth dynamc leaf storage format, to study how ths feature further affects performance. Fgure summarzes the results. All ateges beneft from ths feature, but splt- benefts more, thanks to ts ablty to produce leaves that are better (and hence better prepared ) for the dense format. For the more nterestng patterns of and nt, ts advantage over splt-n-mddle wdens to a factor of more than.5 n terms of space, and more than.7 n terms of tme; ts advantage over other ateges are also more pronounced than n Fgure. Moreover, the relatve performance dfferences stay the same over the course of the workloads (plots are omtted here, but exhbt the same lnear trends as the frst three graphs n Fgure ). In concluson, splt- s a clear wnner. Fnally, note that these experments only report the runnng tme of populatng the matrx. Splt-, wth ts hghest space effcency, becomes even more appealng f we consder the cost of accessng the matrx subuently. For other ateges, one could bulk load (and compact) the array at end of the nserton uence to make subuent scans more effcent, but dong so would further add to the runnng tme and, for a dense matrx, result n a fnal tree no better than splt-. Scalablty Test The exerments above are all performed on a matrx (wth mllon elements). We also vary the matrx sze and plot the normalzed total runnng tme (obtaned by dvdng the total runnng tme by that of splt-n-mddle) n Fgure. The results show a consstent relatve gap between splt-nmddle and splt-, wth or wthout the dynamc leaf storage format. In terms of absolute runnng tme (not plotted here), both ateges scale lnearly wth the matrx sze. It s clear that splt- s space effcency advantage extends to dfferent data scales. uffer Pool Settngs We next replcate the experments n Fgure wth dfferent buffer pool szes: a smaller M and a bgger M. The I/O and CPU tme breakdown for the four splttng ateges wth dynamc leaf page format s shown n Fgure. Splt- and splt- are generally able to better explot a larger buffer pool to reduce ther I/O tme, although a larger-than- nt nt

6 I/O + CPU = total tme ( s) # pages allocated ( 5 ) LA M M M n-mddle balanced-rato nt I/O + CPU = total tme ( s) M M M Fgure : LA-tree, -tree, ; dense matrx. # pages allocated ( 5 ) 5 5 LA nt I/O + CPU = total tme ( s) I/O + CPU = total tme ( s) LA LA Fgure 5: LA-tree, -tree, : sparse matrx. enough buffer pool does not brng further beneft, and n some case may even cause extra CPU overhead (namely splt- wth M pool under ). Splt-n-mddle and splt-balanced-rato are relatvely nsenstve to the sze of buffer pool. In ths sense, ther performance s more predctable. However, even f the memory resource s scarce, splt- stll has consderable advantage over them. LA-Tree, -Tree, and Drectly Addressable Fle We now step up a level and compare the performance of LA-tree (wth splt and dynamc leaf storage format), standard -tree (wth splt-n-mddle and sparse leaf format), and drectly addressable fle (). stores all array values compactly n a fle, enablng drect lookups and elmnatng the need to store array ndces or to use extra ndrectons for ndexng. Fle system optmzatons allow us to allocate dsk pages for lazly: f a page has never been wrtten (because t contans all zeros), t s never allocated. Frst, we repeat the same experments for a matrx n Fgure, and summarze the results n Fgure. In terms of space utlzaton, LA-tree s on par wth, the best possble n ths case; -tree s four tmes worse, because t lacks the dense format and ts leaves are mostly half-full. As for runnng tme, the break-down nto CPU and I/O offers nterestng nsghts. In terms of CPU tme, s the fastest, and -tree s the slowest; the reasons are that s drect address calculaton s smpler than tree lookups, and that searchng wth the sparse leaf format (whch - tree uses exclusvely) s more expensve than the dense format. In terms of I/O tme, -tree suffers from a larger number of I/Os. Surprsngly, has the worst I/O tme for and nt, even though t ncurs a smlar number of I/Os (not plotted here) as LA-tree. A closer look shows that generates very scattered I/Os because column-major nsertons ht faraway portons of the fle. In ths n-mddle balanced-rato Fgure : Impact of buffer pool sze. nt nt I/O + CPU = total tme ( s) M M M nt n-mddle balanced-rato regard, LA- and -trees are better at placng and movng array elements durng the course of these workloads. Ths observaton offers the nsght that t can be suboptmal to smply place each element where t should be at the end of the nserton uence, as the ntermedate states of the data ucture also affect performance. In the second set of experments, we populate a sparse matrx wth % randomly dbuted non-zero elements. Fgure 5 summarzes the results. As expected, really suffers whle -tree shnes, as there are not even locally dense regons n ths matrx. Despte beng unable to explot any densty, LA-tree mantans comparable performance to -tree, except that LA-tree has slghtly hgher I/O tme due to slghtly more random I/Os. From the above two sets of experments, whch addle the opposte ends of the dense-sparse spectrum, we see that LA-tree s able to automatcally acheve optmal (or close to optmal) performance wthout manual tunng. Scalablty Test We also scaled the experments above wth dfferent matrx szes (Fgure 7). Whle LA-tree and -tree scale lnearly under all tests, s scalablty s not lnear. For dense matrces under non-uental nserton pattern, s performance degrades quckly and becomes nferor to LA-tree as the matrx sze ncreases. For sparse matrces, s always substantally slower than LA-tree and -tree. Also note that across all scales LA-tree s able to mantan a factor of performance advantage over -tree for dense matrces, whle havng comparable performance for sparse matrces. More Interestng Inserton Patterns We have only consdered three common yet fundamental nserton patterns so far, namely, and nt. Note that these patterns are ndependent from the storage layout or access pattern; nstead, an nserton pattern s generated by a combnaton of two lnearzatons access and storage. For nstance, can happen f a row-major layout matrx s populated n column-major order, or vce versa. Now, we are ready to test two other patterns obtaned by nsertng nto matrces wth a block-based layout. Gven a matrx, we choose a block-based lnearzaton as ts layout. We set the block sze to be (the bggest sze that can stll ft n a dsk page). Wthn every block, elements are lad out n row-major order, and so are the blocks themselves. On top of ths fxed block storage layout, we consder two ways of populatng a matrx: row-major order and row-wse bt-reversal order. We call the two resultng patterns row/block and bt-reversal/block, respectvely. Note that the second access pattern s an essental part of the D FFT algorthm. Combnng the block storage layout wth these two access patterns, the resultng patterns httng the lnear storage medum become more complcated and nterestng. We test the two patterns on a dense matrx. Fgure plots the results. Agan, n terms of space utlzaton, LAtree s the same as, the best possble n both cases. -tree s more than three tmes worse due to ts lack of dense format. In

7 , sparse leaf format n-mddle balanced-rato 5, dynamc leaf format n-mddle balanced-rato , sparse leaf format n-mddle balanced-rato 5, dynamc leaf format n-mddle balanced-rato nt, sparse leaf format n-mddle balanced-rato 5 nt, dynamc leaf format n-mddle balanced-rato 5 Fgure : Splttng ateges: scalablty test wth sparse (top row) and dynamc (bottom row) leaf formats. X-axes show the scale of matrx ( elements), whle y-axes show the normalzed total runnng tme (n-mddle as baselne) LA-tree -tree, dense matrx , sparse matrx, % densty LA-tree -tree LA-tree -tree, dense matrx , sparse matrx, % densty LA-tree -tree LA-tree -tree nt, dense matrx nt, sparse matrx, % densty LA-tree -tree 5 Fgure 7: LA-tree, -tree, : scalablty on dense and sparse matrces. X-axes show the scale of matrx ( elements, ncludng zeros n case of sparse matrx), whle y-axes show the normalzed runnng tme (-tree as baselne). terms of both I/O tme and total tme, -tree s also the worst, not surprsngly. For row/block, LA-tree s I/O tme s on par wth s, but t has more CPU overhead; so the result s smlar to n Fgure. For bt-reversal/block, LA-tree s I/O tme s only % of s, whch s enough to compensate for ts hgher CPU tme. Overall, the results from these two new nserton pattens agree wth prevous results n Fgure and do not change our concluson. LAS on UFSparse Steppng up yet another level, we examne how LA-tree compares wth -tree and for lnear algebra operatons nvolvng real-world matrces. For the operaton, we test matrx multply, an essental and often performance-crtcal buldng block of more sophstcated analyss. We use an I/O-effcent verson of the block matrx multply algorthm, whch computes the

8 Table : LA-tree, -tree, : Total runnng tme of dgemm on UFSparse and dense matrces. Name(ID) sze #nonzeros LA-tree (s) -tree (s) (s) opt (7) ramage (7) shp (77) std Jac () GaAsH (5) net75 (9) human gene () 7 TSOPF RS b (9) 79 5 Dense # pages allocated ( 5 ) LA row/block bt-reversal/block I/O + CPU = total tme ( s) 5 LA row/block bt-reversal/block Fgure : LA-tree, -tree, : more nserton patterns on blocked dense matrx. # pages allocated ( 5 ) LA human_gene I/O + CPU = total tme ( s) # pages allocated ( 5 ) TSOPF_RS_b Fgure 9: LA-tree, -tree, : UFSparse matrces. LA I/O + CPU = total tme ( s) result matrx one block (submatrx) at a tme by readng and multplyng pars of blocks from the nput matrces and accumulatng the multplcaton results n memory. For multplyng submatrces n memory, we use the LAS routne dgemm f both submatrces have densty greater than.5, or the CHOLMOD [5] routnes cholmod ssmult or cholmod sdmult otherwse. For nput, we use matrces from UFSparse, the Unversty of Florda Sparse Matrx Collecton []. To test each storage method, we prepare the nput matrces wth ths method usng a blocked layout that matches the pattern of blocks accessed by the I/O-effcent matrx multply. We multply each nput matrx wth tself, and save the result usng the same storage method as the nput. Here, we dscuss results for two matrces, human gene and TSOPF RS b (Fgure 9). We report the total runnng tme, whch excludes nput preparaton but ncludes wrtng the result. For human gene ( and densty.79%), we use 5 5 blocks, and the total runnng tme s sec for LAtree, sec for -tree, and 7sec for. suffers from a bloated nput fle. LA- and -trees both perform well, wth LAtree leadng by about %. Ther nput trees are comparable n sze, because human gene looks unformly sparse. The result matrx turns out farly dense, so the LA-tree result s more compact. For TSOPF RS b ( and densty.%), we use blocks, and the total runnng tme s sec for LA-tree, 5sec for -tree, and sec for. Unlke human gene, ths matrx has a dense regon despte ts overall sparsty. LA-tree s able to explot ths local densty to wden ts lead over -tree to a factor of.. Its lead over narrows slghtly, but s stll more than a factor of.. Results on more matrces are presented n Table. The concluson s consstent: for sparse matrces, LA-tree performs much better than, and as well as or better than -tree (dependng on the unformty of sparsty); for the full matrx, LA-tree has comparable performance to, whch s the best, whle -tree really suffers from ts space neffcency. 5 Update atchng We now turn to the problem of batchng ndex updates n a memory buffer 5 to consoldate wrtes to dsk. To support ndex access whle updates are beng buffered, we organze ths buffer as an ndex over the buffered updates; a record lookup would be frst checked aganst ths n-memory ndex. Whenever the buffer s full, we need to flush updates,.e., applyng them n a batch to the underlyng dsk-resdent ndexes. As dscussed n Secton, we queston the common practce of flushng all buffered updates whenever the buffer s full. Secton 5. presents alternatve polces and a theoretcal analyss of ther performance. Secton 5. dscusses mplementaton ssues and Secton 5. presents an emprcal evaluaton. 5. Flushng Polces and Analyss To smplfy theoretcal analyss, we make some assumptons. Frst, we vew each update to a record r as a request for the dsk page (leaf) that contans r or wll contan r, and we assume that we know the denttes of all requested pages before each flushng acton (see Secton 5. for mplementaton detals). Second, we assume that each flush ncurs a fxed cost per update plus a fxed cost per page; multple updates requestng the same page ncur the per-page cost only once for the flush, reflectng the beneft of batchng. ecause the sum of per-update costs n the end reman the same no matter how we flush, we focus on mnmzng the sum of per-page costs over tme. Note that ths analytcal model s an mperfect smplfcaton of realty. For example, t gnores the cost of obtanng page denttes (Secton 5.) and that of splttng (whch depends on factors such as the splttng ategy). Nonetheless, t provdes a reasonable estmate of the true cost, and makes our analyss more generalzable to other batch processng settngs. Wth these assumptons, we now formally defne the problem. Defnton. There are a set of pages P on dsk, and a buffer of capacty K n memory for bufferng requests. Every request refers to a page and takes unt space n the buffer. A flushng polcy selects subsets of requests to flush as needed to keep the buffer sze capped at K at all tmes. Flushng requests for the same page ncurs unt cost. We are nterested n an onlne flushng polcy that mnmzes the total cost over a request uence. For brevty, by buffered requests we mean all requests elgble 5 The buffer n ths context should not be confused wth the system buffer pool. Ths buffer batches updates whle the buffer pool caches dsk pages. Updates to currently buffered records are smply appled to the buffer, and are not counted as new requests. Therefore, n requests, even f they are for the same page, would take n unts of space.

9 for flushng, whch nclude the ncomng request. Wthout loss of generalty, we assume a polcy only flushes when the buffer s full (any polcy can be modfed to do so wthout affectng the cost). We can also assume that f a polcy flushes any request for P, t flushes all buffered requests for P ; n ths case, we smply say t flushes P. As t may have occurred to the reader, ths problem looks smlar to cache replacement []. Unfortunately, known results on cachng do not carry over. Although cachng has been generalzed to cases where pages can have varyng szes and evcton cost can be a functon of the page sze, an underlyng assumpton remans that the cache space devoted to a page P does not change as the number of requests to P ncreases. On the contrary, wth our problem, n requests to the same page take n unts of buffer space. Ths dfference turns out to be fundamental. Whle we can develop flushng polces analogous to well-studed cache replacement polces, we wll see that ther performance dffers both analytcally and expermentally; new polces specalzed for flushng are needed. We now present our flushng polces. Here we summarze our theoretcal results; see Appendx A for formal statements and proofs. We measure the performance of a flushng polcy by ts compettve rato aganst OPT, the optmal offlne polcy, whch knows the entre request uence n advance. OPT can be mplemented by an exponental-tme search; the algorthmc detals are rrelevant here. (As a sde note, the optmal offlne cache replacement polcy, furthest-n-future [], s not optmal for flushng; see Remark.5.) We show that any polcy s O(K)-compettve (Lemma ). (Had we been dealng wth cachng nstead, ths compettve rato would have been the best that any determnstc polcy can offer.) The most commonly used flushng polcy actually does better: Flush-All (ALL). Ths polcy smply flushes the entre buffer whenever the buffer s full. We show that ALL s Ω( K)- and O( K log K)-compettve (Theorems and ). We can generalze the lower bound above to what we call c-recent flushng polces (Defnton n appendx), whch do not buffer a request for a page f there has been no request for that page durng the past ck requests. Clearly, ALL s -recent. We show that any c-recent polcy s Ω( K/c)-compettve (Theorem 5). The next few flushng polces have analoges n cachng: Least-Recently-Used (LRU). Ths polcy always flushes the page whose most recent request s the oldest (among all pages most recent requests). It s analogous to the classc cache replacement polcy of the same name. We show that LRU s Ω( K)-compettve (Corollary ) by notng that LRU s - recent. (Note that for cachng, LRU s optmally compettve, wth a compettve rato of K.) Smallest-Page (SP). Ths polcy always flushes the smallest page,.e., one wth the smallest number of currently buffered requests. It s analogous to the LFU (least-frequently-used) cache replacement polcy. Whle LFU s wdely used for cachng, SP does not make much sense for flushng. Intutvely, SP flushes small pages, but flushng larger ones s more proftable as more requests can be processed wth one page wrte. Whle SP attempts to preserve large pages, pages have lttle chance to grow large because they may get flushed when stll small. We show that SP s Θ(K)-compettve (Lemma and Theorem ). The example conucted n the proof of Theorem makes the above ntuton concrete. (Note that for cachng, LFU s compettve rato s unbounded.) Largest-Page (LP). Ths polcy always flushes the largest page,.e., one wth the largest number of currently buffered requests. It s analogous to the MFU (most-frequently-used) cache replacement polcy. LP avods SP s problem of flushng small pages. On the other hand, LP may flush a page prematurely just because t s currently the largest; however, that page may grow even larger f t not mmedately flushed. We show that, just lke SP, LP s Θ(K)-compettve (Lemma and Theorem 7). The proof of Theorem 7 gves a concrete example of the premature flushng problem. Next, we present two new polces: the frst s a randomzed varant of LP, whle the second s a novel polcy amed at achevng a fundamentally better compettve rato than the polces above. Largest-Page-Probablstcally (LPP). Ths polcy randomly flushes a page wth probablty proportonal to the number of requests currently buffered for ths page. It can be seen as a randomzaton of LP. Intutvely, LPP s desgned to avod the problems of LP and SP: larger pages have a hgher chance of beng flushed, but all pages have a chance to survve and grow larger. Another attractve feature of LPP s ts effcency of mplementaton, as we shall see n Secton 5.. Largest-Group (LG). Ths polcy parttons buffered requests nto groups: Group, where log K, contans a page P f the number of buffered requests for P s n the range [, + ). We defne the sze of a group to be the total number of buffered requests for ts consttuent pages. When the buffer s full, LG flushes the group wth the largest sze. LG s a novel polcy desgned specfcally for the update batchng problem. Intutvely, LG s practce of flushng a group at a tme offers better protecton aganst an adversary than flushng a page at a tme. Wth log K + groups, the largest group K log K + has at least requests, so LG always flushes a szable number of requests. Even f LG had chosen a wrong subset of requests to flush, ths mstake cannot be repeated untl the buffer s full agan, whch only happens after at least K log K + more requests. In contrast, an adversary can more easly penalze polces that may flush a few requests. We show that LG has a compettve rato of O(log K) (Theorem 9), makng t the theoretcally best among our polces. 5. Implementaton Obtanng Page Identtes and Ranges All polces above except ALL requre obtanng the page dentty and key range for a buffered request. Such nformaton s readly avalable by executng a partal lookup for the requested key n the LA-tree, wthout vstng the leaf page contanng the key. Only one partal lookup s needed for requests to the same page, because once we obtan page P s range, we can check whether a request refers to P by comparng the requested key wth P s range. Snce only non-leaf levels are vsted, a generc system buffer pool (not to be confused wth the update buffer) s effectve n reducng I/Os. LP, SP, and LRU At the tme of flush, these polces make one pass over the buffered requests n key order. In the process, we fnd the dentty and range of each requested page P, usng one partal lookup (as opposed to one per request to P, as explaned above). Remanng detals are polcy-specfc and are gven n Remark.. To further reduce page dentfcaton cost, we mantan a cache that remembers the dentty and range for up to a confgurable number of pages. At the next flush, we avod the cost of dentfyng such pages. Of course, ths page nformaton cache consumes space that could otherwse be devoted to bufferng requests, whch we account for n our emprcal evaluaton n Secton 5.. LPP At the frst glance, LPP seems to requre knowng the counts of buffered requests for all pages. A far more effcent mplementaton s possble, however. We smply need to pck one buffered request unformly at random, fnd the dentty and range of ts page,

Storing Matrices on Disk: Theory and Practice Revisited

Storing Matrices on Disk: Theory and Practice Revisited Storng Matrces on Dsk: Theory and Practce Revsted Y Zhang Duke Unversty yzhang@cs.duke.edu amesh Munagala Duke Unversty kamesh@cs.duke.edu Jun Yang Duke Unversty junyang@cs.duke.edu ABSTRACT We consder

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access Agenda Cache Performance Samra Khan March 28, 217 Revew from last lecture Cache access Assocatvty Replacement Cache Performance Cache Abstracton and Metrcs Address Tag Store (s the address n the cache?

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

CE 221 Data Structures and Algorithms

CE 221 Data Structures and Algorithms CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Priority queues and heaps Professors Clark F. Olson and Carol Zander

Priority queues and heaps Professors Clark F. Olson and Carol Zander Prorty queues and eaps Professors Clark F. Olson and Carol Zander Prorty queues A common abstract data type (ADT) n computer scence s te prorty queue. As you mgt expect from te name, eac tem n te prorty

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Lecture #15 Lecture Notes

Lecture #15 Lecture Notes Lecture #15 Lecture Notes The ocean water column s very much a 3-D spatal entt and we need to represent that structure n an economcal way to deal wth t n calculatons. We wll dscuss one way to do so, emprcal

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vdyanagar Faculty Name: Am D. Trved Class: SYBCA Subject: US03CBCA03 (Advanced Data & Fle Structure) *UNIT 1 (ARRAYS AND TREES) **INTRODUCTION TO ARRAYS If we want

More information

CHAPTER 10: ALGORITHM DESIGN TECHNIQUES

CHAPTER 10: ALGORITHM DESIGN TECHNIQUES CHAPTER 10: ALGORITHM DESIGN TECHNIQUES So far, we have been concerned wth the effcent mplementaton of algorthms. We have seen that when an algorthm s gven, the actual data structures need not be specfed.

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

Sorting: The Big Picture. The steps of QuickSort. QuickSort Example. QuickSort Example. QuickSort Example. Recursive Quicksort

Sorting: The Big Picture. The steps of QuickSort. QuickSort Example. QuickSort Example. QuickSort Example. Recursive Quicksort Sortng: The Bg Pcture Gven n comparable elements n an array, sort them n an ncreasng (or decreasng) order. Smple algorthms: O(n ) Inserton sort Selecton sort Bubble sort Shell sort Fancer algorthms: O(n

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Bran Curless Sprng 2008 Announcements (5/14/08) Homework due at begnnng of class on Frday. Secton tomorrow: Graded homeworks returned More dscusson

More information

Brave New World Pseudocode Reference

Brave New World Pseudocode Reference Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION 24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Sorting. Sorting. Why Sort? Consistent Ordering

Sorting. Sorting. Why Sort? Consistent Ordering Sortng CSE 6 Data Structures Unt 15 Readng: Sectons.1-. Bubble and Insert sort,.5 Heap sort, Secton..6 Radx sort, Secton.6 Mergesort, Secton. Qucksort, Secton.8 Lower bound Sortng Input an array A of data

More information

arxiv: v3 [cs.ds] 7 Feb 2017

arxiv: v3 [cs.ds] 7 Feb 2017 : A Two-stage Sketch for Data Streams Tong Yang 1, Lngtong Lu 2, Ybo Yan 1, Muhammad Shahzad 3, Yulong Shen 2 Xaomng L 1, Bn Cu 1, Gaogang Xe 4 1 Pekng Unversty, Chna. 2 Xdan Unversty, Chna. 3 North Carolna

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,

More information

4/11/17. Agenda. Princeton University Computer Science 217: Introduction to Programming Systems. Goals of this Lecture. Storage Management.

4/11/17. Agenda. Princeton University Computer Science 217: Introduction to Programming Systems. Goals of this Lecture. Storage Management. //7 Prnceton Unversty Computer Scence 7: Introducton to Programmng Systems Goals of ths Lecture Storage Management Help you learn about: Localty and cachng Typcal storage herarchy Vrtual memory How the

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Math Homotopy Theory Additional notes

Math Homotopy Theory Additional notes Math 527 - Homotopy Theory Addtonal notes Martn Frankland February 4, 2013 The category Top s not Cartesan closed. problem. In these notes, we explan how to remedy that 1 Compactly generated spaces Ths

More information

K-means and Hierarchical Clustering

K-means and Hierarchical Clustering Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Report on On-line Graph Coloring

Report on On-line Graph Coloring 2003 Fall Semester Comp 670K Onlne Algorthm Report on LO Yuet Me (00086365) cndylo@ust.hk Abstract Onlne algorthm deals wth data that has no future nformaton. Lots of examples demonstrate that onlne algorthm

More information