arxiv: v1 [cs.na] 11 May 2017

Size: px
Start display at page:

Download "arxiv: v1 [cs.na] 11 May 2017"

Transcription

1 Cache-oblivious Marix Muliplicaion for Exac Facorisaion arxiv: v1 [cs.na] 11 May 217 Faima K. Abu Salem 1 Compuer Science Deparmen, American Universiy of Beiru, P. O. Box , Riad El Solh, Beiru , Lebanon Mira Al Arab 2 Compuer Science Deparmen, American Universiy of Beiru, P. O. Box , Riad El Solh, Beiru , Lebanon Absrac We presen a cache-oblivious adapaion of marix muliplicaion o be incorporaed in he parallel TU decomposiion for recangular marices over finie fields, based on he Moron-hybrid space-filling curve represenaion. To realise his, we inroduce he conceps of alignmen and conainmen of sub-marices under he Moron-hybrid layou. We redesign he decomposiions wihin he recursive marix muliplicaion o force he base case o avoid all jumps in address space, a he expense of exra recursive marix muliplicaion (MM) calls. We show ha he resuling cache oblivious adapaion has low span, and our experimens demonsrae ha is sequenial evaluaion order demonsraes orders of magniude improvemen in run-ime, despie he recursion overhead. Keywords: Localiy of reference, Cache-oblivious Algorihms, Space-filling Curves, Moron-hybrid Layou, TU Decomposiion, Finie Fields 1 Inroducion We presen a cache-oblivious adapaion of marix muliplicaion o be incorporaed in he paralel TU decomposiion for recangular marices over finie fields, based on he Moron-hybrid space-filling curve represenaion. Exac riangulisaion of marices is crucial for a large range of problems in Compuer Algebra and Algorihmic Number Theory, where a basis of he soluion se of he associaed linear sysem is required. Our focal algorihm of reference is he TURBO algorihm of Dumas e al. [7] for exac LU decomposiion. This algorihm recurses on recangular and poenially singular marices. TURBO significanly reduces he volume of communicaion on disribued sysems, and reains opimal work and linear span. TURBO can also compue he rank in an exac manner. As benchmarked agains some of he mos efficien curren exac eliminaion algorihms in he lieraure, TURBO incurs low synchronisaion coss and reduces he communicaion cos feaured in [9, 1] by a facor of one hird when used wih only one level of recursion on 4 processors. A significan 1 Corresponding auhor faima.abusalem@aub.edu.lb 2 maa75@aub.edu.lb 1

2 par of TURBO consiss of marix facorisaion, and so, adaping his kernel in a cache-oblivious fashion will ulimaely conribue o a cache oblivious facorisaion algorihm. Tha TURBO has low deph makes adaping is sequenial version o he cache-oblivious model more elling. Paricularly, nesed parallel algorihms for which he naural sequenial execuion has low cache complexiy will also aain good cache complexiy on parallel machines wih privae or shared caches [4]. A he base case of TURBO he sub-marices reach a given hreshold, and so one can ake advanage of cache effecs. To he bes of our knowledge, no cache oblivious (or cache aware) algorihms for exac linear algebra exis in he lieraure. We pursue a cache oblivious adapaion using space-filling curves. TURBO requires index conversion rouines from he space curve chosen and he caresian order, due o he row and column permuaions. In [1], using a deailed analysis of he number of bi operaions required for index conversion, and filering he cos of lookup ables ha represen he recursive decomposiion of he Hilber curve, we have shown ha he Moron-hybrid order incurs he leas cos for index conversion rouines as compared o he Hilber, Peano, or Moron orders. The Moron order is he recursive Z-shaped space filling curve (Fig.??). The Moron-hybrid order sops decomposing when he submarices aain a hreshold dimension T T [2]. A such a level, say when he submarix fis in cache, he overhead for mainaining he curve represenaion ouweighs he reducion in cache complexiy. In reference o he lieraure cied in his manuscrip around he Moron-order and is hybrid, his curve represenaion improves significanly on he emporal localiy of various marix algorihms such as naive muliplicaion, LU decomposiion, and QR facorisaion. In his work, we inroduce he conceps of alignmen and conainmen of sub-marices under he Moron-hybrid layou, and develop he full deails of he MM algorihm by which i observes he alignmen and conainmen of submarices invariably across he marix facorisaion recursive seps. We do his by redesigning he decomposiions wihin he recursive MM o force he base case o avoid all jumps in address space, a he expense of exra recursive MM calls. We show ha he resuling cache oblivious adapaion reains opimal work and criical pah lengh as defaul MM and hus is highly parallel. Our experimens confirm ha he recursion overhead in he Moron-hybrid MM is negligible and leads o significan reducion in run-ime hanks o is improved emporal localiy. Before proceeding, we begin wih brief descripion of he TU algorihm. Consider a recangular marix A over a field F, where A may be singular. A is riangulaed ino he produc of wo marices T and U, such ha A = T U, where U is a upper riangular marix, and T is wih some T paerns. This is done in a series of recursive seps on recangular and poenially singular marices, relaxing he condiion for generaing a sricly lower riangular marix: (1) Recursive TU decomposiion in SE, SW, NE, and NW (2) Virual row and column permuaions needed o re-order he blocks o yield he marix U. For breviy and because of lack of space, we omi he full deails of he algorihm and refer he reader o [7] for a full accoun on TURBO. 2

3 2 Non-Aligned Recangular Sub-marix Muliplicaion Wihin The Recursion Consider Moron-hybrid marices A, B, and C and le S A, S B, and S C be random sub-marices of A, B, and C respecively, for which one has o compue S A = S B S C. This is a ypical scenario encounered during he TU decomposiion. To illusrae furher, consider Fig. 3. Each ineger appearing in he marices in ha figure represens he corresponding Moron-hybrid index of he elemen occupying i. The sub-marices on which he muliplicaion is performed do no begin a he firs enry of a Moron-hybrid sub-marix, hence he concep of an aligned versus non-aligned Moron-hybrid sub-marix. An aligned sub-marix is a 2 a T 2 b T sub-marix of a Moron-hybrid marix ha begins a he firs enry of a row-major sub-marix. A non-aligned sub-marix is a sub-marix of a Moron-hybrid marix ha does no saisfy his condiion. Corollary 2.1 The Caresian index of he firs enry of an aligned sub-marix is of he form (k 1 T, k 2 T ), for any posiive inegers k 1 and k 2. Proof: By is definiion, an aligned sub-marix A M of a Moron-hybrid marix M sars a he firs enry of some row-major sub-marix S M of M. Since he row-major sub-marix S M is of dimensions T T, he Caresian index of he firs enry of S M is given by (k 1 T, k 2 T ), for some posiive inegers k 1 and k 2. By is definiion, he aligned sub-marix A M begins a an elemen of Caresian index (k 1 T, k 2 T ). Corollary 2.2 If an aligned sub-marix is T T, hen i is row-major. Proof: Le A M be a T T aligned sub-marix of a Moron-hybrid marix M. From he definiion of an aligned marix, we know ha A M begins a he firs enry of a row-major sub-marix S M of M. According o he Moron-hybrid layou, all row-major sub-marices of M, including S M are T T. Since A M is also T T, hen A M mus be S M and hence is row-major. An example of a non-aligned sub-marix of a Moron-hybrid marix wih T = 4 is shown in red in Fig. 2. An aligned sub-marix is shown in green. Nex, we relae he lack of alignmen of sub-marices o he recursive accessing of hese sub-marices and discuss he implicaed problems. 2.1 Non-Aligned Sub-Marices and loss of localiy A sub-marix S M of a Moron-hybrid marix M is said o be conained if S M lies compleely wihin a sub-marix of M ordered in a row-major fashion. Oherwise, we say ha S M is scaered. Proposiion 2.3 Le A M be an aligned sub-marix of a Moron-hybrid marix M. The sub-marix a he base case of he recursive division, down unil T T sub-marices, of A M is a T T row-major sub-marix of M. 3

4 Proof: Firs, we claim he recursive division of A M gives 4 aligned submarices. From he definiion of aligned sub-marices, A M has size 2 a T 2 b T. So, he division of each of he dimensions of A M by 2 resuls in four quadrans NW, NE, SW, and SE of A M of size 2 (a 1) T 2 (b 1) T each. Thus hese quadrans saisfy he size condiion from he definiion of aligned marices. Noe ha once any of he dimensions reaches size T i is no longer divided, and he recursive division proceeds on he oher dimension unil ha oo becomes T. I is he size condiion of his same definiion ha leads o T T sub-marices a he base case of recursive division of he aligned sub-marices decomposed from A M. Now, recall, from Cor. 2.1, ha he sar index of A M is of he form (k 1 T, k 2 T ). Then, he sar indices of he sub-marices resuling from he subdivision of A M are (k 1 T, k 2 T ), (k 1 T, (k 2 +2 (b 1) ) T ), ((k 1 +2 (a 1) ) T, k 2 T ), and (k 1 +2 (a 1) ) T, (k 2 +2 (b 1) ) T ) for he NW, NE, SW, and SE quadrans of A M respecively. Thus he sar indices of hese quadrans saisfy he sar index condiion from he definiion of aligned marices. Combining, by hose wo claims, he four quadrans resuling from he sub-division of any aligned marix A M are aligned : hey saisfy boh condiions from he definiion of aligned marices. Second, we show ha he aligned sub-marices a he base case are rowmajor sub-marices of M. If he recursive division coninues ill T T submarices, we ge T T aligned sub-marices. From Cor. 2.2, we know ha hese sub-marices are row-major sub-marices of M. This concludes he proof. Corollary 2.4 Any sub-marix of he T T sub-marix reached a he base case of he recursive division of an aligned sub-marix is conained. Proof: According o Prop. 2.3, he T T sub-marix a he base case of he recursive division of an aligned sub-marix A M of a Moron-hybrid marix M is in row-major layou. Hence, any sub-marix of his T T base case sub-marix lies enirely wihin a row-major sub-marix of M and is herefore conained. In Fig. 4, C M is one of he sub-marices a he base case of he recursive division of he aligned sub-marix A M and is a row-major sub-marix. Any submarix of C M is conained. When non-aligned sub-marices are recursively divided, he sub-marix a he base case may no consis enirely of a row-major sub-marix of he Moron-hybrid marix. I may be scaered across more han one row-major sub-marix. For example, in Fig. 4, he sub-marix S M is a submarix a he base case of he recursion for he non-aligned sub-marix N M in red. S M spans four row-major ordered sub-marices - hence, i is scaered. We know ha he elemens of he sub-marices a he base case are o be raversed in a row-major or column-major order, as required for he base case of MM. Wih such raversal imposed, a scaered sub-marix suffers from wo issues: 1. P 1 : Elemens of a scaered sub-marix are no sufficienly close in memory o mainain good spaial localiy when raversed in a row/column-major 4

5 fashion. This resuls in worse memory performance han for conained sub-marices. 2. P 2 : Moron-hybrid encoding is required for accessing each elemen wihin a scaered sub-marix (hus incurring exra compuaion overhead compared o row-major offse calculaion). Proposiion 2.5 The loss in localiy defined by P 1 and P 2 apply for scaered sub-marices bu no conained sub-marices. Proof: We firs consider P 1. Recall ha he raversal of enries a he base case of he recursion is done in wo orders: row-major and column-major. For conained sub-marices, when consecuively accessing any wo enries in any of hese wo orders, he minimum jump in address space is 1 and he maximum is T as all enries lie wihin one row-major sub-marix of he Moron-hybrid marix. A scaered sub-marix spans more han one row-major sub-marix of he marix. These row-major sub-marices are no necessarily consecuive in memory and raversing, in a row-major or column-major fashion, he scaered sub-marix ha spans hese row-major sub-marices resuls in jumps in address space. When consecuively accessing any wo enries of a scaered sub-marix, he minimum jump in address space is 1 if he wo enries being accessed consecuively belong o he same row-major sub-marix and he maximum is k T 2 + T 1 for some posiive ineger k, if he wo enries belong o differen row-major sub-marices. We now consider P 2. Because he base case sub-marix of an aligned submarix is par of a row-major ordered sub-marix, offse calculaion for he elemens a he base case is fas: radiional row-major offse calculaion is used. Index z of an elemen a offse (i, j) from he sar index σ of he submarix a he base case is given by z = σ+i T +j, since he sub-marix saisfies a row-major ordering wih row lengh = T. This can be seen for he conained sub-marix C M shown in Fig. 4 where σ = 112. As for a non-aligned submarix, accessing any elemen (i, j) in any of he base case sub-marices requires ha he corresponding Moron-hybrid index be calculaed. This incurs exra calculaion overhead as he encoding of he Moron-hybrid index is more cosly han calculaing an offse wihin a row-major ordered sub-marix. 2.2 Modified Non-Aligned Sub-Marix Muliplicaion We aim o improve he sub-marix muliplicaion procedure by addressing issues P 1 and P 2. In his secion, we describe a recursive sub-marix muliplicaion algorihm which ensures ha he sub-marices a he base case of he recursion are conained in a row-major ordered sub-marix of he original marix. By doing his, we reduce he range of addresses of he elemens wihin he submarices a he base case as well as he number of jumps in address space done a he base case, and we eliminae he need for Moron-hybrid encoding a he base case. To ensure efficiency ha he sub-marix a he base case of MM is conained, by Prop. 2.3, he recursive division wihin he algorihm mus sar 5

6 on aligned marices. Recall he random marices A, B, and C in Moron-hybrid order and of dimensions 2 m 2 m, and S A, S B, and S C he random sub-marices of A, B, and C respecively (Fig. 3). We wish o perform he muliplicaion S A = S B S C efficienly. We can recursively divide S A, S B, and S C, as in he defaul MM algorihm, which may resul in scaered sub-marices a he base case since S A, S B, and S C may no be aligned. Insead, we will recursively divide A, B, and C and address only he relevan sub-marix muliplicaions ha ough o be done o produce S A = S B S C. As A, B, and C are aligned, recursively dividing hem will enforce row-major sub-marices a he base case from which we exrac he relevan pars o produce S A. Le k be a superscrip denoing a recursive sep of he proposed MM algorihm. Also, le, u, and v, denoe subscrips in {, 1, 2, 3} (of sub-marices of A, B, and C respecively), indicaing a specific quadran following he Moron (Z-order): N W =, N E = 1, SW = 2, and SW = 3. For k =, A = A, B = B, and C = C. Denoe by S A k, S B k u, and and S C k v he respecive sub-marices of A k, Bu, k and Cv k being muliplied as par of he overall muliplicaion S A = S B S C. As such he iniial problem is o produce S A = S B S C. For his, we firs produce he quadrans A 1 of A, such ha A 1 {NW A, NE A, SW A, SE A } for {, 1, 2, 3}. We do he same for B and C producing Bu 1 and C1 v respecively for u, v {, 1, 2, 3}. For each A 1, we produce he sub-marixs, defined as he par of S A 1 A ha lies in A 1. Similarly, we produce S, and S. Noe ha S B 1 u C 1 A v is he wo-dimensional concaenaion of {S } for {, 1, 2, 3} and hence o calculae S A 1 A we need o calculae S for {, 1, 2, 3}. To do his, we need o consider all combinaions Γ of he form Γ,u,v = {A1, B1 u, C1 v } necessary o produce S A, as will be A 1 jusified below. Now, when considering a combinaion Γ,u,v = {A1, B1 u, C1 v }, if he sub-marices S, S, and S are compaible for muliplicaion, i.e. A 1 B 1 u C 1 v he muliplicaion S + = S S is par of he overall muliplicaion A 1 B 1 u C 1 v S A + = S B S C, hen a recursive call is made on S A 1, S B 1 u, and S C 1 v. Else, if S, S, and S are no compaible, we exrac compaible pars of hese A 1 B 1 u C 1 v sub-marices and we label hem as S A 1, S B 1, and S u C 1 on which he muliplicaion proceeds recursively. Afer doing his for all combinaions Γ,u,v for v, u, v {, 1, 2, 3}, we would have calculaed S A. We now describe he general k h recursive sep of Moron-hybrid MM, which consiss of a round of four subseps. For simpliciy, we drop he subscrips, u and v of A k, Bu, k and Cv k, and we use M o denoe any of he marices A, B, or C, Each aligned M k is idenified by wo values: α M k: he Moron-hybrid index of he firs elemen in he aligned marix M k λ M k: he number of elemens in he aligned sub-marix M k We are also given he sub-marices S A k of A k, S B k of B k, and S C k of C k on which we wish o perform he muliplicaion. Each of he sub-marices S M k is 6

7 idenified by he following: σ SM k : he Moron-hybrid index of he firs enry of S M k r SM k : he number of rows of S M k c SM k : he number of columns of S M k We do no use he 4-uple (M, σ, r, c) o idenify he aligned sub-marices M k because he 3-uple (M, α, λ) simplifies he compuaions for idenificaion of he quadrans of M k and incorporaes he informaion from he 4-uple where α = σ and λ = r c. Sep 1: In his sep, we need o idenify all four aligned quadrans M k+1, {, 1, 2, 3}, of he aligned M k, for k no reaching he base case, o proceed wih he recursive muliplicaion algorihm. The index is dropped from M k for simpliciy. To do his, we idenify he sar index α M k+1 and size λ M k+1 of each quadran M k+1 of M k. Because M k is divided ino four quadrans of equal size, he number of elemens λ M k+1 in any quadran is given by λ = λ M k+1 = λ M k/4. Recall, ha in he Moron-hybrid order, he quadrans no reaching he base case are sored according o he Moron layou. For he Moron order, he quadrans of M k are laid ou in he order NW M k, NE M k, SW M k, hen SE M k, and hence α NWM k = α M k α NEM k = α M k + λ α SWM k = α M k + 2(λ ) α SEM k = α M k + 3(λ ) The sub-marix S M k may no lie enirely wihin one quadran of M k and hence all quadrans M k+1 of M k which conain par of S M k mus be considered, which is he case in he example from Fig. 5 as S A k, S B k, and S C k ouch on all four quadrans of A k, B k, and C k respecively. Given S M k, we mus now idenify, for each M k+1, he par of S M k ha lies wihin M k+1. We denoe his sub-marix by S. The mehod o idenify S now follows. M k+1 M k+1 Sep 2: Recall ha we are given M k and S M k as inpu ino he recursion. As S M k may no lie enirely wihin one quadran of M k, i is scaered, and we mus idenify he pars of S M k which lie in M k+1 denoed by S M. We have k+1 idenified he quadrans M k+1, for {, 1, 2, 3}, and now we will idenify he par of S M k ha lies wihin each M k+1, denoed by S. Then S M k+1 M k is he wo-dimensional concaenaion of {S } for {, 1, 2, 3}. Here we drop he M k+1 index for simpliciy. To idenify S M, we need o idenify is sar index k+1 σ S M and dimensions r k+1 S c M k+1 S. To do his, he following inermediae Mk+1 values are needed. For simpliciy, he indices of he inermediae values denoing dependence on M k are omied. 7

8 r N : The number of rows of S M k in N, he norhern half of M k. c W : The number of columns of S M k in W, he wesern half of M k. r S : The number of rows of S M k in S, he souhern half of M k. c E : The number of columns of S M k in E, he easern half of M k. e : The Moron-hybrid index of he las enry of NW NWM k M k. Similarly for NE M k, SW M k, and SE M k. encode(i, j): Given an enry e of Caresian index (i, j), encode(i, j) reurns he Moron-hybrid index of e exrac i(z): Given an enry e of Moron-hybrid index z, exrac i(z) reurns he coordinae i of he Caresian index (i, j) of e exrac j(z): Given an enry e of Moron-hybrid index z, exrac j(z) reurns he coordinae j of he Caresian index (i, j) of e The idenificaion of S M k+1 is done as follows: For NE M k, calculae e NEM k as follows: e NEM k = α NE M k + λ 1 = α M k + 2 λ 1 and, for SW M k, e SWM k use e SWM k = α SW M k + λ 1 = α M k + 3 λ 1. Find r N using r N = exrac i(e ) exrac i(σ NEM k S )+1, i.e. r M k N is he difference beween he row indices of he las enry of NE M k and he firs enry of S M k and represens he number of rows of S M k in he norhern half of M k. Similarly, we find c W = exrac j(e SWM k ) exrac j(α S M k ) + 1, he number of columns of S M k in he wesern half of M k. Noe ha if r N <= hen no par of S M k lies in he norh half of M k and if c W <= hen no par of S M k lies in he wes half of M k. Afer finding r N and c W, we can find r S and c E using r S = r r SM k N and c E = c c SM k W, which are he remaining rows and columns of S M k respecively So far, we have found he number of rows in he norhern and souhern halves of M k and he number of columns in he wesern and easern halves of M k and we wan o idenify S NW M k, S NE M k, S SW M k and S SW M k for each M k {A k, B k, C k }. Recall ha we are able o idenify a sub-marix by a 4-uple (M, σ, r, c), where σ is he Moron-hybrid index of he firs enry of he sub-marix and r and c are is row and column dimensions respecively. Le (i, j SM k S M ) denoe he Caresian index of he firs enry k 8

9 of S M k found using i = exrac i(σ SM k S ) and j M k S = exrac j(σ M k S ). M k We now idenify S NW, S M k NE, S M k SW and S M k SW according o he M k following cases: 1. For S NW M k, σ S NW M k = σ SM k, r S NW M k = r N, and c S NWM k = c W. 2. For S NE, σ M k S NE M k = c E. c S NEM k 3. For S SW, σ M k S SW M k = c W. c S SWM k = encode(i SM k, j S M k + c W), r S NEM k = encode(i SM k + r N, j SM k ), r S SW M k = r N, and = r S, and 4. For S SE, σ M k S SE M k and c S SEM = c E. k = encode(i SM k + r N, j SM k + c W), r S SEM k = r S, To jusify hese cases we will explain how we arrived a case 2 for example where we idenify he sar index and dimensions of S NE as shown in Fig. 7. M k The res follow similarly. Recall ha σ SM denoes he Moron-hybrid index of k he firs elemen of S M k, and ha (i, j SM k S M ) is he corresponding Caresian k index. The index (i, j SM k S + c M k W) is he Caresian index of he firs elemen in S NE. The corresponding Moron-hybrid index σ M k S NE can be found using M k he funcion encode(i, j SM k S + c M k W). The dimensions of S NE are r M k N c E. Noe ha for each S M, he Caresian index of he sar enry of Moronhybrid index σ S M is given by (i k+1 S + ϕ M k r M k+1, j + ϕ SM k c M ), for ϕ rm {, r N } and ϕ cm {, c W }. Sep 3: By now we have decomposed each M k ino quadrans, and we have idenified, for each quadran M k+1, he par of S M k wihin ha quadran denoed by S. The marix S M k+1 M k is he wo-dimensional concaenaion of {S M k+1 } for {, 1, 2, 3}. Nex, we idenify which quadrans A k+1, B k+1 u, and Cv k+1 o consider for recursive muliplicaion. For each quadran A k+1 (same for for {NW A k, NE A k, SW A k, SE A k} of A k, we have idenified S A k+1 S B k+1 u and S C k+1 v, for all u, v {, 1, 2, 3}). We need o perform he muliplica-. As an example, examine ions wihin S B k and S C k required o calculae S A k+1 NW A k from Fig. 6. The sub-marix S NW A k is given by he 4-uple (A, 5, 3, 3). We idenify his uple using Sep 2 above. To calculae S NW = (A, 5, 3, 3) of A k S A k, he sub-marix from S B k given by (B, 9, 3, 4) is o be muliplied by he sub-marix from S C k given by (C, 11, 4, 3). According o our approach, his will be done in a way so as o ensure ha he sub-marices being muliplied a he base case are conained, which improves localiy and reduces conversion overhead as described earlier. Because he sub-marices (B, 9, 3, 4) and (C, 11, 4, 3) of S B k and S C k ouch on all four quadrans of B k and C k, and we wan o calculae S NW, all quadrans of Bk are o be considered for muliplicaion wih A k all quadrans of C k. Those are given by he following sixeen combinaions of 9

10 quadrans from B k and C k : NW B k and NW C k SW B k and NW C k NW B k and NE C k SW B k and NE C k NW B k and SW C k SW B k and SW C k NW B k and SE C k SW B k and SE C k NE B k and NW C k SE B k and NW C k NE B k and NE C k SE B k and NE C k NE B k and SW C k SE B k and SW C k NE B k and SE C k SE B k and SE C k (1) All of hese are needed o calculae S NW. Bu, o calculae he sub-marix A k S A k, we need o find S NE, S A k SW, and S A k SE in addiion o S A k NW because A k S A k is he wo-dimensional concaenaion of {S A k+1} for {, 1, 2, 3}. Similarly as above, o deermine each of hese quadrans of A k requires sixeen combinaions of quadrans from B k and C k. In oal, o find S A k, we would need up o sixy-four combinaions of quadrans from A k, B k, and C k. Sep 4: For each combinaion, if S, S, and S are no compaible, A 1 B 1 u C 1 v we exrac compaible pars of hese sub-marices and we label hem as S A 1, S B 1, and S u C 1 on which he muliplicaion proceeds recursively. How o exrac v compaible pars is beyond he scope of he presen manuscrip and is lef for fuure work 3. For now, we concede ha omiing i does no diver from he general undersanding of he overall algorihm, and ha he work requiremens for his sep can be embedded in ha required o perform Seps 1 3 above. Proposiion 2.6 If using auxiliary space o peform he marix addiions, and assuming he marix is of dimensions a mos 2 α 2 α, where α is he machine word-size, he cache oblivious MM using Moron-hybrid order requires asympoically he same work and criical pah lenh as defaul MM. Proof: On work: The cache-oblivious algorihm is a divide and conquer algorihm. The divide phase inroduces wo new funcions over he defaul MM algorihm consising of Seps 1 and 2 above. Each of hese seps requires a consan number of arihmeic operaions and calls o encoding and exracion procedures. From Sec. 3.5 of [1], we know ha each encoding or exracion procedure incurs a consan number of operaions assuming he marix is of dimensions a mos 2 α 2 α, where α is he machine word-size. For he ypical value α = 64, such marix sizes are sufficienly large for many applicaions. I follows ha he work of he cache-oblivious algorihm is asympoically he work of he defaul algorihm given by Θ(n 3 ). The conquer par creaes nonoverlapping sub-problems in Seps 3 and 4 above whose union yields he original marix o be mulipled. 3 We also noe ha omiing his par of he algorihm does no deflec from is main raionale. 1

11 On parallelism: All of he exra 64 recursive calls are independen and hus can be cas in parallel. If auxiliary space is available o perform he marix addiions required for each MM, one can also perform addiion in parallel using he sandard algorihm (Ch. 27 of [6]). Hence, he criical pah lengh of he cache-oblivious algorihm remains ha of he defaul mulihreaded algorihm and is known o be Θ(lg 2 n). Remarks on implicaions for Parallel Performance: The sub-marices a he base case of he recursion are conained wihin a row-major sub-marix, hanks o enforcing aligned sub-marices for he recursive division. The Moronhybrid, cache-oblvious version demonsraes superior performance over he defaul algorihm, and eliminaes he need for Moron-hybrid index conversion when accessing each elemen in he sub-marix a he base case, as i can proceed insead wih row-major encoding. The implicaions for parallel performance can be capured using he resuls from [4], which reveal ha nesed parallel algorihms for which he naural sequenial execuion has low cache complexiy will also aain good cache complexiy on parallel machines wih privae or shared caches. In his framework, our adapaion combines improved emporal localiy using he Moron-hybrid order for he serial algorihm as well as opimal work and criical pah lengh for he mulihreaded version. Performance Analysis We now verify ha he cos of increased recursive MM calls for he cache-oblivious sub-marix muliplicaion is significanly compensaed for by he improvemen in emproal localiy hanks o he Moron-hybrid order. We use a Penium IV of 2.8 GHz processor speed, wih an 8 KB L1 cache and a 512 KB L2 cache. I runs linux version and gcc compiler version 4... We generae random Moron-hybrid marices and muliply random sub-marices of hese marices using boh he defaul and cache-oblivious algorihms. To neuralise he effec of modular arimeic over finie fields and o be able o exclusively accoun for he gains induced by he Moron-hybrid order, he random marices we generae are aken over he binary field. According o [3], T = 32 is he ypical value for he runcaion size for block recursive marix algorihms of floaing poin enries ha shows improvemens in cache misses and cycles for Moron-hybrid, defaul MM. Recall he muliplicaion of recangular sub-marices S A = S B S C, where A, B and C are square and in Moron-hybrid order. The dimensions of he square marices are of no significance, since he muliplicaion kernel is operaing on he recangular sub-marices. We hus pariion Moron-hybrid marices of dimensions N = 248 and muliply submarices of hese Moron-hybrid marices of varying sizes. Each experimen is disinguished using varying indices σ SM of he saring enries of each S M and varying dimensions r SM and c SM. Because of he variaion in sizes across each experimen we do no repor on he run-imes of each bu raher choose o repor on he percenage of increase, or decrease, in he number of base case calls made by he cache-oblivious over he defaul algorihm and he associaed percenage of improvemen. We record he number of recursive MM calls made o he base case of each of he wo algorihms and he oal ime aken by he overall muliplicaion o finish. The resuls are presened in Table 1. We inerpre i using he fifh row, say, as an arbirary example. Of all 468 experimens 11

12 Table 1: Percenage Improvemen in Runime of he cache-oblivious algorihm % Inc. in Calls % of Exp. Avg. Imp. Min. Imp. Max. Imp run in oal, abou 9% of hem exhibied abou 34% increase in recursive calls made by he cache-oblivious over he defaul algorihm. The average, maximum, and minimum percenages of improvemen in run-ime across his bach of experimens is shown hereafer, and are all saggeringly high. Examining all rows, one can see ha no maer wha he increase in MM recursive calls has been, his hardly affecs he high percenages of improvemen. The reducions in cache misses as a resul of he cache-oblivious algorihm overwhelm he cos o handle exra recursive calls. 12

13 Figure 1: Generaion of Moron order Figure 2: Aligned and nonaligned sub-marices Figure 3: Sub-marix muliplicaion example I Figure 4: Base case sub-marix of non-aligned sub-marix Figure 5: S NE M k Figure 6: Sub-marix muliplicaion example II Figure 7: Example of Modified Recursive Sub-marix Muliplicaion 13

14 References [1] F. K. Abu Salem and M. Al Arab. Comparaive sudy of space filling curves for cache oblivious TU Decomposiion, exended repor, hp: //arxiv.org/abs/ [2] M. D. Adams and D. S. Wise. Fas addiions on masked inegers, in SIGPLAN No., 41(5):39 45, 26. [3] M. D. Adams and D. S. Wise. Seven a one sroke: resuls from a cacheoblivious paradigm for scalable marix algorihms, in MSPC 6,41 5, ACM Press, 26. [4] G. Blelloch, P. B. Gibbons, and H.-V. Simhadri. Low deph cacheoblivious algorihms, in SPAA 21, pp , ACM Press, 21. [5] N. Chen, N. Wang, and B. Shi. A new algorihm for encoding and decoding he hilber order, in Sofw. Prac. Exper., 37(8):897 98, 27. [6] T. H. Cormen, C. E. Leiserson, R. L. Rives and C. Sein. Inroducion o Algorihms, 3rd ediion, MIT Press. [7] J.G. Dumas and J.L. Roche. A parallel block algorihm for he exac riangulizaion of recangular marices, in SPAA 21, pp , ACM Press, 21. [8] Jeremy D. Frens and David S. Wise. QR facorizaion wih Moronordered quadree marices for memory re-use and parallelism, in PPoPP 3, pp , ACM Press, 23. [9] O. H. Ibarra, S. Moran, and L. E. Rosier, A noe on he parallel complexiy of compuing he rank of order n marices, in Inf. Proc. Le., 11(4-5): , 198. [1] O. H. Ibarra, S. Moran, and R. Hui, A generalizaion of he fas LUP marix decomposiion algorihm and applicaions, in J. of Algs., 3(1):45 56, [11] X. Liu and G. Schrack. Encoding and decoding he Hilber order, in Sofw. Prac. Exper., 26(12): , [12] P. Merkey. Z-ordering and UPC, online, Michigan Tech. Univ., June 23. [13] R. Raman and D. Sephen Wise. Convering o and from dilaed inegers, in IEEE Trans. Comp., 57(4): ,

Implementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report)

Implementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report) Implemening Ray Casing in Terahedral Meshes wih Programmable Graphics Hardware (Technical Repor) Marin Kraus, Thomas Erl March 28, 2002 1 Inroducion Alhough cell-projecion, e.g., [3, 2], and resampling,

More information

Gauss-Jordan Algorithm

Gauss-Jordan Algorithm Gauss-Jordan Algorihm The Gauss-Jordan algorihm is a sep by sep procedure for solving a sysem of linear equaions which may conain any number of variables and any number of equaions. The algorihm is carried

More information

Coded Caching with Multiple File Requests

Coded Caching with Multiple File Requests Coded Caching wih Muliple File Requess Yi-Peng Wei Sennur Ulukus Deparmen of Elecrical and Compuer Engineering Universiy of Maryland College Park, MD 20742 ypwei@umd.edu ulukus@umd.edu Absrac We sudy a

More information

Sam knows that his MP3 player has 40% of its battery life left and that the battery charges by an additional 12 percentage points every 15 minutes.

Sam knows that his MP3 player has 40% of its battery life left and that the battery charges by an additional 12 percentage points every 15 minutes. 8.F Baery Charging Task Sam wans o ake his MP3 player and his video game player on a car rip. An hour before hey plan o leave, he realized ha he forgo o charge he baeries las nigh. A ha poin, he plugged

More information

Lecture 18: Mix net Voting Systems

Lecture 18: Mix net Voting Systems 6.897: Advanced Topics in Crypography Apr 9, 2004 Lecure 18: Mix ne Voing Sysems Scribed by: Yael Tauman Kalai 1 Inroducion In he previous lecure, we defined he noion of an elecronic voing sysem, and specified

More information

Shortest Path Algorithms. Lecture I: Shortest Path Algorithms. Example. Graphs and Matrices. Setting: Dr Kieran T. Herley.

Shortest Path Algorithms. Lecture I: Shortest Path Algorithms. Example. Graphs and Matrices. Setting: Dr Kieran T. Herley. Shores Pah Algorihms Background Seing: Lecure I: Shores Pah Algorihms Dr Kieran T. Herle Deparmen of Compuer Science Universi College Cork Ocober 201 direced graph, real edge weighs Le he lengh of a pah

More information

CENG 477 Introduction to Computer Graphics. Modeling Transformations

CENG 477 Introduction to Computer Graphics. Modeling Transformations CENG 477 Inroducion o Compuer Graphics Modeling Transformaions Modeling Transformaions Model coordinaes o World coordinaes: Model coordinaes: All shapes wih heir local coordinaes and sies. world World

More information

EECS 487: Interactive Computer Graphics

EECS 487: Interactive Computer Graphics EECS 487: Ineracive Compuer Graphics Lecure 7: B-splines curves Raional Bézier and NURBS Cubic Splines A represenaion of cubic spline consiss of: four conrol poins (why four?) hese are compleely user specified

More information

CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL

CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL Klečka Jan Docoral Degree Programme (1), FEEC BUT E-mail: xkleck01@sud.feec.vubr.cz Supervised by: Horák Karel E-mail: horak@feec.vubr.cz

More information

4.1 3D GEOMETRIC TRANSFORMATIONS

4.1 3D GEOMETRIC TRANSFORMATIONS MODULE IV MCA - 3 COMPUTER GRAPHICS ADMN 29- Dep. of Compuer Science And Applicaions, SJCET, Palai 94 4. 3D GEOMETRIC TRANSFORMATIONS Mehods for geomeric ransformaions and objec modeling in hree dimensions

More information

4. Minimax and planning problems

4. Minimax and planning problems CS/ECE/ISyE 524 Inroducion o Opimizaion Spring 2017 18 4. Minima and planning problems ˆ Opimizing piecewise linear funcions ˆ Minima problems ˆ Eample: Chebyshev cener ˆ Muli-period planning problems

More information

Quantitative macro models feature an infinite number of periods A more realistic (?) view of time

Quantitative macro models feature an infinite number of periods A more realistic (?) view of time INFINIE-HORIZON CONSUMPION-SAVINGS MODEL SEPEMBER, Inroducion BASICS Quaniaive macro models feaure an infinie number of periods A more realisic (?) view of ime Infinie number of periods A meaphor for many

More information

Data Structures and Algorithms. The material for this lecture is drawn, in part, from The Practice of Programming (Kernighan & Pike) Chapter 2

Data Structures and Algorithms. The material for this lecture is drawn, in part, from The Practice of Programming (Kernighan & Pike) Chapter 2 Daa Srucures and Algorihms The maerial for his lecure is drawn, in par, from The Pracice of Programming (Kernighan & Pike) Chaper 2 1 Moivaing Quoaion Every program depends on algorihms and daa srucures,

More information

COSC 3213: Computer Networks I Chapter 6 Handout # 7

COSC 3213: Computer Networks I Chapter 6 Handout # 7 COSC 3213: Compuer Neworks I Chaper 6 Handou # 7 Insrucor: Dr. Marvin Mandelbaum Deparmen of Compuer Science York Universiy F05 Secion A Medium Access Conrol (MAC) Topics: 1. Muliple Access Communicaions:

More information

Optimal Crane Scheduling

Optimal Crane Scheduling Opimal Crane Scheduling Samid Hoda, John Hooker Laife Genc Kaya, Ben Peerson Carnegie Mellon Universiy Iiro Harjunkoski ABB Corporae Research EWO - 13 November 2007 1/16 Problem Track-mouned cranes move

More information

A Matching Algorithm for Content-Based Image Retrieval

A Matching Algorithm for Content-Based Image Retrieval A Maching Algorihm for Conen-Based Image Rerieval Sue J. Cho Deparmen of Compuer Science Seoul Naional Universiy Seoul, Korea Absrac Conen-based image rerieval sysem rerieves an image from a daabase using

More information

Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding

Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding Moivaion Image segmenaion Which pixels belong o he same objec in an image/video sequence? (spaial segmenaion) Which frames belong o he same video sho? (emporal segmenaion) Which frames belong o he same

More information

An Improved Square-Root Nyquist Shaping Filter

An Improved Square-Root Nyquist Shaping Filter An Improved Square-Roo Nyquis Shaping Filer fred harris San Diego Sae Universiy fred.harris@sdsu.edu Sridhar Seshagiri San Diego Sae Universiy Seshigar.@engineering.sdsu.edu Chris Dick Xilinx Corp. chris.dick@xilinx.com

More information

PART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR

PART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR . ~ PART 1 c 0 \,).,,.,, REFERENCE NFORMATON CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONTOR n CONTROL DATA 6400 Compuer Sysems, sysem funcions are normally handled by he Monior locaed in a Peripheral

More information

Network management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional);

Network management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional); QoS in Frame Relay Frame relay characerisics are:. packe swiching wih virual circui service (virual circuis are bidirecional);. labels are called DLCI (Daa Link Connecion Idenifier);. for connecion is

More information

STEREO PLANE MATCHING TECHNIQUE

STEREO PLANE MATCHING TECHNIQUE STEREO PLANE MATCHING TECHNIQUE Commission III KEY WORDS: Sereo Maching, Surface Modeling, Projecive Transformaion, Homography ABSTRACT: This paper presens a new ype of sereo maching algorihm called Sereo

More information

Assignment 2. Due Monday Feb. 12, 10:00pm.

Assignment 2. Due Monday Feb. 12, 10:00pm. Faculy of rs and Science Universiy of Torono CSC 358 - Inroducion o Compuer Neworks, Winer 218, LEC11 ssignmen 2 Due Monday Feb. 12, 1:pm. 1 Quesion 1 (2 Poins): Go-ack n RQ In his quesion, we review how

More information

1 œ DRUM SET KEY. 8 Odd Meter Clave Conor Guilfoyle. Cowbell (neck) Cymbal. Hi-hat. Floor tom (shell) Clave block. Cowbell (mouth) Hi tom.

1 œ DRUM SET KEY. 8 Odd Meter Clave Conor Guilfoyle. Cowbell (neck) Cymbal. Hi-hat. Floor tom (shell) Clave block. Cowbell (mouth) Hi tom. DRUM SET KEY Hi-ha Cmbal Clave block Cowbell (mouh) 0 Cowbell (neck) Floor om (shell) Hi om Mid om Snare Floor om Snare cross sick or clave block Bass drum Hi-ha wih foo 8 Odd Meer Clave Conor Guilfole

More information

MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES

MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES B. MARCOTEGUI and F. MEYER Ecole des Mines de Paris, Cenre de Morphologie Mahémaique, 35, rue Sain-Honoré, F 77305 Fonainebleau Cedex, France Absrac. In image

More information

Why not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems

Why not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems Simulaion Wha is simulaion? Simple synonym: imiaion We are ineresed in sudying a Insead of experimening wih he iself we experimen wih a model of he Experimen wih he Acual Ways o sudy a Sysem Experimen

More information

Learning in Games via Opponent Strategy Estimation and Policy Search

Learning in Games via Opponent Strategy Estimation and Policy Search Learning in Games via Opponen Sraegy Esimaion and Policy Search Yavar Naddaf Deparmen of Compuer Science Universiy of Briish Columbia Vancouver, BC yavar@naddaf.name Nando de Freias (Supervisor) Deparmen

More information

REDUCTIONS BBM ALGORITHMS DEPT. OF COMPUTER ENGINEERING ERKUT ERDEM. Bird s-eye view. May. 12, Reduction.

REDUCTIONS BBM ALGORITHMS DEPT. OF COMPUTER ENGINEERING ERKUT ERDEM. Bird s-eye view. May. 12, Reduction. BBM 0 - ALGORITHMS DEPT. OF COMPUTER ENGINEERING ERKUT ERDEM REDUCTIONS May., 0 Bird s-eye view Desideraa. Classify problems according o compuaional requiremens. complexiy order of growh examples linear

More information

Design Alternatives for a Thin Lens Spatial Integrator Array

Design Alternatives for a Thin Lens Spatial Integrator Array Egyp. J. Solids, Vol. (7), No. (), (004) 75 Design Alernaives for a Thin Lens Spaial Inegraor Array Hala Kamal *, Daniel V azquez and Javier Alda and E. Bernabeu Opics Deparmen. Universiy Compluense of

More information

SOT: Compact Representation for Triangle and Tetrahedral Meshes

SOT: Compact Representation for Triangle and Tetrahedral Meshes SOT: Compac Represenaion for Triangle and Terahedral Meshes Topraj Gurung and Jarek Rossignac School of Ineracive Compuing, College of Compuing, Georgia Insiue of Technology, Alana, GA ABSTRACT The Corner

More information

Rule-Based Multi-Query Optimization

Rule-Based Multi-Query Optimization Rule-Based Muli-Query Opimizaion Mingsheng Hong Dep. of Compuer cience Cornell Universiy mshong@cs.cornell.edu Johannes Gehrke Dep. of Compuer cience Cornell Universiy johannes@cs.cornell.edu Mirek Riedewald

More information

Mobile Robots Mapping

Mobile Robots Mapping Mobile Robos Mapping 1 Roboics is Easy conrol behavior percepion modelling domain model environmen model informaion exracion raw daa planning ask cogniion reasoning pah planning navigaion pah execuion

More information

A Principled Approach to. MILP Modeling. Columbia University, August Carnegie Mellon University. Workshop on MIP. John Hooker.

A Principled Approach to. MILP Modeling. Columbia University, August Carnegie Mellon University. Workshop on MIP. John Hooker. Slide A Principled Approach o MILP Modeling John Hooer Carnegie Mellon Universiy Worshop on MIP Columbia Universiy, Augus 008 Proposal MILP modeling is an ar, bu i need no be unprincipled. Slide Proposal

More information

Real Time Integral-Based Structural Health Monitoring

Real Time Integral-Based Structural Health Monitoring Real Time Inegral-Based Srucural Healh Monioring The nd Inernaional Conference on Sensing Technology ICST 7 J. G. Chase, I. Singh-Leve, C. E. Hann, X. Chen Deparmen of Mechanical Engineering, Universiy

More information

Image Content Representation

Image Content Representation Image Conen Represenaion Represenaion for curves and shapes regions relaionships beween regions E.G.M. Perakis Image Represenaion & Recogniion 1 Reliable Represenaion Uniqueness: mus uniquely specify an

More information

Chapter 4 Sequential Instructions

Chapter 4 Sequential Instructions Chaper 4 Sequenial Insrucions The sequenial insrucions of FBs-PLC shown in his chaper are also lised in secion 3.. Please refer o Chaper, "PLC Ladder diagram and he Coding rules of Mnemonic insrucion",

More information

Precise Voronoi Cell Extraction of Free-form Rational Planar Closed Curves

Precise Voronoi Cell Extraction of Free-form Rational Planar Closed Curves Precise Voronoi Cell Exracion of Free-form Raional Planar Closed Curves Iddo Hanniel, Ramanahan Muhuganapahy, Gershon Elber Deparmen of Compuer Science Technion, Israel Insiue of Technology Haifa 32000,

More information

Video Content Description Using Fuzzy Spatio-Temporal Relations

Video Content Description Using Fuzzy Spatio-Temporal Relations Proceedings of he 4s Hawaii Inernaional Conference on Sysem Sciences - 008 Video Conen Descripion Using Fuzzy Spaio-Temporal Relaions rchana M. Rajurkar *, R.C. Joshi and Sananu Chaudhary 3 Dep of Compuer

More information

A non-stationary uniform tension controlled interpolating 4-point scheme reproducing conics

A non-stationary uniform tension controlled interpolating 4-point scheme reproducing conics A non-saionary uniform ension conrolled inerpolaing 4-poin scheme reproducing conics C. Beccari a, G. Casciola b, L. Romani b, a Deparmen of Pure and Applied Mahemaics, Universiy of Padova, Via G. Belzoni

More information

A Fast Non-Uniform Knots Placement Method for B-Spline Fitting

A Fast Non-Uniform Knots Placement Method for B-Spline Fitting 2015 IEEE Inernaional Conference on Advanced Inelligen Mecharonics (AIM) July 7-11, 2015. Busan, Korea A Fas Non-Uniform Knos Placemen Mehod for B-Spline Fiing T. Tjahjowidodo, VT. Dung, and ML. Han Absrac

More information

Scheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012

Scheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012 EDA421/DIT171 - Parallel and Disribued Real-Time Sysems, Chalmers/GU, 2011/2012 Lecure #4 Updaed March 16, 2012 Aemps o mee applicaion consrains should be done in a proacive way hrough scheduling. Schedule

More information

Weighted Voting in 3D Random Forest Segmentation

Weighted Voting in 3D Random Forest Segmentation Weighed Voing in 3D Random Fores Segmenaion M. Yaqub,, P. Mahon 3, M. K. Javaid, C. Cooper, J. A. Noble NDORMS, Universiy of Oxford, IBME, Deparmen of Engineering Science, Universiy of Oxford, 3 MRC Epidemiology

More information

The Roots of Lisp paul graham

The Roots of Lisp paul graham The Roos of Lisp paul graham Draf, January 18, 2002. In 1960, John McCarhy published a remarkable paper in which he did for programming somehing like wha Euclid did for geomery. 1 He showed how, given

More information

Constant-Work-Space Algorithms for Shortest Paths in Trees and Simple Polygons

Constant-Work-Space Algorithms for Shortest Paths in Trees and Simple Polygons Journal of Graph Algorihms and Applicaions hp://jgaa.info/ vol. 15, no. 5, pp. 569 586 (2011) Consan-Work-Space Algorihms for Shores Pahs in Trees and Simple Polygons Tesuo Asano 1 Wolfgang Mulzer 2 Yajun

More information

The Impact of Product Development on the Lifecycle of Defects

The Impact of Product Development on the Lifecycle of Defects The Impac of Produc Developmen on he Lifecycle of Rudolf Ramler Sofware Compeence Cener Hagenberg Sofware Park 21 A-4232 Hagenberg, Ausria +43 7236 3343 872 rudolf.ramler@scch.a ABSTRACT This paper invesigaes

More information

Performance Evaluation of Implementing Calls Prioritization with Different Queuing Disciplines in Mobile Wireless Networks

Performance Evaluation of Implementing Calls Prioritization with Different Queuing Disciplines in Mobile Wireless Networks Journal of Compuer Science 2 (5): 466-472, 2006 ISSN 1549-3636 2006 Science Publicaions Performance Evaluaion of Implemening Calls Prioriizaion wih Differen Queuing Disciplines in Mobile Wireless Neworks

More information

Adaptive Workflow Scheduling on Cloud Computing Platforms with Iterative Ordinal Optimization

Adaptive Workflow Scheduling on Cloud Computing Platforms with Iterative Ordinal Optimization Adapive Workflow Scheduling on Cloud Compuing Plaforms wih Ieraive Ordinal Opimizaion Fan Zhang, Senior Member, IEEE; Junwei Cao, Senior Member, IEEE; Kai Hwang, Fellow, IEEE; Keqin Li, Senior Member,

More information

Handling uncertainty in semantic information retrieval process

Handling uncertainty in semantic information retrieval process Handling uncerainy in semanic informaion rerieval process Chkiwa Mounira 1, Jedidi Anis 1 and Faiez Gargouri 1 1 Mulimedia, InfoRmaion sysems and Advanced Compuing Laboraory Sfax Universiy, Tunisia m.chkiwa@gmail.com,

More information

Evaluation and Improvement of Region-based Motion Segmentation

Evaluation and Improvement of Region-based Motion Segmentation Evaluaion and Improvemen of Region-based Moion Segmenaion Mark Ross Universiy Koblenz-Landau, Insiue of Compuaional Visualisics, Universiässraße 1, 56070 Koblenz, Germany Email: ross@uni-koblenz.de Absrac

More information

Partition-based document identifier assignment (PBDIA) algorithm. (long queries)

Partition-based document identifier assignment (PBDIA) algorithm. (long queries) ( ) Pariion-based documen idenifier assignmen (PBDIA) algorihm PBDIA (long queries) (parallel IR) :,,,, d-gap Compressing an invered file can grealy improve query performance of an informaion rerieval

More information

Open Access Research on an Improved Medical Image Enhancement Algorithm Based on P-M Model. Luo Aijing 1 and Yin Jin 2,* u = div( c u ) u

Open Access Research on an Improved Medical Image Enhancement Algorithm Based on P-M Model. Luo Aijing 1 and Yin Jin 2,* u = div( c u ) u Send Orders for Reprins o reprins@benhamscience.ae The Open Biomedical Engineering Journal, 5, 9, 9-3 9 Open Access Research on an Improved Medical Image Enhancemen Algorihm Based on P-M Model Luo Aijing

More information

Less Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks

Less Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks Less Pessimisic Wors-Case Delay Analysis for Packe-Swiched Neworks Maias Wecksén Cenre for Research on Embedded Sysems P O Box 823 SE-31 18 Halmsad maias.wecksen@hh.se Magnus Jonsson Cenre for Research

More information

A Formalization of Ray Casting Optimization Techniques

A Formalization of Ray Casting Optimization Techniques A Formalizaion of Ray Casing Opimizaion Techniques J. Revelles, C. Ureña Dp. Lenguajes y Sisemas Informáicos, E.T.S.I. Informáica, Universiy of Granada, Spain e-mail: [jrevelle,almagro]@ugr.es URL: hp://giig.ugr.es

More information

User Adjustable Process Scheduling Mechanism for a Multiprocessor Embedded System

User Adjustable Process Scheduling Mechanism for a Multiprocessor Embedded System Proceedings of he 6h WSEAS Inernaional Conference on Applied Compuer Science, Tenerife, Canary Islands, Spain, December 16-18, 2006 346 User Adjusable Process Scheduling Mechanism for a Muliprocessor Embedded

More information

Simple Network Management Based on PHP and SNMP

Simple Network Management Based on PHP and SNMP Simple Nework Managemen Based on PHP and SNMP Krasimir Trichkov, Elisavea Trichkova bsrac: This paper aims o presen simple mehod for nework managemen based on SNMP - managemen of Cisco rouer. The paper

More information

Improving the Efficiency of Dynamic Service Provisioning in Transport Networks with Scheduled Services

Improving the Efficiency of Dynamic Service Provisioning in Transport Networks with Scheduled Services Improving he Efficiency of Dynamic Service Provisioning in Transpor Neworks wih Scheduled Services Ralf Hülsermann, Monika Jäger and Andreas Gladisch Technologiezenrum, T-Sysems, Goslarer Ufer 35, D-1585

More information

Motor Control. 5. Control. Motor Control. Motor Control

Motor Control. 5. Control. Motor Control. Motor Control 5. Conrol In his chaper we will do: Feedback Conrol On/Off Conroller PID Conroller Moor Conrol Why use conrol a all? Correc or wrong? Supplying a cerain volage / pulsewidh will make he moor spin a a cerain

More information

A time-space consistency solution for hardware-in-the-loop simulation system

A time-space consistency solution for hardware-in-the-loop simulation system Inernaional Conference on Advanced Elecronic Science and Technology (AEST 206) A ime-space consisency soluion for hardware-in-he-loop simulaion sysem Zexin Jiang a Elecric Power Research Insiue of Guangdong

More information

Analysis of Various Types of Bugs in the Object Oriented Java Script Language Coding

Analysis of Various Types of Bugs in the Object Oriented Java Script Language Coding Indian Journal of Science and Technology, Vol 8(21), DOI: 10.17485/ijs/2015/v8i21/69958, Sepember 2015 ISSN (Prin) : 0974-6846 ISSN (Online) : 0974-5645 Analysis of Various Types of Bugs in he Objec Oriened

More information

Hidden Markov Model and Chapman Kolmogrov for Protein Structures Prediction from Images

Hidden Markov Model and Chapman Kolmogrov for Protein Structures Prediction from Images Hidden Markov Model and Chapman Kolmogrov for Proein Srucures Predicion from Images Md.Sarwar Kamal 1, Linkon Chowdhury 2, Mohammad Ibrahim Khan 2, Amira S. Ashour 3, João Manuel R.S. Tavares 4, Nilanjan

More information

NEWTON S SECOND LAW OF MOTION

NEWTON S SECOND LAW OF MOTION Course and Secion Dae Names NEWTON S SECOND LAW OF MOTION The acceleraion of an objec is defined as he rae of change of elociy. If he elociy changes by an amoun in a ime, hen he aerage acceleraion during

More information

IAJIT First Online Publication

IAJIT First Online Publication An Improved Feaure Exracion and Combinaion of Muliple Classifiers for Query-by- ming Naha Phiwma and Parinya Sanguansa 2 Deparmen of Compuer Science, Suan Dusi Rajabha Universiy, Thailand 2 Faculy of Engineering

More information

MATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008

MATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008 MATH 5 - Differenial Equaions Sepember 15, 8 Projec 1, Fall 8 Due: Sepember 4, 8 Lab 1.3 - Logisics Populaion Models wih Harvesing For his projec we consider lab 1.3 of Differenial Equaions pages 146 o

More information

An Adaptive Spatial Depth Filter for 3D Rendering IP

An Adaptive Spatial Depth Filter for 3D Rendering IP JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.3, NO. 4, DECEMBER, 23 175 An Adapive Spaial Deph Filer for 3D Rendering IP Chang-Hyo Yu and Lee-Sup Kim Absrac In his paper, we presen a new mehod

More information

STRING DESCRIPTIONS OF DATA FOR DISPLAY*

STRING DESCRIPTIONS OF DATA FOR DISPLAY* SLAC-PUB-383 January 1968 STRING DESCRIPTIONS OF DATA FOR DISPLAY* J. E. George and W. F. Miller Compuer Science Deparmen and Sanford Linear Acceleraor Cener Sanford Universiy Sanford, California Absrac

More information

BI-TEMPORAL INDEXING

BI-TEMPORAL INDEXING BI-TEMPORAL INDEXING Mirella M. Moro Uniersidade Federal do Rio Grande do Sul Poro Alegre, RS, Brazil hp://www.inf.ufrgs.br/~mirella/ Vassilis J. Tsoras Uniersiy of California, Rierside Rierside, CA 92521,

More information

Fully Dynamic Algorithm for Top-k Densest Subgraphs

Fully Dynamic Algorithm for Top-k Densest Subgraphs Fully Dynamic Algorihm for Top-k Denses Subgraphs Muhammad Anis Uddin Nasir 1, Arisides Gionis 2, Gianmarco De Francisci Morales 3 Sarunas Girdzijauskas 4 Royal Insiue of Technology, Sweden Aalo Universiy,

More information

4 Error Control. 4.1 Issues with Reliable Protocols

4 Error Control. 4.1 Issues with Reliable Protocols 4 Error Conrol Jus abou all communicaion sysems aemp o ensure ha he daa ges o he oher end of he link wihou errors. Since i s impossible o build an error-free physical layer (alhough some shor links can

More information

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab CMOS INEGRAED CIRCUI DESIGN ECHNIQUES Universiy of Ioannina Clocking Schemes Dep. of Compuer Science and Engineering Y. siaouhas CMOS Inegraed Circui Design echniques Overview 1. Jier Skew hroughpu Laency

More information

Dynamic Route Planning and Obstacle Avoidance Model for Unmanned Aerial Vehicles

Dynamic Route Planning and Obstacle Avoidance Model for Unmanned Aerial Vehicles Volume 116 No. 24 2017, 315-329 ISSN: 1311-8080 (prined version); ISSN: 1314-3395 (on-line version) url: hp://www.ijpam.eu ijpam.eu Dynamic Roue Planning and Obsacle Avoidance Model for Unmanned Aerial

More information

FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS

FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS Mohammed A. Aseeri and M. I. Sobhy Deparmen of Elecronics, The Universiy of Ken a Canerbury Canerbury, Ken, CT2

More information

Improved TLD Algorithm for Face Tracking

Improved TLD Algorithm for Face Tracking Absrac Improved TLD Algorihm for Face Tracking Huimin Li a, Chaojing Yu b and Jing Chen c Chongqing Universiy of Poss and Telecommunicaions, Chongqing 400065, China a li.huimin666@163.com, b 15023299065@163.com,

More information

Distributed Task Negotiation in Modular Robots

Distributed Task Negotiation in Modular Robots Disribued Task Negoiaion in Modular Robos Behnam Salemi, eer Will, and Wei-Min Shen USC Informaion Sciences Insiue and Compuer Science Deparmen Marina del Rey, USA, {salemi, will, shen}@isi.edu Inroducion

More information

Automatic Calculation of Coverage Profiles for Coverage-based Testing

Automatic Calculation of Coverage Profiles for Coverage-based Testing Auomaic Calculaion of Coverage Profiles for Coverage-based Tesing Raimund Kirner 1 and Waler Haas 1 Vienna Universiy of Technology, Insiue of Compuer Engineering, Vienna, Ausria, raimund@vmars.uwien.ac.a

More information

FUZZY HUMAN/MACHINE RELIABILITY USING VHDL

FUZZY HUMAN/MACHINE RELIABILITY USING VHDL FUZZY HUMN/MCHINE RELIBILITY USING VHDL Carlos. Graciós M. 1, lejandro Díaz S. 2, Efrén Gorroiea H. 3 (1) Insiuo Tecnológico de Puebla v. Tecnológico 420. Col. Maravillas, C. P. 72220, Puebla, Pue. México

More information

It is easier to visualize plotting the curves of cos x and e x separately: > plot({cos(x),exp(x)},x = -5*Pi..Pi,y = );

It is easier to visualize plotting the curves of cos x and e x separately: > plot({cos(x),exp(x)},x = -5*Pi..Pi,y = ); Mah 467 Homework Se : some soluions > wih(deools): wih(plos): Warning, he name changecoords has been redefined Problem :..7 Find he fixed poins, deermine heir sabiliy, for x( ) = cos x e x > plo(cos(x)

More information

Nonparametric CUSUM Charts for Process Variability

Nonparametric CUSUM Charts for Process Variability Journal of Academia and Indusrial Research (JAIR) Volume 3, Issue June 4 53 REEARCH ARTICLE IN: 78-53 Nonparameric CUUM Chars for Process Variabiliy D.M. Zombade and V.B. Ghue * Dep. of aisics, Walchand

More information

Rao-Blackwellized Particle Filtering for Probing-Based 6-DOF Localization in Robotic Assembly

Rao-Blackwellized Particle Filtering for Probing-Based 6-DOF Localization in Robotic Assembly MITSUBISHI ELECTRIC RESEARCH LABORATORIES hp://www.merl.com Rao-Blackwellized Paricle Filering for Probing-Based 6-DOF Localizaion in Roboic Assembly Yuichi Taguchi, Tim Marks, Haruhisa Okuda TR1-8 June

More information

Announcements For The Logic of Boolean Connectives Truth Tables, Tautologies & Logical Truths. Outline. Introduction Truth Functions

Announcements For The Logic of Boolean Connectives Truth Tables, Tautologies & Logical Truths. Outline. Introduction Truth Functions Announcemens For 02.05.09 The Logic o Boolean Connecives Truh Tables, Tauologies & Logical Truhs 1 HW3 is due nex Tuesday William Sarr 02.05.09 William Sarr The Logic o Boolean Connecives (Phil 201.02)

More information

Fill in the following table for the functions shown below.

Fill in the following table for the functions shown below. By: Carl H. Durney and Neil E. Coer Example 1 EX: Fill in he following able for he funcions shown below. he funcion is odd he funcion is even he funcion has shif-flip symmery he funcion has quarer-wave

More information

Quick Verification of Concurrent Programs by Iteratively Relaxed Scheduling

Quick Verification of Concurrent Programs by Iteratively Relaxed Scheduling Quick Verificaion of Concurren Programs by Ieraively Relaxed Scheduling Parick Mezler, Habib Saissi, Péer Bokor, Neeraj Suri Technische Univerisä Darmsad, Germany {mezler, saissi, pbokor, suri}@deeds.informaik.u-darmsad.de

More information

Robust Multi-view Face Detection Using Error Correcting Output Codes

Robust Multi-view Face Detection Using Error Correcting Output Codes Robus Muli-view Face Deecion Using Error Correcing Oupu Codes Hongming Zhang,2, Wen GaoP P, Xilin Chen 2, Shiguang Shan 2, and Debin Zhao Deparmen of Compuer Science and Engineering, Harbin Insiue of Technolog

More information

Difficulty-aware Hybrid Search in Peer-to-Peer Networks

Difficulty-aware Hybrid Search in Peer-to-Peer Networks Difficuly-aware Hybrid Search in Peer-o-Peer Neworks Hanhua Chen, Hai Jin, Yunhao Liu, Lionel M. Ni School of Compuer Science and Technology Huazhong Univ. of Science and Technology {chenhanhua, hjin}@hus.edu.cn

More information

Representing Non-Manifold Shapes in Arbitrary Dimensions

Representing Non-Manifold Shapes in Arbitrary Dimensions Represening Non-Manifold Shapes in Arbirary Dimensions Leila De Floriani,2 and Annie Hui 2 DISI, Universiy of Genova, Via Dodecaneso, 35-646 Genova (Ialy). 2 Deparmen of Compuer Science, Universiy of Maryland,

More information

Axiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries

Axiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries Axiomaic Foundaions and Algorihms for Deciding Semanic Equivalences of SQL Queries Shumo Chu, Brendan Murphy, Jared Roesch, Alvin Cheung, Dan Suciu Paul G. Allen School of Compuer Science and Engineering

More information

Test - Accredited Configuration Engineer (ACE) Exam - PAN-OS 6.0 Version

Test - Accredited Configuration Engineer (ACE) Exam - PAN-OS 6.0 Version Tes - Accredied Configuraion Engineer (ACE) Exam - PAN-OS 6.0 Version ACE Exam Quesion 1 of 50. Which of he following saemens is NOT abou Palo Alo Neworks firewalls? Sysem defauls may be resored by performing

More information

Moving Object Detection Using MRF Model and Entropy based Adaptive Thresholding

Moving Object Detection Using MRF Model and Entropy based Adaptive Thresholding Moving Objec Deecion Using MRF Model and Enropy based Adapive Thresholding Badri Narayan Subudhi, Pradipa Kumar Nanda and Ashish Ghosh Machine Inelligence Uni, Indian Saisical Insiue, Kolkaa, 700108, India,

More information

On Continuity of Complex Fuzzy Functions

On Continuity of Complex Fuzzy Functions Mahemaical Theory and Modeling www.iise.org On Coninuiy of Complex Fuzzy Funcions Pishiwan O. Sabir Deparmen of Mahemaics Faculy of Science and Science Educaion Universiy of Sulaimani Iraq pishiwan.sabir@gmail.com

More information

Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases

Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases Lmarks: A New Model for Similariy-Based Paern Querying in Time Series Daabases Chang-Shing Perng Haixun Wang Sylvia R. Zhang D. So Parker perng@cs.ucla.edu hxwang@cs.ucla.edu Sylvia Zhang@cle.com so@cs.ucla.edu

More information

A METHOD OF MODELING DEFORMATION OF AN OBJECT EMPLOYING SURROUNDING VIDEO CAMERAS

A METHOD OF MODELING DEFORMATION OF AN OBJECT EMPLOYING SURROUNDING VIDEO CAMERAS A METHOD OF MODELING DEFORMATION OF AN OBJECT EMLOYING SURROUNDING IDEO CAMERAS Joo Kooi TAN, Seiji ISHIKAWA Deparmen of Mechanical and Conrol Engineering Kushu Insiue of Technolog, Japan ehelan@is.cnl.kuech.ac.jp,

More information

MB86297A Carmine Timing Analysis of the DDR Interface

MB86297A Carmine Timing Analysis of the DDR Interface Applicaion Noe MB86297A Carmine Timing Analysis of he DDR Inerface Fujisu Microelecronics Europe GmbH Hisory Dae Auhor Version Commen 05.02.2008 Anders Ramdahl 0.01 Firs draf 06.02.2008 Anders Ramdahl

More information

Data Structures and Algorithms

Data Structures and Algorithms Daa Srucures and Algorihms The maerial for his lecure is drawn, in ar, from The Pracice of Programming (Kernighan & Pike) Chaer 2 1 Goals of his Lecure Hel you learn (or refresh your memory) abou: Common

More information

source managemen, naming, proecion, and service provisions. This paper concenraes on he basic processor scheduling aspecs of resource managemen. 2 The

source managemen, naming, proecion, and service provisions. This paper concenraes on he basic processor scheduling aspecs of resource managemen. 2 The Virual Compuers A New Paradigm for Disribued Operaing Sysems Banu Ozden y Aaron J. Goldberg Avi Silberschaz z 600 Mounain Ave. AT&T Bell Laboraories Murray Hill, NJ 07974 Absrac The virual compuers (VC)

More information

Voltair Version 2.5 Release Notes (January, 2018)

Voltair Version 2.5 Release Notes (January, 2018) Volair Version 2.5 Release Noes (January, 2018) Inroducion 25-Seven s new Firmware Updae 2.5 for he Volair processor is par of our coninuing effors o improve Volair wih new feaures and capabiliies. For

More information

The Data Locality of Work Stealing

The Data Locality of Work Stealing The Daa Localiy of Work Sealing Umu A. Acar School of Compuer Science Carnegie Mellon Universiy umu@cs.cmu.edu Guy E. Blelloch School of Compuer Science Carnegie Mellon Universiy guyb@cs.cmu.edu Rober

More information

Computer representations of piecewise

Computer representations of piecewise Edior: Gabriel Taubin Inroducion o Geomeric Processing hrough Opimizaion Gabriel Taubin Brown Universiy Compuer represenaions o piecewise smooh suraces have become vial echnologies in areas ranging rom

More information

Relevance Ranking using Kernels

Relevance Ranking using Kernels Relevance Ranking using Kernels Jun Xu 1, Hang Li 1, and Chaoliang Zhong 2 1 Microsof Research Asia, 4F Sigma Cener, No. 49 Zhichun Road, Beijing, China 100190 2 Beijing Universiy of Poss and Telecommunicaions,

More information

LAMP: 3D Layered, Adaptive-resolution and Multiperspective Panorama - a New Scene Representation

LAMP: 3D Layered, Adaptive-resolution and Multiperspective Panorama - a New Scene Representation Submission o Special Issue of CVIU on Model-based and Image-based 3D Scene Represenaion for Ineracive Visualizaion LAMP: 3D Layered, Adapive-resoluion and Muliperspecive Panorama - a New Scene Represenaion

More information

NRMI: Natural and Efficient Middleware

NRMI: Natural and Efficient Middleware NRMI: Naural and Efficien Middleware Eli Tilevich and Yannis Smaragdakis Cener for Experimenal Research in Compuer Sysems (CERCS), College of Compuing, Georgia Tech {ilevich, yannis}@cc.gaech.edu Absrac

More information

Numerical Solution of ODE

Numerical Solution of ODE Numerical Soluion of ODE Euler and Implici Euler resar; wih(deools): wih(plos): The package ploools conains more funcions for ploing, especially a funcion o draw a single line: wih(ploools): wih(linearalgebra):

More information

In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magnetic Field Maps

In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magnetic Field Maps In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magneic Field Maps A. D. Hahn 1, A. S. Nencka 1 and D. B. Rowe 2,1 1 Medical College of Wisconsin, Milwaukee, WI, Unied

More information