An Efficient Parallel Algorithm of Modified Jacobi Approach for Sparse Linear System

Size: px
Start display at page:

Download "An Efficient Parallel Algorithm of Modified Jacobi Approach for Sparse Linear System"

Transcription

1 991 An Effcent Parallel Algorthm of Modfed Jacob Approach for Sparse Lnear System Bkash Kant Sarkar 1, Shb Sankat Sana 2, G. Sahoo 3 1,3 Department of Informaton Technology, BIT, Mesra, Ranch, Inda E-mal: bk_sarkarbt@hotmal.com 2 Department of Mathematcs, Bhangar Mahavdyalaya, CU, West Bengal, Inda E-mal: shb_sankar@yahoo.com ABSTRACT Several parallel approaches have been developed to solve sparse lnear system based on well-known memory savng schemes. To solve such a lnear system, ths artcle proposes a very smple parallel verson of the modfed Jacob teratve method on Dstrbuted Memory Archtecture, usng the well-known Compressed Sparse Row(CSR) storage format and Recursve Graph Bsecton(RGB). The prme contrbuton of the present nvestgaton s that the ndvdual processors wll not update ts assgned varables any more, provded the prevous teraton acheves smaller than the prescrbed accuracy. Consequently, such processors wll stop computaton as well as communcaton wth other processors to reduce both the computaton and the communcaton tme to a great extent. In fact, the use of such local stoppng crtera ensures to acheve such overall system performance. The expected beneft of ths algorthm s explaned through the analytcal results. Keywords : Parallel, Effcent, Communcaton, Optmzaton, Dstrbuted Memory, Jacob Date of Submsson: February 04, 2011 Date of Acceptance: Aprl 27, INTRODUCTION Large number of physcal problems lke Ar flow over an arcraft wng, Blood crculaton n human body, Water crculaton n an ocean, Weather Forecastng, etc, are descrbed by Partal Dfferental Equatons(PDE). These equatons, when solved usng, Fnte Dfference Method(FDM), generate sets of lnear equatons. But the lnear systems of the most of the physcal problems yeld sparse structure after such transformaton. Several well-known memory savng schemes are developed to store such knd of system. But, soluton of these problems requres huge amount of computatons and becomes very dffcult by employng conventonal computers. So, to solve such problems effcently n parallel manner s stll an attractve ssue. Recently, hgh performance computng has emerged as a key technology nto dverse areas especally for the numercal soluton of large scale problems. Although, there exst several forms of parallelsm[5], but ntroducng data parallelsm usng clusterng wll be easer. A matrx s termed sparse, f majorty of ts entres are zero. As there s no reason to store and operate on huge number of zeros, t s often necessary to modfy the exstng algorthms to take the advantage of the sparse structure of the matrx. Such matrces can be easly compressed, yeldng sgnfcant reducton n memory usage. Several sparse matrx formats exst lke Compressed Sparse Row(CSR) Storage[1], Jagged Dagonal Format[2], Compressed Dagonal Storage Format[3] and Sparse Block Compressed Row Storage Format[4]. Each format takes advantage of a specfc property of the sparse matrx, and therefore acheves dfferent degree of space effcency. In ths work, the CSR storage format (dscussed n secton[2.1]) s used, as t s rather ntutve, straghtforward and more sutable for parallelzaton. The soluton of a lnear system of equatons can be accomplshed by ether of the two numercal methods: Drect or Iteratve. In Drect methods lke Gauss Elmnaton, Gauss Jordon (modfcaton of Gauss Elmnaton) and Matrx Inverson, the amount of computaton s fxed. However, Iteratve methods lke Jacob and Gauss Sedel yeld values whch are found teratvely startng from an approxmaton untl the requred accuracy s obtaned, and hence the amount of computaton depends on the accuracy requred. Further, the parallelzaton of teratve approaches becomes easer as compared to the drect approaches. But some teratve methods are sutable on Multple Instructon Stream and Multple Data stream(mimd) Dstrbuted Memory Machne.

2 For example, Jacob method n comparson to Gauss- Sedel method takes less communcaton tme because all the computatons for -th approxmaton must be ready before the computaton for (+1)-th approxmaton starts. In other words, Jacob teratve approach do not requre exchange of the most recent values of the varables, whereas, a subsequent teraton n Gauss-Sedel needs the values of some varables n that teraton too (.e., causes ntra-teraton dependences). Because of ths fact, Jacob approach s preferred for parallelzaton on Dstrbuted Memory computer as compared to Shared Memory computer. For parttonng data (nvolved wth lnear system) nto dfferent processors to mantan data localty aspect, there are several algorthms lke Mult-grd (Square Mesh Parttonng), Ellpack-Itpack(Row Partton Format), RGB(Recursve Graph Bsecton dscussed n secton[2.2]) etc; and all of whch are appled for parallel machnes. In ths nvestgaton, RGB n comparson to other technques s preferred as t nfluences to opt for less communcaton tme, achevng better statc load balancng of the sparse graph among the processors. Many researchers have concentrated to solve smultaneous system of lnear equatons sequentally and n parallel, usng Jacob and other approaches [5], [6],[7], [8],[9],[10], [11], [12], [13],[19] [20],[21][22]. Some knds of tllng technques[14] are developed for solvng set of lnear system. Tllng s a comple-tme transformaton whch subdvdes the teraton space for a regular computaton so that a new tle-based schedule(where each tle s executed atomcally) exhbts better data localty. So, tllng provdes a method of achevng nter-teraton localty. In[15], Communcaton optmzaton for rregular scentfc computatons on Dstrbuted Memory archtectures s focused. Although a number of technques has been developed tll today to solve set of lnear systems on Dstrbuted Memory machne tryng to reduce communcaton among processors, but only few of them such as [16][17][21] pay attenton to the amount of work done by ndvdual processors. 992 RGB approaches, s developed on Dstrbuted Memory Archtecture to stop unwanted computaton and communcaton among the processors (n order to reduce both the costs). We compare the analytcal results of the proposed work wth Tmng Models [17] and report that the proposed s a better choce. The present artcle s organzed as follows: secton-2 gves theoretcal background about CSR, graph parttonng technque, Jacob method, parallel computers. Secton-3 descrbes the modfed verson of Jacob approach. In secton-4, the proposed parallel algorthm and ts proof of correctness are descrbed. Secton-5 shows the analytcal results. Fnally, secton-6 exhbts the future scope of the work. 2. THEORETICAL BACKGROUND 2.1. COMPRESSED SPARSE ROW(CSR) FORMAT Maxmum storage schemes for sparse matrx employ the technque as follows. Compress all the non-zero elements of the sparse matrx (say, A) n a lnear array and then use some number of auxlary arrays to descrbe the locatons of the non-zeros of the orgnal matrx A. The CSR format uses three arrays to store an n n sparse matrx wth m nonzero entres. () An m 1 array, nonzero[ ], contans the nonzero elements of the lnear system. These are stored n the order of ther rows from 0 to (n 1). However, elements of the same row can be stored n any order. () An m 1 array, col_vector [ ], stores row-wse the respectve column number of each nonzero element. Indeed, each column number of a row represents also the varable wth non-zero co-effcent n that row. () An n 1 array row_vector[ ], and the content of row_vector[] ponts to the frst entry of the th row n nonzero[ ] and col_vector[ ]. One sparse matrx of the form AX=b and ths matrx mapped nto three arrays are shown n Fgs.1 and 2 respectvely. In partcular, n ths work, a very smple parallel verson based on the modfed Jacob teratve method[18] and combnng the capabltes of CSR and

3 x x x x x x x x x x x x x x x x Fgure 1: Sparse Matrx n Equaton form AX=b b 0 b1 b 2 b 3 b 4 b 5 b 6 b 7 = b 8 b 9 b10 b 11 b12 b 13 b14 b15 Fgure2: Sparse Matrx (shown n Fg.-1) Mapped nto three arrays

4 2.2. RECURSIVE GRAPH BISECTION(RGB) TECHNIQUE The RGB technque parttons the doman(graph) by recursvely subdvdng t nto two parts at each step. For p = 2 k processors, the doman yelds p parttons recursvely subdvdng k tmes. Ths bsecton nvolves three major steps. () Intally set the level(startng wth 0). () Then, fnd pseudoperpheral node. () Fnally, partton the graph recursvely. To determne pseudo-perpheral or perpheral node of a graph, dameter of the graph(here, graph s represented by matrx) s requred to be computed frst. The dameter of a graph s defned as follows. δ (G) = max { d(x,y) x ε V, y ε V }, where d (x,y) s the dstance (shortest path) between any two nodes n the graph(g) wth vertex set V. Ideally, one of the two nodes n par (x, y) that acheves the dameter can be used as startng node. These two nodes are called as perpheral nodes, and are very expensve to determne. A pseudo-perpheral node s often employed to partton the graph. For p =2 2 =4, applyng the above segment on the graph (represented by the matrx shown n Fg-1), the parttoned graph s shown n Fg To solve a lnear system, AX = b, through ths method, the soluton vector X must satsfy the equaton: X 1 = b a, j a, j X j () 1 In fact, to solve the system, one may start the process wth an ntal estmaton. However, the Jacob approach reles upon estmaton of every element of vector X to come up wth a new value of X. It uses values already computed for each varable X durng teraton (t+1): 1 X ( ) () t + 1 = b a, jx j t a, j After computng a new estmaton, the approach computes the new value of dff(dfference) based on the change n all elements of X(assume that the ntal value of dff s 0). Actually, the value of dff ensures to stop the approach. Now, dff s computed as: dff = max(abs(x 1 (t) - X 1 (t+1)), abs((x n (t) X n (t+1))...(2) Fgure- 3: Parttoned Graph usng RBG method(here, d 11, d 12. d 22, d 23 are the domans, as four processors are used) 2.3. JACOBI METHOD A set of lnear equatons s represented as AX=b where A s a matrx of sze n x n wth coeffcents a,j, X s an nx1 vector varable to be solved and b s an nx1 vector of rght sde values. Jacob method s an example of teratve method for solvng lnear system AX=b, typcally generated whle workng wth PDE PARALLEL COMPUTERS Parallel computers are those systems that emphasze parallel processng. Parallel computers are generally dvded nto three archtectural confguratons: Ppelne computers: whch belong to SISD(Sngle Instructon Stream and Sngle Data Stream) model computers and the parallelsm acheved through ths type of computers s called as temporal parallelsm. Array processors : whch belong to SIMD(Sngle Instructon Stream and Multple Data Stream) model computers and the parallelsm acheved through ths model s called spatal or synchronous or data parallelsm. The global CU dspatches the same nstructon to each PEs (whch are organzed by a partcular network ) and each executes the same nstructon on a dstnct data set. Multprocessor systems : whch belong to MIMD(Multple Instructon Stream and Multple Data Stream) model computers and the parallelsm acheved through ths type of computers s called as control or asynchronous parallelsm. Ths type of system s agan classfed nto two categores : (a) Shared Memory model computers(or Multprocessors) and (b) Dstrbuted Memory model

5 computers (Message Passng Parallel Computers or Mult-computers). Message Passng Model Computers are also called Loosely Coupled Computers as the degree of nteracton among the processors s not very hgh. A Message Passng Computer, on the other hand, s programmed usng Send-Receve prmtves. There are several send-receve used n practce. 3. MODIFIED JACOBI APPROACH From eqn(2) of secton-2.3, t s clear that n the standard Jacob teratve method, dff s computed as the maxmum among all the absolute dfferences of the values of respectve varables n the current teraton and the mmedate prevous teraton. As per the standard approach, n spte of achevng the desred accuracy by some of the varables n the current teraton, the same varables are updated agan n the next teraton to converge the remanng varables. Consequently, t causes unnecessary update of the converged varables. It s also true that the varables whch are converged to the desred soluton n the present teraton, are needed by the present not converged varables. Thus, the modfed verson stops the updatng of the converged varables n the next teraton to reduce executon tme but the non-converged varables use the values of the necessary converged varables by updatng ther current contents wth the dff value n the current teraton. For example, suppose varable X k s not converged at the present teraton but varable X m s converged. Then, X m s smply updated n the successve teratons as follows. X m = X m + dff.. (3) where dff represents the value of dff(computed from the rest non-converged varables followng eqn(2)) at the current teraton. In fact, X m s here not updated followng equaton-1 (mentoned n secton-2.3),.e., no multplcaton, dvson and more number of addtons are performed. However, X k s computed as per eq (1), usng the value of X m (calculated by eqn(3)). However, n the modfed approach, t s assumed that each row has some non-zero coeffcents excludng the dagonal one. Further, ths method requres that the dagonal elements are dagonally domnant, means that the dagonal element s greater than the sum of the absolute values of the other elements n the gven row. In ths artcle, a smple parallelzed verson of ths modfed approach, based on CSR storage format and RGB partton technque, s presented (n 995 secton-4) on the Multprocessors to solve a sparse lnear system. 4. PROPOSED PARALLEL ALGORITHM It s well-known that, n sequental teratve approaches, we concentrate on the approxmate values of the soluton vector X and these normally depend on certan degree of accuracy. In partcular, the varable dff s only used to contnue the specfed accuracy of the varables. However, nstead of global dff(whch s the maxmum among the computed dfferences of all the varables by followng eqn(2)), the local dff (whch s, ndeed, the maxmum among all the computed dfferences of the varables assgned to ndvdual processor, followng the same eqn(2)), can guarantee to acheve the same. Of greater nterest, the work[18] clams t. In ths secton, we present a smple parallel algorthm of the modfed Jacob method on Dstrbuted Memory Archtecture to solve sparse lnear system. Further, our algorthm s based on Compressed Sparse Row(CSR) storage format and Recursve Graph Bsecton(RGB). The goal of ths algorthm s to optmze the communcaton overhead among the processors, reducng computatonal cost too. However, n the desgned algorthm, the status varable for every processor fulflls such great role. Assumptons: ) All the dagonal elements of the matrx A must be non-zero values. ) The dagonal elements are dagonally domnant, means that the dagonal element s greater than the sum of the absolute values of the other elements n the gven row. ) Processors are represented by unque ds such as : 0, 1, 2, 3, etc. Bref descrpton of the used varables: (a) To represent soluton vector X, one array of structures(records) X[ ] s consdered. In fact, each element of X[ ] represents one varable, and conssts of two felds. In C lke language, such structure(record) can be declared as: struct varable { float value; nt source } X [ ]. Clearly, X[] represents here the th varable (lke X ). The mportance of each of the felds are dscussed below. ) value (ths feld stores the latest updated value of a partcular varable by a processor). ) source(feld manly stores the processor-d of the processor whch s updatng the partcular varable). Thus, t s clear that each varable keeps more nformaton except the value of varable, and each sub-scrpt value of X[ ] represents one varable.

6 Smultaneously, another array NewX[ ] s essental to store temporarly only the current contents of updated varables(.e., NewX[ndex] s used to store temporarly the updated content of X[ndex].value at the current teraton). (b) The 1-D array processor_status[ ] plays here the sgnfcant role to mantan status of the partcpatng processors. For example, processor_status[0] stores the status of 0 th processor and so on. However, ts content s ether 0(means ts work s not over) or 1 (means ts work s over). If p number of processors partcpate n the work, then ts sze wll be p. (c) Locaton[ ], a smple 1-D array, s used to store var_ndces(.e., varables) to be by a processor. So, If v number of varables are updated by a processor, then ts sze wll be v (.e., Locaton[v ]). (d) Three 1-D arrays : row_vector[ ], col_vector[ ] and nonzero[ ] are used to represent CSR storage format of sparse matrx (example shown through Fg-1 for sparse matrx and Fg-2 for ts equvalent CSR). Sub-scrpt of row_vector[ ] ndcates row number. (e) b[ ], 1-D array, s to represent the source vector,.e., each locaton of ths array stores the rght hand sde of a partcular equaton. (f) The varable dff (local dff) s responsble for checkng the desred accuracy of soluton of the assgned varables to each processor. Proposed Algorthm A bref sketch of the algorthm s outlned below. Step-1: Processor P 0 (root processor) ntalzes value 0(zero) to the value part of each element of the soluton vector (X) as well as the necessary values of the other felds(members) of X, and the value of the varable dff. It then broadcasts all these values to the rest processors partcpatng n ths work. Step-2: Assgn varables to be updated by each processor nto ts local varable Locaton[ ]. [Here, assgnng varables to processors s done, seeng the parttoned graph of the matrx A.] Step-3: for all the workng processors P, where 0 p-1 do the followng tasks. // P s the processor-d and p s the total number of workng processors. Step-3.1: for each varable X k assgned to P (K Locaton[ ]), perform the followng sub-steps to update the current retreved varable. Step-3.1.1: Frst retreve the necessary varables as well as ther respectve co-effcents smultaneously 996 accessng col_vector[ ] and nonzero[ ] arrays, followng eq-1. Next, collect the values of these necessary varables from the respectve processors(retreved through source feld of X[ ]) by passng message (f ther work s not over). However, f work of any one s over, then frst update the values of the necessary varables computed earler by that stopped processor, followng eq-3 (presented n secton-3), and use those. Step-3.1.2: Now, update X k (followng eq-1) and store the value of ths varable nto an ndex of the local array NewX[ ]. // For P, step-3.1 ends and step -3.2 starts Step-3.2: Update dff(local dff) followng the eqn (2) [mentoned n secton-2.3]. Step-3.3: Copy the updated values of the varables( stored n NewX[ ]) nto the value part of the respectve locatons of X[ ]. Step-3.4: If dff(local dff) reaches to the desred value (say ε: some value s set ntally), then assgn value 1 to ts processor_status[ ](.e., processor_status[1] = 1) and send ths value (to all other destnaton processors to stop further communcaton wth t) and the latest updated values of the assgned varables to the respectve processors as well as root processors else the processng goes back to step-3.1 for next teraton. Step-4 : If the algorthm termnates, then the root processor(p 0 ) gets the fnal solutons of the varables. 5. ANALYTICAL RESULTS Assume that the number of processors s p. Now, f k number of teratons are requred to acheve the desred accuracy n worst case and total number of nonzero elements n the nonzero array s m ( m << n), then maxmum number operatons lke multplcaton, addton etc, wll be mk n sequental approach whch can be expressed as O(mk). Clearly, the proposed parallel algorthm takes O(mk/p) computaton tme. Although, almost all other exstng parallel approaches also demand the same asymptotc runnng tme. But the status varable processor_status adopted n our algorthm ensures to stop processng of the varables assgned to the ndvdual processors when the desred accuracy computed from the respectve assgned varables to them s found. In other words, there s maxmum probablty to be converged the respectve assgned varables earler(whch s, n fact, less n number of teratons). Consequently, the processors whch

7 termnate ther respectve assgned tasks, can be employed to perform the task of another dstnct dfferent problem n mult-processor envronment. Thus, the present approach reduces executon tme, snce the processors(whose processng s over) stop computaton lke data access, multplcaton, addton, etc. For nstance, suppose processor P 0 acheves the desred accuracy over ts assgned varables after 8 teratons whereas processor P 1 after 12 teratons, P 2 after 14 teratons etc, then unnecessarly P 0, P 1 need not contnue executon up to maxmum teraton(14) over ther respectve assgned varables. Snce ths drawback s overcome here, so t ultmately saves sgnfcant amount of executon tme as compared to the exstng approaches. Also, the approach clams less communcaton cost between processors than any exstng parallel method, snce unwanted communcaton among processors stops. Now, we consder the Tmng Models [17], the total parallel processng tme(assumng same processng speed for each processor) can be expressed as T par = T master +T worker +T com, where T master defnes the master total computaton tme, T worker as the workers total computaton tme, and T com = T c_master + T c_worker. Here, T c_master ncludes broadcast of global geometry, dstrbuton of workng tuples and extracton of workng tuples, whereas T c_worker ncludes extracton of global nformaton, extracton of workng tuples, return of result tuples and ntermedate exchange of data wth the neghborng processors. Clearly, the proposed approach clams better system performance as compared to the mentoned approach, snce master(root) processor need not collect solutons from any processor to compute the global dff and to send the same to the other processors. Consequently, T c_worker does not nclude here extracton of global nformaton (dff ) from master processor and unnecessary extracton of workng tuples, return of result tuples and ntermedate exchange of data wth the neghborng processors whenever the work of the respectve processors s over. 6. CONCLUSION The artcle addresses parallelzaton of a varant of the Jacob method for lnear system soluton n dstrbuted memory computers. In ths varant, once a varable s detected to be converged t s not communcated afterwards. The status varable and graph parttonng technque used n the proposed algorthm balance the computatons and reduce the communcaton overhead. Ths novel dea s very much mportant for the cluster computng because the connecton between 997 processors n such envronment s often slower. The concept s verfed and valdated mathematcally. The proposed algorthm can be mplemented cluster of personal computers connected by hgh speed network. The mplementaton wll be usng the Message Passng Interface(MPI) lbrary as the parallel programmng platform. REFERENCES [1]. J.D.Z. Ba, J. Dongarra, A. Ruhe, H. van der Vorst, Templates for the soluton of algebrac egenvalue problems: A Practcal gude.siam [2]. Y. Saad, Krylov. Subspace methods on supercomputers, SIAM Journal on Scentfc and Statstcal Computng. (1989). [3]. J. Dongarra, In Templates for the Soluton of Algebrac Egenvalue Problems: A Practcal Gude, SIAM [4]. S. Vasslads, S. D. Cotofana, P. Staths, Block based compresson storage expected performance, In Proceedngs of 14th Int l Conf. on Hgh Performance Computng Systems and Applcatons (HPCS 2000) (June 2000). [5]. M. J. Qunn, Parallel Computng Theory and Practce, Oregon State Unversty, Second Edton [6]. Van der Vorst, H. A., Large sparse lnear systems on vector and parallel computers, Parallel Computng, 5, (1987). [7]. Bswa Nath. Datta, Jacob Iteratve Methods for Large Scale Matrx Problems n Control. Dept. Mathematcal Scences, pp.2-21, [8]. F. L. Alvarado, A. Pothen, R. Schreber, Hghly parallel sparse soluton, n Techncal Report, RIACS TR 92.11, NASA Ames Research Center, Moffet Feld,CA. May [9]. C.Aykanat, F. Ozguner, F. Ercal, P. Sadayappan, Iteratve algorthms for soluton of large sparse systems of lnear equatons, IEEE Transactons on Computers, 37(12): (1988). [10]. U. Meer, A parallel partton method for solvng sparse systems of lnear equatons, Parallel Computng, 2: (1985). [11]. M.T. Heath, E.G.Y. Ng, B.W. Peyton, Parallel algorthms for sparse lnear systems, SIAM Revew, 33 : (1991). [12]. Y. Saad, Iteratve Methods for Sparse Lnear Systems, Unversty of Mnnesota, Dept. of Computer Scence and Engneerng.

8 998 [13]. S.G. Petton, Massvely parallel sparse matrx computaton for teratve methods, n Techncal Report. YALEU/DCS/878, Yale Unversty, Department of Computer Scence, New Haven, CT, [14]. M.M. Strout, Larry Carter, Jeanne Ferrante,, Kreaseck Barbara, Sparse Tllng For Statonary Iteratve Methods, Internatonal Journal of Hgh Performance Computng Applcatons, vol-18, no-1, pp ( 2004). [15]. R. Das, M. Uysal, J. Saltz, Y.S.S. Hwang, Communcaton Optmzatons for rregular scentfc computatons on Dstrbuted Memory Archtectures, Journal of Parallel and Dstrbuted Computng, 22(3), pp (1994). [16]. X. Wang, M.M. Chen, X. Wu, and X. An, A Class of Parallel Algorthms for Solvng Large Sparse Lnear Systems on Multprocessors, n Proceedngs of Int. Conf. on Hgh Performance Computng(IEEE-CS), vol-2, ,(2000) [17]. Kostas Blathras, Danel B. Szyld1, and Yuan Sh, Tmng Models and Local Stoppng Crtera for Asynchronous Iteratve Algorthms, Journal of Parallel and Dstrbuted Computng, 58, (1999) [18]. B.K.Sarkar, S.S.Sana, and P.K. Mahant, A Modfed Verson of Jacob Approach, Internatonal Journal of Innovatve Computng and Applcatons, 2, 60-65, [19]. S. Kortas and P. Angot, A practcal and portable model for programmng for teratve solvers on dstrbuted memory machnes, Parallel Computng, 22 (1996), [20]. A.H. Sameh and D.J. Kuck, On Stable Parallel Lnear System Solvers, Journal of the Assocaton for Computng Machnery, 25(1), 81-91, [21]. R. Chandra and C. Sva Ram Murthy, A faster algorthm for solvng lnear algebrac equatons on the star graph, Journal of Parallel and Dstrbuted Computng, 63, , [22]. D. Heller, A survey of parallel algorthms n numercal lnear algebra, SIAM Rev. 20 (4) , 1978.

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation Precondtonng Parallel Sparse Iteratve Solvers for Crcut Smulaton A. Basermann, U. Jaekel, and K. Hachya 1 Introducton One mportant mathematcal problem n smulaton of large electrcal crcuts s the soluton

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introducton 1.1 Parallel Processng There s a contnual demand for greater computatonal speed from a computer system than s currently possble (.e. sequental systems). Areas need great computatonal

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids)

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids) Structured meshes Very smple computatonal domans can be dscretzed usng boundary-ftted structured meshes (also called grds) The grd lnes of a Cartesan mesh are parallel to one another Structured meshes

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Multiblock method for database generation in finite element programs

Multiblock method for database generation in finite element programs Proc. of the 9th WSEAS Int. Conf. on Mathematcal Methods and Computatonal Technques n Electrcal Engneerng, Arcachon, October 13-15, 2007 53 Multblock method for database generaton n fnte element programs

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005 Exercses (Part 4) Introducton to R UCLA/CCPR John Fox, February 2005 1. A challengng problem: Iterated weghted least squares (IWLS) s a standard method of fttng generalzed lnear models to data. As descrbed

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution Real-tme Moton Capture System Usng One Vdeo Camera Based on Color and Edge Dstrbuton YOSHIAKI AKAZAWA, YOSHIHIRO OKADA, AND KOICHI NIIJIMA Graduate School of Informaton Scence and Electrcal Engneerng,

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce Performance Study of Parallel Programmng on Cloud Computng Envronments Usng MapReduce Wen-Chung Shh, Shan-Shyong Tseng Department of Informaton Scence and Applcatons Asa Unversty Tachung, 41354, Tawan

More information

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES A SYSOLIC APPROACH O LOOP PARIIONING AND MAPPING INO FIXED SIZE DISRIBUED MEMORY ARCHIECURES Ioanns Drosts, Nektaros Kozrs, George Papakonstantnou and Panayots sanakas Natonal echncal Unversty of Athens

More information

On Some Entertaining Applications of the Concept of Set in Computer Science Course

On Some Entertaining Applications of the Concept of Set in Computer Science Course On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Parallel Solutions of Indexed Recurrence Equations

Parallel Solutions of Indexed Recurrence Equations Parallel Solutons of Indexed Recurrence Equatons Yos Ben-Asher Dep of Math and CS Hafa Unversty 905 Hafa, Israel yos@mathcshafaacl Gad Haber IBM Scence and Technology 905 Hafa, Israel haber@hafascvnetbmcom

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Solving Planted Motif Problem on GPU

Solving Planted Motif Problem on GPU Solvng Planted Motf Problem on GPU Naga Shalaja Dasar Old Domnon Unversty Norfolk, VA, USA ndasar@cs.odu.edu Ranjan Desh Old Domnon Unversty Norfolk, VA, USA dranjan@cs.odu.edu Zubar M Old Domnon Unversty

More information

Newton-Raphson division module via truncated multipliers

Newton-Raphson division module via truncated multipliers Newton-Raphson dvson module va truncated multplers Alexandar Tzakov Department of Electrcal and Computer Engneerng Illnos Insttute of Technology Chcago,IL 60616, USA Abstract Reducton n area and power

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

A fast algorithm for color image segmentation

A fast algorithm for color image segmentation Unersty of Wollongong Research Onlne Faculty of Informatcs - Papers (Arche) Faculty of Engneerng and Informaton Scences 006 A fast algorthm for color mage segmentaton L. Dong Unersty of Wollongong, lju@uow.edu.au

More information

arxiv: v3 [cs.na] 18 Mar 2015

arxiv: v3 [cs.na] 18 Mar 2015 A Fast Block Low-Rank Dense Solver wth Applcatons to Fnte-Element Matrces AmrHossen Amnfar a,1,, Svaram Ambkasaran b,, Erc Darve c,1 a 496 Lomta Mall, Room 14, Stanford, CA, 9435 b Warren Weaver Hall,

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Floating-Point Division Algorithms for an x86 Microprocessor with a Rectangular Multiplier

Floating-Point Division Algorithms for an x86 Microprocessor with a Rectangular Multiplier Floatng-Pont Dvson Algorthms for an x86 Mcroprocessor wth a Rectangular Multpler Mchael J. Schulte Dmtr Tan Carl E. Lemonds Unversty of Wsconsn Advanced Mcro Devces Advanced Mcro Devces Schulte@engr.wsc.edu

More information

Evaluation of Parallel Processing Systems through Queuing Model

Evaluation of Parallel Processing Systems through Queuing Model ISSN 2278-309 Vkas Shnde, Internatonal Journal of Advanced Volume Trends 4, n Computer No.2, March Scence - and Aprl Engneerng, 205 4(2), March - Aprl 205, 36-43 Internatonal Journal of Advanced Trends

More information

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Positive Semi-definite Programming Localization in Wireless Sensor Networks Postve Sem-defnte Programmng Localzaton n Wreless Sensor etworks Shengdong Xe 1,, Jn Wang, Aqun Hu 1, Yunl Gu, Jang Xu, 1 School of Informaton Scence and Engneerng, Southeast Unversty, 10096, anjng Computer

More information

Vectorization of Image Outlines Using Rational Spline and Genetic Algorithm

Vectorization of Image Outlines Using Rational Spline and Genetic Algorithm 01 Internatonal Conference on Image, Vson and Computng (ICIVC 01) IPCSIT vol. 50 (01) (01) IACSIT Press, Sngapore DOI: 10.776/IPCSIT.01.V50.4 Vectorzaton of Image Outlnes Usng Ratonal Splne and Genetc

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vdyanagar Faculty Name: Am D. Trved Class: SYBCA Subject: US03CBCA03 (Advanced Data & Fle Structure) *UNIT 1 (ARRAYS AND TREES) **INTRODUCTION TO ARRAYS If we want

More information

A parallel Poisson solver using the fast multipole method on networks of workstations

A parallel Poisson solver using the fast multipole method on networks of workstations A parallel Posson solver usng the fast multpole method on networks of workstatons June-Yub Lee (jylee@math.ewha.ac.kr, jylee@cms.nyu.edu) Dept. of Math, Ewha Womans Unversty, Seoul120-750, KOREA, Karpjoo

More information

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6) Harvard Unversty CS 101 Fall 2005, Shmon Schocken Assembler Elements of Computng Systems 1 Assembler (Ch. 6) Why care about assemblers? Because Assemblers employ some nfty trcks Assemblers are the frst

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

Polyhedral Compilation Foundations

Polyhedral Compilation Foundations Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop

More information

A One-Sided Jacobi Algorithm for the Symmetric Eigenvalue Problem

A One-Sided Jacobi Algorithm for the Symmetric Eigenvalue Problem P-Q- A One-Sded Jacob Algorthm for the Symmetrc Egenvalue Problem B. B. Zhou, R. P. Brent E-mal: bng,rpb@cslab.anu.edu.au Computer Scences Laboratory The Australan Natonal Unversty Canberra, ACT 000, Australa

More information

Querying by sketch geographical databases. Yu Han 1, a *

Querying by sketch geographical databases. Yu Han 1, a * 4th Internatonal Conference on Sensors, Measurement and Intellgent Materals (ICSMIM 2015) Queryng by sketch geographcal databases Yu Han 1, a * 1 Department of Basc Courses, Shenyang Insttute of Artllery,

More information

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method Internatonal Journal of Computatonal and Appled Mathematcs. ISSN 89-4966 Volume, Number (07), pp. 33-4 Research Inda Publcatons http://www.rpublcaton.com An Accurate Evaluaton of Integrals n Convex and

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

c 2009 Society for Industrial and Applied Mathematics

c 2009 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 31, No. 3, pp. 1382 1411 c 2009 Socety for Industral and Appled Mathematcs SUPERFAST MULTIFRONTAL METHOD FOR LARGE STRUCTURED LINEAR SYSTEMS OF EQUATIONS JIANLIN XIA, SHIVKUMAR

More information