Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation
|
|
- Benjamin Montgomery
- 5 years ago
- Views:
Transcription
1 Precondtonng Parallel Sparse Iteratve Solvers for Crcut Smulaton A. Basermann, U. Jaekel, and K. Hachya 1 Introducton One mportant mathematcal problem n smulaton of large electrcal crcuts s the soluton of hgh-dmensonal lnear equaton systems. The correspondng matrces are real, non-symmetrc, very ll-condtoned, have an rregular sparsty pattern, and nclude a few dense and columns. When the systems become large, teratve solvers are very lkely to outperform drect methods. For convergence acceleraton of teratve solvers, parallelzaton and approprate precondtonng are suted technques to reduce the executon tme. We present a parallel teratve algorthm wth dstrbuted Schur complement (DSC) precondtonng [5] whch acheves an accuracy of the soluton smlar to a drect solver but usually s dstnctly faster for large problems. The parallel effcency of the method s ncreased by transformng the equaton system nto a problem wthout dense and columns whch results n a system reducton as well as by explotaton of parallel graph parttonng methods. The costs of local, ncomplete LU decompostons are decreased by fll-n reducng reorderng methods of the matrx and a threshold strategy for the factorzaton. 2 Problem of Dense Rows and Columns Matrces from crcut smulaton problems are usually very sparse but nclude a few (nearly) dense and columns. In the parallel case, dense and columns are dffcult to handle for parttonng methods snce they result n couplngs between C & C Research Laboratores, NEC Europe Ltd., D Sankt Augustn, Germany, {basermann, jaekel}@ccrl-nece.de Hgh-End Desgn Technology Development Group, Technology Foundaton Development Dvson, NEC Electroncs Corp., Japan, k-hachya@ax.jp.nec.com
2 all equatons. In addton, good load balance s hard to acheve f the matrx s dstrbuted row-wse. Fll-n reducng orderng methods may become very costly due to a few dense and columns, and the matrces may get very ll-condtoned. Fortunately, dense and columns are usually easy to remove from crcut smulaton matrces snce the correspondng columns or as a rule have only one non-zero entry on the dagonal. Such equatons normally nclude voltage sources (constrants). A dense column whose correspondng row has only one dagonal entry can be removed snce the correspondng unknown can be determned from the row equaton and substtuted n all other equatons. On the other hand, a dense row (equaton) whose correspondng column has merely one dagonal entry s only responsble for the correspondng unknown. All other equatons can be solved ndependently. Fg. 1 llustrates the prncple of the matrx reducton n both cases. If the correspondng columns or of dense and columns do not have one Fgure 1. Removal of (nearly) dense and columns. dagonal entry only such and columns can be handled by usng the Woodbury formula [3]. Ths case s rare for crcut smulaton problems and does not occur for the matrces nvestgated here. 3 Dstrbuted Schur Complement Technques In the followng, technques for the teratve soluton wth DSC precondtonng of crcut smulaton equaton systems are sketched. More detals, n partcular on the theory, can be found n [5]. 3.1 Defntons Fg. 2 schematcally dsplays the row-wse dstrbuton of a matrx A to two processors. Each processor owns ts local row block. The square matrces A are the local dagonal blocks of A. We assume that the local are arranged n such a way that the wthout couplngs to the other processor(s) come frst and then the wth couplngs. The former are called nternal, have only entres n the A part of the local and are not coupled wth of other processors. The latter addtonally have entres outsde the A part or are coupled wth of other processors. These local are named local nterface. The part outsde A whch represents couplngs between the processors s called local nterface matrx. From the vew of processor 2 n Fg. 2, the local nterface of processor 1 wth entres at column postons n the area of 2 are external nterface. Snce
3 Processor 1 Local A 1 Internal x 1 Local nterface External nterface Processor 2 A 2 x 2 Fgure 2. DSC defntons: Matrx dstrbuted to two processors. the sparsty pattern of crcut smulaton matrces usually s non-symmetrc local nterface of processor may have entres n A only but are un-drectonally coupled wth of other processors. These are external nterface from the vew of the other processors. Ths can not be determned locally on processor, communcaton s necessary. Snce each row of the matrx corresponds to a specfc unknown of the equaton system (row 1 to soluton vector component 1 and to rght hand sde component 1, e.g.) nternal unknowns, local nterface unknowns, and external nterface unknowns can be defned correspondngly. 3.2 Algorthm Fg. 3 gves a schematc survey of the DSC algorthm per processor. On each processor an outer B-CGSTAB teraton [6] s performed for all local (unknowns). As basc teratve method, a flexble varant of GMRES, FGMRES [4, 5], s also well suted for the DSC algorthm but s not consdered here because of ts hgher storage requrements. The outer teraton contans a partal matrx-vector multplcaton whch requres communcaton snce each processor only owns ts local segment of the vector. It s necessary to exchange components of non-local vector segments whch correspond to external nterface unknowns (). Wthn the outer B-CGSTAB teraton, an nner B-CGSTAB teraton for the local nterface (unknowns) only s performed. Ths ncludes a partal matrxvector multplcaton of the nterface system but the communcaton scheme s the same as for the outer matrx-vector multplcaton and thus has to be mplemented only once. From the mathematcal pont of vew, each processer solves the followng
4 B-CGSTAB teraton for all local (unknowns) B-CGSTAB teraton for the local nterface (unknowns) Matrx-vector multplcaton: Communcaton of external nterface unknowns Matrx-vector multplcaton: Communcaton of external nterface unknowns Fgure 3. Schematc vew of the DSC algorthm on each processor. equaton: A x + y,ext = b, x = ( ) ( ) u f, b y = g. (1) x are the local vector components, y,ext the external nterface vector components, and b s the local segment of the rght hand sde vector. x s splt nto the nternal vector components u and the local nterface vector components y, b accordngly. A s then splt (see [5] for detals), and (1) s reformulated: A = ( ) ( ) ( ) ( B F B F u + E C E C y 0 Neghbours j E j y j ) = ( f g ). (2) The result of the sum over all neghbourng processors j wth couplngs to processor n (2) s the same as that of y,ext n (1). E j y j s the part of y,ext whch reflects the contrbuton to the local equaton from the neghbourng processor j. The matrx equaton (2) represents two equatons. From the frst, we derve an expresson for u, substtute u n the second equaton and get u = B 1 (f F y ) S y + Neghbours j E j y j = g E B 1 f. (3) S = C E B 1 F s the local Schur complement. Note that (3) s an equaton for the nterface vector components only. (3) can be rewrtten as a block-jacob precondtoned Schur complement sys-
5 Processor 1 Local Local nterface Block ncomp lete LU for the local U 1 U 1,S L 1 L 1,S Block nco mplete LU for the local nterface Fgure 4. Prncple of precondtonng wthn the DSC algorthm. tem [5]: y + S 1 Neghbours j E j y j = S 1 (g E B 1 f ). (4) 3.3 Precondtonng Fg. 4 llustrates the prncple of precondtonng wthn the DSC algorthm per processor. The outer teraton from Fg. 3 s precondtoned per processor by a block ncomplete LU decomposton wth threshold (ILUT) [4] of the local dagonal block (L 1 U 1 n Fg. 4). For precondtonng the nner teraton, a block ILUT for the local nterface only s exploted. Ths factorzaton need not be computed but can be used from the lower rght part of the decomposton for the outer teraton (L 1,S U 1,S n Fg. 4). Mathematcally speakng, we perform a block factorzaton of A on processor usng the splttng from (2): ( ) ( ) ( ) B F A = B 0 I B 1 = F. (5) E C E S 0 I We then assume that we have the LU decomposton S = L,S U,S of the local Schur complement. Wth ths, we formulate the LU factorzaton ( ) ( L,B 0 U,B L 1 L U = E U 1,B F ) (6),B L,S 0 U,S wth B = L,B U,B the LU decomposton of B. By transformng the rght hand
6 sde of (6) nto [( L,B 0 E U 1,B L,S ) ( )] ( U,B 0 I U 1,B L 1,B F ) = 0 U,S 0 I ( ) ( B 0 I B 1 E S 0 I we fnd after comparson wth (5) that L U s an LU factorzaton of A. The other way round, we also see the practcal advantage from (6) that the LU factorzaton S = L,S U,S of the local Schur complement has not to be computed explctly f we already have an LU factorzaton of the local dagonal block A. If we perform ncomplete decompostons we get an approxmate, precondtoned Schur complement system wth the approxmaton S of the local Schur complement S (compare wth (4) ): y + S 1 Neghbours j E j y j = 3.4 Reparttonng and Reorderng F ) 1 S (g E B 1 f ). (7) The dstrbuted sparsty pattern of the matrx can be represented as a dstrbuted graph wth nodes and edges. Graph reparttonng can then be used to reduce the number of couplngs between the dstrbuted matrx row blocks. In graph theory formulaton, the reducton s done by a mnmzaton of the number of edges cut n the graph. Ths goal of graph parttonng corresponds to a mnmzaton of the number of nterface unknowns n the DSC algorthm, and thus problem (7) s made very small. For graph parttonng, we use the ParMETIS software from the Unversty of Mnnesota [2]. Snce ParMETIS requres an undrected graph as nput the non-symmetrc pattern of the matrx has to be symmetrzed for the matrx graph constructon. For local, ncomplete decompostons, we use METIS nested dssecton reorderng to reduce fll-n nto the factors [2]. Nested dssecton reorderng usually generates a smlar sparsty pattern for the local dagonal blocks A on each processor. Ths results n smlar fll-n for each ILUT and thus supports load balancng. 4 Results The followng experments were performed on NEC s PC cluster GRISU (32 2-way SMP nodes wth AMD Athlon MP CPUs, 1.6 GHz, 1 GB man memory per node, Myrnet2000 nterconnecton network between the nodes). For the followng experments, the equaton systems ccp and crc2a whch stem from smulatons of NEC crcuts, the systems crcut 2 and crcut 3 from Phlps crcuts [1] as well as the systems hcrcut and scrcut from Motorola crcuts are used. The latter four systems are avalable from the Matrx Market ( davs/sparse/{bomhof, Hamm})
7 Table 1. Orgnal and reduced systems: Parttonng nto eght sub-domans. Orgnal matrx Reduced matrx Matrx Order Non-zeros #If vars Order Non-zeros #If vars ccp crc2a crcut crcut hcrcut scrcut Table 2. DSC tmes on 8 GRISU processors: Effect of orderng and parttonng. Parttonng + orderng Parttonng only No permutaton Matrx #If vars Tme/s Tme/s #If vars Tme/s ccp crc2a crcut crcut hcrcut scrcut Table 1 presents orders and numbers of non-zeros for the orgnal and the reduced matrces as well as the number of nterface varables (f vars) for parttonng nto eght sub-domans. Reparttonng s appled. The results n table 1 show a dstnctly smaller number of nterface varables n the case of the reduced systems. Reparttonng s effectve n ths case and results n very low costs for the nner teraton from Fg. 3. In table 2, tmes of the DSC method on eght GRISU processors wth both reparttonng and reorderng, wth reparttonng only, and wthout reparttonng and reorderng are dsplayed for the reduced systems. The number of nterface varables s gven for the frst and last scenaro; for the second, the number s the same as for the frst one. The shortest tmes by far n table 2 are acheved for the DSC method wth reparttonng and reorderng. Reparttonng keeps the nterface system small, reorderng usually reduces fll-n and mproves the load balance of the processors (see 3.4). Table 3 shows executons tmes of the DSC method on 1 to 16 GRISU processors for the largest reduced test case crc2a. In Fg. 5 total, sequental executon tmes on GRISU of NEC s crcut smulator MUSASI wth the orgnal drect solver (Schur complement method) and the developed teratve DSC algorthm are dsplayed for a medum sze crcut problem (transent analyss) whch results n equaton systems of order The left tmes do not nclude the tme for the setup phase whle the rght tmes do contan ths tme. The most expensve operatons n the setup phase are the symbolc factorza-
8 Table 3. DSC tmes on GRISU: Scalablty. Tme/s on p processors Matrx p = 1 p = 2 p = 4 p = 8 p = 12 p = 16 crc2a ton wth Markowtz orderng whch are necessary and state-of-the-art for a drect solver, but can be left out for the teratve DSC solver developed Drect solver Iteratve solver Tmes n seconds Total tme wthout setup Total tme wth setup Fgure 5. Sequental executon tmes of NEC s crcut smulator MUSASI. The left tmes n Fg. 5 show a speedup of 2.3 of the DSC solver over the orgnal drect solver. Wth setup phase, the teratve solver outperforms the drect one by even a factor of 4.1 snce the costly symbolc factorzaton wth Markowtz orderng can be skpped. From frst tests, a smlar rato of the smulaton tmes wth drect and teratve solver can be expected for the parallel case, but the ntegraton of the teratve DSC solver nto the parallel crcut smulator MUSASI s not complete yet. 5 Conclusons For equaton systems from real problems, we demonstrated that the DSC algorthm combned wth reparttonng and reorderng s a well suted teratve solver for crcut smulaton. For the performance of teratve solvers, a system reducton by the removal of trval equatons s crucal. Moreover, a dstnct reducton of smulaton tme can be expected f commonly used drect methods n crcut smulaton codes are replaced wth the teratve DSC method presented.
9 Bblography [1] C. W. Bomhof and H. A. Van der Vorst, A parallel lnear system solver for crcut smulaton problems, Numercal Lnear Algebra wth Applcatons, 7 (2000), pp [2] G. Karyps and V. Kumar, ParMETIS: Parallel graph parttonng and sparse matrx orderng lbrary, Tech. rep. # , Unversty of Mnnesota, [3] W. H. Press, S. A. Teukolsky, W. T. Vetterlng, and B. P. Flannery, Numercal Recpes n C, 2nd ed., Cambrdge Unversty Press, [4] Y. Saad, Iteratve Methods for Sparse Lnear Systems, PWS, Boston, [5] Y. Saad and M. Sosonkna, Dstrbuted Schur complement technques for general sparse lnear systems, SISC, 21 (1999), pp [6] H. Van der Vorst, B-CGSTAB: A fast and smoothly convergng varant of B-CG for the soluton of nonsymmetrc lnear systems, SIAM J. Sc. Statst. Comput., 13 (1992), pp
Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)
Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationAn Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices
Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal
More informationSolving two-person zero-sum game by Matlab
Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by
More informationParallel matrix-vector multiplication
Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more
More informationProgramming in Fortran 90 : 2017/2018
Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values
More informationCircuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)
Crcut Analyss I (ENG 405) Chapter Method of Analyss Nodal(KCL) and Mesh(KVL) Nodal Analyss If nstead of focusng on the oltages of the crcut elements, one looks at the oltages at the nodes of the crcut,
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationc 2009 Society for Industrial and Applied Mathematics
SIAM J. MATRIX ANAL. APPL. Vol. 31, No. 3, pp. 1382 1411 c 2009 Socety for Industral and Appled Mathematcs SUPERFAST MULTIFRONTAL METHOD FOR LARGE STRUCTURED LINEAR SYSTEMS OF EQUATIONS JIANLIN XIA, SHIVKUMAR
More informationAMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain
AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references
More informationWavefront Reconstructor
A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes
More informationControl strategies for network efficiency and resilience with route choice
Control strateges for networ effcency and reslence wth route choce Andy Chow Ru Sha Centre for Transport Studes Unversty College London, UK Centralsed strateges UK 1 Centralsed strateges Some effectve
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationLoop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)
Loop Transformatons for Parallelsm & Localty Prevously Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Loop nterchange Loop transformatons and transformaton frameworks
More informationProblem Definitions and Evaluation Criteria for Computational Expensive Optimization
Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty
More informationVery simple computational domains can be discretized using boundary-fitted structured meshes (also called grids)
Structured meshes Very smple computatonal domans can be dscretzed usng boundary-ftted structured meshes (also called grds) The grd lnes of a Cartesan mesh are parallel to one another Structured meshes
More informationLecture 4: Principal components
/3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationNUMERICAL ANALYSIS OF A COUPLED FINITE-INFINITE ELEMENT METHOD FOR EXTERIOR HELMHOLTZ PROBLEMS
Journal of Computatonal Acoustcs, Vol. 14, No. 1 (2006) 21 43 c IMACS NUMERICAL ANALYSIS OF A COUPLED FINITE-INFINITE ELEMENT METHOD FOR EXTERIOR HELMHOLTZ PROBLEMS JEAN-CHRISTOPHE AUTRIQUE LMS Internatonal,
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More informationLoop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation
Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More informationType-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data
Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES
More informationCost-efficient deployment of distributed software services
1/30 Cost-effcent deployment of dstrbuted software servces csorba@tem.ntnu.no 2/30 Short ntroducton & contents Cost-effcent deployment of dstrbuted software servces Cost functons Bo-nspred decentralzed
More informationPolyhedral Compilation Foundations
Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationArray transposition in CUDA shared memory
Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationInsertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array
Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationarxiv: v3 [cs.na] 18 Mar 2015
A Fast Block Low-Rank Dense Solver wth Applcatons to Fnte-Element Matrces AmrHossen Amnfar a,1,, Svaram Ambkasaran b,, Erc Darve c,1 a 496 Lomta Mall, Room 14, Stanford, CA, 9435 b Warren Weaver Hall,
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationKent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming
CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems
More informationModule 6: FEM for Plates and Shells Lecture 6: Finite Element Analysis of Shell
Module 6: FEM for Plates and Shells Lecture 6: Fnte Element Analyss of Shell 3 6.6. Introducton A shell s a curved surface, whch by vrtue of ther shape can wthstand both membrane and bendng forces. A shell
More informationLU Decomposition Method Jamie Trahan, Autar Kaw, Kevin Martin University of South Florida United States of America
nbm_sle_sm_ludecomp.nb 1 LU Decomposton Method Jame Trahan, Autar Kaw, Kevn Martn Unverst of South Florda Unted States of Amerca aw@eng.usf.edu nbm_sle_sm_ludecomp.nb 2 Introducton When solvng multple
More informationPrivate Information Retrieval (PIR)
2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market
More informationGSLM Operations Research II Fall 13/14
GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationCategories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids Verification. General Terms Algorithms
3. Fndng Determnstc Soluton from Underdetermned Equaton: Large-Scale Performance Modelng by Least Angle Regresson Xn L ECE Department, Carnege Mellon Unversty Forbs Avenue, Pttsburgh, PA 3 xnl@ece.cmu.edu
More informationMultiblock method for database generation in finite element programs
Proc. of the 9th WSEAS Int. Conf. on Mathematcal Methods and Computatonal Technques n Electrcal Engneerng, Arcachon, October 13-15, 2007 53 Multblock method for database generaton n fnte element programs
More informationSorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions
Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place
More informationA Robust Method for Estimating the Fundamental Matrix
Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationOn Some Entertaining Applications of the Concept of Set in Computer Science Course
On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,
More information2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements
Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationVirtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory
Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More informationHierarchical clustering for gene expression data analysis
Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally
More informationA MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS
Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung
More informationHEURISTIC METHOD OF DYNAMIC STRESS ANALYSIS IN MULTIBODY SIMULATION USING HPC
11th World Congress on Computatonal Mechancs (WCCM XI) 5th European Conference on Computatonal Mechancs (ECCM V) 6th European Conference on Computatonal Flud Dynamcs (ECFD VI) E. Oñate, J. Olver and A.
More informationRelated-Mode Attacks on CTR Encryption Mode
Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory
More informationSequential search. Building Java Programs Chapter 13. Sequential search. Sequential search
Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to
More informationBFF1303: ELECTRICAL / ELECTRONICS ENGINEERING. Direct Current Circuits : Methods of Analysis
BFF1303: ELECTRICAL / ELECTRONICS ENGINEERING Drect Current Crcuts : Methods of Analyss Ismal Mohd Kharuddn, Zulkfl Md Yusof Faculty of Manufacturng Engneerng Unerst Malaysa Pahang Drect Current Crcut
More informationPositive Semi-definite Programming Localization in Wireless Sensor Networks
Postve Sem-defnte Programmng Localzaton n Wreless Sensor etworks Shengdong Xe 1,, Jn Wang, Aqun Hu 1, Yunl Gu, Jang Xu, 1 School of Informaton Scence and Engneerng, Southeast Unversty, 10096, anjng Computer
More informationAngle-Independent 3D Reconstruction. Ji Zhang Mireille Boutin Daniel Aliaga
Angle-Independent 3D Reconstructon J Zhang Mrelle Boutn Danel Alaga Goal: Structure from Moton To reconstruct the 3D geometry of a scene from a set of pctures (e.g. a move of the scene pont reconstructon
More informationON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE
Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton
More informationa tree-dmensonal settng. In te presented approac, we construct te ne mes by renng an exstng coarse mes and updatng te nodes of te ne mes accordng to t
Parallel Two-Level Metods for Tree-Dmensonal Transonc Compressble Flow Smulatons on Unstructured Meses R. Atbayev a, X.-C. Ca a, and M. Parascvou b a Department of Computer Scence, Unversty of Colorado,
More informationRECENT research on structured mesh flow solver for aerodynamic problems shows that for practical levels of
A Hgh-Order Accurate Unstructured GMRES Algorthm for Invscd Compressble Flows A. ejat * and C. Ollver-Gooch Department of Mechancal Engneerng, The Unversty of Brtsh Columba, 054-650 Appled Scence Lane,
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationCSE 326: Data Structures Quicksort Comparison Sorting Bound
CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the
More informationCHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION
24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and
More informationA Parallel Gauss-Seidel Algorithm for Sparse Power System. Matrices. D. P. Koester, S. Ranka, and G. C. Fox
A Parallel Gauss-Sedel Algorthm for Sparse Power System Matrces D. P. Koester, S. Ranka, and G. C. Fox School of Computer and Informaton Scence and The Northeast Parallel Archtectures Center (NPAC) Syracuse
More informationTPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints
TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process
More informationProjection-Based Performance Modeling for Inter/Intra-Die Variations
Proecton-Based Performance Modelng for Inter/Intra-De Varatons Xn L, Jayong Le 2, Lawrence. Plegg and Andrze Strowas Dept. of Electrcal & Computer Engneerng Carnege Mellon Unversty Pttsburgh, PA 523, USA
More informationConditional Speculative Decimal Addition*
Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant
More informationExplicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements
Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley
More informationRepeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits
Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.
More informationVectorization in the Polyhedral Model
Vectorzaton n the Polyhedral Model Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty October 200 888. Introducton: Overvew Vectorzaton: Detecton
More informationAPPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT
3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ
More informationA fault tree analysis strategy using binary decision diagrams
Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:
More informationVirtual Machine Migration based on Trust Measurement of Computer Node
Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on
More informationAn Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed
More informationELEC 377 Operating Systems. Week 6 Class 3
ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationA mathematical programming approach to the analysis, design and scheduling of offshore oilfields
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and
More informationA One-Sided Jacobi Algorithm for the Symmetric Eigenvalue Problem
P-Q- A One-Sded Jacob Algorthm for the Symmetrc Egenvalue Problem B. B. Zhou, R. P. Brent E-mal: bng,rpb@cslab.anu.edu.au Computer Scences Laboratory The Australan Natonal Unversty Canberra, ACT 000, Australa
More informationToday s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.
Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationHigh-Boost Mesh Filtering for 3-D Shape Enhancement
Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,
More informationLobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide
Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationQuality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation
Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationCSE 326: Data Structures Quicksort Comparison Sorting Bound
CSE 326: Data Structures Qucksort Comparson Sortng Bound Bran Curless Sprng 2008 Announcements (5/14/08) Homework due at begnnng of class on Frday. Secton tomorrow: Graded homeworks returned More dscusson
More informationSolitary and Traveling Wave Solutions to a Model. of Long Range Diffusion Involving Flux with. Stability Analysis
Internatonal Mathematcal Forum, Vol. 6,, no. 7, 8 Soltary and Travelng Wave Solutons to a Model of Long Range ffuson Involvng Flux wth Stablty Analyss Manar A. Al-Qudah Math epartment, Rabgh Faculty of
More informationSENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR
SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu
More informationTopology Design using LS-TaSC Version 2 and LS-DYNA
Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool
More informationStructure from Motion
Structure from Moton Structure from Moton For now, statc scene and movng camera Equvalentl, rgdl movng scene and statc camera Lmtng case of stereo wth man cameras Lmtng case of multvew camera calbraton
More informationReading. 14. Subdivision curves. Recommended:
eadng ecommended: Stollntz, Deose, and Salesn. Wavelets for Computer Graphcs: heory and Applcatons, 996, secton 6.-6., A.5. 4. Subdvson curves Note: there s an error n Stollntz, et al., secton A.5. Equaton
More informationETNA Kent State University
Electronc Transactons on Numercal Analyss Volume 22, pp 41-70, 2006 opyrght 2006, ISSN 1068-961 etna@mcskentedu A NETWORK PROGRAMMING APPROAH IN SOLVING DARY S EQUATIONS BY MIXED FINITE-ELEMENT METHODS
More informationHybrid Non-Blind Color Image Watermarking
Hybrd Non-Blnd Color Image Watermarkng Ms C.N.Sujatha 1, Dr. P. Satyanarayana 2 1 Assocate Professor, Dept. of ECE, SNIST, Yamnampet, Ghatkesar Hyderabad-501301, Telangana 2 Professor, Dept. of ECE, AITS,
More information