A comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation
|
|
- Giles Warren
- 5 years ago
- Views:
Transcription
1 A comparison of Algorithms for Sparse Matrix Factoring and Variable Reordering aimed at Real-time Multibody Dynamic Simulation Jose-Luis Torres-Moreno, Jose-Luis Blanco, Javier López-Martínez, Antonio Giménez-Fernández 1-4 July, 2013 Zagreb (Croatia) ECCOMAS Multibody Dynamics 2013
2 Goal of this work Problem under study Multibody dynamic formulation: Best numerical solver algorithm? = = \ 2 It depends Ax b x A b
3 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 3
4 Problem introduction Modeling decisions: coordinates, formulation, etc. (,&, ) q&& t = f q t q t t ( ) ( ) ( ) (, &, ) && z t = f z t z t t ( ) ( ) ( ) Our choices for this work: Dependent coordinates vs. Independent coordinates Coordinate type for modeling: Dynamic formulation: Natural coordinates Index-1 Lagrange w. Baumgarte 4 1/4: Introduction
5 Problem introduction A x = b Problem structure? N dependent coordinates M constraints (N+M) x (N+M) (N+M M) x 1 = (N+M M) x 1 Square, symmetric, Dense columns psd and sparse 5 1/4: Introduction
6 Structure of matrix A A Most important features - M is constant - Static non-zero structure - SPARSITY 6 1/4: Introduction
7 Why sparsity matters? Naive matrix inversion: T M φq q&& Q = φq 0 λ c A q && λ 1 T M φ q Q = φq 0 c A 1 7 1/4: Introduction
8 Why sparsity matters? nnz=528 (93.76% are zeros) nnz=8266 (2.34% are zeros) M φq φ T q 0 8 1/4: Introduction M φq In general, the inverse of a sparse matrix is NOT sparse! ( 3 ( + ) ) Cost: O N M φ T q 0 1
9 Why sparsity matters? How to keep matrices sparse? Avoid direct matrix inversion. Alternatives? Sparse matrix factorization algorithms: 1) Method: LU, Cholesky. 2) Variable ordering 9 1/4: Introduction
10 Why variable reordering matters? nnz=528 (93.76% are zeros) ~ 94% zeros Natural ordering A = ~ 70% zeros nnz=2599 (69.29% are zeros) nnz=2456 (70.98% are zeros) A = L U = /4: Introduction
11 Why variable reordering matters? Reordering = permutation of variable indices nnz=528 (93.76% are zeros) nnz=528 (93.76% are zeros) A ( :, : ) A( p, p ) 11 1/4: Introduction
12 Why variable reordering matters? nnz=528 (93.76% are zeros) ~ 94% zeros Optimized ordering ~97% and ~93% zeros nnz=244 (97.12% are zeros) nnz=620 (92.67% are zeros) /4: Introduction
13 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 13
14 Tested algorithms Approaches: LU decomposition Cholesky decomposition 14 3/4: Algorithms
15 Tested algorithms Approaches: LU decomposition Cholesky decomposition Ax = b } A 1) L Ux { = L y= b, solve for y ( LU) x = b = y 2) Ux = y, solve for x nnz=244 (97.12% are zeros) nnz=620 (92.67% are zeros) /4: Algorithms
16 Tested algorithms Benchmarked LU implementations (6 variations): 1) Dense LU Eigen C++ library 2) Sparse LU Suitesparse s KLU a) AMD ordering b) COLAMD ordering 3) Sparse LU Suitesparse s UMFPACK a) Natural ordering b) AMD ordering c) METIS ordering Symbolic vs. numeric factorization! 16 3/4: Algorithms
17 Tested algorithms Approaches: LU decomposition Cholesky decomposition (only when M is definite positive) T M φq q&& Q = φq 0 λ c A Apply Cholesky to M. Use Schür complement to build two reduced systems of equations. [R. von Schwerin, Mathematical Methods for MBS in Descriptor Form, 1999] 17 3/4: Algorithms
18 Tested algorithms Approaches: LU decomposition Cholesky decomposition (only when M is definite positive) Benchmarked Cholesky implementations (3 variations): 1) Sparse Cholesky Suitesparse s CHOLMOD a) AMD ordering b) COLAMD ordering c) METIS ordering 18 3/4: Algorithms
19 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 19
20 Our benchmark model Test mechanism: Parameterized 2D lattice of N x N y articulated quadrangles. Example: Nx = 5 Ny = /4: Benchmark
21 Our benchmark model Irregular node locations to avoid singular configurations: 21 3/4: Benchmark
22 Benchmark C++ implementation MB model Solver-specific tasks to be profiled Assemble Solver Assembled model Preprocessing 1. Update Jacobians 2. Build RHS 3. Triplet -> CCS (*) 4. Numeric factoring 5. Solve Ax=b N.I.S. 22 3/4: Benchmark
23 The benchmark For each solver For each ordering For N=1 32 Build mechanism with N x = N, N y =N. Run thousands of time steps for each configuration. Measure average execution time of each calculation. 23 3/4: Benchmark
24 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 24
25 25 4/4: Results
26 Benchmark C++ implementation MB model Solver-specific tasks to be profiled Assemble Solver Assembled model Preprocessing 1. Update Jacobians 2. Build RHS 3. Triplet -> CCS (*) 4. Numeric factoring 5. Solve Ax=b N.I.S. 26 3/4: Benchmark
27 Results: Preprocessing (symbolic factorization) Time (ms) Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 27 4/4: Results
28 Benchmark C++ implementation MB model Solver-specific tasks to be profiled Assemble Solver Assembled model Preprocessing 1. Update Jacobians 2. Build RHS 3. Triplet -> CCS (*) 4. Numeric factoring 5. Solve Ax=b N.I.S. 28 3/4: Benchmark
29 Results: Distribution of time (KLU+COLAMD) Update Jacobians Build RHS Solve Ax=b Triplet->CCS Numeric factoring 4% 7% 4% 4% Triplet -> CCS can be eliminated with smarter coding! 81% 29 4/4: Results
30 Results: num. factoring + solve Total numeric factorization & solve time Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS 200 Time (ms) UMFPACK with different orderings KLU with different orderings Number of redundant coordinates 30 4/4: Results
31 A comment on the Cholesky approach E matrix chol( E * E ) dense!! 31 4/4: Results
32 Results: num. factoring + solve (log scale) 10 2 Dense LU UMFPACK+AMD 10 1 Time (ms) KLU+COLAMD Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 4/4: Results
33 Results: num. factoring + solve (log scale) 10 2 Dense LU UMFPACK+AMD 10 1 ~ 50 coords Time (ms) 10 0 ~ 2000 coords KLU+COLAMD Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 4/4: Results
34 Results: num. factoring + solve (log scale) 10 2 Dense LU UMFPACK+AMD 10 1 Time (ms) 10 0 T=1ms KLU+COLAMD Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 4/4: Results
35 Conclusions MBS dynamics leads to very sparse systems. Exploiting sparsity is crucial for performance unless N<50 approx. 50<N<2000 Use KLU+COLAMD 2000<N Use UMFPACK+AMD In any case: reuse symbolic factorizations! Slides available online: Publications 35 4/4: Results
36 A comparison of Algorithms for Sparse Matrix Factoring and Variable Reordering aimed at Real-time Multibody Dynamic Simulation Thanks for your attention! Slides available online: Publications
(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 1 Don t you just invert
More informationComputational Methods CMSC/AMSC/MAPL 460. Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science
Computational Methods CMSC/AMSC/MAPL 460 Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Some special matrices Matlab code How many operations and memory
More informationLecture 9. Introduction to Numerical Techniques
Lecture 9. Introduction to Numerical Techniques Ivan Papusha CDS270 2: Mathematical Methods in Control and System Engineering May 27, 2015 1 / 25 Logistics hw8 (last one) due today. do an easy problem
More informationAdvanced Techniques for Mobile Robotics Graph-based SLAM using Least Squares. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz
Advanced Techniques for Mobile Robotics Graph-based SLAM using Least Squares Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz SLAM Constraints connect the poses of the robot while it is moving
More informationComputational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn
Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta and Paolo Mantegazza Dipartimento di Ingegneria
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26
More informationEfficient Iterative Semi-supervised Classification on Manifold
. Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani
More informationThe Efficient Extension of Globally Consistent Scan Matching to 6 DoF
The Efficient Extension of Globally Consistent Scan Matching to 6 DoF Dorit Borrmann, Jan Elseberg, Kai Lingemann, Andreas Nüchter, Joachim Hertzberg 1 / 20 Outline 1 Introduction 2 Algorithm 3 Performance
More informationICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss
ICRA 2016 Tutorial on SLAM Graph-Based SLAM and Sparsity Cyrill Stachniss 1 Graph-Based SLAM?? 2 Graph-Based SLAM?? SLAM = simultaneous localization and mapping 3 Graph-Based SLAM?? SLAM = simultaneous
More informationAim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview
Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Sparse Linear Systems and Factorization Methods Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18 Sparse
More informationSparse Matrix Algorithms
Sparse Matrix Algorithms combinatorics + numerical methods + applications Math + X Tim Davis University of Florida June 2013 contributions to the field current work vision for the future Outline Math+X
More informationCVPR 2014 Visual SLAM Tutorial Efficient Inference
CVPR 2014 Visual SLAM Tutorial Efficient Inference kaess@cmu.edu The Robotics Institute Carnegie Mellon University The Mapping Problem (t=0) Robot Landmark Measurement Onboard sensors: Wheel odometry Inertial
More informationInternational Conference on Computational Science (ICCS 2017)
International Conference on Computational Science (ICCS 2017) Exploiting Hybrid Parallelism in the Kinematic Analysis of Multibody Systems Based on Group Equations G. Bernabé, J. C. Cano, J. Cuenca, A.
More informationLeast Squares and SLAM Pose-SLAM
Least Squares and SLAM Pose-SLAM Giorgio Grisetti Part of the material of this course is taken from the Robotics 2 lectures given by G.Grisetti, W.Burgard, C.Stachniss, K.Arras, D. Tipaldi and M.Bennewitz
More informationGTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013
GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»
More informationTHE procedure used to solve inverse problems in areas such as Electrical Impedance
12TH INTL. CONFERENCE IN ELECTRICAL IMPEDANCE TOMOGRAPHY (EIT 2011), 4-6 MAY 2011, UNIV. OF BATH 1 Scaling the EIT Problem Alistair Boyle, Andy Adler, Andrea Borsic Abstract There are a number of interesting
More informationComputational Methods CMSC/AMSC/MAPL 460. Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science
Computational Methods CMSC/AMSC/MAPL 460 Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Zero elements of first column below 1 st row multiplying 1 st
More informationPreconditioning for linear least-squares problems
Preconditioning for linear least-squares problems Miroslav Tůma Institute of Computer Science Academy of Sciences of the Czech Republic tuma@cs.cas.cz joint work with Rafael Bru, José Marín and José Mas
More informationHIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach
HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /
More informationPARDISO Version Reference Sheet Fortran
PARDISO Version 5.0.0 1 Reference Sheet Fortran CALL PARDISO(PT, MAXFCT, MNUM, MTYPE, PHASE, N, A, IA, JA, 1 PERM, NRHS, IPARM, MSGLVL, B, X, ERROR, DPARM) 1 Please note that this version differs significantly
More informationGPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE)
GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE) NATALIA GIMELSHEIN ANSHUL GUPTA STEVE RENNICH SEID KORIC NVIDIA IBM NVIDIA NCSA WATSON SPARSE MATRIX PACKAGE (WSMP) Cholesky, LDL T, LU factorization
More informationarxiv: v1 [cs.pl] 18 May 2017
Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis arxiv:705.06575v [cs.pl] 8 May 207 ABSTRACT Kazem Cheshmi Rutgers University Piscataway, NJ, US kazem.ch@rutgers.edu Michelle
More informationA GPU Sparse Direct Solver for AX=B
1 / 25 A GPU Sparse Direct Solver for AX=B Jonathan Hogg, Evgueni Ovtchinnikov, Jennifer Scott* STFC Rutherford Appleton Laboratory 26 March 2014 GPU Technology Conference San Jose, California * Thanks
More informationNull space computation of sparse singular matrices with MUMPS
Null space computation of sparse singular matrices with MUMPS Xavier Vasseur (CERFACS) In collaboration with Patrick Amestoy (INPT-IRIT, University of Toulouse and ENSEEIHT), Serge Gratton (INPT-IRIT,
More informationL15. POSE-GRAPH SLAM. NA568 Mobile Robotics: Methods & Algorithms
L15. POSE-GRAPH SLAM NA568 Mobile Robotics: Methods & Algorithms Today s Topic Nonlinear Least Squares Pose-Graph SLAM Incremental Smoothing and Mapping Feature-Based SLAM Filtering Problem: Motion Prediction
More informationStructure from Motion
Structure from Motion Outline Bundle Adjustment Ambguities in Reconstruction Affine Factorization Extensions Structure from motion Recover both 3D scene geoemetry and camera positions SLAM: Simultaneous
More informationGPU ACCELERATION OF CHOLMOD: BATCHING, HYBRID AND MULTI-GPU
April 4-7, 2016 Silicon Valley GPU ACCELERATION OF CHOLMOD: BATCHING, HYBRID AND MULTI-GPU Steve Rennich, Darko Stosic, Tim Davis, April 6, 2016 OBJECTIVE Direct sparse methods are among the most widely
More informationCS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018
CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm
More informationA parallel direct/iterative solver based on a Schur complement approach
A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008
More informationInterlude: Solving systems of Equations
Interlude: Solving systems of Equations Solving Ax = b What happens to x under Ax? The singular value decomposition Rotation matrices Singular matrices Condition number Null space Solving Ax = 0 under
More informationAccelerating the Iterative Linear Solver for Reservoir Simulation
Accelerating the Iterative Linear Solver for Reservoir Simulation Wei Wu 1, Xiang Li 2, Lei He 1, Dongxiao Zhang 2 1 Electrical Engineering Department, UCLA 2 Department of Energy and Resources Engineering,
More informationSparse LU Factorization for Parallel Circuit Simulation on GPUs
Department of Electronic Engineering, Tsinghua University Sparse LU Factorization for Parallel Circuit Simulation on GPUs Ling Ren, Xiaoming Chen, Yu Wang, Chenxi Zhang, Huazhong Yang Nano-scale Integrated
More informationComputational Methods CMSC/AMSC/MAPL 460. Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science
Computational Methods CMSC/AMSC/MAPL 460 Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Matrix norms Can be defined using corresponding vector norms Two norm One norm Infinity
More informationThe 3D DSC in Fluid Simulation
The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations
More informationParallel resolution of sparse linear systems by mixing direct and iterative methods
Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix
More informationRobot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss
Robot Mapping Least Squares Approach to SLAM Cyrill Stachniss 1 Three Main SLAM Paradigms Kalman filter Particle filter Graphbased least squares approach to SLAM 2 Least Squares in General Approach for
More informationGraphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General
Robot Mapping Three Main SLAM Paradigms Least Squares Approach to SLAM Kalman filter Particle filter Graphbased Cyrill Stachniss least squares approach to SLAM 1 2 Least Squares in General! Approach for
More informationReal-Time Shape Editing using Radial Basis Functions
Real-Time Shape Editing using Radial Basis Functions, Leif Kobbelt RWTH Aachen Boundary Constraint Modeling Prescribe irregular constraints Vertex positions Constrained energy minimization Optimal fairness
More informationSparse Matrices Introduction to sparse matrices and direct methods
Sparse Matrices Introduction to sparse matrices and direct methods Iain Duff STFC Rutherford Appleton Laboratory and CERFACS Summer School The 6th de Brùn Workshop. Linear Algebra and Matrix Theory: connections,
More informationSparse Direct Solvers for Extreme-Scale Computing
Sparse Direct Solvers for Extreme-Scale Computing Iain Duff Joint work with Florent Lopez and Jonathan Hogg STFC Rutherford Appleton Laboratory SIAM Conference on Computational Science and Engineering
More informationCombinatorial problems in a Parallel Hybrid Linear Solver
Combinatorial problems in a Parallel Hybrid Linear Solver Ichitaro Yamazaki and Xiaoye Li Lawrence Berkeley National Laboratory François-Henry Rouet and Bora Uçar ENSEEIHT-IRIT and LIP, ENS-Lyon SIAM workshop
More informationExploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement
Exploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement Tim Davis (Texas A&M University) with Sanjay Ranka, Mohamed Gadou (University of Florida) Nuri Yeralan (Microsoft) NVIDIA
More informationAdvanced Computer Graphics
G22.2274 001, Fall 2009 Advanced Computer Graphics Project details and tools 1 Project Topics Computer Animation Geometric Modeling Computational Photography Image processing 2 Optimization All projects
More informationMATH 3511 Basics of MATLAB
MATH 3511 Basics of MATLAB Dmitriy Leykekhman Spring 2012 Topics Sources. Entering Matrices. Basic Operations with Matrices. Build in Matrices. Build in Scalar and Matrix Functions. if, while, for m-files
More informationIssues In Implementing The Primal-Dual Method for SDP. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM
Issues In Implementing The Primal-Dual Method for SDP Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu Outline 1. Cache and shared memory parallel computing concepts.
More informationEFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI
EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,
More informationMUMPS. The MUMPS library. Abdou Guermouche and MUMPS team, June 22-24, Univ. Bordeaux 1 and INRIA
The MUMPS library Abdou Guermouche and MUMPS team, Univ. Bordeaux 1 and INRIA June 22-24, 2010 MUMPS Outline MUMPS status Recently added features MUMPS and multicores? Memory issues GPU computing Future
More informationAlgorithm 8xx: SuiteSparseQR, a multifrontal multithreaded sparse QR factorization package
Algorithm 8xx: SuiteSparseQR, a multifrontal multithreaded sparse QR factorization package TIMOTHY A. DAVIS University of Florida SuiteSparseQR is an implementation of the multifrontal sparse QR factorization
More informationGRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS
Discussiones Mathematicae Graph Theory 30 (2010 ) 349 359 GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS Pavla Kabelíková Department of Applied Mathematics FEI, VSB Technical University
More informationContents. I The Basic Framework for Stationary Problems 1
page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other
More informationA Parallel Implementation of the BDDC Method for Linear Elasticity
A Parallel Implementation of the BDDC Method for Linear Elasticity Jakub Šístek joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík Institute of Mathematics of the AS CR, Prague
More informationCGO: G: Decoupling Symbolic from Numeric in Sparse Matrix Computations
CGO: G: Decoupling Symbolic from Numeric in Sparse Matrix Computations ABSTRACT Kazem Cheshmi PhD Student, Rutgers University kazem.ch@rutgers.edu Sympiler is a domain-specific code generator that optimizes
More informationPERFORMANCE ANALYSIS OF LOAD FLOW COMPUTATION USING FPGA 1
PERFORMANCE ANALYSIS OF LOAD FLOW COMPUTATION USING FPGA 1 J. Johnson, P. Vachranukunkiet, S. Tiwari, P. Nagvajara, C. Nwankpa Drexel University Philadelphia, PA Abstract Full-AC load flow constitutes
More informationBLAS and LAPACK + Data Formats for Sparse Matrices. Part of the lecture Wissenschaftliches Rechnen. Hilmar Wobker
BLAS and LAPACK + Data Formats for Sparse Matrices Part of the lecture Wissenschaftliches Rechnen Hilmar Wobker Institute of Applied Mathematics and Numerics, TU Dortmund email: hilmar.wobker@math.tu-dortmund.de
More informationScheduling Strategies for Parallel Sparse Backward/Forward Substitution
Scheduling Strategies for Parallel Sparse Backward/Forward Substitution J.I. Aliaga M. Bollhöfer A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ. Jaume I (Spain) {aliaga,martina,quintana}@icc.uji.es
More informationAutar Kaw Benjamin Rigsby. Transforming Numerical Methods Education for STEM Undergraduates
Autar Kaw Benjamin Rigsy http://nmmathforcollegecom Transforming Numerical Methods Education for STEM Undergraduates http://nmmathforcollegecom LU Decomposition is another method to solve a set of simultaneous
More informationICES REPORT December Chetan Jhurani and L. Demkowicz
ICES REPORT 09-36 December 2009 Multiscale Modeling Using Goal-oriented Adaptivity and Numerical Homogenization. Part III: Algorithms for Moore-Penrose Pseudoinverse by Chetan Jhurani and L. Demkowicz
More informationME964 High Performance Computing for Engineering Applications
ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964
More informationAdvanced Computer Graphics
G22.2274 001, Fall 2010 Advanced Computer Graphics Project details and tools 1 Projects Details of each project are on the website under Projects Please review all the projects and come see me if you would
More informationGPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis
GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis Abstract: Lower upper (LU) factorization for sparse matrices is the most important computing step for circuit simulation
More informationMATH 5520 Basics of MATLAB
MATH 5520 Basics of MATLAB Dmitriy Leykekhman Spring 2011 Topics Sources. Entering Matrices. Basic Operations with Matrices. Build in Matrices. Build in Scalar and Matrix Functions. if, while, for m-files
More informationComparing Least Squares Calculations
Comparing Least Squares Calculations Douglas Bates R Development Core Team Douglas.Bates@R-project.org November 16, 2017 Abstract Many statistics methods require one or more least squares problems to be
More informationESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report
ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new
More informationOn the I/O Volume in Out-of-Core Multifrontal Methods with a Flexible Allocation Scheme
On the I/O Volume in Out-of-Core Multifrontal Methods with a Flexible Allocation Scheme Emmanuel Agullo 1,6,3,7, Abdou Guermouche 5,4,8, and Jean-Yves L Excellent 2,6,3,9 1 École Normale Supérieure de
More informationFigure 6.1: Truss topology optimization diagram.
6 Implementation 6.1 Outline This chapter shows the implementation details to optimize the truss, obtained in the ground structure approach, according to the formulation presented in previous chapters.
More informationIntroduction to Optimization
Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,
More information2.7 Numerical Linear Algebra Software
2.7 Numerical Linear Algebra Software In this section we will discuss three software packages for linear algebra operations: (i) (ii) (iii) Matlab, Basic Linear Algebra Subroutines (BLAS) and LAPACK. There
More informationLecture 27: Fast Laplacian Solvers
Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall
More informationHigh-Performance Linear Algebra Processor using FPGA
High-Performance Linear Algebra Processor using FPGA J. R. Johnson P. Nagvajara C. Nwankpa 1 Extended Abstract With recent advances in FPGA (Field Programmable Gate Array) technology it is now feasible
More informationHigh-Performance Libraries and Tools. HPC Fall 2012 Prof. Robert van Engelen
High-Performance Libraries and Tools HPC Fall 2012 Prof. Robert van Engelen Overview Dense matrix BLAS (serial) ATLAS (serial/threaded) LAPACK (serial) Vendor-tuned LAPACK (shared memory parallel) ScaLAPACK/PLAPACK
More informationFast solvers for mesh-based computations
Fast solvers for mesh-based computations Maciej Paszyński Department of Computer Science AGH University, Krakow, Poland home.agh.edu.pl/paszynsk Collaborators: Anna Paszyńska (Jagiellonian University,Poland)
More informationHYPERGRAPH-BASED UNSYMMETRIC NESTED DISSECTION ORDERING FOR SPARSE LU FACTORIZATION
SIAM J. SCI. COMPUT. Vol. 32, No. 6, pp. 3426 3446 c 2010 Society for Industrial and Applied Mathematics HYPERGRAPH-BASED UNSYMMETRIC NESTED DISSECTION ORDERING FOR SPARSE LU FACTORIZATION LAURA GRIGORI,
More informationSUMMARY. solve the matrix system using iterative solvers. We use the MUMPS codes and distribute the computation over many different
Forward Modelling and Inversion of Multi-Source TEM Data D. W. Oldenburg 1, E. Haber 2, and R. Shekhtman 1 1 University of British Columbia, Department of Earth & Ocean Sciences 2 Emory University, Atlanta,
More informationApplications. Human and animal motion Robotics control Hair Plants Molecular motion
Multibody dynamics Applications Human and animal motion Robotics control Hair Plants Molecular motion Generalized coordinates Virtual work and generalized forces Lagrangian dynamics for mass points
More informationGeometric Modeling Assignment 3: Discrete Differential Quantities
Geometric Modeling Assignment : Discrete Differential Quantities Acknowledgements: Julian Panetta, Olga Diamanti Assignment (Optional) Topic: Discrete Differential Quantities with libigl Vertex Normals,
More informationBiostatistics 615/815 Lecture 13: Programming with Matrix
Computation Biostatistics 615/815 Lecture 13: Programming with Hyun Min Kang February 17th, 2011 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 1 / 28 Annoucements Computation Homework
More informationAMS527: Numerical Analysis II
AMS527: Numerical Analysis II A Brief Overview of Finite Element Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao SUNY Stony Brook AMS527: Numerical Analysis II 1 / 25 Overview Basic concepts Mathematical
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers Customizable software solver package Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein April 27, 2016 1 / 28 Background The Dual Problem Consider
More informationAdvanced Numerical Techniques for Cluster Computing
Advanced Numerical Techniques for Cluster Computing Presented by Piotr Luszczek http://icl.cs.utk.edu/iter-ref/ Presentation Outline Motivation hardware Dense matrix calculations Sparse direct solvers
More informationPARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures
PARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures Solovev S. A, Pudov S.G sergey.a.solovev@intel.com, sergey.g.pudov@intel.com Intel Xeon, Intel Core 2 Duo are trademarks of
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationHYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES
HYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES Maciej Paszyński Department of Computer Science, AGH University of Science and Technology, Kraków, Poland email: paszynsk@agh.edu.pl
More informationReminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches
Reminder: Lecture 20: The Eight-Point Algorithm F = -0.00310695-0.0025646 2.96584-0.028094-0.00771621 56.3813 13.1905-29.2007-9999.79 Readings T&V 7.3 and 7.4 Essential/Fundamental Matrix E/F Matrix Summary
More informationOptimizing Sparse Matrix Vector Multiplication on Emerging Multicores
Optimizing Sparse Matrix Vector Multiplication on Emerging Multicores Orhan Kislal, Wei Ding, Mahmut Kandemir The Pennsylvania State University University Park, Pennsylvania, USA omk03, wzd09, kandemir@cse.psu.edu
More informationNumerical Analysis I - Final Exam Matrikelnummer:
Dr. Behrens Center for Mathematical Sciences Technische Universität München Winter Term 2005/2006 Name: Numerical Analysis I - Final Exam Matrikelnummer: I agree to the publication of the results of this
More informationSparse Training Data Tutorial of Parameter Server
Carnegie Mellon University Sparse Training Data Tutorial of Parameter Server Mu Li! CSD@CMU & IDL@Baidu! muli@cs.cmu.edu High-dimensional data are sparse Why high dimension?! make the classifier s job
More informationME305: Introduction to System Dynamics
ME305: Introduction to System Dynamics Using MATLAB MATLAB stands for MATrix LABoratory and is a powerful tool for general scientific and engineering computations. Combining with user-friendly graphics
More informationLARP / 2018 ACK : 1. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates
Triangular Factors and Row Exchanges LARP / 28 ACK :. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates Then there were three
More informationarxiv: v1 [cs.sy] 8 Feb 2016
An efficient null space inexact Newton method for hydraulic simulation of water distribution networks Edo Abraham a,1,, Ivan Stoianov a,1 a Dept. of Civil and Environmental Engineering, Imperial College
More informationWhen Sparsity Meets Low-Rankness: Transform Learning With Non-Local Low-Rank Constraint For Image Restoration
When Sparsity Meets Low-Rankness: Transform Learning With Non-Local Low-Rank Constraint For Image Restoration Bihan Wen, Yanjun Li and Yoram Bresler Department of Electrical and Computer Engineering Coordinated
More informationSparse Matrices Direct methods
Sparse Matrices Direct methods Iain Duff STFC Rutherford Appleton Laboratory and CERFACS Summer School The 6th de Brùn Workshop. Linear Algebra and Matrix Theory: connections, applications and computations.
More informationNEW ADVANCES IN GPU LINEAR ALGEBRA
GTC 2012: NEW ADVANCES IN GPU LINEAR ALGEBRA Kyle Spagnoli EM Photonics 5/16/2012 QUICK ABOUT US» HPC/GPU Consulting Firm» Specializations in:» Electromagnetics» Image Processing» Fluid Dynamics» Linear
More informationLAPACK. Linear Algebra PACKage. Janice Giudice David Knezevic 1
LAPACK Linear Algebra PACKage 1 Janice Giudice David Knezevic 1 Motivating Question Recalling from last week... Level 1 BLAS: vectors ops Level 2 BLAS: matrix-vectors ops 2 2 O( n ) flops on O( n ) data
More informationOptimizing the operations with sparse matrices on Intel architecture
Optimizing the operations with sparse matrices on Intel architecture Gladkikh V. S. victor.s.gladkikh@intel.com Intel Xeon, Intel Itanium are trademarks of Intel Corporation in the U.S. and other countries.
More informationNumerical Methods 5633
Numerical Methods 5633 Lecture 7 Marina Krstic Marinkovic mmarina@maths.tcd.ie School of Mathematics Trinity College Dublin Marina Krstic Marinkovic 1 / 10 5633-Numerical Methods Organisational To appear
More informationIN OUR LAST HOMEWORK, WE SOLVED LARGE
Y OUR HOMEWORK H A SSIGNMENT Editor: Dianne P. O Leary, oleary@cs.umd.edu FAST SOLVERS AND SYLVESTER EQUATIONS: BOTH SIDES NOW By Dianne P. O Leary IN OUR LAST HOMEWORK, WE SOLVED LARGE SPARSE SYSTEMS
More information2nd Introduction to the Matrix package
2nd Introduction to the Matrix package Martin Maechler and Douglas Bates R Core Development Team maechler@stat.math.ethz.ch, bates@r-project.org September 2006 (typeset on November 16, 2017) Abstract Linear
More informationReport of Linear Solver Implementation on GPU
Report of Linear Solver Implementation on GPU XIANG LI Abstract As the development of technology and the linear equation solver is used in many aspects such as smart grid, aviation and chemical engineering,
More information