A comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation

Size: px
Start display at page:

Download "A comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation"

Transcription

1 A comparison of Algorithms for Sparse Matrix Factoring and Variable Reordering aimed at Real-time Multibody Dynamic Simulation Jose-Luis Torres-Moreno, Jose-Luis Blanco, Javier López-Martínez, Antonio Giménez-Fernández 1-4 July, 2013 Zagreb (Croatia) ECCOMAS Multibody Dynamics 2013

2 Goal of this work Problem under study Multibody dynamic formulation: Best numerical solver algorithm? = = \ 2 It depends Ax b x A b

3 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 3

4 Problem introduction Modeling decisions: coordinates, formulation, etc. (,&, ) q&& t = f q t q t t ( ) ( ) ( ) (, &, ) && z t = f z t z t t ( ) ( ) ( ) Our choices for this work: Dependent coordinates vs. Independent coordinates Coordinate type for modeling: Dynamic formulation: Natural coordinates Index-1 Lagrange w. Baumgarte 4 1/4: Introduction

5 Problem introduction A x = b Problem structure? N dependent coordinates M constraints (N+M) x (N+M) (N+M M) x 1 = (N+M M) x 1 Square, symmetric, Dense columns psd and sparse 5 1/4: Introduction

6 Structure of matrix A A Most important features - M is constant - Static non-zero structure - SPARSITY 6 1/4: Introduction

7 Why sparsity matters? Naive matrix inversion: T M φq q&& Q = φq 0 λ c A q && λ 1 T M φ q Q = φq 0 c A 1 7 1/4: Introduction

8 Why sparsity matters? nnz=528 (93.76% are zeros) nnz=8266 (2.34% are zeros) M φq φ T q 0 8 1/4: Introduction M φq In general, the inverse of a sparse matrix is NOT sparse! ( 3 ( + ) ) Cost: O N M φ T q 0 1

9 Why sparsity matters? How to keep matrices sparse? Avoid direct matrix inversion. Alternatives? Sparse matrix factorization algorithms: 1) Method: LU, Cholesky. 2) Variable ordering 9 1/4: Introduction

10 Why variable reordering matters? nnz=528 (93.76% are zeros) ~ 94% zeros Natural ordering A = ~ 70% zeros nnz=2599 (69.29% are zeros) nnz=2456 (70.98% are zeros) A = L U = /4: Introduction

11 Why variable reordering matters? Reordering = permutation of variable indices nnz=528 (93.76% are zeros) nnz=528 (93.76% are zeros) A ( :, : ) A( p, p ) 11 1/4: Introduction

12 Why variable reordering matters? nnz=528 (93.76% are zeros) ~ 94% zeros Optimized ordering ~97% and ~93% zeros nnz=244 (97.12% are zeros) nnz=620 (92.67% are zeros) /4: Introduction

13 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 13

14 Tested algorithms Approaches: LU decomposition Cholesky decomposition 14 3/4: Algorithms

15 Tested algorithms Approaches: LU decomposition Cholesky decomposition Ax = b } A 1) L Ux { = L y= b, solve for y ( LU) x = b = y 2) Ux = y, solve for x nnz=244 (97.12% are zeros) nnz=620 (92.67% are zeros) /4: Algorithms

16 Tested algorithms Benchmarked LU implementations (6 variations): 1) Dense LU Eigen C++ library 2) Sparse LU Suitesparse s KLU a) AMD ordering b) COLAMD ordering 3) Sparse LU Suitesparse s UMFPACK a) Natural ordering b) AMD ordering c) METIS ordering Symbolic vs. numeric factorization! 16 3/4: Algorithms

17 Tested algorithms Approaches: LU decomposition Cholesky decomposition (only when M is definite positive) T M φq q&& Q = φq 0 λ c A Apply Cholesky to M. Use Schür complement to build two reduced systems of equations. [R. von Schwerin, Mathematical Methods for MBS in Descriptor Form, 1999] 17 3/4: Algorithms

18 Tested algorithms Approaches: LU decomposition Cholesky decomposition (only when M is definite positive) Benchmarked Cholesky implementations (3 variations): 1) Sparse Cholesky Suitesparse s CHOLMOD a) AMD ordering b) COLAMD ordering c) METIS ordering 18 3/4: Algorithms

19 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 19

20 Our benchmark model Test mechanism: Parameterized 2D lattice of N x N y articulated quadrangles. Example: Nx = 5 Ny = /4: Benchmark

21 Our benchmark model Irregular node locations to avoid singular configurations: 21 3/4: Benchmark

22 Benchmark C++ implementation MB model Solver-specific tasks to be profiled Assemble Solver Assembled model Preprocessing 1. Update Jacobians 2. Build RHS 3. Triplet -> CCS (*) 4. Numeric factoring 5. Solve Ax=b N.I.S. 22 3/4: Benchmark

23 The benchmark For each solver For each ordering For N=1 32 Build mechanism with N x = N, N y =N. Run thousands of time steps for each configuration. Measure average execution time of each calculation. 23 3/4: Benchmark

24 Talk outline 1. Problem introduction 2. Tested algorithms 3. Benchmark 4. Results 24

25 25 4/4: Results

26 Benchmark C++ implementation MB model Solver-specific tasks to be profiled Assemble Solver Assembled model Preprocessing 1. Update Jacobians 2. Build RHS 3. Triplet -> CCS (*) 4. Numeric factoring 5. Solve Ax=b N.I.S. 26 3/4: Benchmark

27 Results: Preprocessing (symbolic factorization) Time (ms) Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 27 4/4: Results

28 Benchmark C++ implementation MB model Solver-specific tasks to be profiled Assemble Solver Assembled model Preprocessing 1. Update Jacobians 2. Build RHS 3. Triplet -> CCS (*) 4. Numeric factoring 5. Solve Ax=b N.I.S. 28 3/4: Benchmark

29 Results: Distribution of time (KLU+COLAMD) Update Jacobians Build RHS Solve Ax=b Triplet->CCS Numeric factoring 4% 7% 4% 4% Triplet -> CCS can be eliminated with smarter coding! 81% 29 4/4: Results

30 Results: num. factoring + solve Total numeric factorization & solve time Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS 200 Time (ms) UMFPACK with different orderings KLU with different orderings Number of redundant coordinates 30 4/4: Results

31 A comment on the Cholesky approach E matrix chol( E * E ) dense!! 31 4/4: Results

32 Results: num. factoring + solve (log scale) 10 2 Dense LU UMFPACK+AMD 10 1 Time (ms) KLU+COLAMD Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 4/4: Results

33 Results: num. factoring + solve (log scale) 10 2 Dense LU UMFPACK+AMD 10 1 ~ 50 coords Time (ms) 10 0 ~ 2000 coords KLU+COLAMD Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 4/4: Results

34 Results: num. factoring + solve (log scale) 10 2 Dense LU UMFPACK+AMD 10 1 Time (ms) 10 0 T=1ms KLU+COLAMD Dense LU UMFPACK Natural UMFPACK AMD UMFPACK METIS KLU AMD KLU COLAMD CHOLMOD AMD/AMD CHOLMOD COLAMD/COLAMD CHOLMOD METIS/METIS Number of redundant coordinates 4/4: Results

35 Conclusions MBS dynamics leads to very sparse systems. Exploiting sparsity is crucial for performance unless N<50 approx. 50<N<2000 Use KLU+COLAMD 2000<N Use UMFPACK+AMD In any case: reuse symbolic factorizations! Slides available online: Publications 35 4/4: Results

36 A comparison of Algorithms for Sparse Matrix Factoring and Variable Reordering aimed at Real-time Multibody Dynamic Simulation Thanks for your attention! Slides available online: Publications

(Sparse) Linear Solvers

(Sparse) Linear Solvers (Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert

More information

(Sparse) Linear Solvers

(Sparse) Linear Solvers (Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 1 Don t you just invert

More information

Computational Methods CMSC/AMSC/MAPL 460. Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Some special matrices Matlab code How many operations and memory

More information

Lecture 9. Introduction to Numerical Techniques

Lecture 9. Introduction to Numerical Techniques Lecture 9. Introduction to Numerical Techniques Ivan Papusha CDS270 2: Mathematical Methods in Control and System Engineering May 27, 2015 1 / 25 Logistics hw8 (last one) due today. do an easy problem

More information

Advanced Techniques for Mobile Robotics Graph-based SLAM using Least Squares. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz

Advanced Techniques for Mobile Robotics Graph-based SLAM using Least Squares. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Advanced Techniques for Mobile Robotics Graph-based SLAM using Least Squares Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz SLAM Constraints connect the poses of the robot while it is moving

More information

Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn

Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta and Paolo Mantegazza Dipartimento di Ingegneria

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

Efficient Iterative Semi-supervised Classification on Manifold

Efficient Iterative Semi-supervised Classification on Manifold . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani

More information

The Efficient Extension of Globally Consistent Scan Matching to 6 DoF

The Efficient Extension of Globally Consistent Scan Matching to 6 DoF The Efficient Extension of Globally Consistent Scan Matching to 6 DoF Dorit Borrmann, Jan Elseberg, Kai Lingemann, Andreas Nüchter, Joachim Hertzberg 1 / 20 Outline 1 Introduction 2 Algorithm 3 Performance

More information

ICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss

ICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss ICRA 2016 Tutorial on SLAM Graph-Based SLAM and Sparsity Cyrill Stachniss 1 Graph-Based SLAM?? 2 Graph-Based SLAM?? SLAM = simultaneous localization and mapping 3 Graph-Based SLAM?? SLAM = simultaneous

More information

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Sparse Linear Systems and Factorization Methods Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18 Sparse

More information

Sparse Matrix Algorithms

Sparse Matrix Algorithms Sparse Matrix Algorithms combinatorics + numerical methods + applications Math + X Tim Davis University of Florida June 2013 contributions to the field current work vision for the future Outline Math+X

More information

CVPR 2014 Visual SLAM Tutorial Efficient Inference

CVPR 2014 Visual SLAM Tutorial Efficient Inference CVPR 2014 Visual SLAM Tutorial Efficient Inference kaess@cmu.edu The Robotics Institute Carnegie Mellon University The Mapping Problem (t=0) Robot Landmark Measurement Onboard sensors: Wheel odometry Inertial

More information

International Conference on Computational Science (ICCS 2017)

International Conference on Computational Science (ICCS 2017) International Conference on Computational Science (ICCS 2017) Exploiting Hybrid Parallelism in the Kinematic Analysis of Multibody Systems Based on Group Equations G. Bernabé, J. C. Cano, J. Cuenca, A.

More information

Least Squares and SLAM Pose-SLAM

Least Squares and SLAM Pose-SLAM Least Squares and SLAM Pose-SLAM Giorgio Grisetti Part of the material of this course is taken from the Robotics 2 lectures given by G.Grisetti, W.Burgard, C.Stachniss, K.Arras, D. Tipaldi and M.Bennewitz

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

THE procedure used to solve inverse problems in areas such as Electrical Impedance

THE procedure used to solve inverse problems in areas such as Electrical Impedance 12TH INTL. CONFERENCE IN ELECTRICAL IMPEDANCE TOMOGRAPHY (EIT 2011), 4-6 MAY 2011, UNIV. OF BATH 1 Scaling the EIT Problem Alistair Boyle, Andy Adler, Andrea Borsic Abstract There are a number of interesting

More information

Computational Methods CMSC/AMSC/MAPL 460. Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Vectors, Matrices, Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Zero elements of first column below 1 st row multiplying 1 st

More information

Preconditioning for linear least-squares problems

Preconditioning for linear least-squares problems Preconditioning for linear least-squares problems Miroslav Tůma Institute of Computer Science Academy of Sciences of the Czech Republic tuma@cs.cas.cz joint work with Rafael Bru, José Marín and José Mas

More information

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /

More information

PARDISO Version Reference Sheet Fortran

PARDISO Version Reference Sheet Fortran PARDISO Version 5.0.0 1 Reference Sheet Fortran CALL PARDISO(PT, MAXFCT, MNUM, MTYPE, PHASE, N, A, IA, JA, 1 PERM, NRHS, IPARM, MSGLVL, B, X, ERROR, DPARM) 1 Please note that this version differs significantly

More information

GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE)

GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE) GPU ACCELERATION OF WSMP (WATSON SPARSE MATRIX PACKAGE) NATALIA GIMELSHEIN ANSHUL GUPTA STEVE RENNICH SEID KORIC NVIDIA IBM NVIDIA NCSA WATSON SPARSE MATRIX PACKAGE (WSMP) Cholesky, LDL T, LU factorization

More information

arxiv: v1 [cs.pl] 18 May 2017

arxiv: v1 [cs.pl] 18 May 2017 Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis arxiv:705.06575v [cs.pl] 8 May 207 ABSTRACT Kazem Cheshmi Rutgers University Piscataway, NJ, US kazem.ch@rutgers.edu Michelle

More information

A GPU Sparse Direct Solver for AX=B

A GPU Sparse Direct Solver for AX=B 1 / 25 A GPU Sparse Direct Solver for AX=B Jonathan Hogg, Evgueni Ovtchinnikov, Jennifer Scott* STFC Rutherford Appleton Laboratory 26 March 2014 GPU Technology Conference San Jose, California * Thanks

More information

Null space computation of sparse singular matrices with MUMPS

Null space computation of sparse singular matrices with MUMPS Null space computation of sparse singular matrices with MUMPS Xavier Vasseur (CERFACS) In collaboration with Patrick Amestoy (INPT-IRIT, University of Toulouse and ENSEEIHT), Serge Gratton (INPT-IRIT,

More information

L15. POSE-GRAPH SLAM. NA568 Mobile Robotics: Methods & Algorithms

L15. POSE-GRAPH SLAM. NA568 Mobile Robotics: Methods & Algorithms L15. POSE-GRAPH SLAM NA568 Mobile Robotics: Methods & Algorithms Today s Topic Nonlinear Least Squares Pose-Graph SLAM Incremental Smoothing and Mapping Feature-Based SLAM Filtering Problem: Motion Prediction

More information

Structure from Motion

Structure from Motion Structure from Motion Outline Bundle Adjustment Ambguities in Reconstruction Affine Factorization Extensions Structure from motion Recover both 3D scene geoemetry and camera positions SLAM: Simultaneous

More information

GPU ACCELERATION OF CHOLMOD: BATCHING, HYBRID AND MULTI-GPU

GPU ACCELERATION OF CHOLMOD: BATCHING, HYBRID AND MULTI-GPU April 4-7, 2016 Silicon Valley GPU ACCELERATION OF CHOLMOD: BATCHING, HYBRID AND MULTI-GPU Steve Rennich, Darko Stosic, Tim Davis, April 6, 2016 OBJECTIVE Direct sparse methods are among the most widely

More information

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018 CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm

More information

A parallel direct/iterative solver based on a Schur complement approach

A parallel direct/iterative solver based on a Schur complement approach A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008

More information

Interlude: Solving systems of Equations

Interlude: Solving systems of Equations Interlude: Solving systems of Equations Solving Ax = b What happens to x under Ax? The singular value decomposition Rotation matrices Singular matrices Condition number Null space Solving Ax = 0 under

More information

Accelerating the Iterative Linear Solver for Reservoir Simulation

Accelerating the Iterative Linear Solver for Reservoir Simulation Accelerating the Iterative Linear Solver for Reservoir Simulation Wei Wu 1, Xiang Li 2, Lei He 1, Dongxiao Zhang 2 1 Electrical Engineering Department, UCLA 2 Department of Energy and Resources Engineering,

More information

Sparse LU Factorization for Parallel Circuit Simulation on GPUs

Sparse LU Factorization for Parallel Circuit Simulation on GPUs Department of Electronic Engineering, Tsinghua University Sparse LU Factorization for Parallel Circuit Simulation on GPUs Ling Ren, Xiaoming Chen, Yu Wang, Chenxi Zhang, Huazhong Yang Nano-scale Integrated

More information

Computational Methods CMSC/AMSC/MAPL 460. Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Linear Systems, LU Decomposition, Ramani Duraiswami, Dept. of Computer Science Matrix norms Can be defined using corresponding vector norms Two norm One norm Infinity

More information

The 3D DSC in Fluid Simulation

The 3D DSC in Fluid Simulation The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations

More information

Parallel resolution of sparse linear systems by mixing direct and iterative methods

Parallel resolution of sparse linear systems by mixing direct and iterative methods Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix

More information

Robot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss

Robot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss Robot Mapping Least Squares Approach to SLAM Cyrill Stachniss 1 Three Main SLAM Paradigms Kalman filter Particle filter Graphbased least squares approach to SLAM 2 Least Squares in General Approach for

More information

Graphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General

Graphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General Robot Mapping Three Main SLAM Paradigms Least Squares Approach to SLAM Kalman filter Particle filter Graphbased Cyrill Stachniss least squares approach to SLAM 1 2 Least Squares in General! Approach for

More information

Real-Time Shape Editing using Radial Basis Functions

Real-Time Shape Editing using Radial Basis Functions Real-Time Shape Editing using Radial Basis Functions, Leif Kobbelt RWTH Aachen Boundary Constraint Modeling Prescribe irregular constraints Vertex positions Constrained energy minimization Optimal fairness

More information

Sparse Matrices Introduction to sparse matrices and direct methods

Sparse Matrices Introduction to sparse matrices and direct methods Sparse Matrices Introduction to sparse matrices and direct methods Iain Duff STFC Rutherford Appleton Laboratory and CERFACS Summer School The 6th de Brùn Workshop. Linear Algebra and Matrix Theory: connections,

More information

Sparse Direct Solvers for Extreme-Scale Computing

Sparse Direct Solvers for Extreme-Scale Computing Sparse Direct Solvers for Extreme-Scale Computing Iain Duff Joint work with Florent Lopez and Jonathan Hogg STFC Rutherford Appleton Laboratory SIAM Conference on Computational Science and Engineering

More information

Combinatorial problems in a Parallel Hybrid Linear Solver

Combinatorial problems in a Parallel Hybrid Linear Solver Combinatorial problems in a Parallel Hybrid Linear Solver Ichitaro Yamazaki and Xiaoye Li Lawrence Berkeley National Laboratory François-Henry Rouet and Bora Uçar ENSEEIHT-IRIT and LIP, ENS-Lyon SIAM workshop

More information

Exploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement

Exploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement Exploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement Tim Davis (Texas A&M University) with Sanjay Ranka, Mohamed Gadou (University of Florida) Nuri Yeralan (Microsoft) NVIDIA

More information

Advanced Computer Graphics

Advanced Computer Graphics G22.2274 001, Fall 2009 Advanced Computer Graphics Project details and tools 1 Project Topics Computer Animation Geometric Modeling Computational Photography Image processing 2 Optimization All projects

More information

MATH 3511 Basics of MATLAB

MATH 3511 Basics of MATLAB MATH 3511 Basics of MATLAB Dmitriy Leykekhman Spring 2012 Topics Sources. Entering Matrices. Basic Operations with Matrices. Build in Matrices. Build in Scalar and Matrix Functions. if, while, for m-files

More information

Issues In Implementing The Primal-Dual Method for SDP. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM

Issues In Implementing The Primal-Dual Method for SDP. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM Issues In Implementing The Primal-Dual Method for SDP Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu Outline 1. Cache and shared memory parallel computing concepts.

More information

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,

More information

MUMPS. The MUMPS library. Abdou Guermouche and MUMPS team, June 22-24, Univ. Bordeaux 1 and INRIA

MUMPS. The MUMPS library. Abdou Guermouche and MUMPS team, June 22-24, Univ. Bordeaux 1 and INRIA The MUMPS library Abdou Guermouche and MUMPS team, Univ. Bordeaux 1 and INRIA June 22-24, 2010 MUMPS Outline MUMPS status Recently added features MUMPS and multicores? Memory issues GPU computing Future

More information

Algorithm 8xx: SuiteSparseQR, a multifrontal multithreaded sparse QR factorization package

Algorithm 8xx: SuiteSparseQR, a multifrontal multithreaded sparse QR factorization package Algorithm 8xx: SuiteSparseQR, a multifrontal multithreaded sparse QR factorization package TIMOTHY A. DAVIS University of Florida SuiteSparseQR is an implementation of the multifrontal sparse QR factorization

More information

GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS

GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS Discussiones Mathematicae Graph Theory 30 (2010 ) 349 359 GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS Pavla Kabelíková Department of Applied Mathematics FEI, VSB Technical University

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

A Parallel Implementation of the BDDC Method for Linear Elasticity

A Parallel Implementation of the BDDC Method for Linear Elasticity A Parallel Implementation of the BDDC Method for Linear Elasticity Jakub Šístek joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík Institute of Mathematics of the AS CR, Prague

More information

CGO: G: Decoupling Symbolic from Numeric in Sparse Matrix Computations

CGO: G: Decoupling Symbolic from Numeric in Sparse Matrix Computations CGO: G: Decoupling Symbolic from Numeric in Sparse Matrix Computations ABSTRACT Kazem Cheshmi PhD Student, Rutgers University kazem.ch@rutgers.edu Sympiler is a domain-specific code generator that optimizes

More information

PERFORMANCE ANALYSIS OF LOAD FLOW COMPUTATION USING FPGA 1

PERFORMANCE ANALYSIS OF LOAD FLOW COMPUTATION USING FPGA 1 PERFORMANCE ANALYSIS OF LOAD FLOW COMPUTATION USING FPGA 1 J. Johnson, P. Vachranukunkiet, S. Tiwari, P. Nagvajara, C. Nwankpa Drexel University Philadelphia, PA Abstract Full-AC load flow constitutes

More information

BLAS and LAPACK + Data Formats for Sparse Matrices. Part of the lecture Wissenschaftliches Rechnen. Hilmar Wobker

BLAS and LAPACK + Data Formats for Sparse Matrices. Part of the lecture Wissenschaftliches Rechnen. Hilmar Wobker BLAS and LAPACK + Data Formats for Sparse Matrices Part of the lecture Wissenschaftliches Rechnen Hilmar Wobker Institute of Applied Mathematics and Numerics, TU Dortmund email: hilmar.wobker@math.tu-dortmund.de

More information

Scheduling Strategies for Parallel Sparse Backward/Forward Substitution

Scheduling Strategies for Parallel Sparse Backward/Forward Substitution Scheduling Strategies for Parallel Sparse Backward/Forward Substitution J.I. Aliaga M. Bollhöfer A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ. Jaume I (Spain) {aliaga,martina,quintana}@icc.uji.es

More information

Autar Kaw Benjamin Rigsby. Transforming Numerical Methods Education for STEM Undergraduates

Autar Kaw Benjamin Rigsby.   Transforming Numerical Methods Education for STEM Undergraduates Autar Kaw Benjamin Rigsy http://nmmathforcollegecom Transforming Numerical Methods Education for STEM Undergraduates http://nmmathforcollegecom LU Decomposition is another method to solve a set of simultaneous

More information

ICES REPORT December Chetan Jhurani and L. Demkowicz

ICES REPORT December Chetan Jhurani and L. Demkowicz ICES REPORT 09-36 December 2009 Multiscale Modeling Using Goal-oriented Adaptivity and Numerical Homogenization. Part III: Algorithms for Moore-Penrose Pseudoinverse by Chetan Jhurani and L. Demkowicz

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964

More information

Advanced Computer Graphics

Advanced Computer Graphics G22.2274 001, Fall 2010 Advanced Computer Graphics Project details and tools 1 Projects Details of each project are on the website under Projects Please review all the projects and come see me if you would

More information

GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis

GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis Abstract: Lower upper (LU) factorization for sparse matrices is the most important computing step for circuit simulation

More information

MATH 5520 Basics of MATLAB

MATH 5520 Basics of MATLAB MATH 5520 Basics of MATLAB Dmitriy Leykekhman Spring 2011 Topics Sources. Entering Matrices. Basic Operations with Matrices. Build in Matrices. Build in Scalar and Matrix Functions. if, while, for m-files

More information

Comparing Least Squares Calculations

Comparing Least Squares Calculations Comparing Least Squares Calculations Douglas Bates R Development Core Team Douglas.Bates@R-project.org November 16, 2017 Abstract Many statistics methods require one or more least squares problems to be

More information

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new

More information

On the I/O Volume in Out-of-Core Multifrontal Methods with a Flexible Allocation Scheme

On the I/O Volume in Out-of-Core Multifrontal Methods with a Flexible Allocation Scheme On the I/O Volume in Out-of-Core Multifrontal Methods with a Flexible Allocation Scheme Emmanuel Agullo 1,6,3,7, Abdou Guermouche 5,4,8, and Jean-Yves L Excellent 2,6,3,9 1 École Normale Supérieure de

More information

Figure 6.1: Truss topology optimization diagram.

Figure 6.1: Truss topology optimization diagram. 6 Implementation 6.1 Outline This chapter shows the implementation details to optimize the truss, obtained in the ground structure approach, according to the formulation presented in previous chapters.

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,

More information

2.7 Numerical Linear Algebra Software

2.7 Numerical Linear Algebra Software 2.7 Numerical Linear Algebra Software In this section we will discuss three software packages for linear algebra operations: (i) (ii) (iii) Matlab, Basic Linear Algebra Subroutines (BLAS) and LAPACK. There

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

High-Performance Linear Algebra Processor using FPGA

High-Performance Linear Algebra Processor using FPGA High-Performance Linear Algebra Processor using FPGA J. R. Johnson P. Nagvajara C. Nwankpa 1 Extended Abstract With recent advances in FPGA (Field Programmable Gate Array) technology it is now feasible

More information

High-Performance Libraries and Tools. HPC Fall 2012 Prof. Robert van Engelen

High-Performance Libraries and Tools. HPC Fall 2012 Prof. Robert van Engelen High-Performance Libraries and Tools HPC Fall 2012 Prof. Robert van Engelen Overview Dense matrix BLAS (serial) ATLAS (serial/threaded) LAPACK (serial) Vendor-tuned LAPACK (shared memory parallel) ScaLAPACK/PLAPACK

More information

Fast solvers for mesh-based computations

Fast solvers for mesh-based computations Fast solvers for mesh-based computations Maciej Paszyński Department of Computer Science AGH University, Krakow, Poland home.agh.edu.pl/paszynsk Collaborators: Anna Paszyńska (Jagiellonian University,Poland)

More information

HYPERGRAPH-BASED UNSYMMETRIC NESTED DISSECTION ORDERING FOR SPARSE LU FACTORIZATION

HYPERGRAPH-BASED UNSYMMETRIC NESTED DISSECTION ORDERING FOR SPARSE LU FACTORIZATION SIAM J. SCI. COMPUT. Vol. 32, No. 6, pp. 3426 3446 c 2010 Society for Industrial and Applied Mathematics HYPERGRAPH-BASED UNSYMMETRIC NESTED DISSECTION ORDERING FOR SPARSE LU FACTORIZATION LAURA GRIGORI,

More information

SUMMARY. solve the matrix system using iterative solvers. We use the MUMPS codes and distribute the computation over many different

SUMMARY. solve the matrix system using iterative solvers. We use the MUMPS codes and distribute the computation over many different Forward Modelling and Inversion of Multi-Source TEM Data D. W. Oldenburg 1, E. Haber 2, and R. Shekhtman 1 1 University of British Columbia, Department of Earth & Ocean Sciences 2 Emory University, Atlanta,

More information

Applications. Human and animal motion Robotics control Hair Plants Molecular motion

Applications. Human and animal motion Robotics control Hair Plants Molecular motion Multibody dynamics Applications Human and animal motion Robotics control Hair Plants Molecular motion Generalized coordinates Virtual work and generalized forces Lagrangian dynamics for mass points

More information

Geometric Modeling Assignment 3: Discrete Differential Quantities

Geometric Modeling Assignment 3: Discrete Differential Quantities Geometric Modeling Assignment : Discrete Differential Quantities Acknowledgements: Julian Panetta, Olga Diamanti Assignment (Optional) Topic: Discrete Differential Quantities with libigl Vertex Normals,

More information

Biostatistics 615/815 Lecture 13: Programming with Matrix

Biostatistics 615/815 Lecture 13: Programming with Matrix Computation Biostatistics 615/815 Lecture 13: Programming with Hyun Min Kang February 17th, 2011 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 1 / 28 Annoucements Computation Homework

More information

AMS527: Numerical Analysis II

AMS527: Numerical Analysis II AMS527: Numerical Analysis II A Brief Overview of Finite Element Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao SUNY Stony Brook AMS527: Numerical Analysis II 1 / 25 Overview Basic concepts Mathematical

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers Customizable software solver package Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein April 27, 2016 1 / 28 Background The Dual Problem Consider

More information

Advanced Numerical Techniques for Cluster Computing

Advanced Numerical Techniques for Cluster Computing Advanced Numerical Techniques for Cluster Computing Presented by Piotr Luszczek http://icl.cs.utk.edu/iter-ref/ Presentation Outline Motivation hardware Dense matrix calculations Sparse direct solvers

More information

PARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures

PARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures PARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures Solovev S. A, Pudov S.G sergey.a.solovev@intel.com, sergey.g.pudov@intel.com Intel Xeon, Intel Core 2 Duo are trademarks of

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

HYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES

HYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES HYBRID DIRECT AND ITERATIVE SOLVER FOR H ADAPTIVE MESHES WITH POINT SINGULARITIES Maciej Paszyński Department of Computer Science, AGH University of Science and Technology, Kraków, Poland email: paszynsk@agh.edu.pl

More information

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches Reminder: Lecture 20: The Eight-Point Algorithm F = -0.00310695-0.0025646 2.96584-0.028094-0.00771621 56.3813 13.1905-29.2007-9999.79 Readings T&V 7.3 and 7.4 Essential/Fundamental Matrix E/F Matrix Summary

More information

Optimizing Sparse Matrix Vector Multiplication on Emerging Multicores

Optimizing Sparse Matrix Vector Multiplication on Emerging Multicores Optimizing Sparse Matrix Vector Multiplication on Emerging Multicores Orhan Kislal, Wei Ding, Mahmut Kandemir The Pennsylvania State University University Park, Pennsylvania, USA omk03, wzd09, kandemir@cse.psu.edu

More information

Numerical Analysis I - Final Exam Matrikelnummer:

Numerical Analysis I - Final Exam Matrikelnummer: Dr. Behrens Center for Mathematical Sciences Technische Universität München Winter Term 2005/2006 Name: Numerical Analysis I - Final Exam Matrikelnummer: I agree to the publication of the results of this

More information

Sparse Training Data Tutorial of Parameter Server

Sparse Training Data Tutorial of Parameter Server Carnegie Mellon University Sparse Training Data Tutorial of Parameter Server Mu Li! CSD@CMU & IDL@Baidu! muli@cs.cmu.edu High-dimensional data are sparse Why high dimension?! make the classifier s job

More information

ME305: Introduction to System Dynamics

ME305: Introduction to System Dynamics ME305: Introduction to System Dynamics Using MATLAB MATLAB stands for MATrix LABoratory and is a powerful tool for general scientific and engineering computations. Combining with user-friendly graphics

More information

LARP / 2018 ACK : 1. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates

LARP / 2018 ACK : 1. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates Triangular Factors and Row Exchanges LARP / 28 ACK :. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates Then there were three

More information

arxiv: v1 [cs.sy] 8 Feb 2016

arxiv: v1 [cs.sy] 8 Feb 2016 An efficient null space inexact Newton method for hydraulic simulation of water distribution networks Edo Abraham a,1,, Ivan Stoianov a,1 a Dept. of Civil and Environmental Engineering, Imperial College

More information

When Sparsity Meets Low-Rankness: Transform Learning With Non-Local Low-Rank Constraint For Image Restoration

When Sparsity Meets Low-Rankness: Transform Learning With Non-Local Low-Rank Constraint For Image Restoration When Sparsity Meets Low-Rankness: Transform Learning With Non-Local Low-Rank Constraint For Image Restoration Bihan Wen, Yanjun Li and Yoram Bresler Department of Electrical and Computer Engineering Coordinated

More information

Sparse Matrices Direct methods

Sparse Matrices Direct methods Sparse Matrices Direct methods Iain Duff STFC Rutherford Appleton Laboratory and CERFACS Summer School The 6th de Brùn Workshop. Linear Algebra and Matrix Theory: connections, applications and computations.

More information

NEW ADVANCES IN GPU LINEAR ALGEBRA

NEW ADVANCES IN GPU LINEAR ALGEBRA GTC 2012: NEW ADVANCES IN GPU LINEAR ALGEBRA Kyle Spagnoli EM Photonics 5/16/2012 QUICK ABOUT US» HPC/GPU Consulting Firm» Specializations in:» Electromagnetics» Image Processing» Fluid Dynamics» Linear

More information

LAPACK. Linear Algebra PACKage. Janice Giudice David Knezevic 1

LAPACK. Linear Algebra PACKage. Janice Giudice David Knezevic 1 LAPACK Linear Algebra PACKage 1 Janice Giudice David Knezevic 1 Motivating Question Recalling from last week... Level 1 BLAS: vectors ops Level 2 BLAS: matrix-vectors ops 2 2 O( n ) flops on O( n ) data

More information

Optimizing the operations with sparse matrices on Intel architecture

Optimizing the operations with sparse matrices on Intel architecture Optimizing the operations with sparse matrices on Intel architecture Gladkikh V. S. victor.s.gladkikh@intel.com Intel Xeon, Intel Itanium are trademarks of Intel Corporation in the U.S. and other countries.

More information

Numerical Methods 5633

Numerical Methods 5633 Numerical Methods 5633 Lecture 7 Marina Krstic Marinkovic mmarina@maths.tcd.ie School of Mathematics Trinity College Dublin Marina Krstic Marinkovic 1 / 10 5633-Numerical Methods Organisational To appear

More information

IN OUR LAST HOMEWORK, WE SOLVED LARGE

IN OUR LAST HOMEWORK, WE SOLVED LARGE Y OUR HOMEWORK H A SSIGNMENT Editor: Dianne P. O Leary, oleary@cs.umd.edu FAST SOLVERS AND SYLVESTER EQUATIONS: BOTH SIDES NOW By Dianne P. O Leary IN OUR LAST HOMEWORK, WE SOLVED LARGE SPARSE SYSTEMS

More information

2nd Introduction to the Matrix package

2nd Introduction to the Matrix package 2nd Introduction to the Matrix package Martin Maechler and Douglas Bates R Core Development Team maechler@stat.math.ethz.ch, bates@r-project.org September 2006 (typeset on November 16, 2017) Abstract Linear

More information

Report of Linear Solver Implementation on GPU

Report of Linear Solver Implementation on GPU Report of Linear Solver Implementation on GPU XIANG LI Abstract As the development of technology and the linear equation solver is used in many aspects such as smart grid, aviation and chemical engineering,

More information