Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn

Size: px
Start display at page:

Download "Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn"

Transcription

1 Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta and Paolo Mantegazza Dipartimento di Ingegneria Aerospaziale Multibody Dynamics 2005 International Conference on Advances in Computational Multibody Dynamics ECCOMAS Thematic Conference Madrid, June

2 Outline 2 1. MBDyn (Free) Multibody Software Description 2. Computational Aspects & Improvements Linear Solution Strategies Matrix Assembly Strategies Nonlinear Solution Strategies Parallelization Strategies 3. Applications Fast-Prototyping of Problems without Jacobian: Landing Gear Simulation Fast Solution of Small Problems: Real-Time Simulation Efficient Solution of Medium/Large Problems: Rotorcraft Simulation 4. Conclusions

3 MBDyn (Free) Multibody Software 3 MBDyn is a free general purpose multibody software Freely available in source form at Licensed under GNU General Public License Developed at the Dipartimento di Ingegneria Aerospaziale of the University Politecnico di Milano Solves Initial-Value problems in DAE form Biased toward aeroservoelastic simulation of rotorcraft Features Real-Time simulation capabilities by way of Linux RTAI Features a selection of linear solvers tailored for different problem sizes Object-oriented: allows to easily replace components and add new features

4 Computational Aspects & Improvements 4 Some of the recent computational improvements were pushed by performance requirements originating from: Simulation of medium/large size problems Real-Time simulation of small, yet non-trivial models Requirements may be conflicting; may need different design Key development directions are: Linear Solution Strategies Matrix Assembly Strategies Nonlinear Solution Strategies Parallelization Strategies

5 Computational Aspects (cntd.) 5 Object-Oriented Programming: allows generic programming NLS::Solve() { // Newton-Raphson while (true) { if (DM->Residual()->Test()) { return; if (new_jacobian) { DM->Jacobian(); LS->Solve(); DM->Update(); NLS: nonlinear solver LS: linear solver DM: data manager SS: step solver SS::Advance() { SS->Predict(); NLS->Solve();

6 Linear Solution Strategies 6 Redundant approach: large, sparse matrices => sparse solver Sparse solvers are usually optimized for min. space AND min. operations => room for trading space vs. Speed Default solver: Umfpack (3.0 => 4.4) Many other solvers available: Lapack dense solver: optimal for problems < 60 eqns. WSMP (non-free software) sparse parallel solver: optimal for > eqns. SuperLU, Y12, HSL (non-free software), Meschach, TAUCS,... Custom solver naive : dense storage sparse operations aggressive pivoting (min. fill-in of factored matrix) multithread implementation (limited performance improvements for target problems) Dramatically improves performances (in eqns. range)

7 Assembly Strategies 7 1. Lapack,... Dense: may be relevant for very small, almost dense problems 2. Umfpack, y12m,... SpMap: array of map objects (binary trees), needs packing CC: initialized as SpMap, preserves packing DIR: initialized as SpMap, dense index table CC/DIR-MT: CC/DIR may be efficiently parallelized for SMP 3. naive Sparse-dense: dense storage and fill-in tables for assembly; sparse indices for sparse factorization

8 Assembly Strategies: SpMap 8 The matrix is an array of binary trees: typedef std::map<int, double> row_cont_type; std::vector<row_cont_type> col_indices; double& operator()(int i_row, int i_col) { row_cont_type::iterator i; row_cont_type& row = col_indices[i_col]; i = row.find(i_row); if (i == row.end()) { return row[i_row] = 0.; return i->second; Note: fills in with zeros; the real implementation may handle this.

9 Assembly Strategies: CC The matrix is in Column-Compressed form after initialization as SpMap: std::vector<double>& values; const std::vector<int>& row_indices, column_start; double& operator()(int i_row, int i_col) { int row_begin = column_start[i_col - 1]; int row_end = column_start[i_col] - 1; int idx, row; 9 if (OutOfRange(i_row)) { throw ErrRebuildMatrix(); while (row_end >= row_begin) { idx = (row_begin + row_end)/2; row = row_indices[idx]; if (i_row < row) { row_end = idx - 1; else if (i_row > row) { row_begin = idx + 1; else { return values[idx]; throw ErrRebuildMatrix(); // out of range: rebuild // binary search // not found: rebuild

10 Assembly Strategies: CC (contd.) 10 Advantages: After first assembly, saves packing into column-compressed form Column access cost: 1 (array) Row access cost: log 2 (N) (binary tree) Drawbacks: Need matrix rebuild when fill-in changes (worked around by allowing zeros at first assembly)

11 Assembly Strategies: DIR 11 The matrix is in Column-Compressed form after initialization as SpMap; the indices are dense: std::vector<double>& values; const std::vector<std::vector<int> >& indices; double& operator()(int i_row, int i_col) { return values[indices[i_row][i_col]] Advantages: After first assembly, saves data packing into column-compressed form Column access cost: 1 (array) Row access cost: 1 (array) Drawbacks: Need matrix rebuild when fill-in changes (worked around by allowing zeros at first assembly) Memory occupation N 2

12 Assembly Strategies: CC/DIR-MultiThread 12 One data array per thread Indices shared by all threads int N, // row number Nthr; // thread number std::vector<std::vector<double> > &A // storages for (int row = thr; row < N; row += Nthr) { for (int t = 1; t < Nthr; t++ ) { A[0][row] += A[t][row]; Advantages: No preliminary partitioning: assembly on a first-in basis Final packing is also parallel Drawbacks: Process scheduling overhead for small problems

13 Nonlinear Solution Strategies 13 Solution strategy based on Newton iteration Exact Newton (default) Inexact Newton: GMRES, BiCGStab r x x h w r x w J x w= w h x No need to build the matrix: matrix-free The matrix is actually built to reinitialize the method, acting as a preconditioner; more efficient preconditioners will be implemented Essential feature: Newton-like convergence without implementing Jacobian, more efficient than numerical differentiation: => fast prototyping

14 Parallel Solution Strategies 14 Domain partitioning: equally scaled subproblems minimal interface => METIS Subdomain/interface solution: => Schur 0 ={f [B1 E1 1 0 B s E s x s f s ]{x1 F 1 F s C y g

15 Parallel Solution Strategies (cntd.) The local matrices are factored, exploiting subdomain sparsity 2. The local parts of the right-hand side of the reduced problem are computed and sent to the master node 3. If required, the local parts of the Schur complement matrices are assembled and sent to the master node as well 4. The reduced system is solved <= bottleneck 5. The other unknowns are computed by back-substitution Only step 4. cannot be parallelized. Mostly suited for specific topologies (e.g. helicopter rotors)

16 Applications: Landing Gear 16 Gear-walk instability of commercial aircraft landing gear: Specially implemented shockabsorber and tire models ABS model by explicit feedback => Jacobian is incomplete When the problem is dominated by the dynamics of the portion with no Jacobian, the matrix-free nonlinear solver allows nearly-quadratic convergence

17 Applications: Real-Time simulation 17 The use of general-purpose multibody code for real-time simulation sets very stringent requirements on (worst-case) performances Actually, real-time requirements were the initial motivation for this activity on performance improvement None of the improvements described in this work were significant for real-time because of the very limited size of models ( eqns.) The naive solver (not dicussed here) gave the most significant improvements: 2 to 5 times faster linear solution compared to other sparse solvers for the class of matrices < 4000 eqns, < 5% fill-in Significant reductions in complete multibody analysis time => 6 dof robot with friction (~120 eqns.) runs at > 2 khz on a 2.4 Ghz PC All the improvements discussed so far have proved beneficial for regular, non real-time simulations.

18 Applications: Rotorcraft Analysis 18 Typical problems solved by MBDyn: Rotorcraft trim and stability Tiltrotor trim, stability and maneuvers Aircraft/rotorcraft landing and ground maneuvers Robotics Typical rotorcraft models for stability: eqns. per (deformable) blade, eqns. for rotor hub/controls/airframe

19 Applications: Rotorcraft Analysis 19 Isolated rotor with control system, no aerodynamics Model Equations Baseline Column-compressed (CC) Assembly parallelization + CC (2 CPU) Naïve solver Coarse, realistic Refined, realistic Overrefined, unrealistic The Schur solver cannot be used with this helicopter model because of some interactions with the control system model; the fix is under development

20 Applications: Beam benchmark 20 Straight beam, clamped at one end and impulsively loaded at the other. Model Equations Baseline Column-compressed (CC) Assembly parallelization + CC (2 CPU) Naïve solver Solution parallelization (Schur + CC, 2CPU) Modified Newton Modified Newton Full Newton For yet unclear reasons, Schur does not run with the naive solver; further improvements are expected

21 Conclusions 21 The free general purpose multibody software MBDyn has undergone some performance improvement work. All the above improvements are available in the latest release Most of the performance improvement investigations were dictated by the need to run real-time simulations with general-purpose software. However, only the dedicated sparse solver was beneficial for realtime simulation. Nonetheless, the other improvements were beneficial for regular, general-purpose analysis. The improvements to the software were facilitated by its objectoriented design. There are few conflicting interactions to solve yet; they will be addressed in future releases.

Dipartimento di Ingegneria Aerospaziale Politecnico di Milano (Italy)

Dipartimento di Ingegneria Aerospaziale Politecnico di Milano (Italy) MultiBody Dynamics Analysis Software on Real Time Distributed Systems Pierangelo Masarati Marco Morandini Dipartimento di Ingegneria Aerospaziale Politecnico di Milano (Italy) One-day meeting on: RTAI,

More information

Dipartimento di Ingegneria Aerospaziale Politecnico di Milano

Dipartimento di Ingegneria Aerospaziale Politecnico di Milano Trajectory optimization and real-time simulation for robotics applications Michele Attolico Pierangelo Masarati Paolo Mantegazza Dipartimento di Ingegneria Aerospaziale Politecnico di Milano Multibody

More information

A REAL-TIME HARDWARE-IN-THE-LOOP SIMULATOR FOR ROBOTICS APPLICATIONS

A REAL-TIME HARDWARE-IN-THE-LOOP SIMULATOR FOR ROBOTICS APPLICATIONS MULTIBODY DYNAMICS 2005, ECCOMAS Thematic Conference J.M. Goicolea, J. Cuadrado, J.C. García Orden (eds.) Madrid, Spain, 21 24 June 2005 A REAL-TIME HARDWARE-IN-THE-LOOP SIMULATOR FOR ROBOTICS APPLICATIONS

More information

TRAJECTORY OPTIMIZATION AND REAL-TIME SIMULATION FOR ROBOTICS APPLICATIONS

TRAJECTORY OPTIMIZATION AND REAL-TIME SIMULATION FOR ROBOTICS APPLICATIONS MULTIBODY DYNAMICS 25, ECCOMAS Thematic Conference J.M. Goicolea, J. Cuadrado, J.C. García Orden (eds.) Madrid, Spain, 21 24 June 25 TRAJECTORY OPTIMIZATION AND REAL-TIME SIMULATION FOR ROBOTICS APPLICATIONS

More information

Integration of automatic differentiation tools within object-oriented codes: accuracy and timings

Integration of automatic differentiation tools within object-oriented codes: accuracy and timings Integration of automatic differentiation tools within object-oriented codes: accuracy and timings Deliverable 2.2 Marco Morandini Dipartimento di Ingegneria Aerospaziale, Politecnico di Milano Introduction

More information

MBDyn Installation Manual Version 1.2.1

MBDyn Installation Manual Version 1.2.1 MBDyn Installation Manual Version 1.2.1 Pierangelo Masarati Dipartimento di Ingegneria Aerospaziale Politecnico di Milano Automatically generated August 16, 2004 Contents 1 Introduction 3 2 Getting the

More information

MBDyn Installation Manual Version 1.X-Devel

MBDyn Installation Manual Version 1.X-Devel MBDyn Installation Manual Version 1.X-Devel Pierangelo Masarati Dipartimento di Ingegneria Aerospaziale Politecnico di Milano Automatically generated May 18, 2007 Contents 1 Introduction 3 2 Getting the

More information

A comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation

A comparison of Algorithms for Sparse Matrix. Real-time Multibody Dynamic Simulation A comparison of Algorithms for Sparse Matrix Factoring and Variable Reordering aimed at Real-time Multibody Dynamic Simulation Jose-Luis Torres-Moreno, Jose-Luis Blanco, Javier López-Martínez, Antonio

More information

OPEN-SOURCE MULTIBODY ANALYSIS SOFTWARE. Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta, and Paolo Mantegazza

OPEN-SOURCE MULTIBODY ANALYSIS SOFTWARE. Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta, and Paolo Mantegazza MULTIBODY DYNAMICS 2003 Jorge A.C. Ambrósio(Ed.) IDMEC/IST, Lisbon, Portugal, July 1-4 2003 OPEN-SOURCE MULTIBODY ANALYSIS SOFTWARE Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta, and Paolo Mantegazza

More information

PARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures

PARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures PARDISO - PARallel DIrect SOlver to solve SLAE on shared memory architectures Solovev S. A, Pudov S.G sergey.a.solovev@intel.com, sergey.g.pudov@intel.com Intel Xeon, Intel Core 2 Duo are trademarks of

More information

Real-Time Aeroservoelastic Analysis of Wind-Turbines by Free Multibody Software

Real-Time Aeroservoelastic Analysis of Wind-Turbines by Free Multibody Software Real-Time Aeroservoelastic Analysis of Wind-Turbines by Free Multibody Software Luca Cavagna, Alessandro Fumagalli, Pierangelo Masarati, Marco Morandini, and Paolo Mantegazza Politecnico di Milano, Dipartimento

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

Accelerating the Iterative Linear Solver for Reservoir Simulation

Accelerating the Iterative Linear Solver for Reservoir Simulation Accelerating the Iterative Linear Solver for Reservoir Simulation Wei Wu 1, Xiang Li 2, Lei He 1, Dongxiao Zhang 2 1 Electrical Engineering Department, UCLA 2 Department of Energy and Resources Engineering,

More information

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER AVISHA DHISLE PRERIT RODNEY ADHISLE PRODNEY 15618: PARALLEL COMPUTER ARCHITECTURE PROF. BRYANT PROF. KAYVON LET S

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964

More information

International Conference on Computational Science (ICCS 2017)

International Conference on Computational Science (ICCS 2017) International Conference on Computational Science (ICCS 2017) Exploiting Hybrid Parallelism in the Kinematic Analysis of Multibody Systems Based on Group Equations G. Bernabé, J. C. Cano, J. Cuenca, A.

More information

SCALABLE ALGORITHMS for solving large sparse linear systems of equations

SCALABLE ALGORITHMS for solving large sparse linear systems of equations SCALABLE ALGORITHMS for solving large sparse linear systems of equations CONTENTS Sparse direct solvers (multifrontal) Substructuring methods (hybrid solvers) Jacko Koster, Bergen Center for Computational

More information

Finite Element Implementation

Finite Element Implementation Chapter 8 Finite Element Implementation 8.1 Elements Elements andconditions are the main extension points of Kratos. New formulations can be introduced into Kratos by implementing a new Element and its

More information

A High-Order Accurate Unstructured GMRES Solver for Poisson s Equation

A High-Order Accurate Unstructured GMRES Solver for Poisson s Equation A High-Order Accurate Unstructured GMRES Solver for Poisson s Equation Amir Nejat * and Carl Ollivier-Gooch Department of Mechanical Engineering, The University of British Columbia, BC V6T 1Z4, Canada

More information

THE procedure used to solve inverse problems in areas such as Electrical Impedance

THE procedure used to solve inverse problems in areas such as Electrical Impedance 12TH INTL. CONFERENCE IN ELECTRICAL IMPEDANCE TOMOGRAPHY (EIT 2011), 4-6 MAY 2011, UNIV. OF BATH 1 Scaling the EIT Problem Alistair Boyle, Andy Adler, Andrea Borsic Abstract There are a number of interesting

More information

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018 CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Sparse Linear Systems and Factorization Methods Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18 Sparse

More information

Analysis and control of wind turbine generators

Analysis and control of wind turbine generators Analysis and control of wind turbine generators Eolica Expo 2004 Roma,, September 30 October 2, 2004 Carlo L. Bottasso, Lorenzo Trainelli, Alessandro Croce, Walter Sirchi, Barbara Savini Dipartimento di

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

Parallel resolution of sparse linear systems by mixing direct and iterative methods

Parallel resolution of sparse linear systems by mixing direct and iterative methods Parallel resolution of sparse linear systems by mixing direct and iterative methods Phyleas Meeting, Bordeaux J. Gaidamour, P. Hénon, J. Roman, Y. Saad LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix

More information

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice Stat60/CS94: Randomized Algorithms for Matrices and Data Lecture 11-10/09/013 Lecture 11: Randomized Least-squares Approximation in Practice Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these

More information

Krishnan Suresh Associate Professor Mechanical Engineering

Krishnan Suresh Associate Professor Mechanical Engineering Large Scale FEA on the GPU Krishnan Suresh Associate Professor Mechanical Engineering High-Performance Trick Computations (i.e., 3.4*1.22): essentially free Memory access determines speed of code Pick

More information

Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems

Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems Block Distributed Schur Complement Preconditioners for CFD Computations on Many-Core Systems Dr.-Ing. Achim Basermann, Melven Zöllner** German Aerospace Center (DLR) Simulation- and Software Technology

More information

Solving Large Complex Problems. Efficient and Smart Solutions for Large Models

Solving Large Complex Problems. Efficient and Smart Solutions for Large Models Solving Large Complex Problems Efficient and Smart Solutions for Large Models 1 ANSYS Structural Mechanics Solutions offers several techniques 2 Current trends in simulation show an increased need for

More information

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation 1 Cheng-Han Du* I-Hsin Chung** Weichung Wang* * I n s t i t u t e o f A p p l i e d M

More information

A parallel direct/iterative solver based on a Schur complement approach

A parallel direct/iterative solver based on a Schur complement approach A parallel direct/iterative solver based on a Schur complement approach Gene around the world at CERFACS Jérémie Gaidamour LaBRI and INRIA Bordeaux - Sud-Ouest (ScAlApplix project) February 29th, 2008

More information

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach

HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach HIPS : a parallel hybrid direct/iterative solver based on a Schur complement approach Mini-workshop PHyLeaS associated team J. Gaidamour, P. Hénon July 9, 28 HIPS : an hybrid direct/iterative solver /

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

The 3D DSC in Fluid Simulation

The 3D DSC in Fluid Simulation The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations

More information

Sparse Linear Solver for Power System Analyis using FPGA

Sparse Linear Solver for Power System Analyis using FPGA Sparse Linear Solver for Power System Analyis using FPGA J. R. Johnson P. Nagvajara C. Nwankpa 1 Extended Abstract Load flow computation and contingency analysis is the foundation of power system analysis.

More information

Data mining with sparse grids using simplicial basis functions

Data mining with sparse grids using simplicial basis functions Data mining with sparse grids using simplicial basis functions Jochen Garcke and Michael Griebel Institut für Angewandte Mathematik Universität Bonn Part of the work was supported within the project 03GRM6BN

More information

Performance Evaluation of Multiple and Mixed Precision Iterative Refinement Method and its Application to High-Order Implicit Runge-Kutta Method

Performance Evaluation of Multiple and Mixed Precision Iterative Refinement Method and its Application to High-Order Implicit Runge-Kutta Method Performance Evaluation of Multiple and Mixed Precision Iterative Refinement Method and its Application to High-Order Implicit Runge-Kutta Method Tomonori Kouya Shizuoa Institute of Science and Technology,

More information

Approaches to Parallel Implementation of the BDDC Method

Approaches to Parallel Implementation of the BDDC Method Approaches to Parallel Implementation of the BDDC Method Jakub Šístek Includes joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík. Institute of Mathematics of the AS CR, Prague

More information

Figure 6.1: Truss topology optimization diagram.

Figure 6.1: Truss topology optimization diagram. 6 Implementation 6.1 Outline This chapter shows the implementation details to optimize the truss, obtained in the ground structure approach, according to the formulation presented in previous chapters.

More information

Fast Tridiagonal Solvers on GPU

Fast Tridiagonal Solvers on GPU Fast Tridiagonal Solvers on GPU Yao Zhang John Owens UC Davis Jonathan Cohen NVIDIA GPU Technology Conference 2009 Outline Introduction Algorithms Design algorithms for GPU architecture Performance Bottleneck-based

More information

ASIMULATION program with integrated circuit emphasis

ASIMULATION program with integrated circuit emphasis IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013 261 NICSLU: An Adaptive Sparse Matrix Solver for Parallel Circuit Simulation Xiaoming Chen,

More information

High Performance Computing: Tools and Applications

High Performance Computing: Tools and Applications High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 15 Numerically solve a 2D boundary value problem Example:

More information

CSE 599 I Accelerated Computing - Programming GPUS. Parallel Pattern: Sparse Matrices

CSE 599 I Accelerated Computing - Programming GPUS. Parallel Pattern: Sparse Matrices CSE 599 I Accelerated Computing - Programming GPUS Parallel Pattern: Sparse Matrices Objective Learn about various sparse matrix representations Consider how input data affects run-time performance of

More information

Towards Approximate Computing: Programming with Relaxed Synchronization

Towards Approximate Computing: Programming with Relaxed Synchronization Towards Approximate Computing: Programming with Relaxed Synchronization Lakshminarayanan Renganarayana Vijayalakshmi Srinivasan Ravi Nair (presenting) Dan Prener IBM T.J. Watson Research Center October

More information

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview Arun Das 05/09/2017 Arun Das Waterloo Autonomous Vehicles Lab Introduction What s in a name? Arun Das Waterloo Autonomous

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

Predictive Engineering and Computational Sciences. Data Structures and Methods for Unstructured Distributed Meshes. Roy H. Stogner

Predictive Engineering and Computational Sciences. Data Structures and Methods for Unstructured Distributed Meshes. Roy H. Stogner PECOS Predictive Engineering and Computational Sciences Data Structures and Methods for Unstructured Distributed Meshes Roy H. Stogner The University of Texas at Austin May 23, 2012 Roy H. Stogner Distributed

More information

Control of industrial robots. Kinematic redundancy

Control of industrial robots. Kinematic redundancy Control of industrial robots Kinematic redundancy Prof. Paolo Rocco (paolo.rocco@polimi.it) Politecnico di Milano Dipartimento di Elettronica, Informazione e Bioingegneria Kinematic redundancy Direct kinematics

More information

Computing the rank of big sparse matrices modulo p using gaussian elimination

Computing the rank of big sparse matrices modulo p using gaussian elimination Computing the rank of big sparse matrices modulo p using gaussian elimination Charles Bouillaguet 1 Claire Delaplace 2 12 CRIStAL, Université de Lille 2 IRISA, Université de Rennes 1 JNCF, 16 janvier 2017

More information

Simulation in Computer Graphics. Deformable Objects. Matthias Teschner. Computer Science Department University of Freiburg

Simulation in Computer Graphics. Deformable Objects. Matthias Teschner. Computer Science Department University of Freiburg Simulation in Computer Graphics Deformable Objects Matthias Teschner Computer Science Department University of Freiburg Outline introduction forces performance collision handling visualization University

More information

Data mining with sparse grids

Data mining with sparse grids Data mining with sparse grids Jochen Garcke and Michael Griebel Institut für Angewandte Mathematik Universität Bonn Data mining with sparse grids p.1/40 Overview What is Data mining? Regularization networks

More information

PARDISO Version Reference Sheet Fortran

PARDISO Version Reference Sheet Fortran PARDISO Version 5.0.0 1 Reference Sheet Fortran CALL PARDISO(PT, MAXFCT, MNUM, MTYPE, PHASE, N, A, IA, JA, 1 PERM, NRHS, IPARM, MSGLVL, B, X, ERROR, DPARM) 1 Please note that this version differs significantly

More information

Report of Linear Solver Implementation on GPU

Report of Linear Solver Implementation on GPU Report of Linear Solver Implementation on GPU XIANG LI Abstract As the development of technology and the linear equation solver is used in many aspects such as smart grid, aviation and chemical engineering,

More information

Using RecurDyn. Contents

Using RecurDyn. Contents Using RecurDyn Contents 1.0 Multibody Dynamics Overview... 2 2.0 Multibody Dynamics Applications... 3 3.0 What is RecurDyn and how is it different?... 4 4.0 Types of RecurDyn Analysis... 5 5.0 MBD Simulation

More information

Dynamic Geometry Processing

Dynamic Geometry Processing Dynamic Geometry Processing EG 2012 Tutorial Will Chang, Hao Li, Niloy Mitra, Mark Pauly, Michael Wand Tutorial: Dynamic Geometry Processing 1 Articulated Global Registration Introduction and Overview

More information

Sparse LU Decomposition using FPGA

Sparse LU Decomposition using FPGA Sparse LU Decomposition using FPGA Jeremy Johnson 1, Timothy Chagnon 1, Petya Vachranukunkiet 2, Prawat Nagvajara 2, and Chika Nwankpa 2 CS 1 and ECE 2 Departments Drexel University, Philadelphia, PA jjohnson@cs.drexel.edu,tchagnon@drexel.edu,pv29@drexel.edu,

More information

Dipartimento di Ingegneria Informatica, Automatica e Gestionale A. Ruberti, SAPIENZA, Università di Roma, via Ariosto, Roma, Italy.

Dipartimento di Ingegneria Informatica, Automatica e Gestionale A. Ruberti, SAPIENZA, Università di Roma, via Ariosto, Roma, Italy. Data article Title: Data and performance profiles applying an adaptive truncation criterion, within linesearchbased truncated Newton methods, in large scale nonconvex optimization. Authors: Andrea Caliciotti

More information

Iterative Sparse Triangular Solves for Preconditioning

Iterative Sparse Triangular Solves for Preconditioning Euro-Par 2015, Vienna Aug 24-28, 2015 Iterative Sparse Triangular Solves for Preconditioning Hartwig Anzt, Edmond Chow and Jack Dongarra Incomplete Factorization Preconditioning Incomplete LU factorizations

More information

Efficient Solution Techniques

Efficient Solution Techniques Chapter 4 The secret to walking on water is knowing where the rocks are. Herb Cohen Vail Symposium 14 poster Efficient Solution Techniques In the previous chapter, we introduced methods for implementing

More information

Fast Radial Basis Functions for Engineering Applications. Prof. Marco Evangelos Biancolini University of Rome Tor Vergata

Fast Radial Basis Functions for Engineering Applications. Prof. Marco Evangelos Biancolini University of Rome Tor Vergata Fast Radial Basis Functions for Engineering Applications Prof. Marco Evangelos Biancolini University of Rome Tor Vergata Outline 2 RBF background Fast RBF on HPC Engineering Applications Mesh morphing

More information

A Parallel Implementation of the BDDC Method for Linear Elasticity

A Parallel Implementation of the BDDC Method for Linear Elasticity A Parallel Implementation of the BDDC Method for Linear Elasticity Jakub Šístek joint work with P. Burda, M. Čertíková, J. Mandel, J. Novotný, B. Sousedík Institute of Mathematics of the AS CR, Prague

More information

Design of Parallel Algorithms. Models of Parallel Computation

Design of Parallel Algorithms. Models of Parallel Computation + Design of Parallel Algorithms Models of Parallel Computation + Chapter Overview: Algorithms and Concurrency n Introduction to Parallel Algorithms n Tasks and Decomposition n Processes and Mapping n Processes

More information

Adjoint-Based Sensitivity Analysis for Computational Fluid Dynamics

Adjoint-Based Sensitivity Analysis for Computational Fluid Dynamics Adjoint-Based Sensitivity Analysis for Computational Fluid Dynamics Dimitri J. Mavriplis Max Castagne Professor Department of Mechanical Engineering University of Wyoming Laramie, WY USA Motivation Computational

More information

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Parallel Computations

Parallel Computations Parallel Computations Timo Heister, Clemson University heister@clemson.edu 2015-08-05 deal.ii workshop 2015 2 Introduction Parallel computations with deal.ii: Introduction Applications Parallel, adaptive,

More information

A Sparse QP-Solver Implementation in CGAL. Yves Brise,

A Sparse QP-Solver Implementation in CGAL. Yves Brise, A Sparse QP-Solver Implementation in CGAL Yves Brise, 20090904 Problem min s.t. c T x + x T Dx Ax = b x 0 c, x R n b R m D R n n A R m n 2 Problem min s.t. c T x + x T Dx Ax = b x 0 c, x R n b R m D R

More information

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks

Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms

More information

Super Matrix Solver-P-ICCG:

Super Matrix Solver-P-ICCG: Super Matrix Solver-P-ICCG: February 2011 VINAS Co., Ltd. Project Development Dept. URL: http://www.vinas.com All trademarks and trade names in this document are properties of their respective owners.

More information

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,

More information

Lecture «Robot Dynamics»: Kinematics 3

Lecture «Robot Dynamics»: Kinematics 3 Lecture «Robot Dynamics»: Kinematics 3 151-0851-00 V lecture: CAB G11 Tuesday 10:15 12:00, every week exercise: HG E1.2 Wednesday 8:15 10:00, according to schedule (about every 2nd week) office hour: LEE

More information

An advanced RBF Morph application: coupled CFD-CSM Aeroelastic Analysis of a Full Aircraft Model and Comparison to Experimental Data

An advanced RBF Morph application: coupled CFD-CSM Aeroelastic Analysis of a Full Aircraft Model and Comparison to Experimental Data An advanced RBF Morph application: coupled CFD-CSM Aeroelastic Analysis of a Full Aircraft Model and Comparison to Experimental Data Dr. Marco Evangelos Biancolini Tor Vergata University, Rome, Italy Dr.

More information

Lecture «Robot Dynamics»: Kinematic Control

Lecture «Robot Dynamics»: Kinematic Control Lecture «Robot Dynamics»: Kinematic Control 151-0851-00 V lecture: CAB G11 Tuesday 10:15 12:00, every week exercise: HG E1.2 Wednesday 8:15 10:00, according to schedule (about every 2nd week) Marco Hutter,

More information

Review of previous examinations TMA4280 Introduction to Supercomputing

Review of previous examinations TMA4280 Introduction to Supercomputing Review of previous examinations TMA4280 Introduction to Supercomputing NTNU, IMF April 24. 2017 1 Examination The examination is usually comprised of: one problem related to linear algebra operations with

More information

Matrix-free IPM with GPU acceleration

Matrix-free IPM with GPU acceleration Matrix-free IPM with GPU acceleration Julian Hall, Edmund Smith and Jacek Gondzio School of Mathematics University of Edinburgh jajhall@ed.ac.uk 29th June 2011 Linear programming theory Primal-dual pair

More information

Reducing Communication Costs Associated with Parallel Algebraic Multigrid

Reducing Communication Costs Associated with Parallel Algebraic Multigrid Reducing Communication Costs Associated with Parallel Algebraic Multigrid Amanda Bienz, Luke Olson (Advisor) University of Illinois at Urbana-Champaign Urbana, IL 11 I. PROBLEM AND MOTIVATION Algebraic

More information

Numerical Simulation of Dynamic Systems XXIV

Numerical Simulation of Dynamic Systems XXIV Numerical Simulation of Dynamic Systems XXIV Prof. Dr. François E. Cellier Department of Computer Science ETH Zurich May 14, 2013 Introduction Introduction A number of important simulation applications

More information

Comparison of parallel preconditioners for a Newton-Krylov flow solver

Comparison of parallel preconditioners for a Newton-Krylov flow solver Comparison of parallel preconditioners for a Newton-Krylov flow solver Jason E. Hicken, Michal Osusky, and David W. Zingg 1Introduction Analysis of the results from the AIAA Drag Prediction workshops (Mavriplis

More information

Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead

Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU. Robert Strzodka NVAMG Project Lead Accelerated ANSYS Fluent: Algebraic Multigrid on a GPU Robert Strzodka NVAMG Project Lead A Parallel Success Story in Five Steps 2 Step 1: Understand Application ANSYS Fluent Computational Fluid Dynamics

More information

Graph Coloring via Constraint Programming-based Column Generation

Graph Coloring via Constraint Programming-based Column Generation Graph Coloring via Constraint Programming-based Column Generation Stefano Gualandi Federico Malucelli Dipartimento di Elettronica e Informatica, Politecnico di Milano Viale Ponzio 24/A, 20133, Milan, Italy

More information

CHAO YANG. Early Experience on Optimizations of Application Codes on the Sunway TaihuLight Supercomputer

CHAO YANG. Early Experience on Optimizations of Application Codes on the Sunway TaihuLight Supercomputer CHAO YANG Dr. Chao Yang is a full professor at the Laboratory of Parallel Software and Computational Sciences, Institute of Software, Chinese Academy Sciences. His research interests include numerical

More information

FEM (MSC.Nastran SOL600) and Multibody (MSC.Adams flexible contact) solutions: an application example in helicopter rotor analysis

FEM (MSC.Nastran SOL600) and Multibody (MSC.Adams flexible contact) solutions: an application example in helicopter rotor analysis FEM (MSC.Nastran SOL6) and Multibody (MSC.Adams flexible contact) solutions: an application example in helicopter rotor analysis Daniele Catelani MSC. Software - EMEA Aerospace Consultant Francesca Bianchi

More information

Numeric-Symbolic Exact Rational Linear System Solver

Numeric-Symbolic Exact Rational Linear System Solver Numeric-Symbolic Exact Rational Linear System Solver B David Saunders, David Wood, and Bryan Youse University of Delaware June 10, 2011 Motivation Ax = b Given A Z m n and b Z m, compute x Q n Core problem

More information

Implicit schemes for wave models

Implicit schemes for wave models Implicit schemes for wave models Mathieu Dutour Sikirić Rudjer Bo sković Institute, Croatia and Universität Rostock April 17, 2013 I. Wave models Stochastic wave modelling Oceanic models are using grids

More information

3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs

3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs 3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs H. Knibbe, C. W. Oosterlee, C. Vuik Abstract We are focusing on an iterative solver for the three-dimensional

More information

Recent developments in simulation, optimization and control of flexible multibody systems

Recent developments in simulation, optimization and control of flexible multibody systems Recent developments in simulation, optimization and control of flexible multibody systems Olivier Brüls Department of Aerospace and Mechanical Engineering University of Liège o.bruls@ulg.ac.be Katholieke

More information

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle ICES Student Forum The University of Texas at Austin, USA November 4, 204 Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of

More information

Parallel Algorithm Design. CS595, Fall 2010

Parallel Algorithm Design. CS595, Fall 2010 Parallel Algorithm Design CS595, Fall 2010 1 Programming Models The programming model o determines the basic concepts of the parallel implementation and o abstracts from the hardware as well as from the

More information

Under the Hood of Implicit LS-DYNA

Under the Hood of Implicit LS-DYNA 4 th European LS-DYNA Users Conference Implicit / New Developments Under the Hood of Implicit LS-DYNA Cleve Ashcraft Roger Grimes Brad Maker May 2003 1 Implicit in LS-DYNA v. 970 LS-DYNA v. 970 has an

More information

Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems

Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems Distributed Schur Complement Solvers for Real and Complex Block-Structured CFD Problems Dr.-Ing. Achim Basermann, Dr. Hans-Peter Kersken German Aerospace Center (DLR) Simulation- and Software Technology

More information

Parallel PDE Solvers in Python

Parallel PDE Solvers in Python Parallel PDE Solvers in Python Bill Spotz Sandia National Laboratories Scientific Python 2006 August 18, 2006 Computational Sciences at Sandia Chemically reacting flows Climate modeling Combustion Compressible

More information

PATC Parallel Workflows, CSC/PDC

PATC Parallel Workflows, CSC/PDC PATC Parallel Workflows, CSC/PDC HPC with Elmer Elmer-team Parallel concept of Elmer MESHING GMSH PARTITIONING ASSEMBLY SOLUTION VISUALIZATION Parallel concept of Elmer MPI Trilinos Pardiso Hypre SuperLU

More information

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015

AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 AmgX 2.0: Scaling toward CORAL Joe Eaton, November 19, 2015 Agenda Introduction to AmgX Current Capabilities Scaling V2.0 Roadmap for the future 2 AmgX Fast, scalable linear solvers, emphasis on iterative

More information

Preparation of Codes for Trinity

Preparation of Codes for Trinity Preparation of Codes for Trinity Courtenay T. Vaughan, Mahesh Rajan, Dennis C. Dinge, Clark R. Dohrmann, Micheal W. Glass, Kenneth J. Franko, Kendall H. Pierson, and Michael R. Tupek Sandia National Laboratories

More information

Software Testing part II (white box) Lecturer: Giuseppe Santucci

Software Testing part II (white box) Lecturer: Giuseppe Santucci Software Testing part II (white box) Lecturer: Giuseppe Santucci 4. White box testing White-box (or Glass-box) testing: general characteristics Statement coverage Decision coverage Condition coverage Decision

More information

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report

ESPRESO ExaScale PaRallel FETI Solver. Hybrid FETI Solver Report ESPRESO ExaScale PaRallel FETI Solver Hybrid FETI Solver Report Lubomir Riha, Tomas Brzobohaty IT4Innovations Outline HFETI theory from FETI to HFETI communication hiding and avoiding techniques our new

More information

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure

More information

Minimal Equation Sets for Output Computation in Object-Oriented Models

Minimal Equation Sets for Output Computation in Object-Oriented Models Minimal Equation Sets for Output Computation in Object-Oriented Models Vincenzo Manzoni Francesco Casella Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza Leonardo da Vinci 3, 033

More information

Lecture 15: More Iterative Ideas

Lecture 15: More Iterative Ideas Lecture 15: More Iterative Ideas David Bindel 15 Mar 2010 Logistics HW 2 due! Some notes on HW 2. Where we are / where we re going More iterative ideas. Intro to HW 3. More HW 2 notes See solution code!

More information

BLAS. Christoph Ortner Stef Salvini

BLAS. Christoph Ortner Stef Salvini BLAS Christoph Ortner Stef Salvini The BLASics Basic Linear Algebra Subroutines Building blocks for more complex computations Very widely used Level means number of operations Level 1: vector-vector operations

More information

Parallel Circuit Simulation: How Good Can It Get? Andrei Vladimirescu

Parallel Circuit Simulation: How Good Can It Get? Andrei Vladimirescu Parallel Circuit Simulation: How Good Can It Get? Andrei Vladimirescu Overview Opportunities for Full-Chip Analog Verification Analog vs. Digital Design SPICE standard design tool for Analog and Mixed-Signal

More information