Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn

Size: px

Start display at page:

Download "Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn"

Christal Stevenson
5 years ago
Views:

Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta and Paolo Mantegazza

1 Computational Aspects and Recent Improvements in the Open-Source Multibody Analysis Software MBDyn Pierangelo Masarati, Marco Morandini, Giuseppe Quaranta and Paolo Mantegazza Dipartimento di Ingegneria Aerospaziale Multibody Dynamics 2005 International Conference on Advances in Computational Multibody Dynamics ECCOMAS Thematic Conference Madrid, June

2 Outline 2 1. MBDyn (Free) Multibody Software Description 2. Computational Aspects & Improvements Linear Solution Strategies Matrix Assembly Strategies Nonlinear Solution Strategies Parallelization Strategies 3. Applications Fast-Prototyping of Problems without Jacobian: Landing Gear Simulation Fast Solution of Small Problems: Real-Time Simulation Efficient Solution of Medium/Large Problems: Rotorcraft Simulation 4. Conclusions

3 MBDyn (Free) Multibody Software 3 MBDyn is a free general purpose multibody software Freely available in source form at Licensed under GNU General Public License Developed at the Dipartimento di Ingegneria Aerospaziale of the University Politecnico di Milano Solves Initial-Value problems in DAE form Biased toward aeroservoelastic simulation of rotorcraft Features Real-Time simulation capabilities by way of Linux RTAI Features a selection of linear solvers tailored for different problem sizes Object-oriented: allows to easily replace components and add new features

4 Computational Aspects & Improvements 4 Some of the recent computational improvements were pushed by performance requirements originating from: Simulation of medium/large size problems Real-Time simulation of small, yet non-trivial models Requirements may be conflicting; may need different design Key development directions are: Linear Solution Strategies Matrix Assembly Strategies Nonlinear Solution Strategies Parallelization Strategies

5 Computational Aspects (cntd.) 5 Object-Oriented Programming: allows generic programming NLS::Solve() { // Newton-Raphson while (true) { if (DM->Residual()->Test()) { return; if (new_jacobian) { DM->Jacobian(); LS->Solve(); DM->Update(); NLS: nonlinear solver LS: linear solver DM: data manager SS: step solver SS::Advance() { SS->Predict(); NLS->Solve();

6 Linear Solution Strategies 6 Redundant approach: large, sparse matrices => sparse solver Sparse solvers are usually optimized for min. space AND min. operations => room for trading space vs. Speed Default solver: Umfpack (3.0 => 4.4) Many other solvers available: Lapack dense solver: optimal for problems < 60 eqns. WSMP (non-free software) sparse parallel solver: optimal for > eqns. SuperLU, Y12, HSL (non-free software), Meschach, TAUCS,... Custom solver naive : dense storage sparse operations aggressive pivoting (min. fill-in of factored matrix) multithread implementation (limited performance improvements for target problems) Dramatically improves performances (in eqns. range)

7 Assembly Strategies 7 1. Lapack,... Dense: may be relevant for very small, almost dense problems 2. Umfpack, y12m,... SpMap: array of map objects (binary trees), needs packing CC: initialized as SpMap, preserves packing DIR: initialized as SpMap, dense index table CC/DIR-MT: CC/DIR may be efficiently parallelized for SMP 3. naive Sparse-dense: dense storage and fill-in tables for assembly; sparse indices for sparse factorization

8 Assembly Strategies: SpMap 8 The matrix is an array of binary trees: typedef std::map<int, double> row_cont_type; std::vector<row_cont_type> col_indices; double& operator()(int i_row, int i_col) { row_cont_type::iterator i; row_cont_type& row = col_indices[i_col]; i = row.find(i_row); if (i == row.end()) { return row[i_row] = 0.; return i->second; Note: fills in with zeros; the real implementation may handle this.

9 Assembly Strategies: CC The matrix is in Column-Compressed form after initialization as SpMap: std::vector<double>& values; const std::vector<int>& row_indices, column_start; double& operator()(int i_row, int i_col) { int row_begin = column_start[i_col - 1]; int row_end = column_start[i_col] - 1; int idx, row; 9 if (OutOfRange(i_row)) { throw ErrRebuildMatrix(); while (row_end >= row_begin) { idx = (row_begin + row_end)/2; row = row_indices[idx]; if (i_row < row) { row_end = idx - 1; else if (i_row > row) { row_begin = idx + 1; else { return values[idx]; throw ErrRebuildMatrix(); // out of range: rebuild // binary search // not found: rebuild

10 Assembly Strategies: CC (contd.) 10 Advantages: After first assembly, saves packing into column-compressed form Column access cost: 1 (array) Row access cost: log 2 (N) (binary tree) Drawbacks: Need matrix rebuild when fill-in changes (worked around by allowing zeros at first assembly)

11 Assembly Strategies: DIR 11 The matrix is in Column-Compressed form after initialization as SpMap; the indices are dense: std::vector<double>& values; const std::vector<std::vector<int> >& indices; double& operator()(int i_row, int i_col) { return values[indices[i_row][i_col]] Advantages: After first assembly, saves data packing into column-compressed form Column access cost: 1 (array) Row access cost: 1 (array) Drawbacks: Need matrix rebuild when fill-in changes (worked around by allowing zeros at first assembly) Memory occupation N 2

12 Assembly Strategies: CC/DIR-MultiThread 12 One data array per thread Indices shared by all threads int N, // row number Nthr; // thread number std::vector<std::vector<double> > &A // storages for (int row = thr; row < N; row += Nthr) { for (int t = 1; t < Nthr; t++ ) { A[0][row] += A[t][row]; Advantages: No preliminary partitioning: assembly on a first-in basis Final packing is also parallel Drawbacks: Process scheduling overhead for small problems

13 Nonlinear Solution Strategies 13 Solution strategy based on Newton iteration Exact Newton (default) Inexact Newton: GMRES, BiCGStab r x x h w r x w J x w= w h x No need to build the matrix: matrix-free The matrix is actually built to reinitialize the method, acting as a preconditioner; more efficient preconditioners will be implemented Essential feature: Newton-like convergence without implementing Jacobian, more efficient than numerical differentiation: => fast prototyping

14 Parallel Solution Strategies 14 Domain partitioning: equally scaled subproblems minimal interface => METIS Subdomain/interface solution: => Schur 0 ={f [B1 E1 1 0 B s E s x s f s ]{x1 F 1 F s C y g

15 Parallel Solution Strategies (cntd.) The local matrices are factored, exploiting subdomain sparsity 2. The local parts of the right-hand side of the reduced problem are computed and sent to the master node 3. If required, the local parts of the Schur complement matrices are assembled and sent to the master node as well 4. The reduced system is solved <= bottleneck 5. The other unknowns are computed by back-substitution Only step 4. cannot be parallelized. Mostly suited for specific topologies (e.g. helicopter rotors)

Applications: Landing Gear 16 Gear-walk instability of commercial aircraft landing gear: Specially implemented shockabsorber and tire models ABS model by explicit feedback

16 Applications: Landing Gear 16 Gear-walk instability of commercial aircraft landing gear: Specially implemented shockabsorber and tire models ABS model by explicit feedback => Jacobian is incomplete When the problem is dominated by the dynamics of the portion with no Jacobian, the matrix-free nonlinear solver allows nearly-quadratic convergence

17 Applications: Real-Time simulation 17 The use of general-purpose multibody code for real-time simulation sets very stringent requirements on (worst-case) performances Actually, real-time requirements were the initial motivation for this activity on performance improvement None of the improvements described in this work were significant for real-time because of the very limited size of models ( eqns.) The naive solver (not dicussed here) gave the most significant improvements: 2 to 5 times faster linear solution compared to other sparse solvers for the class of matrices < 4000 eqns, < 5% fill-in Significant reductions in complete multibody analysis time => 6 dof robot with friction (~120 eqns.) runs at > 2 khz on a 2.4 Ghz PC All the improvements discussed so far have proved beneficial for regular, non real-time simulations.

Applications: Rotorcraft Analysis 18 Typical problems solved by MBDyn: Rotorcraft trim and stability Tiltrotor trim, stability and maneuvers Aircraft/rotorcraft

18 Applications: Rotorcraft Analysis 18 Typical problems solved by MBDyn: Rotorcraft trim and stability Tiltrotor trim, stability and maneuvers Aircraft/rotorcraft landing and ground maneuvers Robotics Typical rotorcraft models for stability: eqns. per (deformable) blade, eqns. for rotor hub/controls/airframe

19 Applications: Rotorcraft Analysis 19 Isolated rotor with control system, no aerodynamics Model Equations Baseline Column-compressed (CC) Assembly parallelization + CC (2 CPU) Naïve solver Coarse, realistic Refined, realistic Overrefined, unrealistic The Schur solver cannot be used with this helicopter model because of some interactions with the control system model; the fix is under development

20 Applications: Beam benchmark 20 Straight beam, clamped at one end and impulsively loaded at the other. Model Equations Baseline Column-compressed (CC) Assembly parallelization + CC (2 CPU) Naïve solver Solution parallelization (Schur + CC, 2CPU) Modified Newton Modified Newton Full Newton For yet unclear reasons, Schur does not run with the naive solver; further improvements are expected

21 Conclusions 21 The free general purpose multibody software MBDyn has undergone some performance improvement work. All the above improvements are available in the latest release Most of the performance improvement investigations were dictated by the need to run real-time simulations with general-purpose software. However, only the dedicated sparse solver was beneficial for realtime simulation. Nonetheless, the other improvements were beneficial for regular, general-purpose analysis. The improvements to the software were facilitated by its objectoriented design. There are few conflicting interactions to solve yet; they will be addressed in future releases.

Dipartimento di Ingegneria Aerospaziale Politecnico di Milano (Italy)

MultiBody Dynamics Analysis Software on Real Time Distributed Systems Pierangelo Masarati Marco Morandini Dipartimento di Ingegneria Aerospaziale Politecnico di Milano (Italy) One-day meeting on: RTAI,