Second Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering

Size: px

Start display at page:

Download "Second Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering"

Malcolm Robbins
5 years ago
Views:

1 State of the art distributed parallel computational techniques in industrial finite element analysis Second Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering Ajaccio, France April -5, Dr. Siemens PLM Software, USA PARENG-

2 Scope or presentation Introduction to industrial analysis Geometric domain decomposition Distributed computational solutions Parallel computational kernels Application case studies Conclusions and future work PARENG-

3 Industrial complexity constantly increasing Jet Engine, parts Engine block,, elements 3 3 Car 3, parts Factory, machines PARENG-

4 Computer hardware constantly changing Cray Computer Multi-core CPU $5 million $5 O() gigaflops O() gigaflops sold million sold 4 PARENG-

5 Lifecycle simulations Designer view Analyst view 5 PARENG-

6 Multidisciplinary solutions Designer view Analyst view 6 PARENG-

High performance requirements The constrained stiffness matrix of an analysis problem Number of rows: 35,734,79 Nonzero terms:,384,35,995 Nonzero terms in sparse factor

7 High performance requirements The constrained stiffness matrix of an analysis problem Number of rows: 35,734,79 Nonzero terms:,384,35,995 Nonzero terms in sparse factor matrix: 43,87,4, Memory used during factorization:,8,73, (4 byte) words Actual elapsed time of sparse factorization on a single high performance processor: 335 minutes 7 PARENG-

8 Scope or presentation Introduction to industrial analysis Geometric domain decomposition Distributed computational solutions Parallel computational kernels Application case studies Conclusions 8 PARENG-

9 Single level geometric domain decomposition Subdivide large geometry domains into limited number of partitions Proc Proc Proc k Computations in the geometry partitions are dependent Minimize the boundary size of each partition with respect to its interior Minimize the total boundary size as communication is needed 9 PARENG-

10 Multi-level geometry domain decomposition Single level Subdivide large geometry domains into limited number of partitions Subdivide the partitions into sub-partitions and dynamically reduce them to their collectors Assemble the multilevel substructures to obtain the engineering solution The total number of substructures may exceed the number of processors PARENG-

11 Finite element problem domain decomposition Based on model or matrices Graph Matrix FE model Vertices Diagonal Terms Node points Edges Off-diagonals Elements Undirected Symmetric Linear PARENG-

12 PARENG- Graphs and matrices Graph model and its Laplacian matrix Finite element model and its stiffness matrix = k k k k k k k k k k k k k k k k k K Membrane Element Membrane Element = 4 L

13 3 PARENG- Partitioning technology Spectral bisection method Vertex cut result : u Lu λ = = / / / / / / / /

14 Recursive graph partitioning Coarsening, partitioning and refining phases Coarsening 7 5 Partitioning Partition Partition Refining PARENG-

15 Scope or presentation Introduction to industrial analysis Geometric domain decomposition Distributed computational solutions Parallel computational kernels Application case studies Conclusions and future work 5 PARENG-

16 Distributed memory parallel architecture Cluster of high performance workstations Distributed memory work station Dedicated I/O devices High level parallelism Feasible number of nodes: PARENG-

17 Recursive matrix partitioning Geometric problem Partitioning hierarchy PARENG-

18 Distributed normal modes analysis Physical problem ( K λm ) Φ = Partitioned form,3,3 K oo λmoo Kot λm ot φ o,3,3 K oo λmoo Kot λm ot φ o 3 3 3,7 3,7 K 3 tt λmtt Ktt λm tt φ t 4 4 4,6 4,6 K 4 oo λmoo Kot λm ot φ o = 5 5 5,6 5,6 K 5 oo λmoo Kot λm ot φ o 6 6 6,7 6,7 K 6 tt λmtt Ktt λmtt φ t Ktt λm tt φ t 8 PARENG-

19 Phase Start Processor Processor Processor 3 Processor 4 Communicate 9 PARENG-

20 Phase Start Processors - Processors 3-4 Communicate PARENG-

21 PARENG- Phase 3 Processors Start ~ ) ~ ~ ( = Φ M K λ Solve reduced order problem Recover physical solution Φ = Φ = Φ = ~ ~ ~ ~ ~ ~ ~ ~ t t o o t o o t t o o t o o q q q q q q q

22 Scope or presentation Introduction to industrial analysis Geometric domain decomposition Distributed computational solutions Parallel computational kernels Application case studies Conclusions and future work PARENG-

23 Shared memory parallel architecture Multi-core processors Shared cache Shared memory Low level parallelism Feasible number of cores: -6 3 PARENG-

24 Sparse factorization Matrix connectivity Reordering Elimination tree Factorization 4 PARENG-

25 Multifrontal factorization Sparsity pattern Frontal steps Front amalgamation 5 PARENG-

26 Supernodal approach Symbolic reordering Consecutive columns Same sparsity pattern Cache fitting size 6 PARENG-

27 Matrix update Panel selection Downstream columns Different sparsity pattern BLAS.5 operation 7 PARENG-

28 Scope or presentation Introduction to industrial analysis Geometric domain decomposition Distributed computational solutions Parallel computational kernels Application case studies Conclusions and future work 8 PARENG-

29 High performance workstation cluster IBM P575 nodes with.9 GHz 4 dual-core POWER5 CPUs per node 3.5 Terabyte aggregate memory Terabyte total disk space IBM High Performance Switch (HPS) 8 GB/sec bidirectional bandwidth AIX OS Version 5.3 Parallel Environment (PE) V4. 9 PARENG-

30 Trimmed car body application Shell element model.3 M grid points. M shell elements 7.9 M degrees of freedom Normal modes analysis Frequency 3 Hz ~ normal modes 5 partitions 3 PARENG-

31 Shortening solution time Speed Up Serial Number of DMP processes 3 PARENG-

32 Increased fidelity of analysis.. Solution Time (Normalized) Number of Modes (Normalized) Frequency Range (Hz) 3 PARENG-

33 Distributed memory workstation HP Proliant DL3G5 server 64 dual core (.85 GHz) Xeon CPUs 5GB local SATA disks per node 4 GB memory per node GigE interconnect with HP MPI Suse Linux Version.3 33 PARENG-

34 Automotive engine application Solid element model 3.6 M grid points.3 M tetrahedral elements.8 M degrees of freedom Normal modes analysis Frequency:, Hz ~ 5 normal modes 56 partitions 34 PARENG-

35 Shortening solution time Speed up Serial Number of DMP processes 35 PARENG-

36 Increased fidelity of analysis 4.. Solution Time (Normalized).57. Number of Modes (Normalized) , -, -3, -4, -5, Frequency Range (Hz) 36 PARENG-

37 Scope or presentation Introduction to industrial analysis Geometric domain decomposition Distributed computational solutions Parallel computational kernels Application case studies Conclusions and future work 37 PARENG-

38 Conclusions Geometric domain decomposition technologies provide the basis for distributed solutions on modern hardware Recursive computational solutions can support a wide range of engineering analyses with practically acceptable accuracy The handling of the local matrix operations with multi-core processors contributes to the overall performance gain The performance advantages of distributed computational solutions are significant and tremendously accelerate the engineering work 38 PARENG-

39 Future work Extending the distributed finite element technology to a grid computing environment Overcoming the lack of node to node communication mechanism with a high speed network Minimizing the need for a high bandwidth connection between the local nodes and storage devices Synchronizing completion of similar computational complexity components on non-homogeneous grid environment 39 PARENG-

40 Thank you for your attention! Siemens and the Siemens logo are registered trademarks of Siemens AG. NX is a registered trademark of Siemens PLM Software Inc. in the United States and in other countries. NASTRAN is a registered trademark of the National Aeronautics and Space Administration. SpaceShip One pictures by courtesy and permission of Quartus Engineering Inc. 4 PARENG-

Industrial finite element analysis: Evolution and current challenges. Keynote presentation at NAFEMS World Congress Crete, Greece June 16-19, 2009

Industrial finite element analysis: Evolution and current challenges Keynote presentation at NAFEMS World Congress Crete, Greece June 16-19, 2009 Dr. Chief Numerical Analyst Office of Architecture and