Optimization of time dependent adaptive finite element methods K.-H. Elmer Curt-Risch-Institut, Universitat Hannover Appelstr. 9a, D-30167 Hannover, Germany Abstract To obtain reliable numerical solutions of transient dynamic problems and wave propagation problems with high accuracy it is necessary to use models with many degrees of freedom and many time steps. New algorithms and methods are developed to optimize the finite element model and to minimize the computational time on high-performance computers. The moving wave front is used for a broad time dependent mesh refinement, based on intensity vectors and the speed of wave propagation (intensity indicator), several time steps in advance. The a-posteriori Zienkiewicz-Zhu error indicator controls the adaptive mesh refinement. The implicit-explicit algorithm for direct time integration is based upon operator splitting and mesh partitions. The algorithm avoids subcycling on vector- and parallel-computers and uses the same time step for a coarse mesh and the local mesh refinement. 1 Introduction Numerical simulations use idealized models like analytical methods but allow the investigation of realistic systems with complex behaviour. Among other things the advantage of numerical simulation in computational dynamics is the investigation and visualization of complex time dependent processes and interactions in dynamic systems with complicated initial and boundary conditions. Better understanding of the mechanical behaviour is of special interest in many fields of application like earthquake engineering, soil dynamics, nondestructive testing and acoustics, where it is not possible or too expensive to obtain all the desired information by measurements: stresses and strain within a continuous system or the energy flow and intensity within a structure. However, the complexity of many realistic
2 1 4 High Performance Computing problems in transient dynamics requires very large systems. To obtain reliable numerical solutions of transient dynamic problems with high accuracy it is necessary to use models with many degrees of freedom and many time steps. So new algorithms and methods are developed to optimize time dependent finite element models and minimize the computational time on highperformance computers. The idea is to use the Zienkiewicz-Zhu error estimator [6] for wave propagation problems with h-adaptive time dependent FE-methods. As local mesh refinement and system setup after each time step is very expensive, a method is developed to estimate all regions that are to be refined several time steps in advance. Then the a-posteriori Zienkiewicz-Zhu error estimator is only used to control if the refinement has been sufficient or not. Knowing the expected direction of the energy flow from intensity vectors, the moving wave front allows a broad time dependent mesh refinement several time steps in advance. Because of the high accuracy and small demand on computational time explicit time integration is used for all elements of the coarse mesh. To avoid subcycling on vector- and parallel-computers all parts of the mesh with local mesh refinement and a Courant-number less than 1 are treated implicitly with the same time step using an implicit-explicit algorithm. 2 Implicit-Explicit Time Integration The solution of the initial value problem of the semidiscrete equation of motion Mu + Cii-f Ku = F (1) is the displacement u = u(t) that fulfills the differential equation and the initial conditions. Time integration procedures only consider the differential equation at discrete times tn with the approximations u^, v^ and a^ for the functions u(tn)<, u(tn) and u(tn) and an approximation error depending on the difference procedure [4] The integration procedures of the Newmark family use the following relations to get a solution at the new time step tn+i'- + CVn+l + KUn+1 (2) Un+1 = Un+A*Vn+ [(l-2/?)an + 2/?«In+l] (3) Vn+1 = with the Newmark parameter 7 and /?. The unknown acceleration an+i of the implicit scheme results from the solution of the equation (M + 7 AtC + /?A**K)an+i = F*+i - Cv*+i - Kiin+i (5)
with the predictor term: High Performance Computing 2 1 5 and the corrector term: -2/3K (6) (7) +i (8) i (9) For (3 = 0 and 7 = the Newmark predictor-corrector scheme becomes an explicit scheme identical to the central difference method with the temporally discrete equation of motion + CVn+l + KQn+l = F*+i (10) with the diagonal matrix M. As long as M is diagonal, a^+i may be determined from this without solving equations. The implementation of the Newmark method as an implicit-explicit scheme allows part of the mesh to be treated implicitly and part to be treated explicitly [3]. This has considerable practical advantages in that 'stiff' subdomains of large finite element models can be treated economically with an implicit integrator if the Courant-number is larger than 1. When using the implicit-explicit method the elements of a finite element model are devided into two groups: the implicit elements and the explicit elements. The system matrices contain explicit and implicit groups: with the equation of motion M = M' + M^ (ii) C = C' + C^ (12) K = K' + K* (13) F = F' + F* (14) M*n+l + C'Vn+l + CfVm+i + K'll^+i + K Un+l = Fn+1 (15) where the implicit arrays multiply corrector values, whereas explicit arrays multiply predictor values. With (6) and (7) the equation to determine the acceration a^+i is: i (16) with M* = M + 7A*C' + /3A**K' (17)
2 1 6 High Performance Computing Starting with a coarse reference mesh and a time step based upon the maximum frequency of the spectrum, the explicit time integration procedure is used because of high accuracy and only small computational costs. The time step of the explicit procedure must be equal or less than the critical time step of the elements. Refined parts of the mesh with small elements and the Courant-number larger than 1 are treated implicitly with the same time step for the whole system. It is an advantage especially on high-performance computers like vectorand parallel-computers to use the same time step without subcycling. On vector-computers subcycling means additional calculations on a part of the whole mesh with short vector length and on parallel-computers domain decomposition with local subcycling and date dependency is not very efficient. 3 Time Dependent Adaptive FEM The local approximation errors of a FE-solution can be described with the numerical approximation u of the exact displacement vector u: or in terms of stresses: eu = u-u (18) e, = <7-<7 (19) The energy norm of the error is an integral scalar quantity of the domain H and is defined for elasticity problems in stresses as: - - )*D- V - *)<*«* (20) with the elasticity matrix D This error is related to the strain energy of the problem. The approximation errors decrease as the size of the subdivision of the FE-mesh gets smaller with the so called h-refinement. As in most cases the exact solutions of a are not known, the error estimation after Zienkiewicz-Zhu [6] uses an improved approximation a* (f.e. by averaging of discontinuous stresses <j) and the errors in stresses are estimated errors: e* = y_-& (21) In an optimal mesh the distribution of the local energy norm error e, of any element i should be equal between all elements. To achieve this each element i of the m elements is to be refined, if the local error \\e\\i is not smaller than the desired average error 6m- e, <,("*"' + "*"')* = ^ (22) with the desired relative percentage error r\ of the energy norm and the relation: m l? (23)
High Performance Computing 2 1 7 In dynamic problems the total error of the problem consists of the error of the potential energy and the error of the kinetic energy. E = E, + & (24) It is obvious that the error of the total energy of the dynamic problem mainly depends on the error of the FE-discretization if the time step of the integration procedure is small enough for all frequencies of the problem and it is known that the error of the kinetic energy from spatial discretization in general is much smaller than the error of the potential energy. An error estimate for the semidiscrete hyperbolic problem is given in [3] and [5]. It shows that in the case of mesh refinements the rate of convergence of the potential energy is one order smaller than the rate of the kinetic energy. Thus the rate of convergence of the total energy is mainly dependent on the potential energy and the Zienkiewicz-Zhu [6] error estimator is also applicable to hyperbolic problems with time dependent adaptive mesh refinement. 4 Intensity Indicator As most of the computational time is wasted with mesh refinement procedures and expensive system setup a method is developed [2] to estimate all elements and regions that are to be refined, several time steps in advance. The expected direction of the energy flow and the propagation of the different kind of waves in 2- and 3-dimensional systems can be described by intensity vectors and the moving wave front can be used for a broad time dependent mesh refinement in advance. The intensity of a wavefieldshows the transport of energy per time and area. The intensity vector gives informations about the local change of energy and the direction of energy flow. The total energy E of a domain ft consists of the potential energy Ep and the kinetic energy E&: with the potential energy the kinetic energy E=Ep4-Et (25) #? = g / c,jw%^%t,;<m, (26) LJ J \i & = o / m,?wn, (27) z Vn and the elastic tensor Cijki [1] the power of the wave front is: de r -i^- = y W%6, 4- c,;wt/ij%t,f)dn (28)
218 High Performance Computing Together with the fundamental equation: pi/, = 0\j-,j = < (29) and Hooke's law the equation (28) yields: (30) (31) (32) With the definition [2] of the component Ij of the intensity vector I: it follows (34) 1* In the case of stationary processes it is usual to use mean values of time averages. For transient dynamic problems and wave propagation problems it is more suitable to use the instantaneous intensity vector. The intensity components /^, Iy and /^ of element k of a 3-dimensional FE-model are: (33) 'xy (7, TXZ TV (35) In x-direction this is: These intensity components describe the direction of the instantaneous energy flow of each element. The wave front propagates in this direction in the next time steps and also the zone with mesh refinements. This allows to estimate the zone with all elements that are to be refined several time steps in advance. With the maximum wave propagation velocity CL (36) CL =. (1 -i/-2i/2) (37) the propagation of the refinement zone is (38)
High Performance Computing 2 1 9 The a-posteriori Zienkiewicz-Zhu error indicator is only used to control if the refinement is sufficient or not. This does not lead to optimized FE-meshes, like meshes of adaptive FE-methods for static problems, but it saves a lot of computational time, if the estimated mesh is sufficient for several time steps. A broad time dependent mesh refinement for several time steps is more efficient than several steps of mesh refinement. 5 Example The example of Fig. 1 shows a cantilever beam of steel of W / H / L 10 / 250 / 1000 mrn with the free right hand end subjected to a vertical pulse load of P = 2.0/cTV and TS = 0.040ms duration with a broad frequency range. The material constants are: i/ = 0,3 The initial 2-dimensional FE-mesh from automatic mesh generation of Fig. 1 shows 142 nodes and 232 isoparametric elements. With regard to the frequency range the time step is set to 0.001ms with explicit time integration for all elements. The aim of the time dependent FE-mesh refinement is to limit the relative energy norm error 77 to about 15 % in each element for all time steps. The mesh refinement procedure is only activated every 5 time steps. After this the a-posteriori Zienkiewicz-Zhu error estimator ist used to control the last mesh refinement. If it is not sufficient, the last 5 time steps are repeated with a corrected mesh refinement. To avoid this additional costs a broad refinement is used. Figure 2 shows the adaptive FE-mesh with 260 nodes and 450 elements after 20 time steps with refined elements in the upper right corner of the system below the driving point of the load. The instantaneous intensity vectors of Fig. 3 visualize the energy flow at that time. For direct time integration the implicit-explicit algorithm is used, based upon operator splitting and mesh partitions. Domains with the original coarse mesh are treated explicitly whereas all refined elements are treated implicitly with the same original time step. This avoids expensive subcycling on vector- and parallel-computers. The simulation in Fig. 4 shows the horizontal normal stresses of the beam within the travelling waves after 50 time steps. The time dependent adaptive mesh in Fig. 5 shows 504 nodes and 919 elements. It is obvious that this adaptive FE-method with a-priori intensity indicator does not lead to optimized FE-meshes with an equal distribution of the
220 High Performance Computing discretization error but it is a very efficient solution method if the estimated mesh is sufficient for several time steps. The demand on CPU-time and on storage depends on the number of degrees of freedom of the system. In Fig. 6 the CPU-time of the time dependent adaptive FE-method with intensity indicator and implicit-explicit time integration of 100 time steps is related and compared to the CPU-time of a completely refined system ( = 100 %). The costs of the adaptive solution depends on the number of intermediate time steps between the steps of mesh refinement. Figure 6 shows that adaptive mesh refinement after each time step is very unefficient but leads to optimal meshes and takes even more time than the completely refined problem without adaptivity. There is a minimum of only 25 % CPUtime in this example when using about 5 to 10 intermediate time steps. In the case of large problems it is possible to reduce the computational time of transient adaptive solutions with intensity indicator and Zienkiewicz- Zhu error indicator on high-performance computers down to about 20-10% of a conventional FE-solution. Figure 1: Initial FE-mesh with 142 nodes and 232 triangle elements Figure 2: Adaptive refined mesh 20 time steps after the pulse load with 260 nodes and 450 elements
High Performance Computing 221 Figure 3: Instantaneous intensity vectors and energy flow after 20 time steps -1000. 0. 1000. Figure 4: Horizontal normal stresses in N/m* after 50 time steps Figure 5: Refined mesh after 50 time steps with 504 nodes and 919 elements
222 High Performance Computing ReJat. CPU-Time i?n (%] S*R: iso Ref: 100. _ 80. _ i 60. _ 40. _ 20. _ \^ ^. iterat. _ - 0 012 5 8 10 intermediate 12 16 time ste 2 TOTAL CPU-TIME adaptive transient FEM 95.06.01 Figure 6: Relative costs of the adaptive solution to the conventional solution References [1] Achenbach, J.D.: Wave Propagation in Elastic Solids, North-Holland Publishing Company, Amsterdam, New York, Oxford (1980). [2] Elmer, K.-H.: Optimierung numerischer FE-Modelle zur Simulation der Wellenausbreitung mit Vektor- und Parallelrechnern (Optimization of Numerical FE-Models for Simulating Wave Propagation on Vector and Parallel Computers), DFG-Report NA 139/17-2, Curt-Risch- Institut, Hannover 1996. [3] Hughes, T.J.R.: The Finite Element Method, Prentice-Hall, Englewood Cliffs, N.J., (1989). [4] Natke, H.G.: Baudynamik, B.C. Teubner, Stuttgart, (1989). [5] Strang, G. and G.J.Fix: An Analysis of the Finite Element Methods, Prentice-Hall, Englewood Cliffs, N.J., (1973). [6] Zienkiewicz, O.C. and J.Z.Zhu: A Simple Error Estimator and Adaptive Procedure for Practical Engineering Analysis, Int. J. Numer. Methods Eng., 24, 333-357 (1987).