IMPLEMENTATION OF IMPLICIT FINITE ELEMENT METHODS FOR INCOMPRESSIBLE FLOWS ON THE CM-5
|
|
- Delphia Barber
- 5 years ago
- Views:
Transcription
1 Computer Methods in Applied Mechanics and Engineering, (1994) 1 IMPLEMENTATION OF IMPLICIT FINITE ELEMENT METHODS FOR INCOMPRESSIBLE FLOWS ON THE CM-5 J.G. Kennedy Thinking Machines Corporation 245 First Street Cambridge, MA 02142, USA M. Behr, V. Kalro, and T.E. Tezduyar AEM/AHPCRC Supercomputer Institute, University of Minnesota, 1200 Washington Avenue South, Minneapolis, MN 55415, USA March 20, 1994 Revised: March 27, 1994 Abstract A parallel implementation of an implicit finite element formulation for incompressible fluids on a distributed-memory massively parallel computer is presented. The dominant issue that distinguishes the implementation of finite element problems on distributed-memory computers from that on traditional shared-memory scalar or vector computers is the distribution of data (and hence workload) to the processors and the nonuniform memory hierarchy associated with the processors, particularly the nonuniform costs associated with on-processor and off-processor memory references. Accessing data stored in a remote processor requires computing resources an order of magnitude greater than accessing data locally in a processor. This distribution of data motivates the development of alternatives to traditional algorithms and data structures designed for shared-memory computers, which must now account for distributed-memory architectures. Data structures as well as data decomposition and data communication algorithms designed for distributed-memory computers are presented in the context of high level language constructs from High Performance Fortran. The discussion relies primarily on abstract features of the hardware and software environment and should be applicable, in principle, to a variety of distributed-memory systems. The actual implementation is carried out on a Connection Machine CM-5 system with high performance communication functions. 1. Introduction Distributed-memory, massively parallel computers are emerging as significant competitors to traditional vector supercomputers in the area of large-scale computational fluid dynamics. Fluid problems are particularly well suited for these parallel computers due to large, regular data sets since parallelization occurs over large uniform data structures. The continuing trend in fluid simulations toward significantly larger data sets as well as the need for shorter solution times leads naturally to distributed-memory massively parallel computers. Parallel
2 Computer Methods in Applied Mechanics and Engineering, (1994) 2 computers offer the potential for both higher sustained computational performance as well as substantially larger memory capacities than traditional vector computers. This study focuses on a finite element formulation for the problem of an incompressible, viscous fluid. In spite of offering highly parallel data structures, finite element methods pose a challenge for distributed-memory parallel machines as a result of the irregular data communication patterns that arise in the context of unstructured meshes. The earliest implementations of finite element methods on parallel computers relied on a message passing programming model coupled with domain decomposition constructs [1 3]. Domain decomposition constructs subdivide the physical domain into N p subdomains, one for each of the N p processors, each subdomain typically consisting of a spatially contiguous set of elements. The subdomains then communicate data only through elements on the subdomain boundaries. More recently, data parallel implementations of finite element methods emerged [4 9]. Johan et al. [10] coupled traditional notions of domain decomposition with a data parallel finite element implementation. A variety of methods have been used to construct the element subdomains to maintain favorable load balancing characteristics and low network communication requirements. The recursive spectral bisection (RSB) algorithm due to Pothen et al. [11] and Simon [12] provides a systematic and robust methodology for domain partitioning. Johan et al. [10] provided the first parallel implementation of RSB for unstructured meshes. Behr et al. [9] provide parallel implementation constructs for two incompressible flow formulations (based on velocity-pressure, and stress-velocity-pressure as primary variables) along with two-dimensional flow simulations. These parallel implementation constructs are extended here to include a detailed discussion of the the GMRES solver, including a comparison between a traditional matrix-based algorithm and a matrix-free algorithm, domain decomposition and three-dimensional flow simulations. In addition, the issues associated with the coupling between domain decomposition and gather-scatter communication performance are discussed. The domain decomposition strategy is based on the parallel implementation of RSB provided in Johan et al. [10]. The paper is organized as follows. An abstraction of the hardware and software characteristics of distributed-memory massively parallel computers is provided in Section 2. A statement of the finite element problem is provided in Section 3. The parallel implementation of this problem is presented in Section 4. Data decomposition and associated communication issues are discussed in Section 5. Numerical simulations are presented in Section 6. Finally, conclusions are provided in Section Parallel Computer Model Implementation of the finite element method discussed here is carried out on a Connection Machine CM-5 system in the data parallel language Connection Machine Fortran (CMF). The discussion here relies primarily on general features of both the CM-5 and CMF. The CM-5 is a distributed memory, massively parallel computer system. Like the emerging High Performance Fortran (HPF) standard, CMF is a language based on Fortran 90 with additional data layout compiler directives. In principle, the discussion here applies to other distributed memory, massively parallel computer systems using other programming models and languages. Programming language specifics discussed here are immediately accessible
3 Computer Methods in Applied Mechanics and Engineering, (1994) 3 from both CMF and HPF. The primary programming language constructs used here are data distribution or data layout constructs used to distribute Fortran array elements to memory within the distributed processors 2. The syntax :SERIAL and :PARALLEL are used here to denote serial (in local processor memory) and parallel (across processor memory) array dimensions. For example, consider the arrays A, B, and C on an N p processor machine with cyclic data layout as follows: REAL A(N p ),B(3 N p ),C(5,N p ) CMPLR$ LAYOUT A(:PARALLEL), B(:PARALLEL) CMPLR$ LAYOUT C(:SERIAL, :PARALLEL). Array A has a single parallel dimension whose number of entries matches the number of processors. Cyclic layout of the parallel dimension places a single entry of A in each processor. Array B on the other hand has three times as many entries as there are processors. Cyclic layout of B places B(1 : N p ) one per processor. Similarly, B(N p +1 : 2 N p )andb(2 N p +1 : 3 N p ) are distributed one per processor such that each processor is assigned three entries of B. Array C on the other hand has both a serial and a parallel axis. The parallel axis of C is distributed identically to that in A. The serial axis of C is distributed such that, for the k th parallel entry of C, a serial vector of length 5 is placed in the processor associated with the k th parallel axis entry of C. Further discussion of these constructs may be found in [13, 14]. For the case in which the syntax CMPLR$ LAYOUT C(:SERIAL, :PARALLEL) is assumed to infer cyclic distribution of data along parallel axes, the equivalent syntax in HPF is!hpf$ DISTRIBUTE C(, CYCLIC). Currently CMF supports only block layout (described in [13]). In the case in which the syntax CMPLR$ LAYOUT C(:SERIAL, :PARALLEL) is assumed to infer block distribution of data, the equivalent syntax in HPF is whereas the equivalent case in CMF is!hpf$ DISTRIBUTE C(, BLOCK), CMF$ LAYOUT C(:SERIAL, :NEWS). For either block or cyclic data distribution, parallel array operations may be invoked using simple array syntax in Fortran 90. For example, the expression A(:) = A(:) + 3/C(4, :), where : denotes do for all entries of the axis and invokes the assignment statement in parallel, for all parallel entries in A and C simultaneously. 2 InthecaseoftheCM-5,thetermprocessor is used to infer a single vector unit. There are four vector units per processing node on a CM-5. On parallel architectures composed of processing nodes containing only one vector or superscalar processor, the term processor is unambiguous.
4 Computer Methods in Applied Mechanics and Engineering, (1994) 4 3. Finite Element Formulation Here we consider the isothermal transient response of an incompressible fluid. The initial/boundary-value problem is represented in Box 1 where u is the velocity, p denotes pressure, ρ is the density, σ is the Cauchy stress, f is the body force and g and h are the Dirichlet and Neumann boundary condition values, respectively, enforced on the subsets of the boundary Γ t of the possibly evolving domain Ω t. In the case of the fixed domain, the subscript t denoting time on the domains is dropped. The stress response is assumed to be Newtonian, characterized by the fluid viscosity µ. 1. Momentum Balance on Ω t ( ) u ρ t + u u f σ = 0 2. Mass Balance (Incompressible) on Ω t u =0 3. Initial and Boundary Conditions u = g on (Γ t ) g σ n = h on (Γ t ) h u(x, 0) = u 0 on Ω 0 4. Stress Response (Newtonian) σ = pi + T, T =2µε(u) ε(u) = 1 ( ) u + u T 2 Box 1: Initial/Boundary-Value Problem. A stabilized, space-time, velocity-pressure formulation is then summarized in Box 2. Here, (, ), (, ) Q e n and (, ) Ωn denote L 2 inner products over the space-time slab Q n,the single space-time slab element Q e n and the spatial domain Ω n, respectively. The surface P n is traced by Γ t as t traverses the time interval associated with slab n and (P n ) h is the subset of P n corresponding to (Γ t ) h. The ( ) + n and ( ) n denote the values of a variable at level n as it is approached from the top and the bottom, respectively. The Q h and V h are suitable spaces for pressure and velocity functions, and τ MOM and τ CONT are stabilization parameters. Further details on this formulation, although not central to the discussions here, may be found in Tezduyar et al. [15, 16]. The space-time formulation is used in the next section because of its notational simplicity, but the parallel implementation issues are the same for a semi-discrete formulation, in which the jump term (Box 2, item 2, term 4 on right-hand side) is dropped, the integration takes place over the spatial domain only, and
5 Computer Methods in Applied Mechanics and Engineering, (1994) 5 1. Finite Element Form B(p h, u h ; q h, v h )=F (q h, v h ) (q h, v h ) Q h V h 2. B(p h, u h ; q h, v h ) B(p h, u h ; q h, v h ) = ( u h t + u h u h, ρv h ) Q n + ( σ(p h, u h ), ε(v h ) ) Q n + ( ρ u h,q h) Q n + ( ) (u h ) + n (u h ) n,ρ(v h ) + n Ω n (n el ) n ] + ([ρ( uh t + uh u h ) σ(p h, u h ), e=1 ]) 1 τ MOM [ρ( vh ρ t + uh v h ) σ(q h, v h ) + (n el ) n e=1 ( τcont u h,ρ v h) Q e n Q e n 3. F (q h, v h ) F (q h, v h ) = ( f,ρv h) Q n + ( h, v h) (P n) h (n el ) n + e=1 ]) (f,τ MOM [ρ( vh t + uh v h ) σ(q h, v h ) Q e n Box 2: Stabilized u p Space-Time Finite Element Formulation. the time derivatives are replaced by appropriate expansions. The Galerkin/least-squares problem like the one shown in Box 2 will lead to a nonlinear coupled system of equations: N (d n )=F, (1) where d n is the vector of unknowns associated with marching from time step n 1tonin a semi-discrete formulation, or associated with time slab n in a space-time formulation. For the nonlinear system of equations (1), the Newton-Raphson iterations N d each require the solution of a linear equation system ( ) ( ) d k n = F N d k n, (2) d k n A k nx k n = R k n, (3)
6 Computer Methods in Applied Mechanics and Engineering, (1994) 6 where k is the nonlinear iteration counter, A k n = N/ d d k n is the nonsymmetric Jacobian operator, x k n = d k n is the vector of increments for unknown solution values and R k n = F N ( dn) k is the vector of residuals. When discussing the process of the solution of the linear equation system (3), the sub- and superscripts identifying the time step and nonlinear iteration will be dropped, as only one such system is solved at a given time. An outline of the implicit solution to the finite element problem is shown in Box Preprocessing and initial conditions 2. PARTITION data to processors 3. a. Time step loop (n start =0) b. Nonlinear iteration loop (k start =0) 4. GATHER nodal x k n to elements 5. FORM element matrices and residuals A e,k n 6. SCATTER R e,k n to assembled R k n 7. SOLVE A k nx k n = R k n 8. a. End k loop (k k +1,goto3b) b. End n loop (n n +1,goto3a) 9. Postprocessing and visualization Box 3: Outline of Finite Element Solution., Re,k n 4. Parallel Implementation Here, the global programming model described in Section 2 is used to implement the finite element method. The key features of the current finite element implementation on a distributed-memory massively parallel computer are (1) constructing data structures which circumvent unneeded communication of data between processors, (2) mapping the data associated with these data structures to the processors in a manner which efficiently exploits data locality, (3) using efficient gather and scatter algorithms which distinguish on-processor and off-processor data transfers and (4) maintaining favorable load balancing and scaling properties. Two naturally parallel data structures emerge from the finite element problem: the first associated with the FORM phase (element-ordered data set corresponding to Step 5, the FORM step, in Box 3), the second associated with the SOLVE phase (node-ordered data structure corresponding to Step 7, the SOLVE step, in Box 3). Using the serial and parallel layout constructs described in Section 2, the element-level residual vector R e and its global counterpart R are represented in these two data structures as shown in Box 4, where n dof is the number of degrees of freedom per node, n en is the number of local nodes residing in an element, n nodes is the number of global nodes and n el is the number of elements. The idea is to construct a parallel array axis of length n el for the FORM data structure and n nodes for the SOLVE data structure. Element information in a FORM array associated with each element is then accumulated by indexing along the serial dimension(s) of the array. Similarly, node information in a SOLVE array associated with each node is also accumulated by indexing along the serial dimension(s).
7 Computer Methods in Applied Mechanics and Engineering, (1994) 7 1. FORM Element Based REAL R e (n dof,n en, n el ) CMPLR$ LAYOUT R e (:SERIAL, :SERIAL, :PARALLEL ) 2. SOLVE Node Based REAL R(n dof, n nodes ) CMPLR$ LAYOUT R(:SERIAL, :PARALLEL ) 3. Communication R e (n dof,n en, n el ) R(n dof, n nodes ) gather / scatter Box 4: FORM and SOLVE Data Structures. These FORM and SOLVE data structures exhibit natural parallelism in that they enable the FORM step and a number of phases of the SOLVE step of the solution outlined in Box 3 to take place in parallel without communication between processors. With these two data structures, communication between processors within the time step loop occurs predominantly due to communication between the FORM and SOLVE data structures. That is, communication occurs predominantly in the GATHER and SCATTER steps. Pseudo-code evaluating the boxed terms in Box 2 for the FORM phase is shown in Box 5. Note that the repeated indices imply summation, j σ is an index of the stress tensor component, and that i sd identifies the space dimension. Here, integration over the space-time slabs Q e n is taken as the usual sum over quadrature points. That is, n int χ(:) dq = [χ(:)] l J l (:)w l, (4) Q e n where n int is the number of integration points, J l is the determinant of the Jacobian of the finite element mapping, and w l is the weight. Here, and in Box 5 pseudocode, : implies do for all elements i el =1:n el in a FORM -based array. By definition of the CMPLR$ LAYOUT constructs, for a given element i el, the element-level vector R e (1 : n dof, 1:n en,i el )of n dof n en components resides in the memory of processor p(i el ), where p(i el ) is a mapping provided by the compiler. Furthermore, an element-level vector v e (1 : m, i el ), for any m along a serial dimension and i el along a parallel dimension with identical extent 1:n el as the one in R e, resides in the memory of the same processor p(i el ). Consequently, R e (1 : n dof, 1:n en,i el ) and v e (1 : m, i el ) reside in the same (virtual) processor for each i el [1,n el ]. With this in mind, it is evident from Box 5 that no inter-processor communication occurs in the FORM phase. The SOLVE phase on the other hand does require communication. A summary of the GMRES algorithm used in the SOLVE phase is shown in Box 6. All quantities in the SOLVE phase are stored in the SOLVE data structure with the exception of the element-level Jacobian matrices a e (and, as a result, two element-level vectors required to interact with a e ) which are stored on the element level for performance reasons. From the l
8 Computer Methods in Applied Mechanics and Engineering, (1994) 8 1. B ff" Formation B ff" (p h, u h ; q h, v h ):= σ(p h, u h ):ε(v h )dq Q n B ff" comprises of element-level contributions: B e (:) = σ(j ff" σ, :) ε(j σ, :)dq Q e n 2. B t Formation B t (p h, u h ; q h, v h ):= ρ Qn uh t vh dq B t comprises of element-level contributions: Bt e (:) = ρ(:) u(i sd, :) v(i sd, :) dq Q e n Box 5: Pseudo Code: FORM Phase. perspective of a parallel implementation, the SOLVE phase is comprised primarily of dot products (α = p q), SAXPY operations (p = p + αq), matrix-vector products (q = Ap) and a preconditioning step. Here, only diagonal preconditioning is considered such that the preconditioning step requires strictly inexpensive on-processor operations, with computing costs not significantly beyond that of a dot product or a SAXPY operation. Pseudo code for such steps of the SOLVE phase is shown in Box 7. The dominant computational portion of the GMRES algorithm is the matrix-vector product. Item 3 in Box 7 highlights a matrix-vector product (q = Ap) scheme which consists of three steps: (1) a gather of p to p e on the element level, (2) an on-processor matrix-vector product (q e = a e p e )involving no inter-processor communication and (3) a scatter of q e to q on the global assembled level. In the gather and scatter steps, iconn(1 : n en, 1:n el ) is the nodal connectivity array. This matrix-vector product scheme was initially proposed by Johnsson and Mathur [17] and demonstrates favorable performance characteristics on Connection Machine systems. Note that communication in the SOLVE phase occurs in the global sums within dot products and in the gather/scatter steps of the matrix-vector product, the latter being the dominant communication steps. Expressed in terms of the the High Performance Fortran FORALL construct, the gather step may be expressed in the form FORALL (i dof =1:n dof,i en =1:n en,i el =1:n el ) v e (i dof,i en,i el )=v(i dof,iconn(i en,i el )). (5) A scatter on the other hand must account for collisions of data at the destination and hence takes the form DO i dof =1,n dof FORALL (i node =1:n nodes ) v(i dof,i node )=v(i dof,i node )+SUM (v e (i dof, 1:n en, :), MASK = iconn(1 : n en, :).EQ.i node ) END DO. (6)
9 Computer Methods in Applied Mechanics and Engineering, (1994) 9 DO l =1,n outer GMRES outer iterations r 0 := R Ax 0 compute initial residual β := r 0 2 compute initial residual norm v 1 = r 0 /β define first Krylov vector DO j = 1,m GMRES inner iteration z j := M 1 j v j preconditioning step w := Az j matrix-vector product DO i = 1,j Gramm-Schmidt orthogonalization h i,j := (w, v i ) w := w h i,j v i END DO h j+1,j := w 2 v j+1 := w/h j+1,j define next Krylov vector END DO H := {h i,j } define reduced system matrix y := argminŷ βe 1 Hŷ 2 solve reduced system x := x 0 + m i=1 y iz i form approximate solution IF βe 1 Hy 2 ɛ EXIT convergence check x 0 := x restart END DO Box 6: GMRES algorithm: Algorithm Summary. 1. Dot Product: α = p q 2. SAXPY: p = p + αq α = SUM(p(i dof, :) q(i dof, :)) p(i dof, :) = p(i dof, :) + α q(i dof, :) 3. Matrix-Vector Multiply: q = A p p e (i dof,i en, :) = p(i dof,iconn(i en, :)) (Gather) q e (i dof,i en, :) = a e (i dof,i en,j dof,j en, :) p e (j dof,j en, :) (Local Mult) q(i dof,iconn(i en, :)) = q e (i dof,i en, :) (Scatter) [Add Collisions] Box 7: Pseudo Code: SOLVE Phase. In the numerical implementation, for performance reasons, the gather/scatter steps are implemented on the CM-5 using high performance communication algorithms which replace
10 Computer Methods in Applied Mechanics and Engineering, (1994) 10 the FORALL statements above with single function calls. The gather and scatter steps are discussed further in the following section Ax = N d x N N(d + εx) N(d) x d ε R ε = R(d + εx) =F N(d + εx), R = R(d) =F N(d) Ax = R ε R ε Box 8: Matrix-Free Linearization. An alternative matrix-free GMRES solution scheme may be used based on a matrix-free linearization of the residual as is represented in Box 8. In the matrix-free linearization, which is due to Johan [18], the linear part of the residual represented as a matrix-vector product Ax is approximated by (R ε R)/ε, wherer ε = R(d + εx), R = R(d), and ε is a suitably small number [18]. As a result, in the parallel implementation, the GMRES algorithm differs only in replacing the matrix-vector product in the above algorithm with this simple difference formula between the residuals. In particular, the above three-step matrix-vector product is replaced by the steps (1) gather the current solution vector d to the element level d e,(2) FORM the updated element-level residual R e ε on the element level (without inter-processor communication) based upon d + εx and (3) scatter R e ε to the global assembled level and perform the difference formula. That is, from a computational perspective, this scheme differs primarily from the matrix-vector product scheme in that the on-processor matrixvector product is replaced with formation of the element-level residual R e ε. Notice that the element-level Jacobian matrices a e from Box 7 need not be stored in the matrix-free case, resulting in substantial memory savings since storage of these matrices dominate the memory requirements in the original formulation. This memory savings is accompanied by additional on-processor computational requirements, however, since computing the residual R e ε on subsequent GMRES iterations typically requires greater on-processor computations than does the on-processor matrix-vector multiply discussed above. A comparison of the matrix-free and original GMRES solver is provided in Data Decomposition and Communication Partitioning of the data associated with the FORM and SOLVE data structures into groups, each group being associated with a single processor of the parallel computer, is used to increase the efficiency of the GATHER and SCATTER steps by attempting to
11 Computer Methods in Applied Mechanics and Engineering, (1994) 11 minimize the off-processor communication in these communication steps, taking maximum advantage of data locality. A parallel implementation of the RSB algorithm is used to decompose and distribute the element data ( FORM data structure) to the processors based on the modal analysis of the graph of the connectivity array describing the connectivity between the elements (dual connectivity). The RSB algorithm, with origins due to Pothen et al. [11] and Simon [12], provides a robust, systematic tool for generating efficient data decompositions in parallel. The parallel implementation of the RSB algorithm used here is due to Johan [18], and is available in the Connection Machine Scientific Software Library. The data decomposition generated by the bisection algorithm is exploited in efficient gather and scatter communication algorithms which account for locality of data residing in a given processor by breaking each communication step (gather or scatter) into two distinct phases an on-processor communication step (with communication speeds on the order of the memory bandwidth) and an off-processor communication step (with communication rates on the order of the network bandwidth). Such a two-step algorithm is natural within a message passing programming model. The data parallel implementation of the two-step algorithm is more subtle due to the high-level language constructs. The data parallel twostep algorithm used here is due to Johan et al. [10] and exhibits favorable load balancing and scaling properties for large classes of problems. Performance advantages which arise due to the data decomposition and two-step communication strategies are a result of the amount of data gathered (or scattered) from the surface elements in one partition to that in another partition (hereafter referred to as surface data) relative to the amount of data gathered (or scattered) within the internal volume of an element partition (hereafter referred to as volume data). Provided the mesh partitioning algorithm provides suitably nice, contiguous element groups, the ratio of surface data to volume data becomes small as the number of elements in typical partitions becomes large. Hence, the amount of surface data communicated at network bandwidth speeds is small relative to the amount of volume data communicated at memory bandwidth speeds. The relative amounts of surface data and volume data in a mesh and hence the performance improvements available from the mesh partitioning and communication schemes is dependent on mesh geometry. Three-dimensional meshes typically exhibit more favorable surface data to volume data ratios than do two-dimensional meshes and hence experience more pronounced speed-ups from the data decomposition strategies. To illustrate this, it is useful to look in detail at the amount of surface and volume data which exists first within a general finite element mesh and next within simple illustrative meshes. To begin, assume that the distribution of the global nodes to the processors in the SOLVE data structure are such that (1) nodes internal to the element partition (nodes not on the element partition boundary) are assigned to the processor associated with that element partition and (2) nodes on the element partition boundary (nodes associated with surface data) are assigned such that two element partitions sharing a set of nodes receive a random subset of those shared nodes. Such a node distribution is in fact the one used in the mesh partitioning scheme used here. With this in mind, for this discussion, it is reasonable to characterize the amount of data sent off-processor from an element partition in the gather or scatter operation (equa- Obtaining a true minimum is an NP complete (i.e. intractable) problem.
12 Computer Methods in Applied Mechanics and Engineering, (1994) 12 tions (5) and (6)) as roughly half the data associated with the partition boundary nodes. Consequently, the number of array elements of v(1 : n dof, 1:n nodes )sentoff-processorfrom a single element partition is roughly half the number of partition boundary nodes times n dof. The two-step scatter described in [10] is composed of (1) a scatter of element data within a partition to an intermediate set of (pseudo-global) nodes local to that partition and (2) a scatter of the data associated with this intermediate set of nodes to the global nodes. The first step involves strictly on-processor data motion (n dof n en n partition el words per partition, where n partition el is the number of elements in the partition). The second step involves both on-processor (n dof n Vpartition nodes + 1n 2 dof n partition np words per partition, where n Vpartition np := n partition np n partition np ) and off-processor data motion ( 1n 2 dof n partition np words per partition). Here, n partition np is the number of nodes in the partition and n partition np is the number of nodes on the partition boundary. Next, consider the four simple meshes shown in Figure 1: (1) a two-dimensional square mesh of quadrilateral elements (4 nodes per element, N N elements), (2) a two-dimensional square mesh of triangular elements (3 nodes per element, 2 N N elements), (3) a threedimensional cubic mesh of brick elements (8 nodes per element, N N N elements) and (4) a three-dimensional cubic mesh of tetrahedral elements (4 nodes per element, 5 N N N elements). We constrain the tetrahedral mesh to be that generated from the brick mesh by decomposing each brick into 5 tetrahedral elements containing only those nodes which exist in the brick elements (see Figure 1), and place similar constraint on the pair of two-dimensional meshes. We also require that the elements of each mesh are partitioned into identical rectangular (two-dimensional) or rectangular parallelepiped (three-dimensional) partitions of elements on each processor, so that identical nodes comprise corresponding partitions in each mesh. Figure 1. Meshes for communication bandwidth tests. In the event that these simple meshes are partitioned into N p partitions of n n quadrilateral element partitions in the two-dimensional case (each quadrilateral subdivided into 2 triangles for the triangular mesh) and n n n hexahedral element partitions in the threedimensional case (each hexahedron subdivided into 5 tetrahedra for the tetrahedral mesh), the number of array elements associated with on-processor (volume data) and off-processor (surface data) data motion is shown as a function of n in Table 1. Steps 1 and 2 in Table 1 refer to the steps in the two-step gather-scatter. Noting that n is a linear function of N for each mesh, it is evident from Table 1 that the amount of surface data is O(N d 1 )whereas As is described in Johan et al. [10], the off-processor communication occurs on a node-to-node basis as opposed to an element-to-node basis.
13 Computer Methods in Applied Mechanics and Engineering, (1994) 13 the amount of volume data is O(N d ), where d is dimensionality of the mesh. As the size of the mesh and hence N increases, the amount of volume data quickly dominates the surface data. Step 1 Step 2 Mesh On-PN Off-PN On-PN Off-PN Quads n dof n en n 2 0 n dof ((n 1) 2 +2n) n dof 2n Triangles n dof n en 2n 2 0 n dof ((n 1) 2 +2n) n dof 2n Hexahedra n dof n en n 3 0 n dof ((n 1) 3 +3n 2 +1) n dof (3n 2 +1) Tetrahedra n dof n en 5n 3 0 n dof ((n 1) 3 +3n 2 +1) n dof (3n 2 +1) Table 1. Communication load for square and cubic partitions (n x = n y = n z = n). Ratios of surface data to volume data for meshes associated with specific values of N are shown Tables 2 and 3, where n dof is assumed to be 4. Here the partitions are not square as they are in Table 1, however. Table 2 corresponds to two-dimensional quadrilateral meshes of (1) a mesh of 16 8 partitions of 8 16 elements and (2) a mesh of 16 8 partitions of elements. Table 3 corresponds to three-dimensional hexahedral meshes of (1) a mesh of partitions of 3 6 6elementsand(2) a mesh of partitions of elements. The triangular and tetrahedral meshes are generated from the quadrilateral and hexahedral meshes as described above. From Tables 2 and 3, one can see that the ratio of surface data sent off-processor (at network bandwidth rates) to volume data sent on-processor (at memory bandwidth rates) is quite small, even for these moderately sized meshes. The degree to which a two-step gather or scatter will experience speed-ups due to a particular data partitioning strategy on a particular computer system is a function of the both the memory and the network bandwidths of the computer system. In the case of the CM-5, the speed-ups for the two-dimensional meshes considered in Table 2 are shown in Table 4, while the speed-ups for three-dimensional meshes considered in Table 3 are shown in Table 5. In Tables 4 and 5 the non-partitioned results are based on the communication strategy outlined in [19], with random distribution of the nodal points to the processors. Notice that the two-step scheme with partitioning offers dramatic speed-ups and that the speed-ups for the three-dimensional problems exceed those for the two-dimensional problems. 6. Numerical Simulations D flow around a cylinder: matrix-free vs. non-matrix-free performance In this section we consider a three-dimensional simulation of the flow past a circular cylinder. The simple problem geometry allows us to generate meshes of hexahedral and tetrahedral elements with relative ease. Here we use two meshes shown in Figure 2. The mesh shown in Figure 2 a) consists of 100,907 tetrahedral elements and 21,188 nodes, while the mesh in Figure 2 b) consists of 18,396 hexahedral elements and 21,460 nodes. The steady flow field at Re = 100 is obtained on both meshes with each technique (matrix-free and non-matrixfree). Figure 3 shows the steady-state pressure field around the cylinder obtained with the tetrahedral mesh.
14 Computer Methods in Applied Mechanics and Engineering, (1994) 14 Step 1 Step 2 Mesh On-PN Off-PN On-PN Off-PN Off-PN to On-PN Ratio Quads (N = 128) Triangles (N = 128) Quads (N = 256) Triangles (N = 256) Table 2. Surface to volume data ratios for the square meshes. Step 1 Step 2 Mesh On-PN Off-PN On-PN Off-PN Off-PN to On-PN Ratio Hexahedra (N = 24) Tetrahedra (N = 24) Hexahedra (N = 32) Tetrahedra (N = 32) Table 3. Surface to volume data ratios for the cubic meshes. Gather Scatter Mesh Non-partitioned Partitioned Non-partitioned Partitioned Quads (N = 128) 2.1 ( 30.8 ms) 7.9 ( 8.4 ms) 2.1 ( 31.3 ms) 6.4 (10.2 ms) Triangles (N = 128) 2.0 ( 48.9 ms) 12.4 ( 7.9 ms) 2.0 ( 50.7 ms) 9.5 (10.4 ms) Quads (N = 256) 1.8 (147.0 ms) 10.6 (24.7 ms) 1.7 (154.6 ms) 8.9 (29.9 ms) Triangles (N = 256) 1.6 (239.7 ms) 13.5 (29.4 ms) 1.6 (247.3 ms) 12.6 (32.1 ms) Table 4. Bandwidth comparison for the square mesh in MB/s/PN. Gather Scatter Mesh Non-partitioned Partitioned Non-partitioned Partitioned Hexahedra (N = 24) 1.8 ( 61.3 ms) 8.3 (13.3 ms) 2.1 ( 51.9 ms) 8.0 (13.8 ms) Tetrahedra (N = 24) 1.8 (155.8 ms) 18.8 (14.7 ms) 1.7 (159.1 ms) 14.2 (19.5 ms) Hexahedra (N = 32) 2.0 (132.6 ms) 9.3 (28.3 ms) 1.7 (151.4 ms) 8.1 (32.3 ms) Tetrahedra (N = 32) 1.5 (438.1 ms) 20.2 (32.5 ms) 1.5 (450.2 ms) 19.7 (33.3 ms) Table 5. Bandwidth comparison for the cubic mesh in MB/s/PN.
15 Computer Methods in Applied Mechanics and Engineering, (1994) 15 Figure 2. a) Surface of the tetrahedral and b) hexahedral cylinder mesh. Figure 3. Surface steady pressure field at Reynolds number 100. The parameter which influences the relative performance of the two techniques is the size of the Krylov space. Since in the matrix-free technique the matrix-vector products are replaced by residual evaluations, it is computation dominated; hence increasing the size of the Krylov space would result in the relative slow-down of the matrix-free technique. Table 6 indicates the time required per non-linear iteration, as well as the overall communication percentage, for different number of inner iterations (i.e. Krylov space sizes) for the tetrahedral mesh. Table 7 shows the same data for the hexahedral mesh. The measurements were taken on a CM-5 with 128 processing nodes for the tetrahedral mesh, and on a CM-5 with 32 processing nodes for the hexahedral mesh, resulting in similar subgrid lengths for
16 Computer Methods in Applied Mechanics and Engineering, (1994) 16 the two problems. We observe that for smaller Krylov spaces the matrix-free technique is faster, with a break-even point at around 8 inner iterations in the case of the tetrahedral mesh, and around 30 inner iterations in the case of the hexahedral mesh. The tetrahedral result is similar to findings by Johan [18] for compressible flows. We use 4 gauss points for the tetrahedral mesh and 8 for the hexahedral mesh. Note that in the current matrix-free implementation we store the values of the shape functions and Jacobians of the element domain transformation. At most (in the case of the deforming meshes) they are computed once every non-linear iteration. Matrix-free Non-matrix-free Krylov space size Iteration cost Comm. percentage Iteration cost Comm. percentage sec 18.2% 1.15 sec 18.7% sec 21.8% 1.19 sec 21.3% sec 19.3% 1.31 sec 20.5% sec 21.8% 1.87 sec 27.6% sec 22.8% 2.51 sec 32.1% sec 22.2% 3.24 sec 32.1% Table 6. Matrix-free vs. non-matrix-free comparison for the tetrahedral mesh. Matrix-free Non-matrix-free Krylov space size Iteration cost Comm. percentage Iteration cost Comm. percentage sec 17.2% 4.24 sec 9.2% sec 20.5% 4.62 sec 12.2% sec 21.0% 4.84 sec 13.8% sec 21.1% 6.06 sec 21.0% sec 22.9% 7.68 sec 23.1% sec 22.1% 9.08 sec 25.8% Table 7. Matrix-free vs. non-matrix-free comparison for the hexahedral mesh Flow around a submarine: partitioning benefits This simulation involves three-dimensional flow around a Los Angeles-class submarine. The ability to handle completely unstructured meshes is important when studying flows around complex shapes, since it is difficult to construct a structured mesh around a complex threedimensional object. The semi-automatic structured-mesh generators are generally less flexible and require more user intervention than fully automatic mesh generators designed for unstructured meshes. An example of the latter is the finite octree tetrahedral mesh generator developed by Shephard [20]. Here, this mesh generator was used to create a mesh around a Los Angeles-class submarine. The input to the mesh generator consisted of a geometric definition of the bounding surfaces of the mesh, including the outer rectangular box, and surface model of the submarine hull. The hull geometric model was digitized from commercially available data and was composed of a number of triangular and rectangular Bezier
17 Computer Methods in Applied Mechanics and Engineering, (1994) 17 patches. The mesh used for the current computations consisted of 86,111 nodes and 428,157 tetrahedral elements. Selected surfaces of that mesh are shown in Figure 4. Figure 4. Surface of the submarine mesh. In these initial computations the domain was stationary and therefore a more computationally efficient semi-discrete implementation (Tezduyar et al. [21]) was used in place of the space-time formulation. The boundary conditions consisted of a specified uniform inflow velocity, zero-normal-velocity/zero-shear-stress boundary conditions at the external lateral boundaries, a traction-free outflow boundary, and no-slip condition on the submarine hull. The Reynolds number is based on the free-stream fluid velocity and submarine length. The computations were restarted from a steady-state solution at Reynolds number The Reynolds stress was modeled using a Smagorinsky turbulence model after Kato [22]. In this model, the kinematic viscosity ν is augmented by an eddy viscosity ν T =(Ch) 2 (2ε(u):ε(u)) 1 2, (7) where C =0.15 is the model constant and h is the element length. In the transient phase of the solution, the Krylov space of 50 was used in the GMRES solver with no restarts. At each time step 4 nonlinear iterations were performed. A representative result from this preliminary computation is presented in Figure 5, which shows the pressure field on the submarine hull. At this point in the computation, the drag coefficient remained at The overall sustained performance and communication performance for this simulation are shown in Table 8. The communication performance is shown both for the case of the two-step communication of partitioned data (see Section 5) and for the case of a singlestep communication (see Mathur and Johnsson [19]) with random distribution of the nodes. Figure 6 shows the partitioning for 2048 vector units on the surface mesh of the submarine
18 Computer Methods in Applied Mechanics and Engineering, (1994) 18 Figure 5. Pressure distribution on the submarine hull. hull. Table 8 shows, with and without partitioning, the overall speed in GigaFLOPS, time taken per nonlinear iteration, as well as gather and scatter bandwidths attained in the GMRES solver. All measurements were taken on a CM-5 computer with 512 processing nodes and 2048 vector execution units. Note that the difference in the FORM phase speed between the partitioned and non-partitioned case is statistical and/or possibly due to the load on the front end. The partitioning is observed to more than double the overall speed, by decreasing the gather cost by a factor of 7 and scatter by a factor of 3.5. Figure 6. Partitioning of the submarine mesh for 2048 vector units. Non-partitioned Partitioned FORM phase speed 11.5 GigaFLOPS 12.3 GigaFLOPS Overall speed 2.4 GigaFLOPS 5.4 GigaFLOPS Time per iteration 9.9 sec 4.4 sec Gather Bandwidth 1.5 MB/s/PN 10.4 MB/s/PN Scatter Bandwidth 1.8 MB/s/PN 6.4 MB/s/PN Table 8. Performance with and without mesh partitioning.
19 Computer Methods in Applied Mechanics and Engineering, (1994) Concluding Remarks We have discussed various aspects of a data parallel implementation of finite element methods for computational fluid dynamics. The foundation for such implementation is the existence of high-level data parallel programming languages such as the Connection Machine Fortran or High Performance Fortran. These languages are ideal for exploiting the fine-grain parallelism occurring naturally in finite element problems on large meshes. We based the implementation discussion on a space-time velocity-pressure formulation of incompressible Navier-Stokes equations, and noted that this discussion is equally relevant to many other formulations, including those that employ conventional time-stepping methods. The issues covered include the selection of two principal data storage modes, the formation of elementlevel residual vectors, and the iterative solution process used to solve the linear system of equations arising at each nonlinear iteration step. Subsequently we investigated how additional control over the distribution of the data elements in the two storage modes can be used to significantly reduce the cost of communication between these storage sets. Here we used the two-step gather and scatter routines from the Connection Machine Scientific Software Library. Using a 3D flow past a cylinder as an example, we compared the performance of the aforementioned implementation, using a standard GMRES implementation, as well as its matrix-free version. Finally we presented some results from a 3D simulation of a flow past a complex submarine model, and compared the throughput of both the standard and the two-step communication routines on this practical problem. The preconditioning of the linear system arising from the finite element formulation is still an open issue, especially significant in the incompressible case, where some degree of global (i.e., not local to element or node) preconditioning can dramatically improve convergence. In the examples presented here, only a diagonal preconditioning/scaling has been used. 8. Acknowledgments This research was sponsored by NASA-JSC under grant NAG 9-449, by NSF under grants CTS and ASC , by ARPA under NIST contract 60NANB2D1272, and by ARO under grant DAAH04-93-G Partial support for this work has also come from the ARO contract number DAAL03-89-C-0038 with the AHPCRC at the University of Minnesota. We are indebted to Zdenek Johan for helpful comments and providing access to his CM-5 implementations of both the RSB algorithm for data decomposition and the two-step gather and scatter algorithms. We are also indebted to Kapil Mathur for helpful comments and his contributions to the two-step gather and scatter algorithms. References [1] J.G. Malone, Automatic mesh decomposition and concurrent finite element analysis for hypercube multiprocessor computers, Computer Methods in Applied Mechanics and Engineering, 70 (1988)
20 Computer Methods in Applied Mechanics and Engineering, (1994) 20 [2] C. Farhat and E. Wilson, A new finite element concurrent computer program architecture, International Journal for Numerical Methods in Engineering, 24 (1987) [3] G.A. Lyzenga, A. Raefsky, and B.H. Hager, Finite elements and the method of conjugate gradients on concurrent processors, Report C3P-119, California Institute of Technology, Pasadena, CA, [4] K.K. Mathur and S.L. Johnsson, The finite element method on a data parallel computing system, International Journal of High Speed Computing, 1 (1989) [5] T. Belytschko, E.J. Plaskacz, J.M. Kennedy, and D.L. Greenwell, Finite element analysis on the Connection Machine, Computer Methods in Applied Mechanics and Engineering, 81 (1990) [6] R.A. Shapiro, Implementation of an Euler/Navier-Stokes finite element algorithm on the Connection Machine, in AIAA , AIAA 29th Aerospace Sciences Meeting, (1991). [7] C. Farhat, N. Sobh, and K.C. Park, Transient finite element computations on 65,536 processors: The Connection Machine, International Journal for Numerical Methods in Engineering, 30 (1990) [8] Z. Johan, T.J.R. Hughes, K.K. Mathur, and S.L. Johnsson, A data parallel finite element method for computational fluid dynamics on the Connection Machine system, Computer Methods in Applied Mechanics and Engineering, 99 (1992) [9] M. Behr, A. Johnson, J. Kennedy, S. Mittal, and T.E. Tezduyar, Computation of incompressible flows with implicit finite element implementations on the Connection Machine, Computer Methods in Applied Mechanics and Engineering, 108 (1993) [10] Z. Johan, K.K. Mathur, S.L. Johnsson, and T.J.R. Hughes, An efficient communications strategy for finite element methods on the Connection Machine CM-5 system, Computer Methods in Applied Mechanics and Engineering, 113 (1994) [11] A. Pothen, H.D. Simon, and K.P. Liou, Partitioning sparse matrices with eigenvectors of graphs, SIAM Journal on Matrix Analysis and Applications, 11 (1990) [12] H.D. Simon, Partitioning of unstructured problems for parallel processing, Computing Systems in Engineering, 2 (1991) [13] C.H. Koelbel, D.B. Loveman, R.S. Schreiber, Jr. G.L. Steele, and M.E. Zosel, The High Performance Fortran Handbook. MIT Press, Cambridge, MA, 1994, ISBN [14] Thinking Machines Corporation, 245 First Street, Cambridge, MA 02142, CM Fortran Reference Manual, Versions 1.0 and 1.1, 1991.
21 Computer Methods in Applied Mechanics and Engineering, (1994) 21 [15] T.E. Tezduyar, M. Behr, and J. Liou, A new strategy for finite element computations involving moving boundaries and interfaces the deforming-spatial-domain/space-time procedure: I. The concept and the preliminary tests, Computer Methods in Applied Mechanics and Engineering, 94 (1992) [16] T.E. Tezduyar, M. Behr, S. Mittal, and J. Liou, A new strategy for finite element computations involving moving boundaries and interfaces the deforming-spatialdomain/space-time procedure: II. Computation of free-surface flows, two-liquid flows, and flows with drifting cylinders, Computer Methods in Applied Mechanics and Engineering, 94 (1992) [17] S.L. Johnsson and K.K. Mathur, Experience with the conjugate gradient method for stress analysis on a data parallel supercomputer, International Journal for Numerical Methods in Engineering, 27 (1989) [18] Z. Johan, Data Parallel Finite Element Techniques for Large-Scale Computational Fluid Dynamics, Ph.D. thesis, Department of Mechanical Engineering, Stanford University, [19] K.K. Mathur and S.L. Johnsson, Communication primitives for unstructured finite element simulations on data parallel architectures, Computer Systems in Engineering, 3 (1992) [20] M.S. Shephard and M.K. Georges, Automatic three-dimensional mesh generation by the finite octree technique, International Journal for Numerical Methods in Engineering, 32 (1991) [21] T.E. Tezduyar, S. Mittal, S.E. Ray, and R. Shih, Incompressible flow computations with stabilized bilinear and linear equal-order-interpolation velocity-pressure elements, Computer Methods in Applied Mechanics and Engineering, 95 (1992) [22] C. Kato and M. Ikegawa, Large eddy simulation of unsteady turbulent wake of a circular cylinder using the finite element method, in I. Celik, T. Kobayashi, K.N. Ghia, and J. Kurokawa, editors, Advances in Numerical Simulation of Turbulent Flows, FED-Vol.117, ASME, New York, (1991)
A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS
Contemporary Mathematics Volume 157, 1994 A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS T.E. Tezduyar, M. Behr, S.K. Aliabadi, S. Mittal and S.E. Ray ABSTRACT.
More informationCOMPUTATIONAL METHODS FOR ENVIRONMENTAL FLUID MECHANICS
COMPUTATIONAL METHODS FOR ENVIRONMENTAL FLUID MECHANICS Tayfun Tezduyar tezduyar@rice.edu Team for Advanced Flow Simulation and Modeling (T*AFSM) Mechanical Engineering and Materials Science Rice University
More informationCorrected/Updated References
K. Kashiyama, H. Ito, M. Behr and T. Tezduyar, "Massively Parallel Finite Element Strategies for Large-Scale Computation of Shallow Water Flows and Contaminant Transport", Extended Abstracts of the Second
More informationMesh Decomposition and Communication Procedures for Finite Element Applications on the Connection Machine CM-5 System
Mesh Decomposition and Communication Procedures for Finite Element Applications on the Connection Machine CM-5 System The Harvard community has made this article openly available. Please share how this
More informationAdvanced Mesh Update Techniques for Problems Involving Large Displacements
WCCM V Fifth World Congress on Computational Mechanics July 7,, Vienna, Austria Eds.: H.A. Mang, F.G. Rammerstorfer, J. Eberhardsteiner Advanced Mesh Update Techniques for Problems Involving Large Displacements
More informationMassively Parallel Computing: Unstructured Finite Element Simulations
Massively Parallel Computing: Unstructured Finite Element Simulations The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation
More information1.2 Numerical Solutions of Flow Problems
1.2 Numerical Solutions of Flow Problems DIFFERENTIAL EQUATIONS OF MOTION FOR A SIMPLIFIED FLOW PROBLEM Continuity equation for incompressible flow: 0 Momentum (Navier-Stokes) equations for a Newtonian
More information2.7 Cloth Animation. Jacobs University Visualization and Computer Graphics Lab : Advanced Graphics - Chapter 2 123
2.7 Cloth Animation 320491: Advanced Graphics - Chapter 2 123 Example: Cloth draping Image Michael Kass 320491: Advanced Graphics - Chapter 2 124 Cloth using mass-spring model Network of masses and springs
More informationSTABILIZED FINITE ELEMENT METHODS FOR INCOMPRESSIBLE FLOWS WITH EMPHASIS ON MOVING BOUNDARIES AND INTERFACES
STABILIZED FINITE ELEMENT METHODS FOR INCOMPRESSIBLE FLOWS WITH EMPHASIS ON MOVING BOUNDARIES AND INTERFACES A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Marek
More informationA higher-order finite volume method with collocated grid arrangement for incompressible flows
Computational Methods and Experimental Measurements XVII 109 A higher-order finite volume method with collocated grid arrangement for incompressible flows L. Ramirez 1, X. Nogueira 1, S. Khelladi 2, J.
More informationScalability of Finite Element Applications on Distributed-Memory Parallel Computers
Scalability of Finite Element Applications on Distributed-Memory Parallel Computers The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters.
More informationInvestigation of cross flow over a circular cylinder at low Re using the Immersed Boundary Method (IBM)
Computational Methods and Experimental Measurements XVII 235 Investigation of cross flow over a circular cylinder at low Re using the Immersed Boundary Method (IBM) K. Rehman Department of Mechanical Engineering,
More informationMESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP
Vol. 12, Issue 1/2016, 63-68 DOI: 10.1515/cee-2016-0009 MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP Juraj MUŽÍK 1,* 1 Department of Geotechnics, Faculty of Civil Engineering, University
More informationParallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle
ICES Student Forum The University of Texas at Austin, USA November 4, 204 Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of
More informationSELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND
Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana
More informationWhy Use the GPU? How to Exploit? New Hardware Features. Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. Semiconductor trends
Imagine stream processor; Bill Dally, Stanford Connection Machine CM; Thinking Machines Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid Jeffrey Bolz Eitan Grinspun Caltech Ian Farmer
More informationIsogeometric Analysis of Fluid-Structure Interaction
Isogeometric Analysis of Fluid-Structure Interaction Y. Bazilevs, V.M. Calo, T.J.R. Hughes Institute for Computational Engineering and Sciences, The University of Texas at Austin, USA e-mail: {bazily,victor,hughes}@ices.utexas.edu
More informationComputational Fluid Dynamics - Incompressible Flows
Computational Fluid Dynamics - Incompressible Flows March 25, 2008 Incompressible Flows Basis Functions Discrete Equations CFD - Incompressible Flows CFD is a Huge field Numerical Techniques for solving
More informationCase C3.1: Turbulent Flow over a Multi-Element MDA Airfoil
Case C3.1: Turbulent Flow over a Multi-Element MDA Airfoil Masayuki Yano and David L. Darmofal Aerospace Computational Design Laboratory, Massachusetts Institute of Technology I. Code Description ProjectX
More informationTHE application of advanced computer architecture and
544 IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, VOL. 45, NO. 3, MARCH 1997 Scalable Solutions to Integral-Equation and Finite-Element Simulations Tom Cwik, Senior Member, IEEE, Daniel S. Katz, Member,
More informationThe effect of irregular interfaces on the BDDC method for the Navier-Stokes equations
153 The effect of irregular interfaces on the BDDC method for the Navier-Stokes equations Martin Hanek 1, Jakub Šístek 2,3 and Pavel Burda 1 1 Introduction The Balancing Domain Decomposition based on Constraints
More informationModeling External Compressible Flow
Tutorial 3. Modeling External Compressible Flow Introduction The purpose of this tutorial is to compute the turbulent flow past a transonic airfoil at a nonzero angle of attack. You will use the Spalart-Allmaras
More informationDevelopment of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics
Development of an Integrated Computational Simulation Method for Fluid Driven Structure Movement and Acoustics I. Pantle Fachgebiet Strömungsmaschinen Karlsruher Institut für Technologie KIT Motivation
More informationALE Seamless Immersed Boundary Method with Overset Grid System for Multiple Moving Objects
Tenth International Conference on Computational Fluid Dynamics (ICCFD10), Barcelona,Spain, July 9-13, 2018 ICCFD10-047 ALE Seamless Immersed Boundary Method with Overset Grid System for Multiple Moving
More informationFinal drive lubrication modeling
Final drive lubrication modeling E. Avdeev a,b 1, V. Ovchinnikov b a Samara University, b Laduga Automotive Engineering Abstract. In this paper we describe the method, which is the composition of finite
More informationNumerical Simulations of Fluid-Structure Interaction Problems using MpCCI
Numerical Simulations of Fluid-Structure Interaction Problems using MpCCI François Thirifay and Philippe Geuzaine CENAERO, Avenue Jean Mermoz 30, B-6041 Gosselies, Belgium Abstract. This paper reports
More informationDriven Cavity Example
BMAppendixI.qxd 11/14/12 6:55 PM Page I-1 I CFD Driven Cavity Example I.1 Problem One of the classic benchmarks in CFD is the driven cavity problem. Consider steady, incompressible, viscous flow in a square
More informationNon-Newtonian Transitional Flow in an Eccentric Annulus
Tutorial 8. Non-Newtonian Transitional Flow in an Eccentric Annulus Introduction The purpose of this tutorial is to illustrate the setup and solution of a 3D, turbulent flow of a non-newtonian fluid. Turbulent
More informationGeometry based pre-processor for parallel fluid dynamic simulations using a hierarchical basis
Geometry based pre-processor for parallel fluid dynamic simulations using a hierarchical basis Anil Kumar Karanam Scientific Computation Research Center, RPI Kenneth E. Jansen Scientific Computation Research
More informationA High-Order Accurate Unstructured GMRES Solver for Poisson s Equation
A High-Order Accurate Unstructured GMRES Solver for Poisson s Equation Amir Nejat * and Carl Ollivier-Gooch Department of Mechanical Engineering, The University of British Columbia, BC V6T 1Z4, Canada
More informationCHAPTER 1. Introduction
ME 475: Computer-Aided Design of Structures 1-1 CHAPTER 1 Introduction 1.1 Analysis versus Design 1.2 Basic Steps in Analysis 1.3 What is the Finite Element Method? 1.4 Geometrical Representation, Discretization
More informationTechniques for Optimizing FEM/MoM Codes
Techniques for Optimizing FEM/MoM Codes Y. Ji, T. H. Hubing, and H. Wang Electromagnetic Compatibility Laboratory Department of Electrical & Computer Engineering University of Missouri-Rolla Rolla, MO
More informationNon-Linear Finite Element Methods in Solid Mechanics Attilio Frangi, Politecnico di Milano, February 3, 2017, Lesson 1
Non-Linear Finite Element Methods in Solid Mechanics Attilio Frangi, attilio.frangi@polimi.it Politecnico di Milano, February 3, 2017, Lesson 1 1 Politecnico di Milano, February 3, 2017, Lesson 1 2 Outline
More informationENERGY-224 Reservoir Simulation Project Report. Ala Alzayer
ENERGY-224 Reservoir Simulation Project Report Ala Alzayer Autumn Quarter December 3, 2014 Contents 1 Objective 2 2 Governing Equations 2 3 Methodolgy 3 3.1 BlockMesh.........................................
More informationUsing a Single Rotating Reference Frame
Tutorial 9. Using a Single Rotating Reference Frame Introduction This tutorial considers the flow within a 2D, axisymmetric, co-rotating disk cavity system. Understanding the behavior of such flows is
More informationFAST ALGORITHMS FOR CALCULATIONS OF VISCOUS INCOMPRESSIBLE FLOWS USING THE ARTIFICIAL COMPRESSIBILITY METHOD
TASK QUARTERLY 12 No 3, 273 287 FAST ALGORITHMS FOR CALCULATIONS OF VISCOUS INCOMPRESSIBLE FLOWS USING THE ARTIFICIAL COMPRESSIBILITY METHOD ZBIGNIEW KOSMA Institute of Applied Mechanics, Technical University
More informationMid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr.
Mid-Year Report Discontinuous Galerkin Euler Equation Solver Friday, December 14, 2012 Andrey Andreyev Advisor: Dr. James Baeder Abstract: The focus of this effort is to produce a two dimensional inviscid,
More informationSolution of 2D Euler Equations and Application to Airfoil Design
WDS'6 Proceedings of Contributed Papers, Part I, 47 52, 26. ISBN 8-86732-84-3 MATFYZPRESS Solution of 2D Euler Equations and Application to Airfoil Design J. Šimák Charles University, Faculty of Mathematics
More informationNumerical and theoretical analysis of shock waves interaction and reflection
Fluid Structure Interaction and Moving Boundary Problems IV 299 Numerical and theoretical analysis of shock waves interaction and reflection K. Alhussan Space Research Institute, King Abdulaziz City for
More informationcuibm A GPU Accelerated Immersed Boundary Method
cuibm A GPU Accelerated Immersed Boundary Method S. K. Layton, A. Krishnan and L. A. Barba Corresponding author: labarba@bu.edu Department of Mechanical Engineering, Boston University, Boston, MA, 225,
More informationFEMLAB Exercise 1 for ChE366
FEMLAB Exercise 1 for ChE366 Problem statement Consider a spherical particle of radius r s moving with constant velocity U in an infinitely long cylinder of radius R that contains a Newtonian fluid. Let
More informationSolving Partial Differential Equations on Overlapping Grids
**FULL TITLE** ASP Conference Series, Vol. **VOLUME**, **YEAR OF PUBLICATION** **NAMES OF EDITORS** Solving Partial Differential Equations on Overlapping Grids William D. Henshaw Centre for Applied Scientific
More informationPrediction of Flow Features in Centrifugal Blood Pumps
ECCM-2001 European Conference on Computational Mechanics June 26-29, 2001 Cracow, Poland Prediction of Flow Features in Centrifugal Blood Pumps Marek Behr and Dhruv Arora Department of Mechanical Engineering
More informationStrömningslära Fluid Dynamics. Computer laboratories using COMSOL v4.4
UMEÅ UNIVERSITY Department of Physics Claude Dion Olexii Iukhymenko May 15, 2015 Strömningslära Fluid Dynamics (5FY144) Computer laboratories using COMSOL v4.4!! Report requirements Computer labs must
More informationNIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011
NIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011 First-Order Hyperbolic System Method If you have a CFD book for hyperbolic problems, you have a CFD book for all problems.
More informationMultigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK
Multigrid Solvers in CFD David Emerson Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK david.emerson@stfc.ac.uk 1 Outline Multigrid: general comments Incompressible
More informationFinite Volume Discretization on Irregular Voronoi Grids
Finite Volume Discretization on Irregular Voronoi Grids C.Huettig 1, W. Moore 1 1 Hampton University / National Institute of Aerospace Folie 1 The earth and its terrestrial neighbors NASA Colin Rose, Dorling
More informationAdaptive numerical methods
METRO MEtallurgical TRaining On-line Adaptive numerical methods Arkadiusz Nagórka CzUT Education and Culture Introduction Common steps of finite element computations consists of preprocessing - definition
More informationThis tutorial illustrates how to set up and solve a problem involving solidification. This tutorial will demonstrate how to do the following:
Tutorial 22. Modeling Solidification Introduction This tutorial illustrates how to set up and solve a problem involving solidification. This tutorial will demonstrate how to do the following: Define a
More informationSemi-automatic domain decomposition based on potential theory
Semi-automatic domain decomposition based on potential theory S.P. Spekreijse and J.C. Kok Nationaal Lucht- en Ruimtevaartlaboratorium National Aerospace Laboratory NLR Semi-automatic domain decomposition
More informationLarge Eddy Simulation of Flow over a Backward Facing Step using Fire Dynamics Simulator (FDS)
The 14 th Asian Congress of Fluid Mechanics - 14ACFM October 15-19, 2013; Hanoi and Halong, Vietnam Large Eddy Simulation of Flow over a Backward Facing Step using Fire Dynamics Simulator (FDS) Md. Mahfuz
More informationGuidelines for proper use of Plate elements
Guidelines for proper use of Plate elements In structural analysis using finite element method, the analysis model is created by dividing the entire structure into finite elements. This procedure is known
More informationThe 3D DSC in Fluid Simulation
The 3D DSC in Fluid Simulation Marek K. Misztal Informatics and Mathematical Modelling, Technical University of Denmark mkm@imm.dtu.dk DSC 2011 Workshop Kgs. Lyngby, 26th August 2011 Governing Equations
More informationIntroduction to CFX. Workshop 2. Transonic Flow Over a NACA 0012 Airfoil. WS2-1. ANSYS, Inc. Proprietary 2009 ANSYS, Inc. All rights reserved.
Workshop 2 Transonic Flow Over a NACA 0012 Airfoil. Introduction to CFX WS2-1 Goals The purpose of this tutorial is to introduce the user to modelling flow in high speed external aerodynamic applications.
More informationHigh-Order Navier-Stokes Simulations using a Sparse Line-Based Discontinuous Galerkin Method
High-Order Navier-Stokes Simulations using a Sparse Line-Based Discontinuous Galerkin Method Per-Olof Persson University of California, Berkeley, Berkeley, CA 9472-384, U.S.A. We study some of the properties
More informationIntroduction to ANSYS CFX
Workshop 03 Fluid flow around the NACA0012 Airfoil 16.0 Release Introduction to ANSYS CFX 2015 ANSYS, Inc. March 13, 2015 1 Release 16.0 Workshop Description: The flow simulated is an external aerodynamics
More informationRevision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering. Introduction
Revision of the SolidWorks Variable Pressure Simulation Tutorial J.E. Akin, Rice University, Mechanical Engineering Introduction A SolidWorks simulation tutorial is just intended to illustrate where to
More informationDevelopment of a Maxwell Equation Solver for Application to Two Fluid Plasma Models. C. Aberle, A. Hakim, and U. Shumlak
Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models C. Aberle, A. Hakim, and U. Shumlak Aerospace and Astronautics University of Washington, Seattle American Physical Society
More informationPATCH TEST OF HEXAHEDRAL ELEMENT
Annual Report of ADVENTURE Project ADV-99- (999) PATCH TEST OF HEXAHEDRAL ELEMENT Yoshikazu ISHIHARA * and Hirohisa NOGUCHI * * Mitsubishi Research Institute, Inc. e-mail: y-ishi@mri.co.jp * Department
More informationAirfoil Design Optimization Using Reduced Order Models Based on Proper Orthogonal Decomposition
Airfoil Design Optimization Using Reduced Order Models Based on Proper Orthogonal Decomposition.5.5.5.5.5.5.5..5.95.9.85.8.75.7 Patrick A. LeGresley and Juan J. Alonso Dept. of Aeronautics & Astronautics
More informationImprovements in Dynamic Partitioning. Aman Arora Snehal Chitnavis
Improvements in Dynamic Partitioning Aman Arora Snehal Chitnavis Introduction Partitioning - Decomposition & Assignment Break up computation into maximum number of small concurrent computations that can
More informationOptimization to Reduce Automobile Cabin Noise
EngOpt 2008 - International Conference on Engineering Optimization Rio de Janeiro, Brazil, 01-05 June 2008. Optimization to Reduce Automobile Cabin Noise Harold Thomas, Dilip Mandal, and Narayanan Pagaldipti
More informationApplication of GPU-Based Computing to Large Scale Finite Element Analysis of Three-Dimensional Structures
Paper 6 Civil-Comp Press, 2012 Proceedings of the Eighth International Conference on Engineering Computational Technology, B.H.V. Topping, (Editor), Civil-Comp Press, Stirlingshire, Scotland Application
More information(LSS Erlangen, Simon Bogner, Ulrich Rüde, Thomas Pohl, Nils Thürey in collaboration with many more
Parallel Free-Surface Extension of the Lattice-Boltzmann Method A Lattice-Boltzmann Approach for Simulation of Two-Phase Flows Stefan Donath (LSS Erlangen, stefan.donath@informatik.uni-erlangen.de) Simon
More informationHPC Algorithms and Applications
HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear
More informationPARALLEL COMPUTING. Tayfun E. Tezduyar. Ahmed Sameh
1 FINITE ELEMENT METHODS: 1970 s AND BEYOND L.P. Franca, T.E. Tezduyar and A. Masud (Eds.) c CIMNE, Barcelona, Spain 2004 PARALLEL COMPUTING Tayfun E. Tezduyar Mechanical Engineering Rice University MS
More informationStream Function-Vorticity CFD Solver MAE 6263
Stream Function-Vorticity CFD Solver MAE 66 Charles O Neill April, 00 Abstract A finite difference CFD solver was developed for transient, two-dimensional Cartesian viscous flows. Flow parameters are solved
More informationSubdivision-stabilised immersed b-spline finite elements for moving boundary flows
Subdivision-stabilised immersed b-spline finite elements for moving boundary flows T. Rüberg, F. Cirak Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K. Abstract
More informationHigh-Order Numerical Algorithms for Steady and Unsteady Simulation of Viscous Compressible Flow with Shocks (Grant FA )
High-Order Numerical Algorithms for Steady and Unsteady Simulation of Viscous Compressible Flow with Shocks (Grant FA9550-07-0195) Sachin Premasuthan, Kui Ou, Patrice Castonguay, Lala Li, Yves Allaneau,
More informationContents. I The Basic Framework for Stationary Problems 1
page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other
More informationElement Quality Metrics for Higher-Order Bernstein Bézier Elements
Element Quality Metrics for Higher-Order Bernstein Bézier Elements Luke Engvall and John A. Evans Abstract In this note, we review the interpolation theory for curvilinear finite elements originally derived
More informationApplication of Finite Volume Method for Structural Analysis
Application of Finite Volume Method for Structural Analysis Saeed-Reza Sabbagh-Yazdi and Milad Bayatlou Associate Professor, Civil Engineering Department of KNToosi University of Technology, PostGraduate
More informationSteady Flow: Lid-Driven Cavity Flow
STAR-CCM+ User Guide Steady Flow: Lid-Driven Cavity Flow 2 Steady Flow: Lid-Driven Cavity Flow This tutorial demonstrates the performance of STAR-CCM+ in solving a traditional square lid-driven cavity
More informationThe Immersed Interface Method
The Immersed Interface Method Numerical Solutions of PDEs Involving Interfaces and Irregular Domains Zhiiin Li Kazufumi Ito North Carolina State University Raleigh, North Carolina Society for Industrial
More informationEfficiency Aspects for Advanced Fluid Finite Element Formulations
Proceedings of the 5 th International Conference on Computation of Shell and Spatial Structures June 1-4, 2005 Salzburg, Austria E. Ramm, W. A. Wall, K.-U. Bletzinger, M. Bischoff (eds.) www.iassiacm2005.de
More informationLab 9: FLUENT: Transient Natural Convection Between Concentric Cylinders
Lab 9: FLUENT: Transient Natural Convection Between Concentric Cylinders Objective: The objective of this laboratory is to introduce how to use FLUENT to solve both transient and natural convection problems.
More informationIntroduction to C omputational F luid Dynamics. D. Murrin
Introduction to C omputational F luid Dynamics D. Murrin Computational fluid dynamics (CFD) is the science of predicting fluid flow, heat transfer, mass transfer, chemical reactions, and related phenomena
More informationPredicting Tumour Location by Modelling the Deformation of the Breast using Nonlinear Elasticity
Predicting Tumour Location by Modelling the Deformation of the Breast using Nonlinear Elasticity November 8th, 2006 Outline Motivation Motivation Motivation for Modelling Breast Deformation Mesh Generation
More informationReproducibility of Complex Turbulent Flow Using Commercially-Available CFD Software
Reports of Research Institute for Applied Mechanics, Kyushu University No.150 (71 83) March 2016 Reproducibility of Complex Turbulent Flow Using Commercially-Available CFD Software Report 3: For the Case
More informationAN IMPROVED METHOD TO MODEL SEMI-ELLIPTICAL SURFACE CRACKS USING ELEMENT MISMATCH IN ABAQUS
AN IMPROVED METHOD TO MODEL SEMI-ELLIPTICAL SURFACE CRACKS USING ELEMENT MISMATCH IN ABAQUS R. H. A. Latiff and F. Yusof School of Mechanical Engineering, UniversitiSains, Malaysia E-Mail: mefeizal@usm.my
More informationTHE EFFECTS OF THE PLANFORM SHAPE ON DRAG POLAR CURVES OF WINGS: FLUID-STRUCTURE INTERACTION ANALYSES RESULTS
March 18-20, 2013 THE EFFECTS OF THE PLANFORM SHAPE ON DRAG POLAR CURVES OF WINGS: FLUID-STRUCTURE INTERACTION ANALYSES RESULTS Authors: M.R. Chiarelli, M. Ciabattari, M. Cagnoni, G. Lombardi Speaker:
More informationANSYS FLUENT. Airfoil Analysis and Tutorial
ANSYS FLUENT Airfoil Analysis and Tutorial ENGR083: Fluid Mechanics II Terry Yu 5/11/2017 Abstract The NACA 0012 airfoil was one of the earliest airfoils created. Its mathematically simple shape and age
More informationLagrangian and Eulerian Representations of Fluid Flow: Kinematics and the Equations of Motion
Lagrangian and Eulerian Representations of Fluid Flow: Kinematics and the Equations of Motion James F. Price Woods Hole Oceanographic Institution Woods Hole, MA, 02543 July 31, 2006 Summary: This essay
More informationCFD Analysis of 2-D Unsteady Flow Past a Square Cylinder at an Angle of Incidence
CFD Analysis of 2-D Unsteady Flow Past a Square Cylinder at an Angle of Incidence Kavya H.P, Banjara Kotresha 2, Kishan Naik 3 Dept. of Studies in Mechanical Engineering, University BDT College of Engineering,
More informationFrom Hyperbolic Diffusion Scheme to Gradient Method: Implicit Green-Gauss Gradients for Unstructured Grids
Preprint accepted in Journal of Computational Physics. https://doi.org/10.1016/j.jcp.2018.06.019 From Hyperbolic Diffusion Scheme to Gradient Method: Implicit Green-Gauss Gradients for Unstructured Grids
More informationIncompressible Viscous Flow Simulations Using the Petrov-Galerkin Finite Element Method
Copyright c 2007 ICCES ICCES, vol.4, no.1, pp.11-18, 2007 Incompressible Viscous Flow Simulations Using the Petrov-Galerkin Finite Element Method Kazuhiko Kakuda 1, Tomohiro Aiso 1 and Shinichiro Miura
More informationCase C1.3: Flow Over the NACA 0012 Airfoil: Subsonic Inviscid, Transonic Inviscid, and Subsonic Laminar Flows
Case C1.3: Flow Over the NACA 0012 Airfoil: Subsonic Inviscid, Transonic Inviscid, and Subsonic Laminar Flows Masayuki Yano and David L. Darmofal Aerospace Computational Design Laboratory, Massachusetts
More informationA MULTI-DOMAIN ALE ALGORITHM FOR SIMULATING FLOWS INSIDE FREE-PISTON DRIVEN HYPERSONIC TEST FACILITIES
A MULTI-DOMAIN ALE ALGORITHM FOR SIMULATING FLOWS INSIDE FREE-PISTON DRIVEN HYPERSONIC TEST FACILITIES Khalil Bensassi, and Herman Deconinck Von Karman Institute for Fluid Dynamics Aeronautics & Aerospace
More informationThe Development of a Navier-Stokes Flow Solver with Preconditioning Method on Unstructured Grids
Proceedings of the International MultiConference of Engineers and Computer Scientists 213 Vol II, IMECS 213, March 13-15, 213, Hong Kong The Development of a Navier-Stokes Flow Solver with Preconditioning
More informationCase C2.2: Turbulent, Transonic Flow over an RAE 2822 Airfoil
Case C2.2: Turbulent, Transonic Flow over an RAE 2822 Airfoil Masayuki Yano and David L. Darmofal Aerospace Computational Design Laboratory, Massachusetts Institute of Technology I. Code Description ProjectX
More informationA COUPLED FINITE VOLUME SOLVER FOR THE SOLUTION OF LAMINAR TURBULENT INCOMPRESSIBLE AND COMPRESSIBLE FLOWS
A COUPLED FINITE VOLUME SOLVER FOR THE SOLUTION OF LAMINAR TURBULENT INCOMPRESSIBLE AND COMPRESSIBLE FLOWS L. Mangani Maschinentechnik CC Fluidmechanik und Hydromaschinen Hochschule Luzern Technik& Architektur
More informationUnstructured Mesh Generation for Implicit Moving Geometries and Level Set Applications
Unstructured Mesh Generation for Implicit Moving Geometries and Level Set Applications Per-Olof Persson (persson@mit.edu) Department of Mathematics Massachusetts Institute of Technology http://www.mit.edu/
More informationEulerian Techniques for Fluid-Structure Interactions - Part II: Applications
Published in Lecture Notes in Computational Science and Engineering Vol. 103, Proceedings of ENUMATH 2013, pp. 755-762, Springer, 2014 Eulerian Techniques for Fluid-Structure Interactions - Part II: Applications
More informationPRACE Workshop, Worksheet 2
PRACE Workshop, Worksheet 2 Stockholm, December 3, 2013. 0 Download files http://csc.kth.se/ rvda/prace files ws2.tar.gz. 1 Introduction In this exercise, you will have the opportunity to work with a real
More informationEDICT for 3D computation of two- uid interfaces q
Comput. Methods Appl. Mech. Engrg. 190 (2000) 403±410 www.elsevier.com/locate/cma EDICT for 3D computation of two- uid interfaces q Tayfun E. Tezduyar a, *, Shahrouz Aliabadi b a Mechanical Engineering
More informationComputational Study of Laminar Flowfield around a Square Cylinder using Ansys Fluent
MEGR 7090-003, Computational Fluid Dynamics :1 7 Spring 2015 Computational Study of Laminar Flowfield around a Square Cylinder using Ansys Fluent Rahul R Upadhyay Master of Science, Dept of Mechanical
More informationCalculate a solution using the pressure-based coupled solver.
Tutorial 19. Modeling Cavitation Introduction This tutorial examines the pressure-driven cavitating flow of water through a sharpedged orifice. This is a typical configuration in fuel injectors, and brings
More informationISSN(PRINT): ,(ONLINE): ,VOLUME-1,ISSUE-1,
NUMERICAL ANALYSIS OF THE TUBE BANK PRESSURE DROP OF A SHELL AND TUBE HEAT EXCHANGER Kartik Ajugia, Kunal Bhavsar Lecturer, Mechanical Department, SJCET Mumbai University, Maharashtra Assistant Professor,
More informationUsing Semi-Regular 4 8 Meshes for Subdivision Surfaces
Using Semi-Regular 8 Meshes for Subdivision Surfaces Luiz Velho IMPA Instituto de Matemática Pura e Aplicada Abstract. Semi-regular 8 meshes are refinable triangulated quadrangulations. They provide a
More informationModeling Unsteady Compressible Flow
Tutorial 4. Modeling Unsteady Compressible Flow Introduction In this tutorial, FLUENT s density-based implicit solver is used to predict the timedependent flow through a two-dimensional nozzle. As an initial
More information