Technische Universität München Fakultät für Informatik. Computational Science and Engineering (Int. Master s Program)

Size: px
Start display at page:

Download "Technische Universität München Fakultät für Informatik. Computational Science and Engineering (Int. Master s Program)"

Transcription

1 Technische Universität München Fakultät für Informatik Computational Science and Engineering (Int. Master s Program) Parallel Refinement and Coarsening of recursively structured Adaptive triangular grids Master s Thesis Anas Obeidat 1st examiner: Jun.-Prof. Dr. Michael Bader 2nd examiner: Univ.-Prof. Dr. Hans-Joachim Bungartz Assistant advisor: Csaba Vigh, M.Sc. Thesis handed on:

2 I hereby declare that this thesis is entirely the result of my own work except where otherwise indicated. I have only used the resources given in the list of references Date Anas Obeidat

3 Abstract This master thesis is concerned with the parallel implementation of a refinement and coarsening schemes that were applied on an recursively adaptive structured triangular grids. This implementation contributes in a previous work to simulate the Propagation of Oceanic Waves (Tsunami simulation) using Sierpinski space-filling curve and system of stacks to exchange the information. ii

4 Acknowledgment First of all I would thank Prof. Michael Bader for his guidance during this interesting and challenging topic. Also I do not want to forget the cooperation and support that I got from M.Sc. Vigh, and my other partner in this thesis Dermirel, I would like to thank Vigh for his strong support and tips during the last six months even that he was in the US for the last three months, also the friendship and co-work that I got from my friend Ömer Demirel. Finally I would like to send my respect and appreciation to Prof. Hans-Joachim Bungartz and his Scientific Computing chair at TU München for giving me the opportunity to take part in this international master program, where I obtained deep knowledge and advanced skills in computational science. iii

5 Contents 1 Introduction 1 2 Mathematical Modeling Shallow water equation Vector form of SWE Weak form of the Shallow Water Equations Discretization Numerical Flux The Grid Grid Generation Saving the Grid Traversing the Grid Sierpinski filling curve Triangle Types and Stack System Triangle Types Colouring The Stacks System The New Stacks Parallel grid and load redistribution 24 iv

6 Contents 5.1 Parallel grid Implementing the parallel grid Load redistribution Diffusion All-to-All Test cases and results Dynamic Symmetric Circle Dynamic non-symmetric Circle Moving Target Results Implementation Fork and Join Initial Traversal Empty Traversal Adapt_mark Traversal Edge_mark Traversal Adapt Traversal The implementation Outlook 39 Bibliography 40 v

7 1 Introduction This work is taking part in the Propagation of Oceanic Waves (Tsunami simulation). This thesis is about implementing a Parallel refinement and coarsening frame work, that is considered to be the next step of the previous frame work. In previous frame work we had a refinement and coarsening of recursively structured adaptive triangular grid, the main focus of my work is to run this framework in parallel. Some extra algorithms and schemes were implemented in the new frame work, taking in consideration not to loose the basic implemented ideas like recursively structured grid, traversing the grid via Sierpinski curve, and the stack system. Chapter 2 will talk about the Mathematical Modeling even that I did not contribute in this part, as my focus was on the grid itself, however in this chapter I will give a fast explanation about the model like the Shallow water equation the vector form of SWE. Then I will talk about the grid, generating, saving, and traversing it. In Chapter 4 I will explain the old Stack System, the new one, and the Triangle Types, implementing the grid in parallel and the schemes used for load-redistribution will take part in Chapter 5,then i will show the results, the test cases, the implementation, and in the Outlook a short talk about the next steps will be explained. 1

8 2 Mathematical Modeling As my work contributes in The simulation of the propagation of oceanic waves (Tsunami Simulation), hence, my work concerned on the data structure and the grid itself here I will talk shortly about the mathematical modeling which was presented by Schweiger[12], Böck[7], and finally by Demirel[8]. The mathematical modeling was concerned about using Discontinuous Galerkin Solver to solve the Shallow water equation. 2.1 Shallow water equation SWE is widely used to model the dynamics of incompressible fluids and in our situation the water, often refers as partial differential equations (PDEs), were it is a nonlinear set of hyperbolic mass and momentum conservation laws[8]. SWE can be used to model the flows in the ocean as the flow s depth scale is too short compared to the length, the same situation is applied on the Tsunami waves The Shallow Water Equations are described by deriving three-dimensional Navier- Stokes equations as the following [1]: ξ t + (vh) = 0 (2.1) where v t + v v + τ bfv + f c k v + g ξ ν T H (Hv) = 1 H F (2.2) ξ: The vertical deviation from the flat ocean surface. v: The velocities in x- and y-directions. v = (v x, v y ). H: The total height of the wave is modeled as H = ξ + h b 2

9 Chapter 2. Mathematical Modeling 2.2. Vector form of SWE F : Some configurations such as the pressure of the atmosphere. τ bf : The friction coefficient on the bottom of the ocean. f c : Fictitious(Coriolis) force. k: Respective local normal vector. g: Gravitational acceleration. ν T : Depth-averaged viscosity of the fluid. Fo more detailed see [8] 2.2 Vector form of SWE To be able to solve the SWE a vector form was derived in many ways see [1] [2], the method presented by [10] was used to present the following vector form [8]: u t + divf(u) = 0 (2.3) 2.3 Weak form of the Shallow Water Equations As we need to avoid the strong form of the SWE, DGS was used as it works with the weak form of SWE. The method of how to get the weak form of SWE can be derived can be seen in [8], [12] SWE in their weak forms: Ω ξ t Ω ϕdxdy + F j (ξ) nϕds F j (ξ) ϕdxdy = 0, ϕ V, j = 1 : 3 (2.4) Ω With the integer j is the jth component in the F Flux vector. Ω is the boundary of the Ω and the n is the vector normal to that boundary[8] 3

10 Chapter 2. Mathematical Modeling 2.4. Discretization 2.4 Discretization The triangulation scheme has been chosen in the previous work presented by Bader[4] to discretize the domain, in order to be able to solve the problem numerically using DGC, more details about the domain and triangulation are explained in the next chapter. where the volume V consists of piecewise discontinuous polynomials p k in their respective intervals I k [12]. Each of triangular elements T k has a polynomial p k function which we do not demand to be continuos on the shared edges, this fact gives us the test volume V that that DGS needs, where the volume V consists of piecewise discontinuous polynomials p k in their respective intervals I k [8]. So the test volume V looks like V m = {p(x, y) : T k R p is a polynomial of degree n and n m on T k } 2.5 Numerical Flux In our domain, the physical quantities such as mass and impulse need to be exchanged between the neighbour elements. beccause of the discontinuity we can not compute the boundaries of the triangles directly, For that, we have to make approximations on the boundaries using the numerical fluxes F on every shared edge. And those fluxes exchange during the traversals to ensure a correct simulation. 4

11 3 The Grid In this section we will talk about the Grid, how to generate, save, traverse it. Propagation of Ocean waves meets several challenges considering the Grid and data structures side, as we require: Strong Adaptivity Huge number of cells Efficiency in storing the grid Load balancing and distributing the grid s cells over the processes. An acceptable speed on refining and coarsening the grid. All of those challenges and requirements make us think carefully in generating and saving the grid. 3.1 Grid Generation It is important in any successful Computational Science Simulation to have an accurate and fast solution, and a good discretization of the geometry is crucial to have a high quality solution of the numerics. We adapt in our code a Structured Grid figure 3.1 which allows us to store the computation grid efficiently as it helps us to reduce the amount of the grid information that need to be stored. 5

12 Chapter 3. The Grid 3.1. Grid Generation Figure 3.1: Structured Grid A research group [5] has represented a method based on recursively sub structured grids, we use this method because it allows us to implement an iterative multi-grid solver - which is important to get a fast solution for the system of equation that obtained from the discretization process-with a minimal amount of memory. For all of that our Grid (Computational domain Ω) is constructed using a recursive bisection of triangular grid, in another way we start from a parent triangle cell, each cell is recursively subdivided into two children until a desired resolution or level of adaptive refinement is obtained figure 3.2 Figure 3.2: Recursive Bisection of Triangular Grid 6

13 Chapter 3. The Grid 3.2. Saving the Grid The Grid is supposed to meet the following requirements: 1. Adaptivity: In math point of view to get a small error as much as possible, so the finer the cells are the more accurate. The results are, but from computer science view finer cells lead to more memory to be saved. So To have a good balance between both we only refine the mesh as much as we can in the places where more is expected, and we coarse as much as we can on the other places 2. Conformity[5]: It means that no hanging cells are allowed in the generated adaptive grid. We might face hanging nodes in the case where two cells adjacent to a marked edge were bisected, so to avoid this situation a communication is required during the refinement of the adaptive grid, this means when a cell is chosen to be refined usually the adjacent cells will have to be refined too so we can keep the conformity, but this forced refinement might lead us to force refinement to a further cells and so on in which we call a cascade of refinement as we can see in Figure 3.3 Figure 3.3: Refinement cascade: the requested refinement of the dark-coloured cell (thick line) forces the refinement of four further cells (dashed lines)[5] 3.2 Saving the Grid From the computer science point of view we try to save as less information about our grid as we can, now using a recursive structured adaptive grid which is characterized by solid-neighbours relations and it s recursively refined gives us little information to be saved. 7

14 Chapter 3. The Grid 3.2. Saving the Grid One way of representing such a grid is by using a binary tree, which we will call a refinement tree 3.4 Figure 3.4: Recursively constructed Triangular Grid and its corresponding Binary Tree[6] For example, if we start storing only the neighbourhood relation on the most coarse level, so the nodes represent the triangles, where the root is the parent cell-triangle-, then we refine that parent cell by representing that as two children for that parent cell, and by keep going until the desired level, the result will be a binary tree that represents our refined grid 3.5, so this scheme leads us to have the triangles that are used for the numerical calculation to be represented by the leaves of the refinement tree. 8

15 Chapter 3. The Grid 3.2. Saving the Grid Figure 3.5: Step by Step refining the grid and building its corresponding Binary TreeTree[16] Traversing the refinement tree in depth-first scheme gives us a sequential order of the grid cells that are equivalent to the Sierpinski space-filling carve order, which linearized our grid, thus, we represent the grid s cells as a bitstream or triangle_stream, so we need one bit per cell to represent the refinement information. One bit per cell (triangle) 9

16 Chapter 3. The Grid 3.3. Traversing the Grid where: The refined cells are labeled by 1. The leaves are labeled by 0. The leaves are labeled by 0. If we Traverse the refinement tree of the adaptive grid in 3.4 in depth-first order we will get the following bitstream: Traversing the Grid Now as we are using an iterative scheme we want to go through this grid, so we might think about doing that simultaneously but we might face a problem that we need to store the location of each cell with it s neighbours, which is expensive. As a solution we traverse the grid in cell-oriented scheme-cell by cell-to forward and backward, to perform several Adaptive refinement and Coarsening traversals as much as required, but we are facing a problem in knowing the order of the element to be traversed and to be sure that all the elements got visited, and no element has been visited twice. As a solution we use the algorithmic scheme presented by Bader and Zenger [4] allowed us to implement such an iterative solver without the need to store the neighbour s relationships, the algorithm is based on the space-filling curve, and for the triangular grid, so this is Sierpinski filling curve Sierpinski filling curve Several filling curve techniques such as Piano, Hilbert, and Sierpinski are available, but as the first two techniques are working on Grid the subdivided into squares Sierpinski is using Triangular Grid so it fits perfectly in our domain [11] Figure 3.6 shows the generation of Sierpinski curve in one coarse triangle and the levels of Sierpinski curve as we refined that parent cell 10

17 Chapter 3. The Grid 3.3. Traversing the Grid Figure 3.6: The first six levels of Sierpinski filling curve [14] So as illustrated in the previous figure the curve runs on all the cells (elements), which traverse the grid in a linear way. Traversing the refinement tree in depth-first order which produce the Linear bitstream and traversing the grid using Sierpinski curve is giving us the same order 3.7, also we do not need to save the neighbourhood so we do not need extra memory Figure 3.7: Traversing the grid using Sierpinski curve and the corresponding refinement tree [14] 11

18 4 Triangle Types and Stack System 4.1 Triangle Types Sierpinski curve as mentioned by Schweiger[12] is ruled by traversing the grid parallel to the Hypotenuse as it enters the grid cell. This rule divides our grid cells into three types according to how the Sierpinski curve is visiting them 4.1. Thus the main three types are: H: The SFC enters from the hypotenuse and exit through the leg. K: The SFC enters from the leg and exit through the hypotenuse. V: The SFC enters from the leg and exit through the other one. Figure 4.1: Grid s triangle types according to how SFC visits them. Refining those triangles generate six recursive relations as shown in Figure 4.2, Where: H o (V o, K o ), H n (V n, K o ), V o (H o, K o ), V n (H n, K n ), K o (H n, V o ), K n (H n, V n ). o stands to old edge, the edge that carries the current flux. n stands to new edge, the edge that its flux will be updated. 12

19 Chapter 4. Triangle Types and Stack System 4.1. Triangle Types Old edge indicates that this edge carries the current flux term, new edge is the edge that it s flux value will be updates in the next iteration[12]. Figure 4.2: Triangle types, six recursive relations Colouring Colouring scheme was introduced in-order to know from which stack the edge should be pushed/popped. We use the triangle types to colour the edges (Green or Red), which help us to know exactly to which stack the current element should be pushed/popped. Node colouring In Schraufstetter s[14] work they used to push/pop the Triangles Node into the corresponding Stack as they used a classical Finite Element methods, were we have nodelocated unknowns -in addition to cell- and edge-located unknowns-. Node colouring can be seen in figure

20 Chapter 4. Triangle Types and Stack System 4.1. Triangle Types Figure 4.3: Node Colouring [14] As we are using DGS where out data is suited on the edges (Flux s), plus we need to synchronize the refined edges, edge colouring scheme has been chosen. Triangle s reference colour Before talking about Edge colouring we will elucidate the Triangle s reference colour which is used in the Edge colouring as we will see next. Triangle s reference colour was also described by Schraufstetter[14], we starts with four green-coarse triangles which we give them the reference colour manually 4.4. Figure 4.4: four green-coarse triangles Now the scheme simply is as we refine the parent triangle. The two children take the opposite reference colour that the parent has, so if the parent s reference colour is green the reference colour of the children will be red 4.5. How ever using the triangle types to 14

21 Chapter 4. Triangle Types and Stack System 4.1. Triangle Types colour the edges should be enough but in our code we still use the Triangle s reference colour to help doing that. Figure 4.5: Triangle colouring Edge colouring Edge colouring scheme was introduced by VIGH[15], we use the triangle s reference colour and the triangle type to colour the edges 4.6, the rule for colouring the edges is: The hypotenuse takes the opposite colour of the triangle, while the two legs take the same triangle colour But, what we really interested in is the edge that the Sierpinski curve is not crossing, we will call it the colour_edge, however, we are also colouring the other edges according to the rule. For example: Ho/n: The colour_edge is the left leg, so it takes the same colour as the triangle. Ko/n: The colour_edge is the Right leg, so it takes the same colour as the triangle. Vo/n: The colour_edge is the Hypotenuse, so it takes the triangle s opposite colour. 15

22 Chapter 4. Triangle Types and Stack System 4.1. Triangle Types Figure 4.6: Edge colouring As a summery our grid is looking like 4.7: Figure 4.7: The grid with coloured triangles and edges 16

23 Chapter 4. Triangle Types and Stack System 4.2. The Stacks System 4.2 The Stacks System So in the previous work Schweiger[12], and Böck[? ] introduced four Stacks-types those four types are 4.8: Input: store the element before the traversal starts. Output: store the element After the traversal ends. Red: store the red elements Green: store the green elements Figure 4.8: The old stack system[6] Though we introduce a new stack system to face the challenge of having a full Parallel refinement and coarsening code The New Stacks Taking benefit of how Bader[6], Schweiger[12], have classified the edges, we introduced new edges type figure 4.9: The first edge in the element that Sierpinski curve visits. 17

24 Chapter 4. Triangle Types and Stack System 4.2. The Stacks System Current_edge: The first edge in the element that Sierpinski curve visits. Next_edge: The next edge the Sierpinski curve visits. Coloured_edge: The element s edge that the curve is not crossing Domain_boundary_edge: The element s edge that the curve is not crossing.it might be any of the previous types, so it s an additional characteristic to the element s edge, defining that this is also a domain boundary edge. Process_boundary_edge: It might be any of the first three types, it gives an extra characteristic to the edge, defining that this edge is in between two different processes. Figure 4.9: Edge Types 18

25 Chapter 4. Triangle Types and Stack System 4.2. The Stacks System As a result of introducing new types of edges; new data structure has been introduced too: Crossed_edge Colour_edge_output Colour_edge_input Colour_temp_edge Process_boundary So in total we have eight stacks-all First In Last Out-(LIFO) as described in [15], where each of the Colour_edge_input/output, Colour_temp_edge, Process_boundary are basically two stacks green one and red one depending on the colour of the edge. While the Crossed_edge is a linear array. One Advantage of this new Stack implementation that no data left in the colour stack after finishing one full traversal where we ensure that all the edges go to the output stack, however this implementation is more complicated than the pervious one and needed more code etc.., plus it generated two different output streams. Crossed_edge In the initial traversal Sierpinski curve start traversing the grid for the first time, and here the crossed_edge is filled by the edges that the curve is crossing one by one. At the end of the traversal the crossed_edge is filled by the current_edge and next_edge those edges are then coloured and pushed to the corresponding stack, The last pushed edge will be the first popped when the next traversal starts as we go forward and backward. Colour Stacks The coloured stacks are: colour_edge_output: At the end of the traverse contains all the old coloured edges colour_edge_input: At the beginning of the traverse contains all the new coloured edges colour_temp_edge: The none domain_boundary_edges are pushed here, cause it might be used later. 19

26 Chapter 4. Triangle Types and Stack System 4.2. The Stacks System process_boundary_edge: The edges that are shared between two processes are pushed here. The algorithm for popping/pushing the colour_edge work exactly for the triangles of type Ho/n, Ko/n where we use the same colour reference that the parent has to know to which stack we push/pop the colour_edge, while in Vo/n we use the opposite colour reference. Pushing the colour_edge Algorithm 4.1 H o,k o -Push colour = parent_colour_ref erence P ush(colour_edge) colour_edge_output_stack(colour) Algorithm 4.2 V o -Push colour = N OT (parent_colour_ref erence) P ush(colour_edge) colour_edge_output_stack(colour) In old triangle some calculation are done and the edge will not be used in the current traversal anymore, so the colour_edge is pushed to it s corresponding coloured output_stack. Algorithm 4.3 H n,k n - Push if (colour_edgeisdomain_boundary_edge) then colour = parent_colour_ref erence P ush(colour_edge) colour_edge o utput_stack(colour) else colour = parent_colour_ref erence P ush(colour_edge) colour_temp_edge_stack(colour) end if 20

27 Chapter 4. Triangle Types and Stack System 4.2. The Stacks System Algorithm 4.4 V n - Push if (colour_edgeisdomain_boundary_edge) then colour = N OT (parent_colour_ref erence) P ush(colour_edge) colour_edge o utput_stack(colour) else colour = N OT (parent_colour_ref erence) P ush(colour_edge) colour_temp_edge_stack(colour) end if The new triangle means that we need to check if the edge is a domain_boundary_edge, if it is we push it directly to the output_stack; cause we know that this edge will not be used by other triangles. But, if it is not so we need to push it to the temp_edge_stack as it is shared with another triangle to the neighbour cell. Popping the colour_edge Algorithm 4.5 H o,k o - Pop if (colour_edgeisdomain_boundary_edge) then colour = parent_colour_ref erence colour_edge = P op(colour_edge_output_stack(colour)) else colour = parent_colour_ref erence colour_edge = P op(colour_temp_edge_stack(colour)) end if Algorithm 4.6 V o - Pop if (colour_edgeisdomain_boundary_edge) then colour = N OT (parent_colour_ref erence) colour_edge = P op(colour_edge_output_stack(colour)) else colour = N OT (parent_colour_ref erence) colour_edge = P op(colour_temp_edge_stack(colour)) end if Popping is the opposite case so if the triangle is old and the colour_edge is a domain boundary we pop it from the colour_input stack, and if it is not we take it from the colour temp_edge stack. 21

28 Chapter 4. Triangle Types and Stack System 4.2. The Stacks System Algorithm 4.7 H n, K o -Pop colour = parent_colour_ref erence colour_edge = P op((colour_edge_output_stack(colour)) Algorithm 4.8 V n,-pop colour = N OT (parent_colour_ref erence) colour_edge = P op((colour_edge_output_stack(colour)) For new triangles there is no need to check of the colour_edge is a domain_boundary_edge, cause the edge will be used for the first time so we pop it directly from the colour input stack Process_Boundary_edge Also called the communication stacks, they are used to exchange the refine/coarsen information, and the Flux s between the shared edges. Communication stacks are allocated with an over estimated number of process_boundary_edges before starting visiting the cells as following: in each p r o c e s s : a l l o c a t e edge mpi s t r u c t u r e c r o s s e d / coloured_edges = 1 + (2 (max_depth 1) ) && num_c o a r s e_t r i a n g l e s c o l o u r_temp_edges= c r o s s e d / coloured_edges /2 do i =0, i < MPI_s i z e 1 a l l o c a t e 4 Process_boundary_s t a c k s ( c o l o u r_temp_edges /16) end do After That we initialize four communications stacks where the process_boundary_edges are pushed/popped to depending on their colour (Green, Red) and the current Sierpinski direction (Forward, Backward) as following: i n i t i a l i z e p r o c e s s boundary : do i =0, i < MPI_s i z e 1 Process_boundary_stack ( Forward, Red) Process_boundary_stack ( Backward, Red) Process_boundary_stack ( Forward, Green ) Process_boundary_stack ( Backward, Green ) end do 22

29 Chapter 4. Triangle Types and Stack System 4.2. The Stacks System Initially, we traverse the grid and discover the shared edges and pop them to their corresponding process_boundary_stack. Then we use to functions synchronize/update process_boundary_edges to merge and update the edges with the process_boundary_edges. 23

30 5 Parallel grid and load redistribution 5.1 Parallel grid Distributing the Computational domain Ω over several process s, where each has its own part of the domain and use the communication stacks to ensure correct exchanging of the process_boundary_edges during the traversals 5.1. Adaptivity force to keep the process_boundary_edges in the communication stacks up to date, so we need to synchronize and update the communication stacks with the interprocess edges to maintain the conformity and the correctness of the flux terms. Figure 5.1: Process boundary between two processes 24

31 Chapter 5. Parallel grid and load redistribution 5.1. Parallel grid Implementing the parallel grid To implement a parallel adaptive grid Vigh has introduced Process boundary stack, where we have four stacks that we push/pop the process_boundary_edges according to their colour and to the current traversal s direction forward/backward, the stacks were introduced in the previous chapter as following: Process_boundary_stack(Forward, Red, MPI_neighbour_rank) Process_boundary_stack(Backward,Red, MPI_neighbour_rank) Process_boundary_stack(Forward,Green, MPI_neighbour_rank). Process_boundary_stack(Backward,Green, MPI_neighbour_rank). Those stacks are called communication stacks, and MPI_neighbour_rank is the rank of the process that the process_boundary_edge is shared with. So as we traverse the grid some elements get refined and some get coarsened, so the processes need to communicate and exchange the process_boundary_edges which contain some important data like the current depth, refine the edge, coarsen the edge, etc, those data are necessary to maintain the conformity of the grid. The implementation After allocating and initializing the communication stacks we split the domain over the processes, so each process will know which part of the grid it responsible to. Then we start marking the edges to know which needs to be refined and which to be coarsen here we need to check if this edge belongs to the process region and if Yes we need to synchronize/update the new data of the edge with it s corresponding communication stack so the stacks will keep track of the edges that need to be refined and coarsen. Then at the end of the traversal each process send/receive the shared elements of the Process_boundary_stack to its neighbour this is done using MPI_SEND_RECV [17]. For instance: send_recv(process_boundary_stack(red,forward), process_boundary_stack(red,backward)) In the following Figure 5.2, Process one (P1) sends Process_boundary_stack(Red,Forward) Process two (P2) receives in Process_boundary_stack(Red,Backward) 25

32 Chapter 5. Parallel grid and load redistribution 5.2. Load redistribution Figure 5.2: Process boundary stack communication 5.2 Load redistribution Our grid is an adaptive one, during the traversal we coarsen and refine the element according to a specific geometry, which affects the size of the stream of element, also affect the distribution of the elements over the processes, which might lead us to a bad load-balancing. As example of bad load-balance can be seen in 5.3, where we have a grid that is deeply refined at one point and coarsened; otherwise, now distributing this grid on several processes without load-balancing leads us to have one or two processes sharing the point where the grid is deeply refined so they handle much bigger number of edges/elements than the other processes. 26

33 Chapter 5. Parallel grid and load redistribution 5.2. Load redistribution Figure 5.3: A refined grid that leads to bad load-balance Depending on the data structure we implemented two load redistributing schemes diffusion, and All-to-All Diffusion This scheme will redistribute the grid s numerics cells (triangle_num_stream), those cells contain the following numerics: ξ: The vertical deviation from the flat ocean surface. v: The velocities in x- and y-directions. v = (v x, v y ). The implementation Now each process has its own region of the stream, during adaptation the region s size might change because we append more elements (Refine) or we delete some (Coarsen). At the end of one full traversal and before starting the next initial one we redistribute the triangle_num_stream over the processes in a Diffusion scheme, where we send/re- 27

34 Chapter 5. Parallel grid and load redistribution 5.2. Load redistribution ceive 40% of the difference between two direct neighbour Processes in the Sierpinski order. Of course, this simplifies the load balancing step, because the communication pattern becomes much simpler. To find the best percentage we tried several ones and then we found that 40% works good in our test cases. In the following figure 5.4 we illustrate the Diffusion scheme, where the triangle_num_stream is distributed over four processes. Figure 5.4: An example of Diffusion scheme All-to-All This scheme will redistribute the grid s cells (triangle_stream). All-to-All means that the processes contribute to the result, and All processes receive the result, we are using MPI_ALLGATHER, and MPI_ALLGATHERV which are an MPI Collective communication method more details about these methods can be found in [17]. The implementation During the adaptation the region that the process owns changes as a result of refining/coarsening the triangles, and each process mark this modified region, now after finishing the adaptation process and all the triangle_stream s elements got refined or coarsened we need to redistribute the triangle_stream as following: 28

35 Chapter 5. Parallel grid and load redistribution 5.2. Load redistribution 1. Each process use MPI_ALLGATHER to send the number of it s modified elements to all other processes. 2. Then MPI_ALLGATHERV uses this information, so each process can send it s own modified region of triangles. The MPI_ALLGATHERV concatenates these regions together in the new triangle stream, and distributes them to all processes. The redistribution is not made by one single process, actually it is implemented by the MPI_ALLGATHERV function and it happens in parallel[17]. 29

36 6 Test cases and results Three test cases were tested to examine our load redistribution schemes, maximum number of elements that the code can handle without crashing, and of course if we can run the test case in parallel or not. The idea of the test cases is to generate a geometry where we refine the elements that lay on the geometry to the maximum depth, while we coarsen the others to the minimum depth. All those cases are coded in the current code and can be used by making their compilation flags on/off. 6.1 Dynamic Symmetric Circle compilation flag is!dec$ DEFINE CIRCLE_ADAPTATION A circle was created with a center point (0.50.5) and initial radius = 0.25, in each iteration the radius is increased by 0.2. As this case is a symmetric one it was difficult to see whether the load redistribution schemes work good, however, we could simulate this case up to 16 process with max/min depth= 20/30. The following figure shows the Dynamic Symmetric Circle with max/min depth=16/20, simulated with 7 processor. What is interesting here for example is having a look of the first two images, where the purple blue process s region is changed and got smaller as those two got more elements as a result of passing the circle in their region. 30

37 Chapter 6. Parallel grid and load redistribution 6.1. Dynamic Symmetric Circle 31

38 Chapter 6. Parallel grid and load redistribution 6.1. Dynamic Symmetric Circle Figure 6.1: Dynamic Symmetric Circle 32

39 Chapter 6. Parallel grid and load redistribution 6.2. Dynamic non-symmetric Circle 6.2 Dynamic non-symmetric Circle compilation flag is!dec$ DEFINE CIRCLE_ADAPTATION_01_CENTER To avoid the symmetric case a circle was created with a center point (0.10.1) and initial radius 0.25, in each iteration the radius is increased by 0.2. This case was created to examine the load redistribution as the wave enters the Domain from the lower left corner and go out from the upper right one. The following figure shows the Dynamic non-symmetric Circle with max/min depth=16/20, simulated with 7 processor. 33

40 Chapter 6. Parallel grid and load redistribution 6.2. Dynamic non-symmetric Circle Figure 6.2: Dynamic non-symmetric Circls 34

41 Chapter 6. Parallel grid and load redistribution 6.3. Moving Target 6.3 Moving Target compilation flag is!dec$ DEFINE MOVING_TARGET For the same purpose that we mentioned in 6.2, with a point target starts moving four degrees in each iteration over the Circumference of a ghost circle with radius = 0.25 and center(0.5,0.5). I will nit show images of this test case, because it is difficult to see the moving target as it represents a point in the grid, however the result of this case was presented in my final presentation successfully. 6.4 Results Load redistribution In the following table we see the element s percentage that each process own Test case Process 1 Process 2 Process 3 Process 4 Dynamic non-symmetric Circle 25% 25% 25% 25% Initialization 30.5% 30.5% 19.5% 19.5% Adapt_Traversal 1 28% 28% 22% 22% Redistribute 31% 31% 19% 19% Adapt_Traversal 2 29% 29% 23% 23% Redistribute Redistribute after 20 iteration 25% 25% 25% 25% Adapt_Traversal % 19.5% 30.5% 30.5% Redistribute after iteration 60 25% 24% 25% 25% Speed up All the test cases were run with max/min depth = 20/30 Test case # of element 2 processes 4 processes 8 processes Dynamic Symmetric Circle 2,400, Dynamic non-symmetric Circle 2,097, Moving Target 2,000,

42 7 Implementation Before introducing a roughly algorithm explaining the whole process, we need to introduce the Forking and Joining scheme first additional to that some key-words need to be introduced too 7.1 Fork and Join the triangle_stream and the numeric_triangle_stream are allocated with estimated size which depends on the current depth. This estimated is more than the required size of the stream, because we do not know the amount of refined/coarsened element yet. So we Fork the stream by allocating another triangle_stream, after finishing that adaptation process, we simply Join the two streams by coping the stream that we got after the adaptation to the other one, then we deallocate the old one. 7.2 Initial Traversal Is the first traversal in our implementation, this traversal is responsible on discovering the edges and assign region to each process. When calling it for the next time, its responsible of Fork/Join the numeric_triangle_stream, and also redistributing it in a Diffusion scheme. 7.3 Empty Traversal No adaptation process here, it is used to calculate the old/new flow. And to synchronize the interprocess edges with the process_boundary stacks. 36

43 Chapter 7. Implementation 7.4. Adapt_mark Traversal 7.4 Adapt_mark Traversal Here we start marking the edges that need to be refined/coarsened, And also synchronize/update the interprocess edges with the process_boundary stacks. 7.5 Edge_mark Traversal To preserve the conformity of the grid, were we mark the additional edges to refined/coarsened as a result of the previous traversal, and then we update the process_boundary stacks. 7.6 Adapt Traversal Perform real adaptation, also responsible of All-to-ALl load redistribution(redistribute the triangle_stream). 37

44 Chapter 7. Implementation 7.7. The implementation 7.7 The implementation I n i t i a l i z e the Geometry s Test case a l l o c a t e stack_system a l l o c a t e p r o c e s s_boundary s t a c k s c a l l I n i t i a l Traversal c a l l Empty Traversal Do i =0 To max_i t e r a t i o n c a l l Adapt_mark Traversal edge_mark : Do c a l l Edge_mark Traversal i f ( no more a d d i t i o n a l edges are marked ) Exit edge_mark end do edge_mark c a l l Adapt Traversal : S t a r t : Fork t r i a n g l e_stream Adapt Join t r i a n g l e_stream r e d i s t r i b u t i o n t r i a n g l e_stream End c a l l I n i t i a l Traversal S t a r t : I f ( second c a l l ) Fork t r i a n g l e_stream Join t r i a n g l e_stream r e d i s t r i b u t i o n numeric_t r i a n g l e_stream c a l l Empty Traversal End DO d e a l l o c a t e stack_system d e a l l o c a t e p r o c e s s_boundary s t a c k s 38

45 8 Outlook At the end, I was able to fully accomplish my task of having a Parallel refinement and Coarsening of recursively structured adaptive triangular grids. Finally Demirel s[8] code and mine were supposed to be merged together to have a full solution of the Discontinuous Galerkin Solver for the Shallow Water Equation that is working in parallel on an Adaptive grid, but unfortunately Demirel was unable to fulfill his task, so the next step should be to merge the two codes. Also the speed up that we reach in the current code was not as good as we hoped to, because each process is traversing the whole grid accessing the parts the do not belong to that process. The next essential step is to increase the speed up using several methods. For example, we coarsen the regions that do not belong to the process as much as we can, in this case the process shall not spend much time in traversing those regions. Another approach is Traversal-Cutting, which means that we are not even allowing the process to access a region that does not belong to that process. 39

46 Bibliography [1] Aizinger, V. and Dawson, C.: A discontinuous galerkin method for threedimensional shallow water equations. J. Sci. Comput., 22(1): , [2] Ambati, V. and Bokhove, O.: Flooding and drying in discontinuous galerkin discretizations of shallow water equations. In: Wesseling, P., E. O nate and J. Périaux (Herausgeber): ECCOMAS CFD 2006, European Conference on Computational Fluid Dynamics. TU Delft, September [3] Bader, M.: Raumfüllende kurven. Lecture Notes, TU München, [4] Bader, M. and Zenger, C.: Efficient storage and processing of adaptive triangular grids using Sierpinski curves. in Computational Science - ICCS 2006, vol of Lecture Notes in Computer Science, Springer, 2006, pp [5] Bader, M., Schraufstetter, S., Vigh, C. and Behrens, J.: Memory efficient adaptive mesh generation and implementation of multigrid algorithms using Sierpinski curves. International Journal of Computational Science and Engineering, 4, 2008, pp [6] Bader, M., Bök C.,Schweiger, J., and Vigh, C.: Dynamically adaptive simulation with minimal memory requirement -solving the shallow water equation using sierpinski curves. International Journal of Computational Science and Engineering, 4, 2009, pp. 4. [7] Böck, C.: Discontinuous-galerkin-verfahren zum lösen der flachwassergleichungen auf adaptiven dreiecksgittern. Diplomarbeit, TU München, April [8] Demirel, Ö.: Parallelisation of a Discontinuous Galerkin Solver for the Shallow Water Equation. Master s Thesis, TU München, December [9] Radzieowski, C.: Numerische simulation zeitabhängiger probleme auf dynamischadaptiven dreiecksgittern. Diplomarbeit, TU München, November [10] Remacle, J.-F., S. Franzao, X. Li und M. Shephard.: An adaptive discretization of shallow water equations based on discontinuous Galerkin methods. International Journal for Numerical Methods in Fluids, 52(8): ,

47 Chapter 8. Outlook Bibliography [11] Sagan, H.: Space filling curves. Springer, New York, Heidelberg, Berlin, [12] Schwaiger, J.: Adaptive discontinuous-galerkin-verfahren zum lösen der flachwassergleichungen mit verschiedenen randbedingungen. Diplomarbeit, TU München, September [13] Schwanenberg, D., R. Liem und Köngeter, J.: Discontinuous galerkin method for the shallow water equations. Hydroinformatics 2000, [14] Schraufstetter, S.: Speichereffiziente algorithmen zum lösen partieller differentialgleichungen auf adaptiven dreiecksgittern. Diplomarbeit, TU München, July [15] Vigh, C.: Memory-efficient adaptive grid generation using Sierpinski curves, Masters Thesis, TU München, January [16] Vigh, C.: Lehrstuhltreffen presentation, presentation, TU München, April [17] MPI A Message-Passing Interface Standard Version 2.1:, University of Tennessee, June 23,

Efficient Storage and Processing of Adaptive Triangular Grids using Sierpinski Curves

Efficient Storage and Processing of Adaptive Triangular Grids using Sierpinski Curves Efficient Storage and Processing of Adaptive Triangular Grids using Sierpinski Curves Csaba Attila Vigh, Dr. Michael Bader Department of Informatics, TU München JASS 2006, course 2: Numerical Simulation:

More information

Parallelizing Adaptive Triangular Grids with Refinement Trees and Space Filling Curves

Parallelizing Adaptive Triangular Grids with Refinement Trees and Space Filling Curves Parallelizing Adaptive Triangular Grids with Refinement Trees and Space Filling Curves Daniel Butnaru butnaru@in.tum.de Advisor: Michael Bader bader@in.tum.de JASS 08 Computational Science and Engineering

More information

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves

Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves Michael Bader TU München Stefanie Schraufstetter TU München Jörn Behrens AWI Bremerhaven Abstract

More information

Shallow Water Equation simulation with Sparse Grid Combination Technique

Shallow Water Equation simulation with Sparse Grid Combination Technique GUIDED RESEARCH Shallow Water Equation simulation with Sparse Grid Combination Technique Author: Jeeta Ann Chacko Examiner: Prof. Dr. Hans-Joachim Bungartz Advisors: Ao Mo-Hellenbrand April 21, 2017 Fakultät

More information

Parallel Adaptive Tsunami Modelling with Triangular Discontinuous Galerkin Schemes

Parallel Adaptive Tsunami Modelling with Triangular Discontinuous Galerkin Schemes Parallel Adaptive Tsunami Modelling with Triangular Discontinuous Galerkin Schemes Stefan Vater 1 Kaveh Rahnema 2 Jörn Behrens 1 Michael Bader 2 1 Universität Hamburg 2014 PDES Workshop 2 TU München Partial

More information

Space Filling Curves and Hierarchical Basis. Klaus Speer

Space Filling Curves and Hierarchical Basis. Klaus Speer Space Filling Curves and Hierarchical Basis Klaus Speer Abstract Real world phenomena can be best described using differential equations. After linearisation we have to deal with huge linear systems of

More information

Joint Advanced Student School 2007 Martin Dummer

Joint Advanced Student School 2007 Martin Dummer Sierpiński-Curves Joint Advanced Student School 2007 Martin Dummer Statement of the Problem What is the best way to store a triangle mesh efficiently in memory? The following points are desired : Easy

More information

Efficiency of adaptive mesh algorithms

Efficiency of adaptive mesh algorithms Efficiency of adaptive mesh algorithms 23.11.2012 Jörn Behrens KlimaCampus, Universität Hamburg http://www.katrina.noaa.gov/satellite/images/katrina-08-28-2005-1545z.jpg Model for adaptive efficiency 10

More information

Realistic Animation of Fluids

Realistic Animation of Fluids 1 Realistic Animation of Fluids Nick Foster and Dimitris Metaxas Presented by Alex Liberman April 19, 2005 2 Previous Work Used non physics-based methods (mostly in 2D) Hard to simulate effects that rely

More information

13.472J/1.128J/2.158J/16.940J COMPUTATIONAL GEOMETRY

13.472J/1.128J/2.158J/16.940J COMPUTATIONAL GEOMETRY 13.472J/1.128J/2.158J/16.940J COMPUTATIONAL GEOMETRY Lecture 23 Dr. W. Cho Prof. N. M. Patrikalakis Copyright c 2003 Massachusetts Institute of Technology Contents 23 F.E. and B.E. Meshing Algorithms 2

More information

FEMLAB Exercise 1 for ChE366

FEMLAB Exercise 1 for ChE366 FEMLAB Exercise 1 for ChE366 Problem statement Consider a spherical particle of radius r s moving with constant velocity U in an infinitely long cylinder of radius R that contains a Newtonian fluid. Let

More information

Homework 4A Due November 7th IN CLASS

Homework 4A Due November 7th IN CLASS CS207, Fall 2014 Systems Development for Computational Science Cris Cecka, Ray Jones Homework 4A Due November 7th IN CLASS Previously, we ve developed a quite robust Graph class to let us use Node and

More information

Klima-Exzellenz in Hamburg

Klima-Exzellenz in Hamburg Klima-Exzellenz in Hamburg Adaptive triangular meshes for inundation modeling 19.10.2010, University of Maryland, College Park Jörn Behrens KlimaCampus, Universität Hamburg Acknowledging: Widodo Pranowo,

More information

INTERNATIONAL JOURNAL OF CIVIL AND STRUCTURAL ENGINEERING Volume 2, No 3, 2012

INTERNATIONAL JOURNAL OF CIVIL AND STRUCTURAL ENGINEERING Volume 2, No 3, 2012 INTERNATIONAL JOURNAL OF CIVIL AND STRUCTURAL ENGINEERING Volume 2, No 3, 2012 Copyright 2010 All rights reserved Integrated Publishing services Research article ISSN 0976 4399 Efficiency and performances

More information

1.2 Numerical Solutions of Flow Problems

1.2 Numerical Solutions of Flow Problems 1.2 Numerical Solutions of Flow Problems DIFFERENTIAL EQUATIONS OF MOTION FOR A SIMPLIFIED FLOW PROBLEM Continuity equation for incompressible flow: 0 Momentum (Navier-Stokes) equations for a Newtonian

More information

Space-Filling Curves An Introduction

Space-Filling Curves An Introduction Department of Informatics Technical University Munich Space-Filling Curves An Introduction Paper accompanying the presentation held on April nd 005 for the Joint Advanced Student School (JASS) in St. Petersburg

More information

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND

SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana

More information

NUMERICAL 3D TRANSONIC FLOW SIMULATION OVER A WING

NUMERICAL 3D TRANSONIC FLOW SIMULATION OVER A WING Review of the Air Force Academy No.3 (35)/2017 NUMERICAL 3D TRANSONIC FLOW SIMULATION OVER A WING Cvetelina VELKOVA Department of Technical Mechanics, Naval Academy Nikola Vaptsarov,Varna, Bulgaria (cvetelina.velkova1985@gmail.com)

More information

FLUENT Secondary flow in a teacup Author: John M. Cimbala, Penn State University Latest revision: 26 January 2016

FLUENT Secondary flow in a teacup Author: John M. Cimbala, Penn State University Latest revision: 26 January 2016 FLUENT Secondary flow in a teacup Author: John M. Cimbala, Penn State University Latest revision: 26 January 2016 Note: These instructions are based on an older version of FLUENT, and some of the instructions

More information

ISSUES IN ADAPTIVE MESH REFINEMENT IMPLEMENTATION

ISSUES IN ADAPTIVE MESH REFINEMENT IMPLEMENTATION Sixth Mississippi State Conference on Differential Equations and Computational Simulations, Electronic Journal of Differential Equations, Conference 5 (007), pp. 5. ISSN: 07-669. URL: http://ejde.math.txstate.edu

More information

EVALUATION OF AN EFFICIENT STACK-RLE CLUSTERING CONCEPT FOR DYNAMICALLY ADAPTIVE GRIDS

EVALUATION OF AN EFFICIENT STACK-RLE CLUSTERING CONCEPT FOR DYNAMICALLY ADAPTIVE GRIDS SIAM J. SCI. COMPUT. Vol. 38, No. 6, pp. C678 C712 c 2016 Society for Industrial and Applied Mathematics EVALUATION OF AN EFFICIENT STACK-RLE CLUSTERING CONCEPT FOR DYNAMICALLY ADAPTIVE GRIDS MARTIN SCHREIBER,

More information

HPC Algorithms and Applications

HPC Algorithms and Applications HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear

More information

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK Multigrid Solvers in CFD David Emerson Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK david.emerson@stfc.ac.uk 1 Outline Multigrid: general comments Incompressible

More information

A NURBS-BASED APPROACH FOR SHAPE AND TOPOLOGY OPTIMIZATION OF FLOW DOMAINS

A NURBS-BASED APPROACH FOR SHAPE AND TOPOLOGY OPTIMIZATION OF FLOW DOMAINS 6th European Conference on Computational Mechanics (ECCM 6) 7th European Conference on Computational Fluid Dynamics (ECFD 7) 11 15 June 2018, Glasgow, UK A NURBS-BASED APPROACH FOR SHAPE AND TOPOLOGY OPTIMIZATION

More information

Adaptive-Mesh-Refinement Pattern

Adaptive-Mesh-Refinement Pattern Adaptive-Mesh-Refinement Pattern I. Problem Data-parallelism is exposed on a geometric mesh structure (either irregular or regular), where each point iteratively communicates with nearby neighboring points

More information

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution Multigrid Pattern I. Problem Problem domain is decomposed into a set of geometric grids, where each element participates in a local computation followed by data exchanges with adjacent neighbors. The grids

More information

Efficient Algorithmic Approaches for Flow Simulations on Cartesian Grids

Efficient Algorithmic Approaches for Flow Simulations on Cartesian Grids Efficient Algorithmic Approaches for Flow Simulations on Cartesian Grids M. Bader, H.-J. Bungartz, B. Gatzhammer, M. Mehl, T. Neckel, T. Weinzierl TUM Department of Informatics Chair of Scientific Computing

More information

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 The data of the problem is of 2GB and the hard disk is of 1GB capacity, to solve this problem we should Use better data structures

More information

Load Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations

Load Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations Load Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations Stefan Bischof, Ralf Ebner, and Thomas Erlebach Institut für Informatik Technische Universität München D-80290

More information

Shallow Water Simulations on Graphics Hardware

Shallow Water Simulations on Graphics Hardware Shallow Water Simulations on Graphics Hardware Ph.D. Thesis Presentation 2014-06-27 Martin Lilleeng Sætra Outline Introduction Parallel Computing and the GPU Simulating Shallow Water Flow Topics of Thesis

More information

Robustness improvement of polyhedral mesh method for airbag deployment simulations. TU Delft

Robustness improvement of polyhedral mesh method for airbag deployment simulations. TU Delft Robustness improvement of polyhedral mesh method for airbag deployment simulations. TU Delft Santiago Alagon Carrillo, Numerical Analysis Daily Supervisor: Prof. dr. ir. C. Vuik, Company Supervisor: Ir.

More information

Parallel Mesh Partitioning in Alya

Parallel Mesh Partitioning in Alya Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Parallel Mesh Partitioning in Alya A. Artigues a *** and G. Houzeaux a* a Barcelona Supercomputing Center ***antoni.artigues@bsc.es

More information

Comparisons of Compressible and Incompressible Solvers: Flat Plate Boundary Layer and NACA airfoils

Comparisons of Compressible and Incompressible Solvers: Flat Plate Boundary Layer and NACA airfoils Comparisons of Compressible and Incompressible Solvers: Flat Plate Boundary Layer and NACA airfoils Moritz Kompenhans 1, Esteban Ferrer 2, Gonzalo Rubio, Eusebio Valero E.T.S.I.A. (School of Aeronautics)

More information

Parallel Multigrid on Cartesian Meshes with Complex Geometry +

Parallel Multigrid on Cartesian Meshes with Complex Geometry + Parallel Multigrid on Cartesian Meshes with Complex Geometry + Marsha Berger a and Michael Aftosmis b and Gedas Adomavicius a a Courant Institute, New York University, 251 Mercer St., New York, NY 10012

More information

Efficient Global Element Indexing for Parallel Adaptive Flow Solvers

Efficient Global Element Indexing for Parallel Adaptive Flow Solvers Procedia Computer Science Volume 29, 2014, Pages 246 255 ICCS 2014. 14th International Conference on Computational Science Efficient Global Element Indexing for Parallel Adaptive Flow Solvers Michael Lieb,

More information

IMAGE ANALYSIS DEDICATED TO POLYMER INJECTION MOLDING

IMAGE ANALYSIS DEDICATED TO POLYMER INJECTION MOLDING Image Anal Stereol 2001;20:143-148 Original Research Paper IMAGE ANALYSIS DEDICATED TO POLYMER INJECTION MOLDING DAVID GARCIA 1, GUY COURBEBAISSE 2 AND MICHEL JOURLIN 3 1 European Polymer Institute (PEP),

More information

AMATOS A FLEXIBLE ENGINE FOR ADAPTIVE GRID COMPUTATIONS

AMATOS A FLEXIBLE ENGINE FOR ADAPTIVE GRID COMPUTATIONS AMATOS A FLEXIBLE ENGINE FOR ADAPTIVE GRID COMPUTATIONS Abstract. amatos stands for Adaptive Mesh generator for Atmospheric and Oceanic Simulations. It is a software library that eases to a great extend

More information

Continuum-Microscopic Models

Continuum-Microscopic Models Scientific Computing and Numerical Analysis Seminar October 1, 2010 Outline Heterogeneous Multiscale Method Adaptive Mesh ad Algorithm Refinement Equation-Free Method Incorporates two scales (length, time

More information

Application of A Priori Error Estimates for Navier-Stokes Equations to Accurate Finite Element Solution

Application of A Priori Error Estimates for Navier-Stokes Equations to Accurate Finite Element Solution Application of A Priori Error Estimates for Navier-Stokes Equations to Accurate Finite Element Solution P. BURDA a,, J. NOVOTNÝ b,, J. ŠÍSTE a, a Department of Mathematics Czech University of Technology

More information

Analysis, extensions and applications of the Finite-Volume Particle Method (FVPM) PN-II-RU-TE Synthesis of the technical report -

Analysis, extensions and applications of the Finite-Volume Particle Method (FVPM) PN-II-RU-TE Synthesis of the technical report - Analysis, extensions and applications of the Finite-Volume Particle Method (FVPM) PN-II-RU-TE-2011-3-0256 - Synthesis of the technical report - Phase 1: Preparation phase Authors: Delia Teleaga, Eliza

More information

Distributed Newest Vertex Bisection

Distributed Newest Vertex Bisection Distributed Newest Vertex Bisection in Dune-ALUGrid Martin Alkämper and Robert Klöfkorn Dune User Meeting 2015 Algorithm Some Analysis Experiments Problem In Dune-ALUGrid (among others) we provide an adaptive,

More information

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr.

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr. Mid-Year Report Discontinuous Galerkin Euler Equation Solver Friday, December 14, 2012 Andrey Andreyev Advisor: Dr. James Baeder Abstract: The focus of this effort is to produce a two dimensional inviscid,

More information

NIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011

NIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011 NIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011 First-Order Hyperbolic System Method If you have a CFD book for hyperbolic problems, you have a CFD book for all problems.

More information

Chemnitz Scientific Computing Preprints

Chemnitz Scientific Computing Preprints Roman Unger Obstacle Description with Radial Basis Functions for Contact Problems in Elasticity CSC/09-01 Chemnitz Scientific Computing Preprints Impressum: Chemnitz Scientific Computing Preprints ISSN

More information

The Immersed Interface Method

The Immersed Interface Method The Immersed Interface Method Numerical Solutions of PDEs Involving Interfaces and Irregular Domains Zhiiin Li Kazufumi Ito North Carolina State University Raleigh, North Carolina Society for Industrial

More information

ENERGY-224 Reservoir Simulation Project Report. Ala Alzayer

ENERGY-224 Reservoir Simulation Project Report. Ala Alzayer ENERGY-224 Reservoir Simulation Project Report Ala Alzayer Autumn Quarter December 3, 2014 Contents 1 Objective 2 2 Governing Equations 2 3 Methodolgy 3 3.1 BlockMesh.........................................

More information

CFD Simulation of Moving Geometries Using Cartesian Grids

CFD Simulation of Moving Geometries Using Cartesian Grids Technische Universität München Fakultät für Informatik Diplomarbeit in Informatik CFD Simulation of Moving Geometries Using Cartesian Grids Kristof Unterweger Technische Universität München Fakultät für

More information

Computational Fluid Dynamics - Incompressible Flows

Computational Fluid Dynamics - Incompressible Flows Computational Fluid Dynamics - Incompressible Flows March 25, 2008 Incompressible Flows Basis Functions Discrete Equations CFD - Incompressible Flows CFD is a Huge field Numerical Techniques for solving

More information

Precise FEM solution of corner singularity using adjusted mesh applied to 2D flow

Precise FEM solution of corner singularity using adjusted mesh applied to 2D flow Precise FEM solution of corner singularity using adjusted mesh applied to 2D flow Jakub Šístek, Pavel Burda, Jaroslav Novotný Department of echnical Mathematics, Czech echnical University in Prague, Faculty

More information

The Shallow Water Equations and CUDA

The Shallow Water Equations and CUDA The Shallow Water Equations and CUDA HPC - Algorithms and Applications Alexander Pöppl Technical University of Munich Chair of Scientific Computing January 11 th 2017 Last Tutorial Discretized Heat Equation

More information

Stream Function-Vorticity CFD Solver MAE 6263

Stream Function-Vorticity CFD Solver MAE 6263 Stream Function-Vorticity CFD Solver MAE 66 Charles O Neill April, 00 Abstract A finite difference CFD solver was developed for transient, two-dimensional Cartesian viscous flows. Flow parameters are solved

More information

Isogeometric Collocation Method

Isogeometric Collocation Method Chair for Computational Analysis of Technical Systems Faculty of Mechanical Engineering, RWTH Aachen University Isogeometric Collocation Method Seminararbeit By Marko Blatzheim Supervisors: Dr. Stefanie

More information

Modeling Unsteady Compressible Flow

Modeling Unsteady Compressible Flow Tutorial 4. Modeling Unsteady Compressible Flow Introduction In this tutorial, FLUENT s density-based implicit solver is used to predict the timedependent flow through a two-dimensional nozzle. As an initial

More information

MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP

MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP Vol. 12, Issue 1/2016, 63-68 DOI: 10.1515/cee-2016-0009 MESHLESS SOLUTION OF INCOMPRESSIBLE FLOW OVER BACKWARD-FACING STEP Juraj MUŽÍK 1,* 1 Department of Geotechnics, Faculty of Civil Engineering, University

More information

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions Data In single-program multiple-data (SPMD) parallel programs, global data is partitioned, with a portion of the data assigned to each processing node. Issues relevant to choosing a partitioning strategy

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

Wall-Distance Calculation Modelling

Wall-Distance Calculation Modelling Wall-Distance Calculation for Turbulence Modelling J.C. Bakker Technische Universiteit Delft Wall-Distance Calculation for Turbulence Modelling by J.C. Bakker in partial fulfillment of the requirements

More information

Preliminary Spray Cooling Simulations Using a Full-Cone Water Spray

Preliminary Spray Cooling Simulations Using a Full-Cone Water Spray 39th Dayton-Cincinnati Aerospace Sciences Symposium Preliminary Spray Cooling Simulations Using a Full-Cone Water Spray Murat Dinc Prof. Donald D. Gray (advisor), Prof. John M. Kuhlman, Nicholas L. Hillen,

More information

Level set methods Formulation of Interface Propagation Boundary Value PDE Initial Value PDE Motion in an externally generated velocity field

Level set methods Formulation of Interface Propagation Boundary Value PDE Initial Value PDE Motion in an externally generated velocity field Level Set Methods Overview Level set methods Formulation of Interface Propagation Boundary Value PDE Initial Value PDE Motion in an externally generated velocity field Convection Upwind ddifferencingi

More information

Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we

Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we have to talk about the way in which we represent the

More information

Nonoscillatory Central Schemes on Unstructured Triangular Grids for Hyperbolic Systems of Conservation Laws

Nonoscillatory Central Schemes on Unstructured Triangular Grids for Hyperbolic Systems of Conservation Laws Nonoscillatory Central Schemes on Unstructured Triangular Grids for Hyperbolic Systems of Conservation Laws Ivan Christov 1,* Bojan Popov 1 Peter Popov 2 1 Department of Mathematics, 2 Institute for Scientific

More information

Tutorial 1. Introduction to Using FLUENT: Fluid Flow and Heat Transfer in a Mixing Elbow

Tutorial 1. Introduction to Using FLUENT: Fluid Flow and Heat Transfer in a Mixing Elbow Tutorial 1. Introduction to Using FLUENT: Fluid Flow and Heat Transfer in a Mixing Elbow Introduction This tutorial illustrates the setup and solution of the two-dimensional turbulent fluid flow and heat

More information

Scientific Computing

Scientific Computing Lecture on Scientific Computing Dr. Kersten Schmidt Lecture 20 Technische Universität Berlin Institut für Mathematik Wintersemester 2014/2015 Syllabus Linear Regression, Fast Fourier transform Modelling

More information

Asynchronous OpenCL/MPI numerical simulations of conservation laws

Asynchronous OpenCL/MPI numerical simulations of conservation laws Asynchronous OpenCL/MPI numerical simulations of conservation laws Philippe HELLUY 1,3, Thomas STRUB 2. 1 IRMA, Université de Strasbourg, 2 AxesSim, 3 Inria Tonus, France IWOCL 2015, Stanford Conservation

More information

Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data

Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data Dick K.P. Yue Center for Ocean Engineering Department of Mechanical Engineering Massachusetts Institute of Technology Cambridge,

More information

Geometry Vocabulary. acute angle-an angle measuring less than 90 degrees

Geometry Vocabulary. acute angle-an angle measuring less than 90 degrees Geometry Vocabulary acute angle-an angle measuring less than 90 degrees angle-the turn or bend between two intersecting lines, line segments, rays, or planes angle bisector-an angle bisector is a ray that

More information

(LSS Erlangen, Simon Bogner, Ulrich Rüde, Thomas Pohl, Nils Thürey in collaboration with many more

(LSS Erlangen, Simon Bogner, Ulrich Rüde, Thomas Pohl, Nils Thürey in collaboration with many more Parallel Free-Surface Extension of the Lattice-Boltzmann Method A Lattice-Boltzmann Approach for Simulation of Two-Phase Flows Stefan Donath (LSS Erlangen, stefan.donath@informatik.uni-erlangen.de) Simon

More information

Lagrangian methods and Smoothed Particle Hydrodynamics (SPH) Computation in Astrophysics Seminar (Spring 2006) L. J. Dursi

Lagrangian methods and Smoothed Particle Hydrodynamics (SPH) Computation in Astrophysics Seminar (Spring 2006) L. J. Dursi Lagrangian methods and Smoothed Particle Hydrodynamics (SPH) Eulerian Grid Methods The methods covered so far in this course use an Eulerian grid: Prescribed coordinates In `lab frame' Fluid elements flow

More information

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing Natasha Flyer National Center for Atmospheric Research Boulder, CO Meshes vs. Mesh-free

More information

1 Exercise: Heat equation in 2-D with FE

1 Exercise: Heat equation in 2-D with FE 1 Exercise: Heat equation in 2-D with FE Reading Hughes (2000, sec. 2.3-2.6 Dabrowski et al. (2008, sec. 1-3, 4.1.1, 4.1.3, 4.2.1 This FE exercise and most of the following ones are based on the MILAMIN

More information

Verification and Validation of Turbulent Flow around a Clark-Y Airfoil

Verification and Validation of Turbulent Flow around a Clark-Y Airfoil Verification and Validation of Turbulent Flow around a Clark-Y Airfoil 1. Purpose 58:160 Intermediate Mechanics of Fluids CFD LAB 2 By Tao Xing and Fred Stern IIHR-Hydroscience & Engineering The University

More information

Parallel Algorithms: Adaptive Mesh Refinement (AMR) method and its implementation

Parallel Algorithms: Adaptive Mesh Refinement (AMR) method and its implementation Parallel Algorithms: Adaptive Mesh Refinement (AMR) method and its implementation Massimiliano Guarrasi m.guarrasi@cineca.it Super Computing Applications and Innovation Department AMR - Introduction Solving

More information

1 The range query problem

1 The range query problem CS268: Geometric Algorithms Handout #12 Design and Analysis Original Handout #12 Stanford University Thursday, 19 May 1994 Original Lecture #12: Thursday, May 19, 1994 Topics: Range Searching with Partition

More information

Microwell Mixing with Surface Tension

Microwell Mixing with Surface Tension Microwell Mixing with Surface Tension Nick Cox Supervised by Professor Bruce Finlayson University of Washington Department of Chemical Engineering June 6, 2007 Abstract For many applications in the pharmaceutical

More information

Multiphase flow metrology in oil and gas production: Case study of multiphase flow in horizontal tube

Multiphase flow metrology in oil and gas production: Case study of multiphase flow in horizontal tube Multiphase flow metrology in oil and gas production: Case study of multiphase flow in horizontal tube Deliverable 5.1.2 of Work Package WP5 (Creating Impact) Authors: Stanislav Knotek Czech Metrology Institute

More information

Solving Partial Differential Equations on Overlapping Grids

Solving Partial Differential Equations on Overlapping Grids **FULL TITLE** ASP Conference Series, Vol. **VOLUME**, **YEAR OF PUBLICATION** **NAMES OF EDITORS** Solving Partial Differential Equations on Overlapping Grids William D. Henshaw Centre for Applied Scientific

More information

Realistic Animation of Fluids

Realistic Animation of Fluids Realistic Animation of Fluids p. 1/2 Realistic Animation of Fluids Nick Foster and Dimitri Metaxas Realistic Animation of Fluids p. 2/2 Overview Problem Statement Previous Work Navier-Stokes Equations

More information

Space-filling curves for 2-simplicial meshes created with bisections and reflections

Space-filling curves for 2-simplicial meshes created with bisections and reflections Space-filling curves for 2-simplicial meshes created with bisections and reflections Dr. Joseph M. Maubach Department of Mathematics Eindhoven University of Technology Eindhoven, The Netherlands j.m.l.maubach@tue.nl

More information

Research Article Parallel Adaptive Mesh Refinement Combined with Additive Multigrid for the Efficient Solution of the Poisson Equation

Research Article Parallel Adaptive Mesh Refinement Combined with Additive Multigrid for the Efficient Solution of the Poisson Equation International Scholarly Research Network ISRN Applied Mathematics Volume 2012, Article ID 246491, 24 pages doi:10.5402/2012/246491 Research Article Parallel Adaptive Mesh Refinement Combined with Additive

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction ME 475: Computer-Aided Design of Structures 1-1 CHAPTER 1 Introduction 1.1 Analysis versus Design 1.2 Basic Steps in Analysis 1.3 What is the Finite Element Method? 1.4 Geometrical Representation, Discretization

More information

The Shallow Water Equations and CUDA

The Shallow Water Equations and CUDA The Shallow Water Equations and CUDA Oliver Meister December 17 th 2014 Tutorial Parallel Programming and High Performance Computing, December 17 th 2014 1 Last Tutorial Discretized Heat Equation System

More information

The Shallow Water Equations and CUDA

The Shallow Water Equations and CUDA The Shallow Water Equations and CUDA Alexander Pöppl December 9 th 2015 Tutorial: High Performance Computing - Algorithms and Applications, December 9 th 2015 1 Last Tutorial Discretized Heat Equation

More information

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle

Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of Earth s Mantle ICES Student Forum The University of Texas at Austin, USA November 4, 204 Parallel High-Order Geometric Multigrid Methods on Adaptive Meshes for Highly Heterogeneous Nonlinear Stokes Flow Simulations of

More information

First Steps - Ball Valve Design

First Steps - Ball Valve Design COSMOSFloWorks 2004 Tutorial 1 First Steps - Ball Valve Design This First Steps tutorial covers the flow of water through a ball valve assembly before and after some design changes. The objective is to

More information

Application of Finite Volume Method for Structural Analysis

Application of Finite Volume Method for Structural Analysis Application of Finite Volume Method for Structural Analysis Saeed-Reza Sabbagh-Yazdi and Milad Bayatlou Associate Professor, Civil Engineering Department of KNToosi University of Technology, PostGraduate

More information

Numerical studies for Flow Around a Sphere regarding different flow regimes caused by various Reynolds numbers

Numerical studies for Flow Around a Sphere regarding different flow regimes caused by various Reynolds numbers Numerical studies for Flow Around a Sphere regarding different flow regimes caused by various Reynolds numbers R. Jendrny, H. Damanik, O. Mierka, S. Turek Institute of Applied Mathematics (LS III), TU

More information

Driven Cavity Example

Driven Cavity Example BMAppendixI.qxd 11/14/12 6:55 PM Page I-1 I CFD Driven Cavity Example I.1 Problem One of the classic benchmarks in CFD is the driven cavity problem. Consider steady, incompressible, viscous flow in a square

More information

Introduction to ANSYS CFX

Introduction to ANSYS CFX Workshop 03 Fluid flow around the NACA0012 Airfoil 16.0 Release Introduction to ANSYS CFX 2015 ANSYS, Inc. March 13, 2015 1 Release 16.0 Workshop Description: The flow simulated is an external aerodynamics

More information

Nonlinear Potential Flow Solver Development in OpenFOAM

Nonlinear Potential Flow Solver Development in OpenFOAM Nonlinear Potential Flow Solver Development in OpenFOAM A. Mehmood Plymouth University, UK April 19,2016 A. Mehmood Table of Contents 1 Motivation 2 Solution Methodology Mathematical Formulation Sequence

More information

Differential Geometry: Circle Packings. [A Circle Packing Algorithm, Collins and Stephenson] [CirclePack, Ken Stephenson]

Differential Geometry: Circle Packings. [A Circle Packing Algorithm, Collins and Stephenson] [CirclePack, Ken Stephenson] Differential Geometry: Circle Packings [A Circle Packing Algorithm, Collins and Stephenson] [CirclePack, Ken Stephenson] Conformal Maps Recall: Given a domain Ω R 2, the map F:Ω R 2 is conformal if it

More information

GEOMETRY MODELING & GRID GENERATION

GEOMETRY MODELING & GRID GENERATION GEOMETRY MODELING & GRID GENERATION Dr.D.Prakash Senior Assistant Professor School of Mechanical Engineering SASTRA University, Thanjavur OBJECTIVE The objectives of this discussion are to relate experiences

More information

BACK AND FORTH ERROR COMPENSATION AND CORRECTION METHODS FOR REMOVING ERRORS INDUCED BY UNEVEN GRADIENTS OF THE LEVEL SET FUNCTION

BACK AND FORTH ERROR COMPENSATION AND CORRECTION METHODS FOR REMOVING ERRORS INDUCED BY UNEVEN GRADIENTS OF THE LEVEL SET FUNCTION BACK AND FORTH ERROR COMPENSATION AND CORRECTION METHODS FOR REMOVING ERRORS INDUCED BY UNEVEN GRADIENTS OF THE LEVEL SET FUNCTION TODD F. DUPONT AND YINGJIE LIU Abstract. We propose a method that significantly

More information

AMR Multi-Moment FVM Scheme

AMR Multi-Moment FVM Scheme Chapter 4 AMR Multi-Moment FVM Scheme 4.1 Berger s AMR scheme An AMR grid with the hierarchy of Berger s AMR scheme proposed in [13] for CFD simulations is given in Fig.4.1 as a simple example for following

More information

On the high order FV schemes for compressible flows

On the high order FV schemes for compressible flows Applied and Computational Mechanics 1 (2007) 453-460 On the high order FV schemes for compressible flows J. Fürst a, a Faculty of Mechanical Engineering, CTU in Prague, Karlovo nám. 13, 121 35 Praha, Czech

More information

Adaptive Mesh Refinement (AMR)

Adaptive Mesh Refinement (AMR) Adaptive Mesh Refinement (AMR) Carsten Burstedde Omar Ghattas, Georg Stadler, Lucas C. Wilcox Institute for Computational Engineering and Sciences (ICES) The University of Texas at Austin Collaboration

More information

The Study of Ship Motions in Regular Waves using a Mesh-Free Numerical Method

The Study of Ship Motions in Regular Waves using a Mesh-Free Numerical Method The Study of Ship Motions in Regular Waves using a Mesh-Free Numerical Method by Bruce Kenneth Cartwright, B. Eng., M. Sc. Submitted in fulfilment of the requirements for the Degree of Master of Philosophy

More information

Tutorial Two Built in Mesh

Tutorial Two Built in Mesh Built in Mesh 4 th edition, Jan. 2018 This offering is not approved or endorsed by ESI Group, ESI-OpenCFD or the OpenFOAM Foundation, the producer of the OpenFOAM software and owner of the OpenFOAM trademark.

More information

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids Patrice Castonguay and Antony Jameson Aerospace Computing Lab, Stanford University GTC Asia, Beijing, China December 15 th, 2011

More information

Massively Parallel Finite Element Simulations with deal.ii

Massively Parallel Finite Element Simulations with deal.ii Massively Parallel Finite Element Simulations with deal.ii Timo Heister, Texas A&M University 2012-02-16 SIAM PP2012 joint work with: Wolfgang Bangerth, Carsten Burstedde, Thomas Geenen, Martin Kronbichler

More information

Topology Preserving Tetrahedral Decomposition of Trilinear Cell

Topology Preserving Tetrahedral Decomposition of Trilinear Cell Topology Preserving Tetrahedral Decomposition of Trilinear Cell Bong-Soo Sohn Department of Computer Engineering, Kyungpook National University Daegu 702-701, South Korea bongbong@knu.ac.kr http://bh.knu.ac.kr/

More information