MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY

Size: px
Start display at page:

Download "MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY"

Transcription

1 MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY R.D. FALGOUT, T.A. MANTEUFFEL, B. O NEILL, AND J.B. SCHRODER Abstract. The need for paraeism in the time dimension is being driven by changes in computer architectures, where performance increases are now provided through greater concurrency, not faster cock speeds. This creates a botteneck for sequentia time marching schemes because they ack paraeism in the time dimension. Mutigrid Reduction in Time (MGRIT) is an iterative procedure that aows for tempora paraeism by utiizing mutigrid reduction techniques and a mutieve hierarchy of coarse time grids. MGRIT has been shown to be effective for inear probems, with speedups of up to 50 times. The goa of this work is the efficient soution of noninear probems with MGRIT, where efficiency is defined as achieving simiar performance when compared to an equivaent inear probem. The benchmark noninear probem is the p-lapacian, where p = 4 corresponds to a we-known noninear diffusion equation, and p = 2 corresponds to the standard inear diffusion operator, our benchmark inear probem. The key difficuty encountered is that the noninear time-step sover becomes progressivey more expensive on coarser time eves, as the time-step size increases. To overcome such difficuties, mutigrid research has historicay targeted an accumuated body of experience regarding how to choose an appropriate sover for a specific probem type. To that end, this paper deveops a ibrary of MGRIT optimizations and modifications, most importanty an aternate initia guess for the noninear time-step sover and deayed spatia coarsening, that wi aow many noninear paraboic probems to be soved with parae scaing behavior comparabe to the corresponding inear probem. Key words. mutigrid, mutigrid-in-time, paraboic probems, noninear, reduction-based mutigrid, pararea, high performance computing AMS subject cassifications. 65F10, 65M22, 65M55 1. Introduction. Previousy, increasing cock speeds aowed for the speedup of sequentia time integration simuations of a fixed size, and aowed simuations to be refined in both space and time without an increase in overa wa-cock time. However, increases in cock speed have stagnated, eading to a sequentia time integration botteneck. Future speedups wi be avaiabe from more concurrency, and hence, speedups for time marching simuations must aso come from increased concurrency. By aowing for paraeism in time, much greater computationa resources can be brought to bear and overa speedups can be achieved. Because of this, interest in parae-in-time methods has grown over the ast decade. Perhaps the most we known parae-in-time agorithm, Pararea [24], is equivaent [14] to a two-eve mutigrid scheme. This work focuses on the mutigrid reduction in time (MGRIT) method [10]. MGRIT is a true mutieve agorithm and has optima parae communication behavior, as opposed to a two-eve scheme, where the size of the coarse-eve imits concurrency. Work on parae-in-time methods actuay goes back at east 50 years [33] and incudes a variety of approaches. Work regarding direct methods incudes [32, 36, 27, Submitted to the editors 6/29/2016. Center for Appied Scientific Computing, Lawrence Livermore Nationa Laboratory, P.O. Box 808, L-561, Livermore,CA This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore Nationa Laboratory under Contract DE-AC52-07NA LLNL-JRNL Department of Appied Mathematics, University of Coorado at Bouder, Bouder, Coorado. This work was performed under the auspices of the U.S. Department of Energy under grant numbers (SC) DE-FC02-03ER25574 and (NNSA) DE-NA and Lawrence Livermore Nationa Laboratory under contract B

2 6, 15]. There are iterative approaches, as we, based on mutipe shooting, domain decomposition, waveform reaxation, and mutigrid, incuding [22, 16, 25, 1, 17, 18, 38, 5, 39, 37, 20, 19, 24, 7, 31, 9, 40, 10]. For a gente introduction to this history, pease see the review paper [12]. This work focuses on mutigrid approaches (and MGRIT in particuar) because of mutigrid s optima agorithmic scaing for both parae communication and number of operations. An additiona attraction of MGRIT is its non-intrusive nature, where the user empoys an existing sequentia time-stepping routine within the context of the MGRIT impementation. This work uses XBraid [41], an open source impementation of MGRIT deveoped at Lawrence Livermore Nationa Laboratory (LLNL). MGRIT soves, in parae, a genera first-order, ordinary differentia equation (ODE) and corresponding time discretization: (1.1) u t = f(u, t), u(0) = u 0, t [0, T ], (1.2) u(t + δt) = Φ(u(t), u(t + δt)) + g(t + δt), where Φ is a noninear operator that represents the chosen time-stepping routine and g is a time dependent function that incorporates a the soution independent terms. In the inear case, the appication of Φ is either a matrix vector mutipication, e.g. forward Euer, or a spatia sove, e.g. backward Euer. Sequentia time marching schemes are optima in that they move from time t = 0 to t = T using the fewest possibe appications of Φ. By appying Φ iterativey, in comparaby expensive but highy parae mutigrid cyces, MGRIT sacrifices additiona computation for tempora concurrency. Both methods are optima [10], i.e. O(N) where N is the tota number of time-steps, but the constant for MGRIT is higher. This creates a crossover point wherein the added concurrency overcomes the extra computationa work. Beyond this crossover point MGRIT provides a speedup over sequentia methods. Appication of MGRIT to inear paraboic probems was studied in [10]. Figure 1 shows a strong scaing study of MGRIT for inear diffusion on the machine Vucan, an IBM BG/Q machine at LLNL. The probem size was (257) (space time). Three data sets are presented, a standard sequentia time-stepping run (square markers), a time-ony parae run of MGRIT (circe markers), and a space-time parae run of MGRIT (triange markers). The space-time parae runs used an 8 8 processor grid in space, with a additiona processors added in time. Both MGRIT curves represent the use of tempora and spatia coarsening, so that the ratio of δt/h 2 is fixed on coarse time-grids, where h is the spatia mesh width. The maximum speedup achieved by the bue curve (triange markers) is approximatey 50, and the crossover point where MGRIT provides a speedup is at about 128 processors in time. Achieving a simiar efficiency for noninear paraboic probems is the chief goa of this paper. When considering the performance of MGRIT, the appication of Φ is the dominant process. For a inear probem with impicit time-stepping, each appication of Φ equates to soving one inear system. When an optima spatia sover such as cassica mutigrid in space [28, 4, 34, 21] is used, the work required for a time-step evauation is independent of the time-step size. However, for noninear probems, each appication of Φ invoves an iterative noninear sove, and when using a common method such as Newton s method, convergence can depend strongy on the initia approximation even when inear mutigrid is used as the inner sover. In particuar, the previous time-step 2

3 Fig. 1: Time to sove 2D inear diffusion on a (128) space-time grid using sequentia time-stepping and two different processor decompositions of MGRIT. [10] is farther away on coarser grids. To expore the effects of introducing a noninearity that exhibits this key difficuty, a mode noninear paraboic probem, known as the p-lapacian, is considered: (1.3) u t (x, t) ( u(x, t) p 2 u(x, t)) = b(x, t), x Ω, t [0, T ], subject to the foowing Neumann boundary and initia conditions: (1.4) (1.5) u(x, t) p 2 u(x, t) n = g(x, t), x Ω, t (0, T ], u(x, 0) = u 0 (x), x Ω. The p-lapacian for p = 4 is we-known as a means of modeing soi erosion and transport [2] and has aso found uses in image processing (denoising, segmentation and inpainting) and machine earning (see [8] for an overview and [23] for an introduction). In this paper, the mode noninear probem corresponds to p = 4, whie the comparabe inear probem corresponds to p = 2, which is the standard diffusion operator. Our study wi consist of investigating parae-in-time for Equation (1.3) in the context of XBraid and MGRIT. We wi baance user concerns such as overa timeto-soution, memory use, and non-intrusiveness. To begin the study, we consider a naive appication of MGRIT to (1.3), where a arge increase in the cost of a noninear sove (for Newton s method) on the coarser tempora grids is observed. This is caused by the reativey arge time-steps (compared to the finest grid) and the associated poor initia guess to the Newton sover on the coarse eves. These increases counteract the strength of mutigrid, where speedup is achieved by using cheap coarse grid probems to acceerate convergence on the fine grid. Therefore, our strategy is to minimize the cost of each noninear sove, which is measured with a cost estimate. The desired resut is to achieve simiar efficiencies for the noninear and inear versions of (1.3), whie aso taking into account common user concerns, such as non-intrusiveness and storage costs. Utimatey, this paper s goa is to present a detaied, experimentay backed, ibrary of optimizations and modifications that can be used to efficienty impement MGRIT for a arge range of noninear paraboic probems. Thus, we have carefuy chosen our mode noninear paraboic probem and we note that mutigrid research has historicay targeted such an accumuated body of experience regarding how to choose an appropriate sover for a specific probem type. 3

4 To reduce the average cost of each Φ evauation, an improved initia guess for each time-step is investigated. For sequentia time-stepping, a commony used approach is to take the previous time-step as the initia guess. However, in the parae-intime setting this approach is fawed because on coarse eves, where the time-step size is arge, this is a poor initia guess. The recent parae-in-time works of [35, 26, 30] consider another option based on the iterative nature of many parae-intime agorithms, where information from previous parae-in-time iterations is used as the initia guess. This increases storage requirements, but incurs no additiona computationa cost. For MGRIT, this entais using as the initia guess the soution from the previous evauation of Φ at each time point and on each MGRIT eve. With this initia guess, as MGRIT converges, the cost of a Newton sove on each eve goes to zero 1. However, this aso creates a tension between memory usage and computationa efficiency because MGRIT need not store the soution from the previous Φ evauation at a time points. Using the soution from the previous Φ evauation as the initia guess whenever it is avaiabe aows a user with memory constraints to reap the benefits of this improved initia guess, whie imiting per-processor storage costs. This reduced storage resut and the importance of improved initia guesses for MGRIT are both important contributions of this work. Foowing this, a spatia coarsening strategy is pursued to imit δt/h 2 on coarse time grids. The goa is to reduce the number of Newton iterations required on coarse eves by reducing the spatia variabiity of the noninearity whie aso improving the condition number of the inner inear probems. Recognizing this specific importance of spatia coarsening when controing the cost of noninear time-step sovers on coarse time grids is an important contribution of this work. In addition, the smaer probem sizes drasticay reduce coarse grid compute times and coarse grid storage costs. Two additiona strategies incude avoiding unnecessary work on the first MGRIT cyce and optimizing the number of eves in the MGRIT hierarchy. Further speedups can be obtained by oosening the Newton sover toerance during the first three MGRIT iterations on a eves, where the approximate soution is sti poor. Overa, the most effective strategies are spatia coarsening and the improved initia guess, but the other strategies combined have a simiary significant impact on runtime. Together, these strategies produce an MGRIT agorithm for noninear probems that has an efficiency simiar to that found for a corresponding inear probem. In Section 2, the genera MGRIT framework is discussed. In Section 3, some impementation detais are given. In Section 4, our strategy for improving the performance of MGRIT is proposed and justified. In Section 5, this strategy is impemented. In sections 7.1 and 7.2, weak and strong scaing resuts are presented. 2. MGRIT overview. First, a brief overview of the MGRIT agorithm for time independent inear probems is presented. The noninear, time dependent, extension foows in Section 2.1. Define a uniform tempora grid with time-step δt and nodes t j, j = 0,..., N t (non-uniform grids can easiy be accommodated). Further, define a coarse tempora grid with time-step T = mδt and nodes T j = j T, j = 0, 1,..., N t /m, for some coarsening factor, m. This is depicted in Figure 2. In bock 1 Assuming the Newton sover measures noninear convergence with a fixed absoute toerance. 4

5 T 0 T 1 t 0 t 1 t 2 t 3... tm T = mδt δt t Nt Fig. 2: Fine- and coarse-grid tempora meshes. Fine-grid points (back) are present on ony the fine-grid, whereas coarse-grid points (red) are on both the fine- and coarsegrid. trianguar form, the time-stepping probem (1.2) is I u 0 Φ u 1 (2.1) Au = I Φ I. u Nt = g 0 g 1. g Nt = g. Sequentia time marching is a forward bock sove of this system. MGRIT soves this system iterativey, in parae, using a coarse-grid correction scheme based on mutigrid reduction. Both are O(N) methods, but MGRIT is highy concurrent. Mutigrid reduction strategies are a variation of cycic reduction methods and, as such, successivey eiminate unknowns in the system. If the fine points are eiminated, the system becomes: I u,0 Φ m I u,1 (2.2) A u =.... = Rg = g. Φ m I u,nt/m Idea restriction, R, and interpoation, P, are defined as in [10], so that the system (2.2) can be formed in a mutigrid fashion. Let, I Φ m 1... Φ I (2.3a) R =..., Φ m 1... Φ I (2.3b) I Φ T... Φ m 1,T P T =.... I Φ T... Φ m 1,T The interpoation injects at coarse points before extending those vaues to fine points, i.e., it is injection from the coarse- to fine-grid foowed by F-reaxation (defined beow). With this, the idea coarse-grid operator is A = RAP. 2 This is referred to as idea because the soution of (2.2) yieds an exact soution at the coarse points. This 2 We note that RAP = R I AP, where R I is injection at the coarse points. Thus for efficiency, injection is aways used to map to the coarse-eve (ike Pararea). The exception is the spatia coarsening option where spatia restriction and interpoation functions, R x() and P x(), are used to coarsen in space as we as time. 5

6 Φ Φ Φ Φ Φ Φ g g g g g g F-reaxation Φ Φ g g C-reaxation Fig. 3: F- and C-reaxation for coarsening by factor of 4 coarse grid is essentiay a compressed version of the probem. If this is foowed by interpoation, then the exact soution is aso avaiabe at fine points. The imitation of this exact reduction method is that the coarse-grid probem is, in genera, as expensive to sove as the origina fine-grid probem (because of the Φ m evauations). Mutigrid reduction methods address this by approximating A with B, where (2.4) B = I Φ I... Φ I, and Φ is an approximate coarse-grid time-step operator. One obvious choice for defining Φ is to re-discretize the probem on the coarse grid so that a coarse-grid time-step is roughy as expensive as a fine-grid time-step. This paper makes that choice. For instance, with backward Euer, one simpy uses a arger time-step size. Convergence of MGRIT is governed by the approximation, A B, and this choice of using a re-discretization of Φ with T = mδt has proved effective [10, 11]. It is important to note that, whie the definition of this agorithm reies upon Φ and Φ, the internas of these functions need not be known. This is the non-intrusive aspect of MGRIT. The user defines the time-step operator and can wrap existing codes to work within the MGRIT framework. The coarse grid is used to compute an error correction based on the residua equation (see Agorithm 1). Reaxation, a oca fine-grid process, is used to resove fine-scae behavior. Figure 3 shows the actions of F- and C-reaxation on a tempora grid with m = 4. F-reaxation propagates the soution forward in time from each coarse point to the neighboring F -points. Overa, reaxation is highy parae. Each interva of F -points can be updated independenty during F-reaxation. Every C-point update (during C-reaxation) is simiary independent MGRIT agorithm for noninear probems. The inear MGRIT agorithm [10] is easiy extended to the noninear setting using fu approximation storage (FAS), a noninear mutigrid scheme [3]. The FAS description of MGRIT first appeared in [11]. Note that the F-reaxation two-grid variant of noninear MGRIT is equivaent to the Pararea agorithm [14]. The noninear MGRIT agorithm is presented in Agorithm 1 as a two-eve method, but can be used in a mutieve setting by recursivey appying the agorithm at Step 4. Idea interpoation (2.3) is carried out in two steps (7 and 8) for simpicity in impementation. 6

7 F-cyce t, x t, x 4 t, 2 x 4 t, x 16 t, 4 x 16 t, x Fig. 4: Exampe mutigrid V-cyce with space-time coarsening that hods δt/h 2 fixed on the eft, and then time-ony coarsening on the right. Prior to running the agorithm, initia vaues for u at the fine grid C-points must be set. In genera, the C-points are initiaized with the best avaiabe estimate of the soution. Aternativey, the initia guess can be obtained using a fu mutigrid methodoogy. In this case, the fine grid C-points are obtained by interpoating a cheap coarse grid soution to the fine grid (see Section 6.1). Agorithm 1 MGRIT(A, u, g) 1: Appy F- or FCF-reaxation to A(u) = g. 2: Inject the fine grid approximation and its residua to the coarse grid: u,i u mi, r,i g mi (A(u)) mi. 3: If Spatia coarsening, then u,i R x (u,i ), r,i R x (r,i ). 4: Sove B (v ) = B (u ) + r. 5: Compute the coarse grid error approximation: e v u. 6: If Spatia coarsening, then e,i P x (e,i ). 7: Correct u at C-points: u mi = u mi + e,i. 8: If converged, then update F -points: appy F-reaxation to A(u) = g. 9: Ese go to step 1. The reader wi note that with exact arithmetic, MGRIT with FCF-reaxation propagates the initia condition two fu coarse grid time intervas (2 T ) each cyce. Thus, MGRIT is equivaent to a sequentia direct sove in N t /(2m) iterations. With F-reaxation ony, the sequentia soution is achieved in N t /m iterations. The speedup comes from the fact that MGRIT converges in O(1) iterations, as shown in [10]. A variety of cycing strategies are avaiabe in mutigrid (e.g., V, W, F). A resuts presented here use the standard V-cyce depicted in Figure 4. This corresponds to Agorithm 1 with the Sove step turned into a singe recursive ca. The recursion ends when a triviay sized grid, of, say, 5 time points, is reached. At this point a sequentia sover is used. On the right in Figure 4, a V-cyce with ony tempora coarsening is shown. Fu spatia coarsening, depicted on the eft, fixes the paraboic ratio, δt/h 2, on a eves, but can degrade the MGRIT convergence rate for the noninear probem considered here. Deayed spatia coarsening, described in Section 5.3, proved to be a more effective strategy. The MGRIT agorithm is impemented in XBraid [41], an open source package deveoped at LLNL. XBraid conforms to MGRIT s non-intrusive phiosophy and requires the user to wrap an existing time-stepping routine, as we as define a few other basic operations ike a state-vector norm and inner-product. The key computationa 7

8 kerne is the time-stepping (i.e., Φ) routine, but a the specifics are opaque to XBraid and done in user code. This aows the user to add tempora paraeism to existing time-stepping codes with minima modifications. For more detais, see [10] and [41] Storage Costs. The reader shoud aso note the storage costs associated with MGRIT. At a minimum, the soution vaues u i must be stored at each C-point in the tempora hierarchy (the first term in the storage cost mode beow). In addition, some number k of auxiiary vectors must be stored at a points on the coarse eves (the second term beow). Let L be the number of tempora eves, p be the number of tempora processors, and s be the cost to store a vector on eve. Then the storage cost of MGRIT (per tempora processor) is as foows: L 1 (2.5) S C (k) i=0 Nt m i+1 p L 1 s i + i=1 Nt k m i s i. p By defaut, MGRIT stores two auxiiary vectors on coarse grids (k = 2): a restricted copy of the fine-grid soution and the right-hand-side for the coarse FAS system. However, in Section 5.2 we show how storing one additiona auxiiary vector, specificay the most recent soution from the corresponding Φ evauation, aows the user to dramaticay reduce the overa runtime. Equation (2.5) wi be used to compute a memory mutipier that gives an estimate of the per processor increase in memory usage when compared to sequentia timestepping. The memory mutipier used is defined as S C (k)/s Mode probem impementation. The weak form of equations (1.3)-(1.5) reads: find u V h such that (3.1) u t, v h + u p 2 u, v h = b, v h + g, v h Ω, v h V h, where V h is an appropriate finite eement space. Discretizing in time using a backward Euer method and the tempora mesh in Figure 2 gives (3.2) uk+1 u k, v h + u k+1 p 2 u k+1, v h = b k+1, v h + g k+1, v h Ω, v h V h, δt where k = 0, 1,..., N t 1, and u 0 is the initia condition, given in (1.5), projected onto the finite eement space. Define Ψ(u)(v) and f k (v) to be (3.3) (3.4) Ψ(u)(v) = u, v + δt u p 2 u, v, f k (v) = u k + δt b k+1, v + δt g k+1, v Ω. Then, the fina noninear weak form is: find u k+1 V h such that (3.5) Ψ(u k+1 )(v h ) = f k (v h ), v h V h, k = 0, 1, 2,..., N t 1. Each time-step corresponds to the soution of this noninear system (i.e., the inversion of Ψ). The Fréchet derivative of Ψ(u)(v), Ψ (u)(v)[w], is (3.6) (3.7) Ψ Ψ(u + aw)(v) Ψ(u)(v) (u)(v)[w] = im, a 0 a = w, v + δt [ u p 2 + (p 2)( u)( u) T ] w, v. 8

9 Hence, Newtons method for (3.5) is (3.8) u j+1 k+1 = uj k+1 δuj, where the subscripts on u are time-steps, the superscripts on u are the Newton iterations, and δu j is the unique eement of V h such that (3.9) Ψ (u j k+1 )(vh )[δu j ] = Ψ(u j k+1 )(vh ) f k (v h ), for every v h V h Numerica parameters. A tests were competed with T = 4 seconds and Ω = [0, 2] 2 on a reguar grid. The forcing function, b(x, t), was chosen such that the exact soution was u(x, y) = sin(κx) sin(κy) sin(τt), where κ = π and τ = (2 + 1/6)π. Uness otherwise stated, the p-lapacian was used with p = 4. The spatia discretization was computed using standard bi-inear quadriatera eements and MFEM [29], a parae finite eement code. The numerica testing parameters used throughout the paper (uness otherwise mentioned) were as foows. The Newton toerance was fixed at The spatia sover for each Newton iteration was BoomerAMG from hypre b [21]. The BoomerAMG parameters were: HMIS coarsening (coarsen-type 10), one eve of aggressive coarsening, symmetric L1 Gauss-Seide (reax-type 8), extended cassica modified interpoation (interp-type 6), and interpoation truncation equa to 4 nonzeros per row. The machine used for a numerica tests was Vucan, an IBM BG/Q machine at LLNL. Except for the scaing studies, the test probem size was a (64) space-time grid on the domain [0, 2] 2 [0, 4] using 4 processors in space and 128 processors in time. V-cyces and FCF-reaxation were empoyed in every test with a fixed stopping criteria of 10 9 /( δt h). This aowed the same toerance, reative to the fine-grid resoution, to be used in a cases. Note that this is an overy tight toerance with respect to discretization error, set in arge part because of the desire to investigate the agorithm s asymptotic convergence properties. The tempora coarsening factor was m = Estabishing baseines. The goa of this paper is to deveop experience with genera strategies that one can use to obtain an MGRIT agorithm for noninear probems that is as efficient as those previousy seen for inear probems. To that end, an efficiency metric for computationa cost is now presented. Two numerica baseines, through which a improvements wi be measured, are aso given. The MGRIT cost metric used here reies on c (j), the average cost of a Newton sove 3 on grid eve during MGRIT iteration j. This is a sensibe choice because the key computationa kerne for noninear probems is the Newton sove. Section 4.1 provides a detaied discussion of this metric and why it was chosen. Two baseine tests were competed, the first being the sequentia baseine. This baseine determined the average cost of a Newton sove when using a sequentia sover, on a variety of space-time grids. An efficient impementation of MGRIT for noninear 3 By Newton sove, we mean the whoe cost to take a noninear time step with Φ using the Newton sover, incuding its mutipe inner Newton iterations. 9

10 probems wi match (or improve on) these cost estimates. The second baseine test is caed the naive MGRIT baseine. For this baseine, MGRIT was appied to the mode noninear probem in an out of the box fashion. Typicay, time-stepping routines have a hard-coded initia guess equa to the previous time-step and do not impement spatia coarsening. Hence, the naive MGRIT baseine mimics this. Note that this approach did, given enough tempora processors, provide a sma speedup over the sequentia routine (see Section 7.2); however, these speedups were much ess than those seen for inear probems Cost Metric. We define the chosen cost metric, c (j), as the cost of a Newton sove (Φ appication) on grid eve during MGRIT iteration j, averaged over a time-steps on that eve, during that MGRIT iteration, i.e., (4.1) c (j) = a (j) w, where a (j) is the average number of Newton iterations required to take a noninear time-step on eve and MGRIT iteration j, and w is the number of spatia unknowns on eve. Thus, c (j) is the cost estimate of a singe Φ appication. As motivation for this choice, et us examine a simpe cost mode for the entire MGRIT agorithm. Restriction from eve to eve + 1 has the same cost as a C-reaxation on eve. Likewise, interpoation from eve to eve 1 has the same cost as an F-reaxation on eve. Hence, beyond the coarsest grid when using FCFreaxation, each MGRIT eve carries out 3 F-reaxations and 2 C-reaxations, for a tota of (2 + (m 1)/m)N t evauations of Φ, where for simpicity N t, the number of time-steps on eve, is assumed to be an exact mutipe of m. Next, consider an MGRIT V-cyce where eve 0 is the finest and eve L is the coarsest, and ν is the number of V-cyces required for MGRIT to converge to within the residua toerance. Then, the tota cost of the MGRIT agorithm is ( ) ν L 1 (4.2) C(L, ν) N tl c (k) L + c (k) (2 + (m 1)/m)N t. k=1 =1 Ceary, equation (4.2) impies that minimizing the average cost of a Newton sove on each eve and MGRIT iteration wi directy resut in a reduction in the overa cost of the MGRIT cyce, and hence, c (j) can be used to measure the efficiency of the agorithm. Section 5 examines the two most important strategies for minimizing c (j), an improved initia guess for Newton and spatia coarsening. Whie the metric shown here is c (j), another parae performance issue is the variance in the cost of a Newton sove over the tempora domain. On the coarser grids, where a processor might own a singe time-step, synchronization effects impy that an F- or C-reaxation cannot compete unti the sowest processor finishes. This in turn impies that the difference between the maximum and minimum cost per Newton sove woud indicate synchronization probems. We therefore note that c (j) aso tracks this spread between maximum and minimum for this probem, but since this fact is probem dependent, we draw the reader s attention to it Sequentia time-stepping baseine. We now estabish the sequentia time-stepping baseine. Tabe 1 shows estimates of the cost of a Newton sove on a of the space-time grids present in the space-time grid hierarchy when MGRIT is appied to the mode probem outined in Section 3.1. These estimates, cacuated 10

11 δt h a w c 1/1024 1/ e3 1/256 1/ e4 1/64 1/ e4 1/16 1/ e3 1/4 1/ e3 1/1 1/ e2 (a) Deayed spatia coarsening δt h a w c 1/1024 1/ e3 1/256 1/ e4 1/64 1/ e4 1/16 1/ e4 1/4 1/ e4 1/1 1/ e4 (b) No spatia Coarsening Tabe 1: Baseine Newton sover costs for sequentia time-stepping. using Equation (4.1), are the average cost of a Newton sove when using a sequentia sover. Tabe 1a depicts baseine estimates for MGRIT using deayed spatia coarsening, whie Tabe 1b provides baseine estimates for MGRIT with no spatia coarsening. An efficient impementation of MGRIT wi match (or improve on) these cost estimates across a eves. Tabe 1 indicates that the average cost of a Newton iteration is highy dependent on the space-time grid, with there being a considerabe advantage to coarsening simutaneousy in space and time. This is because spatia coarsening reduces both a (j) and w. The decrease in a (j) is expained by considering a standard backward Euer time-step, (4.3) ( I δt ) h 2 G (u k+1 ) = f(u k ), where G is a noninear diffusion operator. As δt/h 2 increases, the noninear operator moves away from the identity, becoming more expensive to sove. Coarsening in space and time bounds this ratio, making the noninear sove from Equation (4.3) cheaper on the coarse grid. A detaied investigation into using spatia coarsening is given in Section 5.3. The other important strategy expored here to reduce the cost of Newton is to improve the initia guess (see Section 5.2). This importance is aso evident in Tabe 1, where a continues to increase with δt, even when spatia coarsening is used. This is because δt is increasing, thus making the initia guess to Newton (the previous time-step) progressivey worse. On the coarse eves, where δt is arge, this is ceary a poor approximation to the soution. In Section 5.2, an aternate initia guess for Newton, where the initia guess is the soution obtained during a previous evauation of Φ at the current time-point, and on the current MGRIT eve, is pursued Naive MGRIT baseine. Next, the so caed naive MGRIT baseine is presented. The naive impementation is an unmodified, out of the box type impementation that wraps a sequentia time-stepping routine without any MGRIT optimizations. In particuar, the previous time-step is used as the initia guess and spatia coarsening is not impemented. Tabe 2 depicts the cost estimates, c (j), for this setting inside an MGRIT cyce. Each δt (coumn) vaue corresponds to a tempora eve, whie the rows represent different XBraid iterations. Due to increasing Newton iteration counts, the cost of a coarse grid Newton sove is, in some paces, 6 times more expensive than a fine grid sove. This can ead to 11

12 Iteration δt = 1/1024 1/256 1/64 1/16 1/ e4 2.10e4 2.20e4 3.16e4 5.86e4 4.92e e4 1.24e4 1.76e4 3.52e4 5.98e4 5.41e e3 1.19e4 1.70e4 3.35e4 6.06e4 5.73e e3 1.17e4 1.72e4 3.38e4 5.94e4 5.73e e3 1.17e4 1.72e4 3.39e4 5.94e4 5.73e e3 1.17e4 1.72e4 3.39e4 5.98e4 5.73e4 Tabe 2: Cost estimates, c (j), for the naive MGRIT baseine (sover 0). The cost estimates are given in raw, unscaed form across each tempora eve and MGRIT iteration. Iteration δt = 1/1024 1/256 1/64 1/16 1/4 1 Tota e5 4.20e4 4.39e4 6.32e4 1.17e5 9.83e4 5.38e e4 2.48e4 3.51e4 7.05e4 1.20e5 1.08e5 4.44e e4 2.38e4 3.41e4 6.71e4 1.21e5 1.15e5 4.31e5.... Tabe 3: For processes active on each tempora eve, estimated average cost to carry out FCF-reaxation, based on Tabe an inefficient method in parae. Consider the case where each MPI process owns m points in time on the finest eve, i.e., one CF-interva. In [10] it was shown that this is an efficient decomposition. On coarser eves, each process then owns at most one point in time. Tabe 3 gives, for the processes active on each tempora eve, the estimated cost of an FCF-reaxation. On the finest grid, these numbers are 2m times the cost estimates in Tabe 2 because FCF-reaxation (for arger m) invoves roughy 2m time-step evauations. Then, on coarser grids, the cost estimate is simpy twice what is in Tabe 2 because each processor owns at most one point, and a F -points are reaxed twice. In this setting, it is immediatey cear how expensive Newton soves on coarse eves can dominate the cost of a V-cyce because the eves must be traversed in order on each processor (i.e., one can sum each row in Tabe 3 for a cost estimate of that V-cyce). Since the dominant cost of a V-cyce is reaxation, this row-sum, given in the fina coumn, is then a cost estimate for each V-cyce. In contrast, for the inear setting (when using an optima spatia sover), the work required woud be independent of time-step size. When targeting an MGRIT efficiency simiar to a inear probem, this wi be a key issue. In concusion, our goa is to present a ibrary of MGRIT optimizations and modifications that can be extended to most noninear paraboic probems. These optimizations and modifications are deveoped with the goa of minimizing the average cost of a Newton sove, focusing on the coarse grids, whie addressing common user concerns of non-intrusiveness and memory usage. Controing any growth in the cost of a Newton sove across tempora eves wi be key to matching the resuts seen with MGRIT for inear probems. By itsef, the naive appication of MGRIT scaes very 12

13 poory when paced aongside the comparabe inear probem (see sections 7.1 and 7.2). 5. Efficient MGRIT for the mode noninear probem. In this section a variety of approaches designed to make MGRIT more efficient for the chosen noninear mode probem (1.3) are investigated. Improvements are measured by comparing to the naive MGRIT baseine of Section Sover ID tabe. Given the number of options considered, and to make discussion easier, each sover is given a numerica ID. Tabe 4 presents each sover considered and its runtime for the chosen test probem size. Aso given is the per processor memory mutipier that gives an estimate of the per processor increase in memory usage (see Section 2.2), when compared to sequentia time-stepping. The mutipier is given for the number of tempora processors p = 128 and p = 4096, which is the argest possibe p for this test probem size. For p = 128, the mutipier is quite arge, but the addition of more resources can make this number reativey sma. Each sover option is discussed in more detai in the foowing subsections. However, a brief description of each sover is presented here for the reader s convenience. Sover 0 refers to the naive MGRIT approach from Section 4.3. The coumn heading Spatia Grids refers to the number of spatia grids used and is the option introduced in Section 5.3. A vaue of 1 for Spatia Grids indicates no spatia coarsening, whie 4 means that the finest spatia grid is coarsened 3 times for a tota of 4 grids. The finest grid is 64 64, so coarsening further in space is not advantageous. The difference between sovers 1 and 2 is that sover 1 deays spatia coarsening so that it begins on the fourth tempora grid. Sover 2 begins spatia coarsening immediatey on the first coarse time grid. Remember, for this probem with 4096 time-steps and m = 4, there are ony 6 tempora grids. Given the bad effects on convergence visibe from no deay in sover 2, uness otherwise mentioned, spatia coarsening is aways deayed. See Section 5.3 for more detais. Sovers 3, 4, 5 and 6 correspond to the improved initia guess introduced in Section 5.2. Here, IIG means that the improved initia guess, specificay the soution from a previous evauation of Φ, is used as the initia guess for each Newton sove. This happens either at every point, or at a the coarse grid points and the fine grid C-points, depending on whether a the points or ony the C-points are stored. Regarding the memory mutipier, the use of IIG requires an additiona vector stored on coarse eves, hence k = 3 (instead of 2) in Equation 2.5. Sovers 7, 8, 9 and 10 correspond to turning on the Skip option introduced in Section 6.1. There, the concept of skipping unnecessary work during the first MGRIT down cyce is introduced. Sovers 11 and 12 correspond to having Cheap first three iters as introduced in Section 6.2. Here, the Newton toerance is reaxed during the first three MGRIT iterations. Sovers 13, 14, 15 and 16 reduce the number of eves in the hierarchy by setting a arger Coarsest grid size as introduced in Section 6.3. For instance with m = 4, using a coarsest grid size of 16 instead of 4 removes the coarsest eve in the hierarchy. This can improve the time-to-soution by avoiding cycing between very sma grids MGRIT with an improved initia guess for Newton s method. Section 4.2 showed that c (j) increases with the time-step size (and hence MGRIT eve) because using the previous time-step as the initia guess for Newton s methods becomes increasingy inaccurate. Thus, this section expores an improved initia guess 13

14 Sover ID Spatia grids IIG Skip Cheap first 3 iters Coarsest grid size Memory mut., p = Runtime per iter Iterations Runtime 0 1 Never No No s s 1 4 Never No No s 7 712s 2 4* Never No No s s 3 1 C points No No s 7 638s 4 4 C points No No s 7 579s 5 1 Aways No No s 7 647s 6 4 Aways No No s 7 591s 7 1 C points Yes No s 7 453s 8 4 C points Yes No s 7 410s 9 1 Aways Yes No s 7 429s 10 4 Aways Yes No s 7 391s 11 1 Aways Yes Yes s 7 395s 12 4 Aways Yes Yes s 7 360s 13 1 Aways Yes Yes s 7 408s 14 4 Aways Yes Yes s 7 354s 15 1 Aways Yes Yes s 8 506s 16 4 Aways Yes Yes s 8 383s Tabe 4: Overa runtimes, storage costs, iteration counts and average time per iteration for the various sover options, with a (64) space-time grid. A * indicates no deay in spatia coarsening. (IIG), which uses as the initia guess for Newton s method at a specific time point and MGRIT eve, the soution from the previous evauation of Φ at that point and eve. As XBraid converges, this vaue becomes an ever improving initia guess. The main drawback of using the IIG is that an extra auxiiary vector must be stored at each point on every coarse grid 4. The extra storage at coarse eves is required because the FAS soution, aready stored there, is not equa to the resut of the previous Φ evauation. This creates a tension between memory usage and computationa efficiency. To imit memory usage, one can aternativey choose to store the soution at the fine grid C-points ony. In this case, the IIG is avaiabe at a time steps except those competed during fine grid F-reaxation, where the previous time-step is used. This approach, referred to as IIG at C-points, is used by sovers 3, 4, 7 and 8. 4 The soution from the previous Φ evauation and the current FAS soution are equivaent on the fine-grid, so the IIG is aready avaiabe at stored fine-grid points. 14

15 Iteration δt = 1/1024 1/256 1/64 1/16 1/ Tabe 5: Reative cost estimates, c (j) /c (j),naive, for MGRIT with the improved initia guess at a points (sover 5), across each tempora eve and MGRIT iteration. The sover setup is otherwise identica to that used for Tabe 2. Cost estimates are scaed entry-by-entry with c (j),naive, the naive MGRIT baseine costs from Tabe 2. Tabe 5 shows c (j) over a eves and iterations, scaed entry-by-entry with the corresponding entry from the naive MGRIT baseine from Tabe 2. For exampe, entry (2,3) in Tabe 5 has been scaed by entry (2,3) from Tabe 2. Vaues ess than one indicate that the cost per Newton sove was reduced by using the improved initia guess, whie vaues greater than one indicate it increased. A arge reduction in c (j) is seen across most tempora grids and iterations. The exception to this is on the fine-grid, where costs increase during eary iterations when using the IIG. This is not surprising. During the first few iterations the IIG is either inaccurate or unavaiabe. 5 A consequence of these increased costs is seen in Tabe 4, where sover 3, which uses the IIG everywhere except the fine grid F-Points, was actuay sighty faster than when the IIG was used at a points. Despite the increased costs on the fine grid, sovers 3 and 5, where the IIG is used as the initia guess to the Newton sover at a points and ony at coarse points (C-points), show arge reductions in overa runtime, a direct consequence of the cost reductions seen in Tabe 5. Moreover, there is no degradation in MGRIT convergence. Sovers 4 and 6, where the IIG is used in conjunction with spatia coarsening, are discussed in Section 5.4. In concusion, these resuts indicate that using the IIG as the initia guess is very beneficia. Using the IIG ony at C-points is an exceent option, in this case reducing the runtime by 44%, with ony a 32% increase in storage costs over the naive approach, when comparing sovers 0 and 3 in Tabe 4. Using the IIG as the initia guess at a points doubed the storage costs of the method, but ony reduced the runtime by 43%. Optionay, a user coud opt to experimentay determine which initia guess to use on each eve and iteration. Such a strategy woud ikey resut in further cost savings, but be highy probem specific. Instead, we wi pursue more genera strategies that, when used in conjunction with the improved initia guess, wi reduce the cost of those first three iterations in Section 6. Either way, the improved initia guess is an effective, non-intrusive strategy that wi provide a arge reduction in runtime for most noninear paraboic impementations of MGRIT. This abiity to effectivey use reduced storage ony at C-points and 5 During the first iteration, at C-points, the user suppied initia guess is returned as the soution from a previous Φ evauation. The user does not define an initia guess at F -points, so the previous time-step must be used. 15

16 the overa importance of improved initia guesses for noninear MGRIT are both important contributions of this work MGRIT with spatia coarsening. Motivated by resuts seen in Section 4.2 and Tabe 1a, spatia coarsening is added to the naive sover from Section 4, as indicated in Agorithm 1. In genera, the user s code defines the separate spatia interpoation and restriction functions P x () and R x (). At first gance spatia coarsening seems intrusive, however, MGRIT semi-coarsens in time and is agnostic to the spatia discretization, thus spatia coarsening is an extra option that can easiy be impemented independenty of the time-integration scheme. Here, the natura finite eement spatia restriction operator (and its transpose) is used to interpoate between reguary refined spatia grids. This operator is provided by MFEM. Spatia interpoation equas the scaed (by 1/4) transpose of restriction so that R x P x I. The choice of spatia interpoation operators is an area of active research; however, it is not surprising that scaing so that RP resembes an obique projection heps MGRIT convergence. A better choice here coud ead to improved resuts beow. The benefit of spatia coarsening is vaidated by the improved timings in Tabe 4. Here, sovers 0, 1 and 2 appy varying eves of spatia coarsening. Beginning spatia coarsening on the first coarse grid (sover 2) eads to a degradation in MGRIT convergence. For practica purposes, this sover is unusabe as the convergence degradation continues for arger probems. This is unfortunate given that this approach has a much smaer runtime per iteration. It is important to note that this degradation is not restricted to noninear probems. For exampe, if we repace the noninear co-efficient of the p-lapacian with the exact, known soution, then the resuting inear probem sti suffers from the same MGRIT degradation due to spatia coarsening. This degradation is sti an area of active research, however recent anaysis suggests that it occurs because the eigenvaues of the time-integration operators (Φ and Φ ) are not bounded away from zero. Our goa is to deveop a spatia coarsening strategy that can be appied to a noninear probems. As such, we omit a detaied, probem specific convergence anaysis and instead provide a simpe strategy that aows spatia coarsening to be used with some noninear probems that exhibit this phenomenon. A simiar strategy was used in [13] to make use of spatia coarsening inside a fu space-time mutigrid method. The strategy, referred to as deayed spatia coarsening, is to imit spatia coarsening to the coarsest time grids. For exampe, sover 1 uses four spatia grids with spatia coarsening deayed unti the fourth tempora eve. Tabe 6 presents a cost anaysis of this strategy, scaing each c (j) with the corresponding cost estimates found using the naive MGRIT baseine in Tabe 2. To highight the cost saving benefits of deayed spatia coarsening, this tabe shows resuts that use the previous time-step as the initia guess for the Newton sove. Section 5.4 discusses the combined effect of spatia coarsening and the improved initia guess, introduced in Section 5.2. Deayed spatia coarsening reduced both the number and cost of Newton iterations on the coarse grids, In fact, on the coarsest grid the average cost per Newton Sove is 100 times smaer than the cost to compete the same sove using the naive baseine. Recognizing the specific importance of spatia coarsening when controing the noninear time-step sover iterations on coarse time grids is an important contribution. The benefit of these dramatic cost reductions is seen in Tabe 4. Sover 1, where spatia coarsening is deayed unti the fourth tempora eve, simutaneousy minimized 16

17 Iteration δt = 1/1024 1/256 1/64 1/16 1/ Tabe 6: Reative cost estimates, c (j) /c (j),naive, for MGRIT with spatia coarsening (sover 1), across each tempora eve and MGRIT iteration. Spatia coarsening begins on the fourth time eve, δt = 1/16. The previous time-step is used as the initia guess at a points. The sover setup is otherwise identica to that used for Tabe 2. Cost estimates are scaed entry-by-entry with c (j),naive, the naive MGRIT baseine costs from Tabe 2. Iteration δt = 1/1024 1/256 1/64 1/16 1/ Tabe 7: Reative cost estimates, c (j) /c (j),naive, for MGRIT when using spatia coarsening and the improved initia guess (sover 6), across each tempora eve and MGRIT iteration. The sover setup is otherwise identica to that used for Tabe 2. Cost estimates are scaed entry-by-entry with c (j),naive, the naive MGRIT baseine costs from Tabe 2. Spatia coarsening again begins on the fourth time eve, δt = 1/16. the MGRIT convergence rate and the coarse grid computationa costs, eading to a 37 % reduction in run-time when compared to sover 0. This is the strategy pursued in this paper and it has proven to be robust in our tests Combining the improved initia guess and spatia coarsening. Sections 5.2 and 5.3 introduced an improved initia guess and spatia coarsening. Individuay these improvements reduced the overa runtime by 43% and 37%, respectivey. Tabe 7 gives a cost anaysis of the MGRIT agorithm when these strategies are combined, using 4 eves of spatia coarsening and the IIG at a points. In this tabe each cost estimate is again scaed by the corresponding cost estimate cacuated using the naive MGRIT baseine. Sovers 4 and 6, from Tabe 4, show the effect of using both of these options. When compared to the naive baseine, sover 4, where the IIG is used ony at C- points, gives a 49% reduction in runtime with ony a 5% increase in storage, when compared to sover 0. Overa, deayed spatia coarsening, and spatia coarsening in genera, is an effective strategy that can be appied to most, if not a, noninear paraboic impe- 17

18 mentations of MGRIT. Ceary, no deay in spatia coarsening is best, however, in cases where this causes a arge degradation in the convergence rate, deayed spatia coarsening can sti be used to baance the computationa savings and convergence degradation. Mitigating this degradation in the convergence rate is an area of active research. 6. Further Cost Savings. Whie the improved initia guess and spatia coarsening are, by far, the most effective strategies presented, severa more strategies for reducing the cost of MGRIT are now considered. The effect of these strategies is significanty smaer than that for spatia coarsening and the improved initia guess; thus we wi now normaize the cost anaysis tabes with respect to the just previousy considered sover configuration. For exampe, the next tabe in Section 6.1 is scaed entry-by-entry by the raw cost estimates corresponding to Tabe 7 from Section 5.4. The tabe after that in Section 6.2 is scaed entry-by-entry by the raw cost estimates corresponding to the tabe from Section 6.1, and so on. An entry greater than 1 represents a deterioration reative to the previous sover configuration, whie a vaue ess than 1 represents an improvement Skipping Unnecessary Work. Even with the improvements so far, c (j) i remains arge during the first three MGRIT iterations. Consider iteration 0 and the down-cyce and up-cyce parts of Figure 4. For the mode probem, where no prior knowedge of the soution is avaiabe, it is cear that reaxation during the down cyce of iteration 0 provides no benefit. The first time that goba information is propagated is during the coarse grid sove, and the subsequent up cyce, of iteration 0. Therefore, a reaxation, and in fact work of any kind, during the down-cyce of iteration 0 can be omitted. In this setting, the initia condition on the finest-grid is injected to the coarsest-grid and then seriay propagated. Then, the soution is interpoated back to the finest-grid. Because we have no knowedge of the soution, the previous time step is used as the initia guess throughout this process. This strategy, simiar to those found in fu mutigrid methods, is a non-intrusive modification of the basic MGRIT agorithm. If some a priori knowedge of the soution is avaiabe to initiaize the finest-grid for MGRIT, then this work shoud not be skipped. In a other cases, this is an effective, cheap, approach to determining an initia guess at the fine-grid C-points. Tabe 8 shows the cost anaysis for this approach (sover 8) when it is added to the sover strategy from Section 5.4 (sover 6), where the IIG is used as the initia guess, aong with four eves of spatia coarsening. The cost anaysis is simiar if spatia coarsening is not used. To better discern the improvements, c (j) is scaed by the corresponding raw cost estimates generated when using sover 6. Skipping work can actuay increase the cost of a Newton sove on some eves during the first few iterations. However, the corresponding sovers 7 and 8 in Tabe 4 show that this strategy reduces the overa runtime with no degradation in MGRIT convergence. The reduced runtime is due to the fact that the time-stepping routine is caed far fewer times during iteration 0, despite being more expensive when it is caed. For instance, the denotes the fact that there is no work done on the first eve for iteration 0, by far the most expensive eve. On coarse eves, during iteration 0, ony interpoation (F-reaxation) is performed Cheap initia iterates. The initia three iterates are the most expensive, with c (j) significanty higher on a eves. The soution at this point is sti inaccurate, so another obvious modification is to reduce the Newton toerance during the first 18

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm A Comparison of a Second-Order versus a Fourth- Order Lapacian Operator in the Mutigrid Agorithm Kaushik Datta (kdatta@cs.berkeey.edu Math Project May 9, 003 Abstract In this paper, the mutigrid agorithm

More information

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm Outine Parae Numerica Agorithms Chapter 8 Prof. Michae T. Heath Department of Computer Science University of Iinois at Urbana-Champaign CS 554 / CSE 512 1 2 3 4 Trianguar Matrices Michae T. Heath Parae

More information

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART 13 AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART Eva Vona University of Ostrava, 30th dubna st. 22, Ostrava, Czech Repubic e-mai: Eva.Vona@osu.cz Abstract: This artice presents the use of

More information

Analysis and parallelization strategies for Ruge-Stüben AMG on many-core processors

Analysis and parallelization strategies for Ruge-Stüben AMG on many-core processors Anaysis and paraeization strategies for Ruge-Stüben AMG on many-core processors P. Zaspe Departement Mathematik und Informatik Preprint No. 217-6 Fachbereich Mathematik June 217 Universität Base CH-451

More information

ETNA Kent State University

ETNA Kent State University Eectronic Transactions on Numerica Anaysis. Voume 6, pp. 162-181, December 1997. Copyright 1997,. ISSN 1068-9613. ETNA WAVE-RAY MULTIGRID METHOD FOR STANDING WAVE EQUATIONS A. BRANDT AND I. LIVSHITS Abstract.

More information

On-Chip CNN Accelerator for Image Super-Resolution

On-Chip CNN Accelerator for Image Super-Resolution On-Chip CNN Acceerator for Image Super-Resoution Jung-Woo Chang and Suk-Ju Kang Dept. of Eectronic Engineering, Sogang University, Seou, South Korea {zwzang91, sjkang}@sogang.ac.kr ABSTRACT To impement

More information

Load Balancing by MPLS in Differentiated Services Networks

Load Balancing by MPLS in Differentiated Services Networks Load Baancing by MPLS in Differentiated Services Networks Riikka Susitaiva, Jorma Virtamo, and Samui Aato Networking Laboratory, Hesinki University of Technoogy P.O.Box 3000, FIN-02015 HUT, Finand {riikka.susitaiva,

More information

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162 oward Efficient Spatia Variation Decomposition via Sparse Regression Wangyang Zhang, Karthik Baakrishnan, Xin Li, Duane Boning and Rob Rutenbar 3 Carnegie Meon University, Pittsburgh, PA 53, wangyan@ece.cmu.edu,

More information

Mobile App Recommendation: Maximize the Total App Downloads

Mobile App Recommendation: Maximize the Total App Downloads Mobie App Recommendation: Maximize the Tota App Downoads Zhuohua Chen Schoo of Economics and Management Tsinghua University chenzhh3.12@sem.tsinghua.edu.cn Yinghui (Catherine) Yang Graduate Schoo of Management

More information

A Fast Block Matching Algorithm Based on the Winner-Update Strategy

A Fast Block Matching Algorithm Based on the Winner-Update Strategy In Proceedings of the Fourth Asian Conference on Computer Vision, Taipei, Taiwan, Jan. 000, Voume, pages 977 98 A Fast Bock Matching Agorithm Based on the Winner-Update Strategy Yong-Sheng Chenyz Yi-Ping

More information

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming The First Internationa Symposium on Optimization and Systems Bioogy (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 267 279 Soving Large Doube Digestion Probems for DNA Restriction

More information

On Upper Bounds for Assortment Optimization under the Mixture of Multinomial Logit Models

On Upper Bounds for Assortment Optimization under the Mixture of Multinomial Logit Models On Upper Bounds for Assortment Optimization under the Mixture of Mutinomia Logit Modes Sumit Kunnumka September 30, 2014 Abstract The assortment optimization probem under the mixture of mutinomia ogit

More information

Nearest Neighbor Learning

Nearest Neighbor Learning Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification

More information

A Petrel Plugin for Surface Modeling

A Petrel Plugin for Surface Modeling A Petre Pugin for Surface Modeing R. M. Hassanpour, S. H. Derakhshan and C. V. Deutsch Structure and thickness uncertainty are important components of any uncertainty study. The exact ocations of the geoogica

More information

Response Surface Model Updating for Nonlinear Structures

Response Surface Model Updating for Nonlinear Structures Response Surface Mode Updating for Noninear Structures Gonaz Shahidi a, Shamim Pakzad b a PhD Student, Department of Civi and Environmenta Engineering, Lehigh University, ATLSS Engineering Research Center,

More information

Endoscopic Motion Compensation of High Speed Videoendoscopy

Endoscopic Motion Compensation of High Speed Videoendoscopy Endoscopic Motion Compensation of High Speed Videoendoscopy Bharath avuri Department of Computer Science and Engineering, University of South Caroina, Coumbia, SC - 901. ravuri@cse.sc.edu Abstract. High

More information

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space Sensitivity Anaysis of Hopfied Neura Network in Cassifying Natura RGB Coor Space Department of Computer Science University of Sharjah UAE rsammouda@sharjah.ac.ae Abstract: - This paper presents a study

More information

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion.

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion. Lecture outine 433-324 Graphics and Interaction Scan Converting Poygons and Lines Department of Computer Science and Software Engineering The Introduction Scan conversion Scan-ine agorithm Edge coherence

More information

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER!

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER! [1,2] have, in theory, revoutionized cryptography. Unfortunatey, athough offer many advantages over conventiona and authentication), such cock synchronization in this appication due to the arge operand

More information

Solutions to the Final Exam

Solutions to the Final Exam CS/Math 24: Intro to Discrete Math 5//2 Instructor: Dieter van Mekebeek Soutions to the Fina Exam Probem Let D be the set of a peope. From the definition of R we see that (x, y) R if and ony if x is a

More information

Multigrid Reduction in Time for Nonlinear Parabolic Problems

Multigrid Reduction in Time for Nonlinear Parabolic Problems University of Colorado, Boulder CU Scholar Applied Mathematics Graduate Theses & Dissertations Applied Mathematics Spring 1-1-2017 Multigrid Reduction in Time for Nonlinear Parabolic Problems Ben O'Neill

More information

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 ECEn 528 Prof. Archibad Lab: Dynamic Scheduing Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 Overview This ab's purpose is to expore issues invoved in the design of out-of-order issue processors.

More information

arxiv: v2 [math.na] 7 Mar 2017

arxiv: v2 [math.na] 7 Mar 2017 An agebraic mutigrid method for Q 2 Q 1 mixed discretizations of the Navier-Stokes equations Andrey Prokopenko and Raymond S. Tuminaro arxiv:1607.02489v2 [math.na] 7 Mar 2017 Abstract Agebraic mutigrid

More information

Hiding secrete data in compressed images using histogram analysis

Hiding secrete data in compressed images using histogram analysis University of Woongong Research Onine University of Woongong in Dubai - Papers University of Woongong in Dubai 2 iding secrete data in compressed images using histogram anaysis Farhad Keissarian University

More information

Distance Weighted Discrimination and Second Order Cone Programming

Distance Weighted Discrimination and Second Order Cone Programming Distance Weighted Discrimination and Second Order Cone Programming Hanwen Huang, Xiaosun Lu, Yufeng Liu, J. S. Marron, Perry Haaand Apri 3, 2012 1 Introduction This vignette demonstrates the utiity and

More information

A Fast Multigrid Algorithm for Mesh Deformation

A Fast Multigrid Algorithm for Mesh Deformation A Fast Mutigrid Agorithm for Mesh Deformation Lin Shi Yizhou Yu Nathan Be Wei-Wen Feng University of Iinois at Urbana-Champaign Figure : The ide CAMEL becomes a boxer with the hep of MOCAP data and our

More information

Automatic Grouping for Social Networks CS229 Project Report

Automatic Grouping for Social Networks CS229 Project Report Automatic Grouping for Socia Networks CS229 Project Report Xiaoying Tian Ya Le Yangru Fang Abstract Socia networking sites aow users to manuay categorize their friends, but it is aborious to construct

More information

Priority Queueing for Packets with Two Characteristics

Priority Queueing for Packets with Two Characteristics 1 Priority Queueing for Packets with Two Characteristics Pave Chuprikov, Sergey I. Nikoenko, Aex Davydow, Kiri Kogan Abstract Modern network eements are increasingy required to dea with heterogeneous traffic.

More information

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code Further Optimization of the Decoding Method for Shortened Binary Cycic Fire Code Ch. Nanda Kishore Heosoft (India) Private Limited 8-2-703, Road No-12 Banjara His, Hyderabad, INDIA Phone: +91-040-3378222

More information

Quality of Service Evaluations of Multicast Streaming Protocols *

Quality of Service Evaluations of Multicast Streaming Protocols * Quaity of Service Evauations of Muticast Streaming Protocos Haonan Tan Derek L. Eager Mary. Vernon Hongfei Guo omputer Sciences Department University of Wisconsin-Madison, USA {haonan, vernon, guo}@cs.wisc.edu

More information

PHASE retrieval has been an active research topic for decades [1], [2]. The underlying goal is to estimate an unknown

PHASE retrieval has been an active research topic for decades [1], [2]. The underlying goal is to estimate an unknown DOLPHIn Dictionary Learning for Phase Retrieva Andreas M. Timann, Yonina C. Edar, Feow, IEEE, and Juien Maira, Member, IEEE arxiv:60.063v [math.oc] 3 Aug 06 Abstract We propose a new agorithm to earn a

More information

Chapter Multidimensional Direct Search Method

Chapter Multidimensional Direct Search Method Chapter 09.03 Mutidimensiona Direct Search Method After reading this chapter, you shoud be abe to:. Understand the fundamentas of the mutidimensiona direct search methods. Understand how the coordinate

More information

Neural Network Enhancement of the Los Alamos Force Deployment Estimator

Neural Network Enhancement of the Los Alamos Force Deployment Estimator Missouri University of Science and Technoogy Schoars' Mine Eectrica and Computer Engineering Facuty Research & Creative Works Eectrica and Computer Engineering 1-1-1994 Neura Network Enhancement of the

More information

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS Pave Tchesmedjiev, Peter Vassiev Centre for Biomedica Engineering,

More information

Joint disparity and motion eld estimation in. stereoscopic image sequences. Ioannis Patras, Nikos Alvertos and Georgios Tziritas y.

Joint disparity and motion eld estimation in. stereoscopic image sequences. Ioannis Patras, Nikos Alvertos and Georgios Tziritas y. FORTH-ICS / TR-157 December 1995 Joint disparity and motion ed estimation in stereoscopic image sequences Ioannis Patras, Nikos Avertos and Georgios Tziritas y Abstract This work aims at determining four

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Scheduing Announcement Homework 2 due on October 25th Project 1 due on October 26th 2 CSE 120 Scheduing and Deadock Scheduing Overview In discussing

More information

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS A C Finch K J Mackenzie G J Basdon G Symonds Raca-Redac Ltd Newtown Tewkesbury Gos Engand ABSTRACT The introduction of fine-ine technoogies to printed

More information

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory 0 th Word Congress on Structura and Mutidiscipinary Optimization May 9 -, 03, Orando, Forida, USA A Design Method for Optima Truss Structures with Certain Redundancy Based on Combinatoria Rigidity Theory

More information

As Michi Henning and Steve Vinoski showed 1, calling a remote

As Michi Henning and Steve Vinoski showed 1, calling a remote Reducing CORBA Ca Latency by Caching and Prefetching Bernd Brügge and Christoph Vismeier Technische Universität München Method ca atency is a major probem in approaches based on object-oriented middeware

More information

Fastest-Path Computation

Fastest-Path Computation Fastest-Path Computation DONGHUI ZHANG Coege of Computer & Information Science Northeastern University Synonyms fastest route; driving direction Definition In the United states, ony 9.% of the househods

More information

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method 297 Rea-Time Feature escriptor Matching via a Muti-Resoution Ehaustive Search Method Chi-Yi Tsai, An-Hung Tsao, and Chuan-Wei Wang epartment of Eectrica Engineering, Tamang University, New Taipei City,

More information

JOINT IMAGE REGISTRATION AND EXAMPLE-BASED SUPER-RESOLUTION ALGORITHM

JOINT IMAGE REGISTRATION AND EXAMPLE-BASED SUPER-RESOLUTION ALGORITHM JOINT IMAGE REGISTRATION AND AMPLE-BASED SUPER-RESOLUTION ALGORITHM Hyo-Song Kim, Jeyong Shin, and Rae-Hong Park Department of Eectronic Engineering, Schoo of Engineering, Sogang University 35 Baekbeom-ro,

More information

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System GPU Impementation of Parae SVM as Appied to Intrusion Detection System Sudarshan Hiray Research Schoar, Department of Computer Engineering, Vishwakarma Institute of Technoogy, Pune, India sdhiray7@gmai.com

More information

Special Edition Using Microsoft Excel Selecting and Naming Cells and Ranges

Special Edition Using Microsoft Excel Selecting and Naming Cells and Ranges Specia Edition Using Microsoft Exce 2000 - Lesson 3 - Seecting and Naming Ces and.. Page 1 of 8 [Figures are not incuded in this sampe chapter] Specia Edition Using Microsoft Exce 2000-3 - Seecting and

More information

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

AMR++: Object-Oriented Parallel Adaptive Mesh Refinement

AMR++: Object-Oriented Parallel Adaptive Mesh Refinement UCRL-ID-137373 AMR++: Object-Oriented Parae Adaptive Mesh Refinement 5. Phiip, D. Quinan February 2,200O U.S. Department of Energy Approved for pubic reease; further dissemination unimited DISCLAIMER This

More information

A Memory Grouping Method for Sharing Memory BIST Logic

A Memory Grouping Method for Sharing Memory BIST Logic A Memory Grouping Method for Sharing Memory BIST Logic Masahide Miyazai, Tomoazu Yoneda, and Hideo Fuiwara Graduate Schoo of Information Science, Nara Institute of Science and Technoogy (NAIST), 8916-5

More information

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES Eya En Gad, Akshay Gadde, A. Saman Avestimehr and Antonio Ortega Department of Eectrica Engineering University of Southern

More information

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line American J. of Engineering and Appied Sciences 3 (): 5-24, 200 ISSN 94-7020 200 Science Pubications Appication of Inteigence Based Genetic Agorithm for Job Sequencing Probem on Parae Mixed-Mode Assemby

More information

Space-Time Trade-offs.

Space-Time Trade-offs. Space-Time Trade-offs. Chethan Kamath 03.07.2017 1 Motivation An important question in the study of computation is how to best use the registers in a CPU. In most cases, the amount of registers avaiabe

More information

Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications

Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications Aternative Decompositions for Distributed Maximization of Network Utiity: Framework and Appications Danie P. Paomar and Mung Chiang Eectrica Engineering Department, Princeton University, NJ 08544, USA

More information

Complex Human Activity Searching in a Video Employing Negative Space Analysis

Complex Human Activity Searching in a Video Employing Negative Space Analysis Compex Human Activity Searching in a Video Empoying Negative Space Anaysis Shah Atiqur Rahman, Siu-Yeung Cho, M.K.H. Leung 3, Schoo of Computer Engineering, Nanyang Technoogica University, Singapore 639798

More information

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University Arithmetic Coding Prof. Ja-Ling Wu Department of Computer Science and Information Engineering Nationa Taiwan University F(X) Shannon-Fano-Eias Coding W..o.g. we can take X={,,,m}. Assume p()>0 for a. The

More information

Binarized support vector machines

Binarized support vector machines Universidad Caros III de Madrid Repositorio instituciona e-archivo Departamento de Estadística http://e-archivo.uc3m.es DES - Working Papers. Statistics and Econometrics. WS 2007-11 Binarized support vector

More information

Further Concepts in Geometry

Further Concepts in Geometry ppendix F Further oncepts in Geometry F. Exporing ongruence and Simiarity Identifying ongruent Figures Identifying Simiar Figures Reading and Using Definitions ongruent Trianges assifying Trianges Identifying

More information

Reference trajectory tracking for a multi-dof robot arm

Reference trajectory tracking for a multi-dof robot arm Archives of Contro Sciences Voume 5LXI, 5 No. 4, pages 53 57 Reference trajectory tracking for a muti-dof robot arm RÓBERT KRASŇANSKÝ, PETER VALACH, DÁVID SOÓS, JAVAD ZARBAKHSH This paper presents the

More information

Delay Budget Partitioning to Maximize Network Resource Usage Efficiency

Delay Budget Partitioning to Maximize Network Resource Usage Efficiency Deay Budget Partitioning to Maximize Network Resource Usage Efficiency Kartik Gopaan Tzi-cker Chiueh Yow-Jian Lin Forida State University Stony Brook University Tecordia Technoogies kartik@cs.fsu.edu chiueh@cs.sunysb.edu

More information

MCSE Training Guide: Windows Architecture and Memory

MCSE Training Guide: Windows Architecture and Memory MCSE Training Guide: Windows 95 -- Ch 2 -- Architecture and Memory Page 1 of 13 MCSE Training Guide: Windows 95-2 - Architecture and Memory This chapter wi hep you prepare for the exam by covering the

More information

A MULTIGRID METHOD FOR ADAPTIVE SPARSE GRIDS

A MULTIGRID METHOD FOR ADAPTIVE SPARSE GRIDS A MULTIGRID METHOD FOR ADAPTIVE SPARSE GRIDS BENJAMIN PEHERSTORFER, STEFAN ZIMMER, CHRISTOPH ZENGER, AND HANS-JOACHIM BUNGARTZ Preprint December 17, 2014 Abstract. Sparse grids have become an important

More information

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Patrice Y. Simard, 1 Yann A. Le Cun, 2 John S. Denker, 2 Bernard Victorri 3 1 Microsoft Research, 1 Microsoft Way, Redmond,

More information

A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irregular LDPC Codes in the IEEE 802.

A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irregular LDPC Codes in the IEEE 802. A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irreguar LDPC Codes in the IEEE 82.16e Standards Yeong-Luh Ueng and Chung-Chao Cheng Dept. of Eectrica Engineering,

More information

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01 Page 1 of 15 Chapter 9 Chapter 9: Deveoping the Logica Data Mode The information requirements and business rues provide the information to produce the entities, attributes, and reationships in ogica mode.

More information

An Exponential Time 2-Approximation Algorithm for Bandwidth

An Exponential Time 2-Approximation Algorithm for Bandwidth An Exponentia Time 2-Approximation Agorithm for Bandwidth Martin Fürer 1, Serge Gaspers 2, Shiva Prasad Kasiviswanathan 3 1 Computer Science and Engineering, Pennsyvania State University, furer@cse.psu.edu

More information

Navigating and searching theweb

Navigating and searching theweb Navigating and searching theweb Contents Introduction 3 1 The Word Wide Web 3 2 Navigating the web 4 3 Hyperinks 5 4 Searching the web 7 5 Improving your searches 8 6 Activities 9 6.1 Navigating the web

More information

Optimized Base-Station Cache Allocation for Cloud Radio Access Network with Multicast Backhaul

Optimized Base-Station Cache Allocation for Cloud Radio Access Network with Multicast Backhaul Optimized Base-Station Cache Aocation for Coud Radio Access Network with Muticast Backhau Binbin Dai, Student Member, IEEE, Ya-Feng Liu, Member, IEEE, and Wei Yu, Feow, IEEE arxiv:804.0730v [cs.it] 28

More information

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING Binbin Dai and Wei Yu Ya-Feng Liu Department of Eectrica and Computer Engineering University of Toronto, Toronto ON, Canada M5S 3G4 Emais:

More information

Functions. 6.1 Modular Programming. 6.2 Defining and Calling Functions. Gaddis: 6.1-5,7-10,13,15-16 and 7.7

Functions. 6.1 Modular Programming. 6.2 Defining and Calling Functions. Gaddis: 6.1-5,7-10,13,15-16 and 7.7 Functions Unit 6 Gaddis: 6.1-5,7-10,13,15-16 and 7.7 CS 1428 Spring 2018 Ji Seaman 6.1 Moduar Programming Moduar programming: breaking a program up into smaer, manageabe components (modues) Function: a

More information

1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER Backward Fuzzy Rule Interpolation

1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER Backward Fuzzy Rule Interpolation 1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER 2014 Bacward Fuzzy Rue Interpoation Shangzhu Jin, Ren Diao, Chai Que, Senior Member, IEEE, and Qiang Shen Abstract Fuzzy rue interpoation

More information

Understanding the Mixing Patterns of Social Networks: The Impact of Cores, Link Directions, and Dynamics

Understanding the Mixing Patterns of Social Networks: The Impact of Cores, Link Directions, and Dynamics Understanding the Mixing Patterns of Socia Networks: The Impact of Cores, Link Directions, and Dynamics [Last revised on May 22, 2011] Abedeaziz Mohaisen Huy Tran Nichoas Hopper Yongdae Kim University

More information

QoS-Aware Data Transmission and Wireless Energy Transfer: Performance Modeling and Optimization

QoS-Aware Data Transmission and Wireless Energy Transfer: Performance Modeling and Optimization QoS-Aware Data Transmission and Wireess Energy Transfer: Performance Modeing and Optimization Dusit Niyato, Ping Wang, Yeow Wai Leong, and Tan Hwee Pink Schoo of Computer Engineering, Nanyang Technoogica

More information

Language Identification for Texts Written in Transliteration

Language Identification for Texts Written in Transliteration Language Identification for Texts Written in Transiteration Andrey Chepovskiy, Sergey Gusev, Margarita Kurbatova Higher Schoo of Economics, Data Anaysis and Artificia Inteigence Department, Pokrovskiy

More information

Efficient method to design RF pulses for parallel excitation MRI using gridding and conjugate gradient

Efficient method to design RF pulses for parallel excitation MRI using gridding and conjugate gradient Origina rtice Efficient method to design RF puses for parae excitation MRI using gridding and conjugate gradient Shuo Feng, Jim Ji Department of Eectrica & Computer Engineering, Texas & M University, Texas,

More information

A Method for Calculating Term Similarity on Large Document Collections

A Method for Calculating Term Similarity on Large Document Collections $ A Method for Cacuating Term Simiarity on Large Document Coections Wofgang W Bein Schoo of Computer Science University of Nevada Las Vegas, NV 915-019 bein@csunvedu Jeffrey S Coombs and Kazem Taghva Information

More information

Geometric clustering for line drawing simplification

Geometric clustering for line drawing simplification Eurographics Symposium on Rendering (2005) Kavita Baa, Phiip Dutré (Editors) Geometric custering for ine drawing simpification P. Bara, J.Thoot and F. X. Siion ε ARTIS GRAVIR/IMAG INRIA Figure 1: The two

More information

Searching, Sorting & Analysis

Searching, Sorting & Analysis Searching, Sorting & Anaysis Unit 2 Chapter 8 CS 2308 Fa 2018 Ji Seaman 1 Definitions of Search and Sort Search: find a given item in an array, return the index of the item, or -1 if not found. Sort: rearrange

More information

Construction and refinement of panoramic mosaics with global and local alignment

Construction and refinement of panoramic mosaics with global and local alignment Construction and refinement of panoramic mosaics with goba and oca aignment Heung-Yeung Shum and Richard Szeisi Microsoft Research Abstract This paper presents techniques for constructing fu view panoramic

More information

Community-Aware Opportunistic Routing in Mobile Social Networks

Community-Aware Opportunistic Routing in Mobile Social Networks IEEE TRANSACTIONS ON COMPUTERS VOL:PP NO:99 YEAR 213 Community-Aware Opportunistic Routing in Mobie Socia Networks Mingjun Xiao, Member, IEEE Jie Wu, Feow, IEEE, and Liusheng Huang, Member, IEEE Abstract

More information

Portable Compiler Optimisation Across Embedded Programs and Microarchitectures using Machine Learning

Portable Compiler Optimisation Across Embedded Programs and Microarchitectures using Machine Learning Portabe Compier Optimisation Across Embedded Programs and Microarchitectures using Machine Learning Christophe Dubach, Timothy M. Jones, Edwin V. Bonia Members of HiPEAC Schoo of Informatics University

More information

tread base carc1 carc2 sidewall2 beadflat beadcushion treadfine basefine Y X Z

tread base carc1 carc2 sidewall2 beadflat beadcushion treadfine basefine Y X Z APPLICATION OF THE FINITE ELEMENT METHOD TO THE ANALYSIS OF AUTOMOBILE TIRES H.-J. PAYER, G. MESCHKE AND H.A. MANG Institute for Strength of Materias Vienna University of Technoogy Karspatz 13/E202, A-1040

More information

Optimization and Application of Support Vector Machine Based on SVM Algorithm Parameters

Optimization and Application of Support Vector Machine Based on SVM Algorithm Parameters Optimization and Appication of Support Vector Machine Based on SVM Agorithm Parameters YAN Hui-feng 1, WANG Wei-feng 1, LIU Jie 2 1 ChongQing University of Posts and Teecom 400065, China 2 Schoo Of Civi

More information

Abstract. We investigate the parallel implementation of the diagonal{implicitly iterated Runge{

Abstract. We investigate the parallel implementation of the diagonal{implicitly iterated Runge{ Diagona{Impicity Iterated Runge{Kutta Methods on Distributed Memory Mutiprocessors Thomas Rauber Gudua Runger Computer Science Department Universitat des Saarandes Postfach 151150 66041 Saarbrucken, Germany

More information

Ad Hoc Networks 11 (2013) Contents lists available at SciVerse ScienceDirect. Ad Hoc Networks

Ad Hoc Networks 11 (2013) Contents lists available at SciVerse ScienceDirect. Ad Hoc Networks Ad Hoc Networks (3) 683 698 Contents ists avaiabe at SciVerse ScienceDirect Ad Hoc Networks journa homepage: www.esevier.com/ocate/adhoc Dynamic agent-based hierarchica muticast for wireess mesh networks

More information

A Multigrid Method for Adaptive Sparse Grids

A Multigrid Method for Adaptive Sparse Grids A Mutigrid Method for Adaptive Sparse Grids The MIT Facuty has made this artice openy avaiabe. Pease share how this access benefits you. Your story matters. Citation As Pubished Pubisher Peherstorfer,

More information

InnerSpec: Technical Report

InnerSpec: Technical Report InnerSpec: Technica Report Fabrizio Guerrini, Aessandro Gnutti, Riccardo Leonardi Department of Information Engineering, University of Brescia Via Branze 38, 25123 Brescia, Itay {fabrizio.guerrini, a.gnutti006,

More information

Incremental Discovery of Object Parts in Video Sequences

Incremental Discovery of Object Parts in Video Sequences Incrementa Discovery of Object Parts in Video Sequences Stéphane Drouin, Patrick Hébert and Marc Parizeau Computer Vision and Systems Laboratory, Department of Eectrica and Computer Engineering Lava University,

More information

Extended Node-Arc Formulation for the K-Edge-Disjoint Hop-Constrained Network Design Problem

Extended Node-Arc Formulation for the K-Edge-Disjoint Hop-Constrained Network Design Problem Extended Node-Arc Formuation for the K-Edge-Disjoint Hop-Constrained Network Design Probem Quentin Botton Université cathoique de Louvain, Louvain Schoo of Management, (Begique) botton@poms.uc.ac.be Bernard

More information

Joint Optimization of Intra- and Inter-Autonomous System Traffic Engineering

Joint Optimization of Intra- and Inter-Autonomous System Traffic Engineering Joint Optimization of Intra- and Inter-Autonomous System Traffic Engineering Kin-Hon Ho, Michae Howarth, Ning Wang, George Pavou and Styianos Georgouas Centre for Communication Systems Research, University

More information

A study of comparative evaluation of methods for image processing using color features

A study of comparative evaluation of methods for image processing using color features A study of comparative evauation of methods for image processing using coor features FLORENTINA MAGDA ENESCU,CAZACU DUMITRU Department Eectronics, Computers and Eectrica Engineering University Pitești

More information

Performance of data networks with random links

Performance of data networks with random links Performance of data networks with random inks arxiv:adap-org/9909006 v2 4 Jan 2001 Henryk Fukś and Anna T. Lawniczak Department of Mathematics and Statistics, University of Gueph, Gueph, Ontario N1G 2W1,

More information

Holistic Aggregates in a Networked World: Distributed Tracking of Approximate Quantiles

Holistic Aggregates in a Networked World: Distributed Tracking of Approximate Quantiles Hoistic Aggregates in a Networked Word: Distributed Tracking of Approximate Quanties Graham Cormode Be Laboratories cormode@be-abs.com Minos Garofaakis Be Laboratories minos@be-abs.com S. Muthukrishnan

More information

Alpha labelings of straight simple polyominal caterpillars

Alpha labelings of straight simple polyominal caterpillars Apha abeings of straight simpe poyomina caterpiars Daibor Froncek, O Nei Kingston, Kye Vezina Department of Mathematics and Statistics University of Minnesota Duuth University Drive Duuth, MN 82-3, U.S.A.

More information

Automatic Hidden Web Database Classification

Automatic Hidden Web Database Classification Automatic idden Web atabase Cassification Zhiguo Gong, Jingbai Zhang, and Qian Liu Facuty of Science and Technoogy niversity of Macau Macao, PRC {fstzgg,ma46597,ma46620}@umac.mo Abstract. In this paper,

More information

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System Sef-Contro Cycic Access with Time Division - A MAC Proposa for The HFC System S.M. Jiang, Danny H.K. Tsang, Samue T. Chanson Hong Kong University of Science & Technoogy Cear Water Bay, Kowoon, Hong Kong

More information

University of Bristol - Explore Bristol Research. Link to published version (if available): /ICIP

University of Bristol - Explore Bristol Research. Link to published version (if available): /ICIP Anantrasirichai, N., & Canagarajah, C. N. (2010). Spatiotempora superresoution for ow bitrate H.264 video. In IEEE Internationa Conference on Image Processing. (pp. 2809-2812). 10.1109/ICIP.2010.5651088

More information

International Journal of Electronics and Communications (AEÜ)

International Journal of Electronics and Communications (AEÜ) Int. J. Eectron. Commun. (AEÜ) 67 (2013) 470 478 Contents ists avaiabe at SciVerse ScienceDirect Internationa Journa of Eectronics and Communications (AEÜ) journa h o me pa ge: www.esevier.com/ocate/aeue

More information

A Two-Step Approach for Interactive Pre-Integrated Volume Rendering of Unstructured Grids

A Two-Step Approach for Interactive Pre-Integrated Volume Rendering of Unstructured Grids A Two-Step Approach for Interactive Pre-Integrated Voume Rendering of Unstructured Grids Stefan Roettger and Thomas Ert Visuaization and Interactive Systems Group University of Stuttgart Abstract In the

More information

A Local Optimal Method on DSA Guiding Template Assignment with Redundant/Dummy Via Insertion

A Local Optimal Method on DSA Guiding Template Assignment with Redundant/Dummy Via Insertion A Loca Optima Method on DSA Guiding Tempate Assignment with Redundant/Dummy Via Insertion Xingquan Li 1, Bei Yu 2, Jiani Chen 1, Wenxing Zhu 1, 24th Asia and South Pacific Design T h e p i c Automation

More information

17.3 Surface Area of Pyramids and Cones

17.3 Surface Area of Pyramids and Cones Name Cass Date 17.3 Surface Area of Pyramids and Cones Essentia Question: How is the formua for the atera area of a reguar pyramid simiar to the formua for the atera area of a right cone? Expore G.11.C

More information

Design of IP Networks with End-to. to- End Performance Guarantees

Design of IP Networks with End-to. to- End Performance Guarantees Design of IP Networks with End-to to- End Performance Guarantees Irena Atov and Richard J. Harris* ( Swinburne University of Technoogy & *Massey University) Presentation Outine Introduction Mutiservice

More information

Privacy Preserving Subgraph Matching on Large Graphs in Cloud

Privacy Preserving Subgraph Matching on Large Graphs in Cloud Privacy Preserving Subgraph Matching on Large Graphs in Coud Zhao Chang,#, Lei Zou, Feifei Li # Peing University, China; # University of Utah, USA; {changzhao,zouei}@pu.edu.cn; {zchang,ifeifei}@cs.utah.edu

More information