Wavefront Cache-friendly Algorithm for Compact Numerical Schemes
|
|
- Lydia Webb
- 5 years ago
- Views:
Transcription
1 NASA/CR ICASE Report No Waveront Cache-riendly Alorithm or Compact Numerical Schemes Alex Povitsky ICASE, Hampton, Virinia Institute or Computer Applications in Science and Enineerin NASA Lanley Research Center Hampton, VA Operated by Universities Space Research Association National Aeronautics and Space Administration Lanley Research Center Hampton, Virinia Prepared or Lanley Research Center under Contract NAS October 1999
2 WAVEFRONT CACHE-FRIENDLY ALGORITHM FOR COMPACT NUMERICAL SCHEMES ALEX POVITSKY æ Abstract. Compact numerical schemes provide hih-order solution o PDEs with low dissipation and dispersion. Computer implementation o these schemes requires numerous passes o data throuh cache memory that considerably reduces perormance o these schemes. To reduce this diæculty, a novel alorithm is proposed here. This alorithm is based on a waveront approach and sweeps throuh cache only twice. Key words. cache locality, compact scheme, waveront alorithm, banded linear systems Subject classiæcation. Computer Science, Applied and Numerical Mathematics 1. Introduction. Compact numerical schemes are widely used or challenin problems o computational physics ë1ë. Compact ænite diæerence ormulas are deæned as expressions where derivatives at diæerent mesh points appear simultaneously. These schemes mimic spectral schemes with low dissipation and dispersion. In spite o the act that the number o arithmetic operations per rid node is approximately equal or explicit and compact ormulations o the same approximation order ë2ë, the computational time is considerably larer or the compact schemes. Tam and Webb èë3ë, p. 278è reported about the order o manitude computational time diæerence between explicit and compact ormulations. The poor perormance o compact schemes is explained by architectural eatures o modern computers. The ap between ormal CPU perormance and actual perormance is likely to increase because the CPU speed tends to increase much aster than the speed o memory access. To increase the computational eæciency o modern computers, the interace between a processor and its memory includes a number o cache memories that are placed èloicallyè between the processor and the physical main memory, ivin the processor ast access to data stored in the cache ë4ë. Access to main memory typically requires dozens or hundreds o æops, and reduction o the main memory-cache exchane represents a challene or scientiæc computin. Compact schemes require solution o banded linear systems, which consider each spatial partial derivative separately. Solution o banded linear systems by Gaussian elimination requires orward and backward sweeps throuh data. These sweeps are repeated in all three spatial directions ollowed by a Rune-Kutta temporal update èrkè and computin the riht-hand sides o compact ormulations. Thus, compact 3-D solvers pass data throuh the memory cache eiht times. On the contrary, an explicit central-diæerence alorithm may be easily written in such away that data passes throuh the cache only once. This study proposes a new ormulation o a compact-scheme based numerical alorithm which sweeps throuh data only twice per stae o RK. The alorithm is based on a waveront approach, where a current ront nodes are computed usin only the values at previous ronts. Then this alorithm is expanded to any number o levels o cache memory. Numerical solution at each time step is exactly the same as or a standard compact alorithm. æ Staæ Scientist, ICASE, NASA Lanley Research Center, Hampton, VA è aeralpo@icase.eduè. This research was supported by the National Aeronautics and Space Administration under NASA Contract No. NAS while the author was in residence at the Institute or Computer Applications in Science and Enineerin èicaseè, NASA Lanley Research Center, Hampton, VA
3 è1è 2. Hih-order Numerical Methods. Consider a multi-dimensional partial diæerential equation èpdeè: du dt = where t is the time, k =1; 2; 3 are spatial coordinates. The riht-hand side terms are approximated usin compact ænite diæerence schemes ë1ë: è2è æu 0 i,2 + æu 0 i,1 + U 0 i + æu 0 i+1 + æu 0 i+2 = a 2æx èu i+1, U b i,1è+ 2æx èu i+2, U i,2è; where æx is the rid step and primes denote derivatives with respect to x: Expansion to systems with second spatial derivatives ènavier-stokes typeè is straiht-orward as the compact ormulation or derivatives and the method o their computation are similar to those or the ærst derivatives. Equation è1è is discretized in time with an explicit Rune-Kutta scheme. The solution is advanced rom time level n to time level n + 1 in several sub-staes ë6ë è3è H M = S + M U M+1 = U M + b M+1 æth M ; + M + a M H M,1 ; where M is the particular stae number; and the coeæcients a M and b M depend upon the order o the RK scheme. To compute spatial derivatives, we solve the sets o independent linear banded systems o equations where each system corresponds to one line o the numerical rid. For example, a system correspondin to a line in the x direction has a scalar tridiaonal matrix N x æ N x : è4è a k;l Z k,1;l + b k;l Z k;l + c k;l Z k+1;l = k;l ; where k = 1; :::; N x ; l = 1; :::; N y æ N z ; a k;l ;b k;l and c k;l are the coeæcients, Z k;l are the unknown variables, and N x ;N y and N z are the number o rid nodes in the x; y and z directions, respectively. The ærst step o the Thomas alorithm is LU actorization è5è d 1;l = b 1;l ; c k,1;l d k;l = b k;l, a k;l ; k =2; :::; N x ; d k,1;l and orward substitution èfsè è6è 1;l = 1;l d 1;l ; k;l =,a k;l k,1;l + k;l d k;l ; k =2; :::; N x : The second step o the Thomas alorithm is backward substitution èbsè è7è Z Nx;l = Nx;l; Z k;l = k;l, Z k+1;l c k;l d k;l ; k = N x, 1; :::; 1: The coeæcients a k ;b k and c k are constant or compact schemes; thereore, LU actorization is perormed only once and the ærst step computations include only orward substitution è6è. The standard alorithm or compact numerical solution o the system è1è is perormed as ollows: Alorithm A Step 1. Compute the riht-hand side o equation è2è usin values o the overnin variable U rom the previous time step. Step 2. Compute the spatial derivatives solvin tridiaonal systems in all spatial directions. 2
4 Step 3. Compute the riht-hand side o equation è1è usin the spatial derivatives computed in Step 2 and update overnin variables by Rune-Kutta scheme. Step 4. Repeat computational Steps 1-3 or all Q staes o Rune-Kutta scheme. Step 5. Repeat computational Steps 1-4 or all time steps. This alorithm passes data throuh the cache to perorm Step 1, then it passes data throuh the cache twice per direction to compute the spatial derivatives èstep 2è, and ænally the alorithm touches each rid point to compute the temporal update èstep 3è. Thereore, the data pass throuh the cache 2 + 2D times, where D is the number o directions. For explicit schemes, coeæcients æ and æ are equal to zero; thereore, Step 2 in the above alorithm is reduced to local computations o spatial derivatives. Hence, explicit alorithms can be easily written in such away that the data passes throuh the cache once. 3. Proposed Cache-riendly Alorithm. In this section we develop a cache-aware compact numerical alorithm where the data passes throuh the cache only twice. Let us consider ærst the two-dimensional case. The waveronts are deæned as subsets o rid nodes èi;jè with I + J = const within a waveront. I a rid node èi;jè belons to the ront W, its neihbors èi, 1;Jè and èi;j, 1è belon to the previous ront WM and two other neihbors èi;j + 1è and èi +1;Jè belon to the next ront WP: The ollowin Alorithm B sweeps throuh rid nodes only twice and perorms exactly the same computations as Alorithm A èsee the previous sectionè. Alorithm B or iw=if,...,il or in=1,...,igèiw è Compute the riht-hand side o equation è2è Compute the orward step o the Thomas alorithm è6è in the x and y directions. or iw=il,...,if or in=1,...,igèiw è Perorm the backward step o the Thomas alorithm è7è in the x and y directions. Compute the Rune-Kutta temporal update. Here, IF and IL are the ærst and the last waveronts; indexed variable IG deænes the number o ridnodes within a waveront iw: The orward-step computations è6è in both spatial directions use already computed orward-step coeæcients in rid nodes èi, 1; Jè and èi; J, 1è; whereas the backward-step computations è7è use already computed values in rid nodes èi +1;Jè and èi;j + 1è èsee Fiure 1è. This alorithm exploits the data-independence o the solution o banded linear systems in diæerent spatial directions, i.e., the systems in the x and y directions are solved simultaneously. Additionally, the riht-hand sides o compact 3
5 Table 1 Number o data passes throuh cache or basic compact scheme èalorithm Aè, waveront compact scheme èalorithm Bè and explicit scheme. Number o data passes throuh cache Dimension Alorithm A Explicit Alorithm B 2-D D ormulations are computed simultaneously with the orward-step computations and the RK computations are perormed immediately ater completion o the backward-step computations or a rid node. This alorithm can be easily expanded to penta-diaonal matrices where two neihborin ronts rom either side are used in computations. This alorithm is expanded to the three-dimensional case where waveronts represent planes o rid nodes èi;j;kè with I + J + K = const: Similar to the 2-D case, three previous neihbors belon to the plane WM èsee Fiure 2è and the next three neihbors belon to the plane WP: Still, the alorithm sweeps twice throuh the 3-D array èsee Table 1è. 4. Extension to Multi-level Cache. Let us ærst consider two cache levels, primary level L 1 and secondary level L 2 : We cover the computational domain with boxes èsquares in the 2-D caseè that æt the cache size L 1 ; and renumber the boxes as super-nodes èi b ;J b ;K b è: Then, we deæne box waveronts as subsets o boxes where I b + J b + K b = const: Each box is considered as a computational domain where the orward and the backward steps o Alorithm B are applied. A computational domain covered with nine cache boxes is shown in Fiure 3. These boxes orm æve waveronts in the orward and backward directions. The ærst box waveront includes the box è1; 1è; the second box waveront includes boxes è2; 1è and è1; 2è and so on. The computed waveronts within each box are shown or the ærst two box waveront levels in the orward and backward directions. Let us deæne the two-level alorithm as ollows: Alorithm C or ib=ibf,...,ibl or iw=ifèib è,...,ilèib è Perorm orward-step computations o alorithm B or ib=ibl,...,ibf or iw=ilèib è,...,ifèib è Perorm backward-step computations o alorithm B Alorithm C is consistent, i.e., the previous waveront is completed by the time orward- or backwardstep computations bein or the current waveront. Alorithm C requires a diæerent way o storae o 4
6 overnin array U than the traditional way o column-by-column placement o array in memory. Instead, here the array should be stored box-by-box. This alorithm is easily expanded to cases with any number o cache levels. A cache box is covered with smaller boxes o the size o the next èsmallerè level o cache. In this case the number o nested loops is equal to the number o cache levels. The inner loop sweeps throuh rid nodes that belon to a waveront, whereas outer loops sweep throuh box waveronts correspondin to diæerent levels o cache. 5. Conclusion. The cache-aware compact numerical alorithm has been developed. The data pass throuh cache only twice or 2-D and 3-D cases. The alorithm is expanded to any number o cache levels. Interaction o the proposed alorithm with compilers will be studied in our uture research. REFERENCES ë1ë S. K. Lele, Compact Finite Diæerence Schemes with Spectral Like Resolution, Journal o Computational Physics, 103 è1992è, pp ë2ë T. Colonius, Lectures on Computational Aeroacoustics, presented at the lecture series on Aeroacoustics and Active Noise Control, von Karman Institute o Fluid Dynamics, 1997, ë3ë C. K. W. Tam and J. C. Webb, Dispersion-relation-preservin Finite Diæerence Schemes or Computational Acoustics, Journal o Computational Physics, 107 è1993è, pp ë4ë R. Y. Kain, Advanced Computer Architecture èa System Desin Approachè, Prentice-Hall, Inc., ë5ë Tien-Pao Shih, Goal-directed Perormance Tunin or Scientiæc Applications, Ph.D. Thesis, University o Michian, ë6ë R. V. Wilson, A. O. Demuren and M. Carpenter, Hih-order Compact Schemes or Numerical Simulation o Incompressible Flows, ICASE Report No ,
7 (I, J+1) (I-1,J) (I+1,J) (I,J) (I,J-1) WM W WP Fi. 1. Waveront alorithm in a 2-D case, where WM;W and WP are three consequent waveronts. Solid arrows represent orward-step computations and dashed arrows represent backward-step computations (I-1,J,K) (I,J,K-1) (I,J,K) (I,J-1,K) y Fi. 2. Waveront alorithm in a 3-D case. Shaded plane is the previous waveront. 6
8 (1,3) (2,3) (3,3) (1,2) (2,2) (3,2) (1,1) (2,1) (3,1) Fi. 3. Domain partitionin or two-level cache. 7
Parallelization of the Pipelined Thomas Algorithm
NASA/CR-1998-208736 ICASE Report No. 98-48 Parallelization of the Pipelined Thomas Algorithm A. Povitsky ICASE, Hampton, Virginia Institute for Computer Applications in Science and Engineering NASA Langley
More informationThe spatial bandwidth (BW) of atmospheric turbulence is inversely related to its spatial correlation length,. The. experiences is: g ug
1 Lecture 27 Simulation of Turbulence and Phuoid Response to It Simulation of atmospheric turbulence is essential for conductin realistic fliht dynamics simulations. Matlab s Aerospace Blockset/Environment/Wind
More informationChapter 5 THE MODULE FOR DETERMINING AN OBJECT S TRUE GRAY LEVELS
Qian u Chapter 5. Determinin an Object s True Gray evels 3 Chapter 5 THE MODUE OR DETERMNNG AN OJECT S TRUE GRAY EVES This chapter discusses the module for determinin an object s true ray levels. To compute
More informationEFFICIENT SOLUTION ALGORITHMS FOR HIGH-ACCURACY CENTRAL DIFFERENCE CFD SCHEMES
EFFICIENT SOLUTION ALGORITHMS FOR HIGH-ACCURACY CENTRAL DIFFERENCE CFD SCHEMES B. Treidler, J.A. Ekaterineris and R.E. Childs Nielsen Engineering & Research, Inc. Mountain View, CA, 94043 Abstract Preliminary
More informationIterative Single-Image Digital Super-Resolution Using Partial High-Resolution Data
Iterative Sinle-Imae Diital Super-Resolution Usin Partial Hih-Resolution Data Eran Gur, Member, IAENG and Zeev Zalevsky Abstract The subject of extractin hih-resolution data from low-resolution imaes is
More informationSparse matrices, graphs, and tree elimination
Logistics Week 6: Friday, Oct 2 1. I will be out of town next Tuesday, October 6, and so will not have office hours on that day. I will be around on Monday, except during the SCAN seminar (1:25-2:15);
More informationCalculation of Flow Past A Sphere in the Vicinity of A Ground Using A Direct Boundary Element Method
Australian Journal of Basic and Applied Sciences, 3(): 480-485, 009 ISSN 1991-8178 1 Calculation of Flow Past A Sphere in the Vicinity of A Ground Using A Direct Boundary Element Method 1 3 M. Mushtaq,
More informationA SUIF Interface Module for Eli. W. M. Waite. University of Colorado
A SUIF Interace Module or Eli W. M. Waite Department o Electrical and Computer Enineerin University o Colorado William.Waite@Colorado.edu 1 What is Eli? Eli [2] is a domain-specic prorammin environment
More informationLECTURE 11. The LU factorization technique discussed in the preceding lecture seems quite diæerent from the Gaussian
LETURE Gaussian Elimination The LU factorization technique discussed in the preceding lecture seems quite diæerent from the Gaussian elimination technique for solving systems of linear equations that is
More informationStatus. We ll do code generation first... Outline
Status Run-time Environments Lecture 11 We have covered the ront-end phases Lexical analysis Parsin Semantic analysis Next are the back-end phases Optimization Code eneration We ll do code eneration irst...
More informationCS205b/CME306. Lecture 9
CS205b/CME306 Lecture 9 1 Convection Supplementary Reading: Osher and Fedkiw, Sections 3.3 and 3.5; Leveque, Sections 6.7, 8.3, 10.2, 10.4. For a reference on Newton polynomial interpolation via divided
More informationSELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND
Student Submission for the 5 th OpenFOAM User Conference 2017, Wiesbaden - Germany: SELECTIVE ALGEBRAIC MULTIGRID IN FOAM-EXTEND TESSA UROIĆ Faculty of Mechanical Engineering and Naval Architecture, Ivana
More information53 M 0 j èm 2 i è ;1 M 0 j èm 2 i è ;1 èm 2 i è ;0 èm 2 i è ;0 (a) (b) M 0 i M 0 i (c) (d) Figure 6.1: Invalid boundary layer elements due to invisibi
CHAPTER 6 BOUNDARY LAYER MESHING - ENSURING ELEMENT VALIDITY Growth curves are created with maximum consideration for topological validity and topological compatibility with the model. However, only preliminary
More informationImage Fusion for Enhanced Vision System using Laplacian Pyramid
Imae Fusion or Enhanced Vision System usin aplacian Pyramid Abhilash G, T.V. Rama Murthy Department o ECE REVA Institute o Technoloy and Manaement Banalore-64, India V. P. S Naidu MSDF ab, FMCD, CSIR-National
More informationStudy and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou
Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm
More informationObject-oriented Design for Sparse Direct Solvers
NASA/CR-1999-208978 ICASE Report No. 99-2 Object-oriented Design for Sparse Direct Solvers Florin Dobrian Old Dominion University, Norfolk, Virginia Gary Kumfert and Alex Pothen Old Dominion University,
More informationDesign æy. Ted J. Hubbard z. Erik K. Antonsson x. Engineering Design Research Laboratory. California Institute of Technology.
Cellular Automata Modeling in MEMS Design æy Ted J. Hubbard z Erik K. Antonsson x Engineering Design Research Laboratory Division of Engineering and Applied Science California Institute of Technology January
More informationEncoding Time in seconds. Encoding Time in seconds. PSNR in DB. Encoding Time for Mandrill Image. Encoding Time for Lena Image 70. Variance Partition
Fractal Image Compression Project Report Viswanath Sankaranarayanan 4 th December, 1998 Abstract The demand for images, video sequences and computer animations has increased drastically over the years.
More informationParallel Implementations of Gaussian Elimination
s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n
More informationTurbulence et Génération de Bruit Equipe de recherche du Centre Acoustique LMFA, UMR CNRS 5509, Ecole Centrale de Lyon Simulation Numérique en Aéroacoustique Institut Henri Poincaré - 16 novembre 2006
More informationStraiht Line Detection Any straiht line in 2D space can be represented by this parametric euation: x; y; ; ) =x cos + y sin, =0 To nd the transorm o a
Houh Transorm E186 Handout Denition The idea o Houh transorm is to describe a certain line shape straiht lines, circles, ellipses, etc.) lobally in a parameter space { the Houh transorm domain. We assume
More informationEfficient Tridiagonal Solvers for ADI methods and Fluid Simulation
Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Nikolai Sakharnykh - NVIDIA San Jose Convention Center, San Jose, CA September 21, 2010 Introduction Tridiagonal solvers very popular
More informationMemory Hierarchy. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Memory Hierarchy Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Time (ns) The CPU-Memory Gap The gap widens between DRAM, disk, and CPU speeds
More informationFOURTH ORDER COMPACT FORMULATION OF STEADY NAVIER-STOKES EQUATIONS ON NON-UNIFORM GRIDS
International Journal of Mechanical Engineering and Technology (IJMET Volume 9 Issue 10 October 2018 pp. 179 189 Article ID: IJMET_09_10_11 Available online at http://www.iaeme.com/ijmet/issues.asp?jtypeijmet&vtype9&itype10
More informationLecture Notes on Cache Iteration & Data Dependencies
Lecture Notes on Cache Iteration & Data Dependencies 15-411: Compiler Design André Platzer Lecture 23 1 Introduction Cache optimization can have a huge impact on program execution speed. It can accelerate
More informationB15 Enhancement of Linear Features from Gravity Anomalies by Using Curvature Gradient Tensor Matrix
B5 Enhancement of Linear Features from Gravity Anomalies by Usin Curvature Gradient Tensor Matrix B. Oruç* (Kocaeli University) SUMMARY In this study, a new ede enhancement technique based on the eienvalues
More informationMaurice Holt University of California at Berkeley, Berkeley, California
NASA/CR-1998-208958 ICASE Report No. 98-54 3D Characteristics Maurice Holt University of California at Berkeley, Berkeley, California Institute for Computer Applications in Science and Engineering NASA
More informationTOWARD A UNIFIED INFORMATION SPACE FOR THE SPECIFICATION OF BUILDING PERFORMANCE SIMULATION RESULTS
Ninth International IBPSA Conference Montréal, Canada Auust 15-18, 2005 TOWARD A UNIFIED INFORMATION SPACE FOR THE SPECIFICATION OF BUILDING PERFORMANCE SIMULATION RESULTS Ardeshir Mahdavi, Julia Bachiner,
More informationMethods for Enhancing the Speed of Numerical Calculations for the Prediction of the Mechanical Behavior of Parts Made Using Additive Manufacturing
Methods for Enhancing the Speed of Numerical Calculations for the Prediction of the Mechanical Behavior of Parts Made Using Additive Manufacturing Mohammad Nikoukar, Nachiket Patil, Deepankar Pal, and
More informationETNA Kent State University
Electronic Transactions on Numerical Analysis. Volume, 2, pp. 92. Copyright 2,. ISSN 68-963. ETNA BEHAVIOR OF PLANE RELAXATION METHODS AS MULTIGRID SMOOTHERS IGNACIO M. LLORENTE AND N. DUANE MELSON Abstract.
More informationCircuit Memory Requirements Number of Utilization Number of Utilization. Variable Length one 768x16, two 32x7, è è
Previous Alorithm ë6ë SPACK Alorithm Circuit Memory Requirements Number of Utilization Number of Utilization Arrays Req'd Arrays Req'd Variable Lenth one 768x16, two 32x7, 7 46.2è 5 64.7è CODEC one 512x1
More information(f) (a) Time (s)
A spectral element method or wave propagation simulation near a uid-solid interace Dimitri Komatitsch (1), Christophe Barnes (2), Jeroen Tromp (1) (1) Harvard University, Mailcode 252-21, Caltech, Pasadena,
More informationcuibm A GPU Accelerated Immersed Boundary Method
cuibm A GPU Accelerated Immersed Boundary Method S. K. Layton, A. Krishnan and L. A. Barba Corresponding author: labarba@bu.edu Department of Mechanical Engineering, Boston University, Boston, MA, 225,
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationDevelopment of a Maxwell Equation Solver for Application to Two Fluid Plasma Models. C. Aberle, A. Hakim, and U. Shumlak
Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models C. Aberle, A. Hakim, and U. Shumlak Aerospace and Astronautics University of Washington, Seattle American Physical Society
More informationVon Neumann Analysis for Higher Order Methods
1. Introduction Von Neumann Analysis for Higher Order Methods Von Neumann analysis is a widely used method to study how an initial wave is propagated with certain numerical schemes for a linear wave equation
More informationThe Development of a Navier-Stokes Flow Solver with Preconditioning Method on Unstructured Grids
Proceedings of the International MultiConference of Engineers and Computer Scientists 213 Vol II, IMECS 213, March 13-15, 213, Hong Kong The Development of a Navier-Stokes Flow Solver with Preconditioning
More informationParallel Poisson Solver in Fortran
Parallel Poisson Solver in Fortran Nilas Mandrup Hansen, Ask Hjorth Larsen January 19, 1 1 Introduction In this assignment the D Poisson problem (Eq.1) is to be solved in either C/C++ or FORTRAN, first
More informationParallelization of an Object-oriented Unstructured Aeroacoustics Solver
NASA/CR-1999-209098 ICASE Report No. 99-11 Parallelization of an Object-oriented Unstructured Aeroacoustics Solver Abdelkader Baggag Purdue University, West-Lafayette, Indiana Harold Atkins NASA Langley
More informationACGV 2008, Lecture 1 Tuesday January 22, 2008
Advanced Computer Graphics and Visualization Spring 2008 Ch 1: Introduction Ch 4: The Visualization Pipeline Ch 5: Basic Data Representation Organization, Spring 2008 Stefan Seipel Filip Malmberg Mats
More informationA Random Variable Shape Parameter Strategy for Radial Basis Function Approximation Methods
A Random Variable Shape Parameter Strategy for Radial Basis Function Approximation Methods Scott A. Sarra, Derek Sturgill Marshall University, Department of Mathematics, One John Marshall Drive, Huntington
More information1 Introduction Object recognition is one of the most important functions in human vision. To understand human object recognition, it is essential to u
2D Observers in 3D Object Recognition Zili Liu NEC Research Institute Princeton, NJ 854 Daniel Kersten University of Minnesota Minneapolis, MN 55455 Abstract Converging evidence has shown that human object
More informationElectrical Power System Harmonic Analysis Using Adaptive BSS Algorithm
Sensors & ransducers 2013 by IFSA http://www.sensorsportal.com Electrical Power System Harmonic Analysis Usin Adaptive BSS Alorithm 1,* Chen Yu, 2 Liu Yuelian 1 Zhenzhou Institute of Aeronautical Industry
More informationTransactions on Information and Communications Technologies vol 15, 1997 WIT Press, ISSN
Optimization of time dependent adaptive finite element methods K.-H. Elmer Curt-Risch-Institut, Universitat Hannover Appelstr. 9a, D-30167 Hannover, Germany Abstract To obtain reliable numerical solutions
More informationStatic and Dynamic Analysis Of Reed Valves Using a Minicomputer Based Finite Element Systems
Purdue University Purdue e-pubs International Compressor Engineering Conference School of Mechanical Engineering 1980 Static and Dynamic Analysis Of Reed Valves Using a Minicomputer Based Finite Element
More informationPiecewise polynomial interpolation
Chapter 2 Piecewise polynomial interpolation In ection.6., and in Lab, we learned that it is not a good idea to interpolate unctions by a highorder polynomials at equally spaced points. However, it transpires
More informationSOLVING SYSTEMS OF LINEAR INTERVAL EQUATIONS USING THE INTERVAL EXTENDED ZERO METHOD AND MULTIMEDIA EXTENSIONS
Please cite this article as: Mariusz Pilarek, Solving systems of linear interval equations using the "interval extended zero" method and multimedia extensions, Scientific Research of the Institute of Mathematics
More informationA Nearest Neighbor Method for Efficient ICP
A Nearest Neihbor Method or Eicient ICP Michael Greenspan Guy Godin Visual Inormation Technoloy Group Institute or Inormation Technoloy, National Research Council Canada Bld. M50, 1500 Montreal Rd., Ottawa,
More informationLinear Loop Transformations for Locality Enhancement
Linear Loop Transformations for Locality Enhancement 1 Story so far Cache performance can be improved by tiling and permutation Permutation of perfectly nested loop can be modeled as a linear transformation
More informationFinite Element Simulations. Mark Francis Adams
Multigrid Equation Solvers for Large Scale Nonlinear Finite Element Simulations Mark Francis Adams Report No. UCBèCSD-99-1033 January 1999 Computer Science Division èeecsè University of California Berkeley,
More informationFast Multipole Method on the GPU
Fast Multipole Method on the GPU with application to the Adaptive Vortex Method University of Bristol, Bristol, United Kingdom. 1 Introduction Particle methods Highly parallel Computational intensive Numerical
More informationNumerical Methods for PDEs. SSC Workgroup Meetings Juan J. Alonso October 8, SSC Working Group Meetings, JJA 1
Numerical Methods for PDEs SSC Workgroup Meetings Juan J. Alonso October 8, 2001 SSC Working Group Meetings, JJA 1 Overview These notes are meant to be an overview of the various memory access patterns
More informationAim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview
Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure
More informationAn Optimization Method Based On B-spline Shape Functions & the Knot Insertion Algorithm
An Optimization Method Based On B-spline Shape Functions & the Knot Insertion Algorithm P.A. Sherar, C.P. Thompson, B. Xu, B. Zhong Abstract A new method is presented to deal with shape optimization problems.
More informationCMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)
CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can
More informationFinite difference methods
Finite difference methods Siltanen/Railo/Kaarnioja Spring 8 Applications of matrix computations Applications of matrix computations Finite difference methods Spring 8 / Introduction Finite difference methods
More informationFLUID SIMULATION. Kristofer Schlachter
FLUID SIMULATION Kristofer Schlachter The Equations Incompressible Navier-Stokes: @u @t = (r u)u 1 rp + vr2 u + F Incompressibility condition r u =0 Breakdown @u @t The derivative of velocity with respect
More informationClustering Reduced Order Models for Computational Fluid Dynamics
Clustering Reduced Order Models for Computational Fluid Dynamics Gabriele Boncoraglio, Forest Fraser Abstract We introduce a novel approach to solving PDE-constrained optimization problems, specifically
More informationRecognition. Normalized Correlation - Example. Normalized Correlation. Correlation. Image Processing - Lesson 10
Imae Processin - Lesson 0 Normalized Correlation - Eample Reconition Correlation Features (eometric hashin Moments Eienaces imae pattern Correlation Normalized Correlation model Correspondence Problem
More informationInterpolation Techniques for Overset Grids
Journal of the Arkansas Academy of Science Volume 57 Article 22 2003 Interpolation Techniques for Overset Grids Paul S. Sherman Arkansas State University Nathan B. Edgar Arkansas State University Follow
More informationComputational Fluid Dynamics - Incompressible Flows
Computational Fluid Dynamics - Incompressible Flows March 25, 2008 Incompressible Flows Basis Functions Discrete Equations CFD - Incompressible Flows CFD is a Huge field Numerical Techniques for solving
More informationSolution of the Hyperbolic Partial Differential Equation on Graphs and Digital Spaces: A Klein Bottle a Projective Plane and a 4D Sphere
International Journal o Discrete Mathematics 2017; 2(3): 88-94 http://wwwsciencepublishinggroupcom/j/dmath doi: 1011648/jdmath2017020315 olution o the Hyperbolic Partial Dierential Equation on Graphs and
More informationAn Adaptive Stencil Linear Deviation Method for Wave Equations
211 An Adaptive Stencil Linear Deviation Method for Wave Equations Kelly Hasler Faculty Sponsor: Robert H. Hoar, Department of Mathematics ABSTRACT Wave Equations are partial differential equations (PDEs)
More informationOCC and Its Variants. Jan Lindström. Helsinki 7. November Seminar on real-time systems UNIVERSITY OF HELSINKI. Department of Computer Science
OCC and Its Variants Jan Lindström Helsinki 7. November 1997 Seminar on real-time systems UNIVERSITY OF HELSINKI Department o Computer Science Contents 1 Introduction 1 2 Optimistic Concurrency Control
More informationOutline F. OPTICS. Objectives. Introduction. Wavefronts. Light Rays. Geometrical Optics. Reflection and Refraction
F. OPTICS Outline 22. Spherical mirrors 22.2 Reraction at spherical suraces 22.3 Thin lenses 22. Geometrical optics Objectives (a) use the relationship = r/2 or spherical mirrors (b) draw ray agrams to
More informationNull space basis: mxz. zxz I
Loop Transformations Linear Locality Enhancement for ache performance can be improved by tiling and permutation Permutation of perfectly nested loop can be modeled as a matrix of the loop nest. dependence
More informationFrom Java to C A Supplement to Computer Algorithms, Third Edition. Sara Baase Allen Van Gelder
From Java to C A Supplement to Computer Alorithms, Third Edition Sara Baase Allen Van Gelder October 30, 2004 ii clcopyriht 2000, 2001 Sara Baase and Allen Van Gelder. All rihts reserved. This document
More informationMaunam: A Communication-Avoiding Compiler
Maunam: A Communication-Avoiding Compiler Karthik Murthy Advisor: John Mellor-Crummey Department of Computer Science Rice University! karthik.murthy@rice.edu February 2014!1 Knights Tour Closed Knight's
More informationA-posteriori Diffusion Analysis of Numerical Schemes in Wavenumber Domain
2th Annual CFD Symposium, August 9-1, 218, Bangalore A-posteriori Diffusion Analysis of Numerical Schemes in Wavenumber Domain S. M. Joshi & A. Chatterjee Department of Aerospace Engineering Indian Institute
More informationMotion based 3D Target Tracking with Interacting Multiple Linear Dynamic Models
Motion based 3D Target Tracking with Interacting Multiple Linear Dynamic Models Zhen Jia and Arjuna Balasuriya School o EEE, Nanyang Technological University, Singapore jiazhen@pmail.ntu.edu.sg, earjuna@ntu.edu.sg
More informationPartial Differential Equations
Simulation in Computer Graphics Partial Differential Equations Matthias Teschner Computer Science Department University of Freiburg Motivation various dynamic effects and physical processes are described
More information2 CHAPTR 1. BOTTOM UP PARSING 1. S ::= 4. T ::= T* F 2. ::= +T 5. j F 3. j T 6. F ::= 7. j Fiure 1.1: Our Sample Grammar for Bottom Up Parsin Our beli
Chapter 1 Bottom Up Parsin The key diæculty with top-down parsin is the requirement that the rammar satisfy the LL1 property. You will recall that this entailed knowin, when you are facin the token that
More informationCache Oblivious Matrix Transpositions using Sequential Processing
IOSR Journal of Engineering (IOSRJEN) e-issn: 225-321, p-issn: 2278-8719 Vol. 3, Issue 11 (November. 213), V4 PP 5-55 Cache Oblivious Matrix s using Sequential Processing korde P.S., and Khanale P.B 1
More informationThree dimensional meshless point generation technique for complex geometry
Three dimensional meshless point generation technique for complex geometry *Jae-Sang Rhee 1), Jinyoung Huh 2), Kyu Hong Kim 3), Suk Young Jung 4) 1),2) Department of Mechanical & Aerospace Engineering,
More informationA SUBGRADIENT PROJECTION ALGORITHM FOR NON-DIFFERENTIABLE SIGNAL RECOVERY Jian Luo and Patrick L. Combettes Department of Electrical Engineering, City College and Graduate School, City University of New
More informationOn sufficient conditions of the injectivity : development of a numerical test algorithm via interval analysis
On sufficient conditions of the injectivity : development of a numerical test algorithm via interval analysis S. Lagrange (lagrange@istia.univ-angers.fr) LISA, UFR Angers, 62 avenue Notre Dame du Lac,
More informationDevelopment and Verification of an SP 3 Code Using Semi-Analytic Nodal Method for Pin-by-Pin Calculation
Journal of Physical Science and Application 7 () (07) 0-7 doi: 0.765/59-5348/07.0.00 D DAVID PUBLISHIN Development and Verification of an SP 3 Code Usin Semi-Analytic Chuntao Tan Shanhai Nuclear Enineerin
More informationREVIEW OF NUMERICAL SCHEMES AND BOUNDARY CONDITIONS APPLIED TO WAVE PROPAGATION PROBLEMS
REVIEW OF NUMERICAL SCHEMES AND BOUNDARY CONDITIONS APPLIED TO WAVE PROPAGATION PROBLEMS O. de Almeida Universidade Federal de Uberlândia Departamento de Engenharia Mecânica FEMEC Campus Santa Mônica CP.
More informationImage Processing - Lesson 10. Recognition. Correlation Features (geometric hashing) Moments Eigenfaces
Imae Processin - Lesson 0 Reconition Correlation Features (eometric hashin) Moments Eienaces Normalized Correlation - Example imae pattern Correlation Normalized Correlation Correspondence Problem match?
More informationIntroduction to Finite Element Method
Guest Lecture in Prodi Teknik Sipil Introduction to Finite Element Method Wong Foek Tjong, Ph.D. Petra Christian University Surabaya Lecture Outline 1. Overview of the FEM 2. Computational steps of the
More informationREDUCTION CUT INVERTED SUM
Irreducible Plane Curves Jason E. Durham æ Oregon State University Corvallis, Oregon durhamj@ucs.orst.edu August 4, 1999 Abstract Progress in the classiæcation of plane curves in the last æve years has
More informationAdarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs)
OBJECTIVE FLUID SIMULATIONS Adarsh Krishnamurthy (cs184-bb) Bela Stepanova (cs184-bs) The basic objective of the project is the implementation of the paper Stable Fluids (Jos Stam, SIGGRAPH 99). The final
More informationOutline. COMSOL Multyphysics: Overview of software package and capabilities
COMSOL Multyphysics: Overview of software package and capabilities Lecture 5 Special Topics: Device Modeling Outline Basic concepts and modeling paradigm Overview of capabilities Steps in setting-up a
More informationA Rigorous Correctness Proof of a Tomasulo Scheduler Supporting Precise Interrupts
A Riorous Correctness Proo o a Tomasulo Scheduler Supportin Precise Interrupts Daniel Kroenin Λ, Silvia M. Mueller y, and Wolan J. Paul Dept. 14: Computer Science, University o Saarland, Post Box 151150,
More informationJournal of Universal Computer Science, vol. 3, no. 10 (1997), submitted: 11/3/97, accepted: 2/7/97, appeared: 28/10/97 Springer Pub. Co.
Journal of Universal Computer Science, vol. 3, no. 10 (1997), 1100-1113 submitted: 11/3/97, accepted: 2/7/97, appeared: 28/10/97 Springer Pub. Co. Compression of Silhouette-like Images based on WFA æ Karel
More informationDESIGN AND ANALYSIS OF UPDATE-BASED CACHE COHERENCE PROTOCOLS FOR SCALABLE SHARED-MEMORY MULTIPROCESSORS. David Brian Glasco
DESIGN AND ANALYSIS OF UPDATE-BASED CACHE COHERENCE PROTOCOLS FOR SCALABLE SHARED-MEMORY MULTIPROCESSORS David Brian Glasco Technical Report No. CSL-TR-95-670 June 1995 This research was Supported by Digital
More informationGPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis
GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis Abstract: Lower upper (LU) factorization for sparse matrices is the most important computing step for circuit simulation
More informationNetwork-Aware Resource Allocation in Distributed Clouds
Dissertation Research Summary Thesis Advisor: Asst. Prof. Dr. Tolga Ovatman Istanbul Technical University Department of Computer Engineering E-mail: aralat@itu.edu.tr April 4, 2016 Short Bio Research and
More information2 Consider the linear combinations, x w T x èx, x è and y w T y èy, y è, of the two variables respectively. The correlation between x and y is given b
Estimating Multiple Depths in Semi-transparent Stereo Images M. Borga, H. Knutsson Computer Vision Laboratory Department of Electrical Engineering Linkíoping University SE-581 83 Linkíoping, Sweden Abstract
More informationIndependent systems consist of x
5.1 Simultaneous Linear Equations In consistent equations, *Find the solution to each system by graphing. 1. y Independent systems consist of x Three Cases: A. consistent and independent 2. y B. inconsistent
More informationFinite Volume Discretization on Irregular Voronoi Grids
Finite Volume Discretization on Irregular Voronoi Grids C.Huettig 1, W. Moore 1 1 Hampton University / National Institute of Aerospace Folie 1 The earth and its terrestrial neighbors NASA Colin Rose, Dorling
More informationAbalone Age Prediction using Artificial Neural Network
IOSR Journal o Computer Engineering (IOSR-JCE) e-issn: 2278-066,p-ISSN: 2278-8727, Volume 8, Issue 5, Ver. II (Sept - Oct. 206), PP 34-38 www.iosrjournals.org Abalone Age Prediction using Artiicial Neural
More informationA NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS
Contemporary Mathematics Volume 157, 1994 A NEW MIXED PRECONDITIONING METHOD BASED ON THE CLUSTERED ELEMENT -BY -ELEMENT PRECONDITIONERS T.E. Tezduyar, M. Behr, S.K. Aliabadi, S. Mittal and S.E. Ray ABSTRACT.
More informationDISSERTATION. Presented to the Faculty of the Graduate School of. The University of Texas at Austin. in Partial Fulællment. of the Requirements
MAP LEARNING WITH UNINTERPRETED SENSORS AND EFFECTORS by DAVID MARK PIERCE, B.S., B.A., M.S.C.S. DISSERTATION Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial
More informationTHE FINANCIAL CALCULATOR
Starter Kit CHAPTER 3 Stalla Seminars THE FINANCIAL CALCULATOR In accordance with the AIMR calculator policy in eect at the time o this writing, CFA candidates are permitted to use one o two approved calculators
More informationReview for Exam I, EE552 2/2009
Gonale & Woods Review or Eam I, EE55 /009 Elements o Visual Perception Image Formation in the Ee and relation to a photographic camera). Brightness Adaption and Discrimination. Light and the Electromagnetic
More informationCache Memories. Topics. Next time. Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance
Cache Memories Topics Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance Next time Dynamic memory allocation and memory bugs Fabián E. Bustamante,
More informationCS 468 (Spring 2013) Discrete Differential Geometry
Lecturer: Adrian Butscher, Justin Solomon Scribe: Adrian Buganza-Tepole CS 468 (Spring 2013) Discrete Differential Geometry Lecture 19: Conformal Geometry Conformal maps In previous lectures we have explored
More informationInterface and Boundary Schemes for High-Order Methods
19th AIAA Computational Fluid Dynamics 22-25 June 29, San Antonio, Texas AIAA 29-3658 Interface and Boundary Schemes for High-Order Methods Xun Huan, Jason E. Hicken, and David W. Zingg Institute for Aerospace
More informationREGULAR GRAPHS OF GIVEN GIRTH. Contents
REGULAR GRAPHS OF GIVEN GIRTH BROOKE ULLERY Contents 1. Introduction This paper gives an introduction to the area of graph theory dealing with properties of regular graphs of given girth. A large portion
More information