Virtual EM Inc. Ann Arbor, Michigan, USA

Size: px
Start display at page:

Download "Virtual EM Inc. Ann Arbor, Michigan, USA"

Transcription

1 Functional Description of the Architecture of a Special Purpose Processor for Orders of Magnitude Reduction in Run Time in Computational Electromagnetics Tayfun Özdemir Virtual EM Inc. Ann Arbor, Michigan, USA tayfun@virtualem.com SPI 2013, Paris, France

2 1. Definition of the Problem 2. Algorithm 3. Processor Architecture Organization of the Talk Chip and the User Interface Scalable Run Time Processor Nodes and Mapping of the Algorithm 4. Manufacturing of the Chip SPI 2013, Paris, France, May 13,

3 1. Definition of Problem Use of Computational Electromagnetics (CEM) Signal Integrity during design EMI related to board design and packaging RF Circuit Antennas Slow rate of adoption of CEM tools due to large run times Sequential Programming Use of general purpose hardware Recent Progress Massively parallel machines Graphics Processing Units (GPUs) Multi-core Central Processing Units (CPUs) Giga Floating Point Operations Per Second (Gflops)/$ is still too low MDGRAPE delivers small multiples of Gflops/$ but is expensive SPI 2013, Paris, France, May 13,

4 What is Needed Orders of magnitude increase in Gflops/$ ratio: 100x or 1,000x What is the hold up? Current algorithms require sequential programming General purpose hardware are used The problem with the current algorithms: Based on Galerkin s or Functional formulation Results in Linear System of Equations Sequential Programming for Matrix Fill and Solver Poor scaling in High Performance Computing (HPC) platforms Gflops/$ of GPUs and Multi-core CPUs are increasing but Sequential Programming is holding back the scaling Economy of Scale for GPUs and multi-core CPUs will not alone suffice A new paradigm needed: New Algorithms implemented in the form of Hardware designed specially for CEM: Example: FFT Chips SPI 2013, Paris, France, May 13,

5 2. Algorithm Hardware and the Algorithm are inseparable: Algorithm implemented in the form of hardware What needs to happen for realizing the above? A new numerical algorithm that can be implemented in hardware in a scalable fashion A special purpose processor built to implement the above algorithm Rule: Electrical Numerical Science lags Mechanical by 20 years Computational Fluid Dynamics (CFD) colleagues have already done it! Abandoned Navier-Stokes Equations in favor of Boltzman Equation Simulate the flow of a Newtonian fluid with collision models Simulate streaming and collision processes across a limited number of particles to realize a viscous flow behavior across greater dimensions SPI 2013, Paris, France, May 13,

6 Lattice Boltzman Method (LBM) PowerFLOW software by Exa Corporation (Burlington, MA, USA) Fictitious particles performing consecutive propagation and collision processes over discrete lattice Yields the Navier-Stokes equations in asymptotic expansion Algorithm is highly scalable on HPC platforms (After Lattice Boltzmann Methods for Fluid Dynamics by Steven Orszag, Yale University) SPI 2013, Paris, France, May 13,

7 Conventional CFD vs LBM (After Lattice Boltzmann Methods for Fluid Dynamics by Steven Orszag, Yale University) SPI 2013, Paris, France, May 13,

8 New Algorithm for CEM We are not proposing to replace Maxwell's Equations We are not proposing to abandon the current efforts on developing algorithms with a focus on sequential programming Such efforts will continue to exist and are essential in improving algorithmic efficiency Rather, replacing the Galerkin s and Functional methods that are used today to directly discretize the Maxwell s equations Perhaps the CEM analogy to LBM Multi-pole expansion of the field? New Algorithm must be Maxwellian Two options: New scalable algorithm for existing HPC platforms (as CFD folks did) A new special purpose processor with an accompanying new algorithm SPI 2013, Paris, France, May 13,

9 New Algorithm for CEM We are not proposing to replace Maxwell's Equations We are not proposing to abandon the current efforts on developing algorithms with a focus on sequential programming Such efforts will continue to exist and are essential in improving algorithmic efficiency Rather, replacing the Galerkin s and Functional methods that are used today to directly discretize the Maxwell s equations Perhaps the CEM analogy to LBM of CFD Multi-pole expansion of the field? Two options: New scalable algorithm for existing HPC platforms (as CFD folks did) A new special purpose processor with an accompanying new algorithm What is presented here. SPI 2013, Paris, France, May 13,

10 3. Processor Architecture Inspired by the Finite Element Method (FEM): FEM Chip Expandable to: Finite Difference Time Domain (FDTD) and the Time Domain Finite Element Method (TDFEM) Method of Moments (MoM) requires a bit more thinking. SPI 2013, Paris, France, May 13,

11 FEM Chip and User Interface Geometry Excitation Boundary Conditions FEM Chip Unknowns (E or H Field Vector) PC GUI Engine API FEM PCI Board SPI 2013, Paris, France, May 13,

12 Scalable Run Time ~ O(N) CLOCK (250MHz) ITER 1 ITER 2 ITER 3 ITER N CLOCK One clock cycle = (1/250) micro seconds t (sec) Solution Time = N Clock Cycle = (N / 250) x 10-6 seconds SPI 2013, Paris, France, May 13,

13 Run Times Type of Problem (*) N Run Time A resonant antenna ~10 3 1msec Five wavelength long RF circuit ~ msec Small Boat ~ sec F16 aircraft (**) ~ min Large Ship ~ hrs (*) One frequency point and/or one look angle at 10GHz (**) U.S. Air Force Challenge since 1970s SPI 2013, Paris, France, May 13,

14 Processor Nodes & Scalable Algorithm Multi-Pole Representation of the EM Field (ongoing research) Map Mesh Nodes to Computing Nodes (ongoing research) P1 P2 P1 P2 P4 P3 P3 P4 P5 P5 P6 P6 MESH COMPUTING NODES SPI 2013, Paris, France, May 13,

15 New Paradigm Nodes must perform simple computations Data sharing must be local Must converge in O(N) iterations, i.e., O(N) clock cycles At each iteration, P1 = a * P1 + b *P2 + c * P3 + d * P5 (ongoing research) Computational Unit (P1 = a * P1 + b *P2 + ) b, P2 c, P3 d, P5 P1 c b P2 P1 b, P1 d P3 P4 RAM = P1 c, P1 d, P1 P5 P6 CLOCK SPI 2013, Paris, France, May 13,

16 New Paradigm Computational Unit (P1 = a * P1 + b *P2 + ) b, P2 c, P3 d, P5 P1 c b P2 P1 b, P1 d P3 P4 RAM = P1 c, P1 d, P1 P5 P6 CLOCK SPI 2013, Paris, France, May 13,

17 New Paradigm Computational Unit (P1 = a * P1 + b *P2 + ) b, P2 c, P3 d, P5 P1 c b P2 P1 b, P1 d P3 P4 RAM = P1 c, P1 d, P1 P5 P6 CLOCK SPI 2013, Paris, France, May 13,

18 New Paradigm Computational Unit (P1 = a * P1 + b *P2 + ) b, P2 c, P3 d, P5 P1 c b P2 P1 b, P1 d P3 P4 RAM = P1 c, P1 d, P1 P5 P6 CLOCK SPI 2013, Paris, France, May 13,

19 4. Manufacturing of the Chip Chip Design & Manufacturing Functionality (ongoing research) HDL via Verilog (~$300K) Low volume prototype chip ($20K/unit via Asian Foundries) PCI Card Design & Manufacturing Application Programming Interface (API) PC System and Benchmarking High Volume Production SPI 2013, Paris, France, May 13,

20 Manufacturing Challenges Chip has to have as many nodes as the number of unknowns 1K for a Resonant Antenna but 100M for a Small Boat at 10GHz 3D Chip not possible today Interconnects between the nodes form a 3D lattice Has to be 2D with today s chip manufacturing technology Parallel nature of the above paradigm (and therefore the run time scaling with N) must be compromised Introduce a level of sequential computational steps Sub-divide the three-dimensional solution space into sections, each of which could be mapped to a two-dimensional grid A reasonably high Gflops/$ ratio could still be achieved. SPI 2013, Paris, France, May 13,

21 GFlops/$ Wars Acceleware Corp: GPUs? Gflops/$1,000 Impulse Technologies: FPGAs? Gflops/$1,000 Appro International: Blade Clusters 4 Gflops/$1,000 (*) IBM: BlueGene L 0.1 Gflops/$1,000 Virtual EM: MDGRAPE machine 21 Gflops/$1,000 Proposed Scheme (estimated) >100 Gflops/$1,000 (*) 2008 numbers SPI 2013, Paris, France, May 13,

22 Next Steps 1. Confirm that the proposed architecture will provide a) Order of magnitude increase in Gflops/$ b) O(N) scaling of Run Time via a) Simulations b) Limited prototyping using simple micro-controllers serving as simple computational nodes 2. Research on Algorithms a) Scalable b) Can be implemented in hardware 3. Manufacturing of the Processor a) 3D Chip (not possible in the near future) b) 2D Chip for 3D Problems with compromised scaling (most likely) c) 2D Chip for 2D Problems 2D problems Body of Revolution (BoR) Problems SPI 2013, Paris, France, May 13,

23 Next Steps 4. Improve scaling of current algorithms on today s hardware GPUs Multi-core CPUs FPGAs ARM-based micro-controllers SPI 2013, Paris, France, May 13,

Lecture 7: Introduction to HFSS-IE

Lecture 7: Introduction to HFSS-IE Lecture 7: Introduction to HFSS-IE 2015.0 Release ANSYS HFSS for Antenna Design 1 2015 ANSYS, Inc. HFSS-IE: Integral Equation Solver Introduction HFSS-IE: Technology An Integral Equation solver technology

More information

Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis

Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis Electrical Interconnect and Packaging Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis Jason Morsey Barry Rubin, Lijun Jiang, Lon Eisenberg, Alina Deutsch Introduction Fast

More information

Simulation Advances for RF, Microwave and Antenna Applications

Simulation Advances for RF, Microwave and Antenna Applications Simulation Advances for RF, Microwave and Antenna Applications Bill McGinn Application Engineer 1 Overview Advanced Integrated Solver Technologies Finite Arrays with Domain Decomposition Hybrid solving:

More information

Simulation Advances. Antenna Applications

Simulation Advances. Antenna Applications Simulation Advances for RF, Microwave and Antenna Applications Presented by Martin Vogel, PhD Application Engineer 1 Overview Advanced Integrated Solver Technologies Finite Arrays with Domain Decomposition

More information

High Performance Computing

High Performance Computing High Performance Computing ADVANCED SCIENTIFIC COMPUTING Dr. Ing. Morris Riedel Adjunct Associated Professor School of Engineering and Natural Sciences, University of Iceland Research Group Leader, Juelich

More information

1.2 Numerical Solutions of Flow Problems

1.2 Numerical Solutions of Flow Problems 1.2 Numerical Solutions of Flow Problems DIFFERENTIAL EQUATIONS OF MOTION FOR A SIMPLIFIED FLOW PROBLEM Continuity equation for incompressible flow: 0 Momentum (Navier-Stokes) equations for a Newtonian

More information

A Graphical User Interface (GUI) for Two-Dimensional Electromagnetic Scattering Problems

A Graphical User Interface (GUI) for Two-Dimensional Electromagnetic Scattering Problems A Graphical User Interface (GUI) for Two-Dimensional Electromagnetic Scattering Problems Veysel Demir vdemir@olemiss.edu Mohamed Al Sharkawy malshark@olemiss.edu Atef Z. Elsherbeni atef@olemiss.edu Abstract

More information

HFSS Ansys ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary

HFSS Ansys ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary HFSS 12.0 Ansys 2009 ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary Comparison of HFSS 11 and HFSS 12 for JSF Antenna Model UHF blade antenna on Joint Strike Fighter Inherent improvements in

More information

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm

More information

HFSS Hybrid Finite Element and Integral Equation Solver for Large Scale Electromagnetic Design and Simulation

HFSS Hybrid Finite Element and Integral Equation Solver for Large Scale Electromagnetic Design and Simulation HFSS Hybrid Finite Element and Integral Equation Solver for Large Scale Electromagnetic Design and Simulation Laila Salman, PhD Technical Services Specialist laila.salman@ansys.com 1 Agenda Overview of

More information

HFSS 14 Update for SI and RF Applications Markus Kopp Product Manager, Electronics ANSYS, Inc.

HFSS 14 Update for SI and RF Applications Markus Kopp Product Manager, Electronics ANSYS, Inc. HFSS 14 Update for SI and RF Applications Markus Kopp Product Manager, Electronics ANSYS, Inc. 1 ANSYS, Inc. September 21, Advanced Solvers: Finite Arrays with DDM 2 ANSYS, Inc. September 21, Finite Arrays

More information

Lattice Boltzmann with CUDA

Lattice Boltzmann with CUDA Lattice Boltzmann with CUDA Lan Shi, Li Yi & Liyuan Zhang Hauptseminar: Multicore Architectures and Programming Page 1 Outline Overview of LBM An usage of LBM Algorithm Implementation in CUDA and Optimization

More information

Fast Multipole Method on the GPU

Fast Multipole Method on the GPU Fast Multipole Method on the GPU with application to the Adaptive Vortex Method University of Bristol, Bristol, United Kingdom. 1 Introduction Particle methods Highly parallel Computational intensive Numerical

More information

Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method

Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method Introducing a Cache-Oblivious Blocking Approach for the Lattice Boltzmann Method G. Wellein, T. Zeiser, G. Hager HPC Services Regional Computing Center A. Nitsure, K. Iglberger, U. Rüde Chair for System

More information

Numerical Algorithms on Multi-GPU Architectures

Numerical Algorithms on Multi-GPU Architectures Numerical Algorithms on Multi-GPU Architectures Dr.-Ing. Harald Köstler 2 nd International Workshops on Advances in Computational Mechanics Yokohama, Japan 30.3.2010 2 3 Contents Motivation: Applications

More information

Aspects of RF Simulation and Analysis Software Methods. David Carpenter. Remcom. B = t. D t. Remcom (Europe)

Aspects of RF Simulation and Analysis Software Methods. David Carpenter. Remcom. B = t. D t. Remcom (Europe) Remcom (Europe) Central Boulevard Blythe Valley Park Solihull West Midlands England, B90 8AG www.remcom.com +44 870 351 7640 +44 870 351 7641 (fax) Aspects of RF Simulation and Analysis Software Methods

More information

Driven Cavity Example

Driven Cavity Example BMAppendixI.qxd 11/14/12 6:55 PM Page I-1 I CFD Driven Cavity Example I.1 Problem One of the classic benchmarks in CFD is the driven cavity problem. Consider steady, incompressible, viscous flow in a square

More information

computational Fluid Dynamics - Prof. V. Esfahanian

computational Fluid Dynamics - Prof. V. Esfahanian Three boards categories: Experimental Theoretical Computational Crucial to know all three: Each has their advantages and disadvantages. Require validation and verification. School of Mechanical Engineering

More information

Studies of the Continuous and Discrete Adjoint Approaches to Viscous Automatic Aerodynamic Shape Optimization

Studies of the Continuous and Discrete Adjoint Approaches to Viscous Automatic Aerodynamic Shape Optimization Studies of the Continuous and Discrete Adjoint Approaches to Viscous Automatic Aerodynamic Shape Optimization Siva Nadarajah Antony Jameson Stanford University 15th AIAA Computational Fluid Dynamics Conference

More information

High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters

High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters SIAM PP 2014 High Scalability of Lattice Boltzmann Simulations with Turbulence Models using Heterogeneous Clusters C. Riesinger, A. Bakhtiari, M. Schreiber Technische Universität München February 20, 2014

More information

LATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS

LATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS LATTICE-BOLTZMANN AND COMPUTATIONAL FLUID DYNAMICS NAVIER-STOKES EQUATIONS u t + u u + 1 ρ p = Ԧg + ν u u=0 WHAT IS COMPUTATIONAL FLUID DYNAMICS? Branch of Fluid Dynamics which uses computer power to approximate

More information

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK

Multigrid Solvers in CFD. David Emerson. Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK Multigrid Solvers in CFD David Emerson Scientific Computing Department STFC Daresbury Laboratory Daresbury, Warrington, WA4 4AD, UK david.emerson@stfc.ac.uk 1 Outline Multigrid: general comments Incompressible

More information

Turbostream: A CFD solver for manycore

Turbostream: A CFD solver for manycore Turbostream: A CFD solver for manycore processors Tobias Brandvik Whittle Laboratory University of Cambridge Aim To produce an order of magnitude reduction in the run-time of CFD solvers for the same hardware

More information

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation

Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation Nikolai Sakharnykh - NVIDIA San Jose Convention Center, San Jose, CA September 21, 2010 Introduction Tridiagonal solvers very popular

More information

Lecture 2: Introduction

Lecture 2: Introduction Lecture 2: Introduction v2015.0 Release ANSYS HFSS for Antenna Design 1 2015 ANSYS, Inc. Multiple Advanced Techniques Allow HFSS to Excel at a Wide Variety of Applications Platform Integration and RCS

More information

High-Frequency Algorithmic Advances in EM Tools for Signal Integrity Part 1. electromagnetic. (EM) simulation. tool of the practic-

High-Frequency Algorithmic Advances in EM Tools for Signal Integrity Part 1. electromagnetic. (EM) simulation. tool of the practic- From January 2011 High Frequency Electronics Copyright 2011 Summit Technical Media, LLC High-Frequency Algorithmic Advances in EM Tools for Signal Integrity Part 1 By John Dunn AWR Corporation Only 30

More information

Two-Phase flows on massively parallel multi-gpu clusters

Two-Phase flows on massively parallel multi-gpu clusters Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous

More information

1 Past Research and Achievements

1 Past Research and Achievements Parallel Mesh Generation and Adaptation using MAdLib T. K. Sheel MEMA, Universite Catholique de Louvain Batiment Euler, Louvain-La-Neuve, BELGIUM Email: tarun.sheel@uclouvain.be 1 Past Research and Achievements

More information

Towards real-time prediction of Tsunami impact effects on nearshore infrastructure

Towards real-time prediction of Tsunami impact effects on nearshore infrastructure Towards real-time prediction of Tsunami impact effects on nearshore infrastructure Manfred Krafczyk & Jonas Tölke Inst. for Computational Modeling in Civil Engineering http://www.cab.bau.tu-bs.de 24.04.2007

More information

CUDA Experiences: Over-Optimization and Future HPC

CUDA Experiences: Over-Optimization and Future HPC CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign

More information

International Supercomputing Conference 2009

International Supercomputing Conference 2009 International Supercomputing Conference 2009 Implementation of a Lattice-Boltzmann-Method for Numerical Fluid Mechanics Using the nvidia CUDA Technology E. Riegel, T. Indinger, N.A. Adams Technische Universität

More information

LATTICE-BOLTZMANN METHOD FOR THE SIMULATION OF LAMINAR MIXERS

LATTICE-BOLTZMANN METHOD FOR THE SIMULATION OF LAMINAR MIXERS 14 th European Conference on Mixing Warszawa, 10-13 September 2012 LATTICE-BOLTZMANN METHOD FOR THE SIMULATION OF LAMINAR MIXERS Felix Muggli a, Laurent Chatagny a, Jonas Lätt b a Sulzer Markets & Technology

More information

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs Gordon Erlebacher Department of Scientific Computing Sept. 28, 2012 with Dimitri Komatitsch (Pau,France) David Michea

More information

Optimization of HOM Couplers using Time Domain Schemes

Optimization of HOM Couplers using Time Domain Schemes Optimization of HOM Couplers using Time Domain Schemes Workshop on HOM Damping in Superconducting RF Cavities Carsten Potratz Universität Rostock October 11, 2010 10/11/2010 2009 UNIVERSITÄT ROSTOCK FAKULTÄT

More information

Recent Approaches of CAD / CAE Product Development. Tools, Innovations, Collaborative Engineering.

Recent Approaches of CAD / CAE Product Development. Tools, Innovations, Collaborative Engineering. Recent Approaches of CAD / CAE Product Development. Tools, Innovations, Collaborative Engineering. Author: Dr.-Ing. Peter Binde Abstract: In this paper, the latest approaches in the field of CAD-CAE product

More information

Performance and Accuracy of Lattice-Boltzmann Kernels on Multi- and Manycore Architectures

Performance and Accuracy of Lattice-Boltzmann Kernels on Multi- and Manycore Architectures Performance and Accuracy of Lattice-Boltzmann Kernels on Multi- and Manycore Architectures Dirk Ribbrock, Markus Geveler, Dominik Göddeke, Stefan Turek Angewandte Mathematik, Technische Universität Dortmund

More information

CUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata

CUDA. Fluid simulation Lattice Boltzmann Models Cellular Automata CUDA Fluid simulation Lattice Boltzmann Models Cellular Automata Please excuse my layout of slides for the remaining part of the talk! Fluid Simulation Navier Stokes equations for incompressible fluids

More information

Performance Analysis of the Lattice Boltzmann Method on x86-64 Architectures

Performance Analysis of the Lattice Boltzmann Method on x86-64 Architectures Performance Analysis of the Lattice Boltzmann Method on x86-64 Architectures Jan Treibig, Simon Hausmann, Ulrich Ruede Zusammenfassung The Lattice Boltzmann method (LBM) is a well established algorithm

More information

Software and Performance Engineering for numerical codes on GPU clusters

Software and Performance Engineering for numerical codes on GPU clusters Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3

More information

Optimization of FEM solver for heterogeneous multicore processor Cell. Noriyuki Kushida 1

Optimization of FEM solver for heterogeneous multicore processor Cell. Noriyuki Kushida 1 Optimization of FEM solver for heterogeneous multicore processor Cell Noriyuki Kushida 1 1 Center for Computational Science and e-system Japan Atomic Energy Research Agency 6-9-3 Higashi-Ueno, Taito-ku,

More information

Hardware Acceleration for CST MICROWAVE STUDIO. Amy Dewis Channel Manager

Hardware Acceleration for CST MICROWAVE STUDIO. Amy Dewis Channel Manager Hardware Acceleration for CST MICROWAVE STUDIO Amy Dewis Channel Manager Agenda 1. Acceleware Overview 2. Why use Hardware Acceleration? 3. Current Performance, Features and Hardware 4. Upcoming Features

More information

Reconfigurable Computing - (RC)

Reconfigurable Computing - (RC) Reconfigurable Computing - (RC) Yogindra S Abhyankar Hardware Technology Development Group, C-DAC Outline Motivation Architecture Applications Performance Summary HPC Fastest Growing Sector HPC, the massive

More information

Exploring the features of OpenCL 2.0

Exploring the features of OpenCL 2.0 Exploring the features of OpenCL 2.0 Saoni Mukherjee, Xiang Gong, Leiming Yu, Carter McCardwell, Yash Ukidave, Tuan Dao, Fanny Paravecino, David Kaeli Northeastern University Outline Introduction and evolution

More information

FEKO Mesh Optimization Study of the EDGES Antenna Panels with Side Lips using a Wire Port and an Infinite Ground Plane

FEKO Mesh Optimization Study of the EDGES Antenna Panels with Side Lips using a Wire Port and an Infinite Ground Plane FEKO Mesh Optimization Study of the EDGES Antenna Panels with Side Lips using a Wire Port and an Infinite Ground Plane Tom Mozdzen 12/08/2013 Summary This study evaluated adaptive mesh refinement in the

More information

Introduction to Parallel Programming in OpenMp Dr. Yogish Sabharwal Department of Computer Science & Engineering Indian Institute of Technology, Delhi

Introduction to Parallel Programming in OpenMp Dr. Yogish Sabharwal Department of Computer Science & Engineering Indian Institute of Technology, Delhi Introduction to Parallel Programming in OpenMp Dr. Yogish Sabharwal Department of Computer Science & Engineering Indian Institute of Technology, Delhi Lecture - 01 Introduction to Parallel Computing Architectures

More information

Large-scale Gas Turbine Simulations on GPU clusters

Large-scale Gas Turbine Simulations on GPU clusters Large-scale Gas Turbine Simulations on GPU clusters Tobias Brandvik and Graham Pullan Whittle Laboratory University of Cambridge A large-scale simulation Overview PART I: Turbomachinery PART II: Stencil-based

More information

Parallelization of a Electromagnetic Analysis Tool

Parallelization of a Electromagnetic Analysis Tool Parallelization of a Electromagnetic Analysis Tool Milissa Benincasa Black River Systems Co. 162 Genesee Street Utica, NY 13502 (315) 732-7385 phone (315) 732-5837 fax benincas@brsc.com United States Chris

More information

Unstructured Grid Numbering Schemes for GPU Coalescing Requirements

Unstructured Grid Numbering Schemes for GPU Coalescing Requirements Unstructured Grid Numbering Schemes for GPU Coalescing Requirements Andrew Corrigan 1 and Johann Dahm 2 Laboratories for Computational Physics and Fluid Dynamics Naval Research Laboratory 1 Department

More information

Introducing Virtuoso RF Designer (RFD) For RFIC Designs

Introducing Virtuoso RF Designer (RFD) For RFIC Designs A seminar on Cadence Virtuoso RF Designer is scheduled for March 5, 2008. To know more, write to Brajesh Heda at brajesh@cadence.com Introducing Virtuoso RF Designer (RFD) For RFIC Designs Introduction

More information

A laboratory-dualsphysics modelling approach to support landslide-tsunami hazard assessment

A laboratory-dualsphysics modelling approach to support landslide-tsunami hazard assessment A laboratory-dualsphysics modelling approach to support landslide-tsunami hazard assessment Lake Lucerne case, Switzerland, 2007 Dr. Valentin Heller (www.drvalentinheller.com) Geohazards and Earth Processes

More information

Session S0069: GPU Computing Advances in 3D Electromagnetic Simulation

Session S0069: GPU Computing Advances in 3D Electromagnetic Simulation Session S0069: GPU Computing Advances in 3D Electromagnetic Simulation Andreas Buhr, Alexander Langwost, Fabrizio Zanella CST (Computer Simulation Technology) Abstract Computer Simulation Technology (CST)

More information

The future is parallel but it may not be easy

The future is parallel but it may not be easy The future is parallel but it may not be easy Michael J. Flynn Maxeler and Stanford University M. J. Flynn 1 HiPC Dec 07 Outline I The big technology tradeoffs: area, time, power HPC: What s new at the

More information

Accelerating Double Precision FEM Simulations with GPUs

Accelerating Double Precision FEM Simulations with GPUs Accelerating Double Precision FEM Simulations with GPUs Dominik Göddeke 1 3 Robert Strzodka 2 Stefan Turek 1 dominik.goeddeke@math.uni-dortmund.de 1 Mathematics III: Applied Mathematics and Numerics, University

More information

Shape Optimization (activities ) Raino A. E. Mäkinen

Shape Optimization (activities ) Raino A. E. Mäkinen Shape Optimization (activities 1983-2010) Raino A. E. Mäkinen What is (mathematical) shape optimization? In general, any optimization problem in which parameter to be optimized has some geometric interpretation

More information

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016

ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? AGENDA Why use AMG? When to use AMG? NVIDIA AmgX Results 2

More information

CGT 581 G Fluids. Overview. Some terms. Some terms

CGT 581 G Fluids. Overview. Some terms. Some terms CGT 581 G Fluids Bedřich Beneš, Ph.D. Purdue University Department of Computer Graphics Technology Overview Some terms Incompressible Navier-Stokes Boundary conditions Lagrange vs. Euler Eulerian approaches

More information

High Performance Computing for PDE Some numerical aspects of Petascale Computing

High Performance Computing for PDE Some numerical aspects of Petascale Computing High Performance Computing for PDE Some numerical aspects of Petascale Computing S. Turek, D. Göddeke with support by: Chr. Becker, S. Buijssen, M. Grajewski, H. Wobker Institut für Angewandte Mathematik,

More information

Numerical methods in plasmonics. The method of finite elements

Numerical methods in plasmonics. The method of finite elements Numerical methods in plasmonics The method of finite elements Outline Why numerical methods The method of finite elements FDTD Method Examples How do we understand the world? We understand the world through

More information

Shape optimisation using breakthrough technologies

Shape optimisation using breakthrough technologies Shape optimisation using breakthrough technologies Compiled by Mike Slack Ansys Technical Services 2010 ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary Introduction Shape optimisation technologies

More information

Computational electromagnetic modeling in parallel by FDTD in 2D SIMON ELGLAND. Thesis for the Degree of Master of Science in Robotics

Computational electromagnetic modeling in parallel by FDTD in 2D SIMON ELGLAND. Thesis for the Degree of Master of Science in Robotics Computational electromagnetic modeling in parallel by FDTD in 2D Thesis for the Degree of Master of Science in Robotics SIMON ELGLAND School of Innovation, Design and Engineering Mälardalen University

More information

INNOVATIVE CFD FOR SUPER-COMPUTER RESULTS ON YOUR DESKTOP

INNOVATIVE CFD FOR SUPER-COMPUTER RESULTS ON YOUR DESKTOP INNOVATIVE CFD FOR SUPER-COMPUTER RESULTS ON YOUR DESKTOP XFlow is a next generation CFD software that uses a proprietary, particle-based, meshless approach which can easily handle traditionally complex

More information

Center for Computational Science

Center for Computational Science Center for Computational Science Toward GPU-accelerated meshfree fluids simulation using the fast multipole method Lorena A Barba Boston University Department of Mechanical Engineering with: Felipe Cruz,

More information

IMPLEMENTATION OF ANALYTICAL (MATLAB) AND NUMERICAL (HFSS) SOLUTIONS ADVANCED ELECTROMAGNETIC THEORY SOHAIB SAADAT AFRIDI HAMMAD BUTT ZUNNURAIN AHMAD

IMPLEMENTATION OF ANALYTICAL (MATLAB) AND NUMERICAL (HFSS) SOLUTIONS ADVANCED ELECTROMAGNETIC THEORY SOHAIB SAADAT AFRIDI HAMMAD BUTT ZUNNURAIN AHMAD STUDY OF SCATTERING & RESULTANT RADIATION PATTERN: INFINITE LINE CURRENT SOURCE POSITIONED HORIZONTALLY OVER A PERFECTLY CONDUCTING INFINITE GROUND PLANE IMPLEMENTATION OF ANALYTICAL (MATLAB) AND NUMERICAL

More information

Next-generation CFD: Real-Time Computation and Visualization

Next-generation CFD: Real-Time Computation and Visualization Next-generation CFD: Real-Time Computation and Visualization Christian F. Janßen Hamburg University of Technology Tesla C1060, ~20 million lattice nodes [2010] Kinetic approaches for the simulation of

More information

GPU Acceleration of Matrix Algebra. Dr. Ronald C. Young Multipath Corporation. fmslib.com

GPU Acceleration of Matrix Algebra. Dr. Ronald C. Young Multipath Corporation. fmslib.com GPU Acceleration of Matrix Algebra Dr. Ronald C. Young Multipath Corporation FMS Performance History Machine Year Flops DEC VAX 1978 97,000 FPS 164 1982 11,000,000 FPS 164-MAX 1985 341,000,000 DEC VAX

More information

Sailfish: Lattice Boltzmann Fluid Simulations with GPUs and Python

Sailfish: Lattice Boltzmann Fluid Simulations with GPUs and Python Sailfish: Lattice Boltzmann Fluid Simulations with GPUs and Python Micha l Januszewski Institute of Physics University of Silesia in Katowice, Poland Google GTC 2012 M. Januszewski (IoP, US) Sailfish:

More information

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids

A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids A Scalable GPU-Based Compressible Fluid Flow Solver for Unstructured Grids Patrice Castonguay and Antony Jameson Aerospace Computing Lab, Stanford University GTC Asia, Beijing, China December 15 th, 2011

More information

AIR LOAD CALCULATION FOR ISTANBUL TECHNICAL UNIVERSITY (ITU), LIGHT COMMERCIAL HELICOPTER (LCH) DESIGN ABSTRACT

AIR LOAD CALCULATION FOR ISTANBUL TECHNICAL UNIVERSITY (ITU), LIGHT COMMERCIAL HELICOPTER (LCH) DESIGN ABSTRACT AIR LOAD CALCULATION FOR ISTANBUL TECHNICAL UNIVERSITY (ITU), LIGHT COMMERCIAL HELICOPTER (LCH) DESIGN Adeel Khalid *, Daniel P. Schrage + School of Aerospace Engineering, Georgia Institute of Technology

More information

Particleworks: Particle-based CAE Software fully ported to GPU

Particleworks: Particle-based CAE Software fully ported to GPU Particleworks: Particle-based CAE Software fully ported to GPU Introduction PrometechVideo_v3.2.3.wmv 3.5 min. Particleworks Why the particle method? Existing methods FEM, FVM, FLIP, Fluid calculation

More information

Recent Via Modeling Methods for Multi-Vias in a Shared Anti-pad

Recent Via Modeling Methods for Multi-Vias in a Shared Anti-pad Recent Via Modeling Methods for Multi-Vias in a Shared Anti-pad Yao-Jiang Zhang, Jun Fan and James L. Drewniak Electromagnetic Compatibility (EMC) Laboratory, Missouri University of Science &Technology

More information

cuibm A GPU Accelerated Immersed Boundary Method

cuibm A GPU Accelerated Immersed Boundary Method cuibm A GPU Accelerated Immersed Boundary Method S. K. Layton, A. Krishnan and L. A. Barba Corresponding author: labarba@bu.edu Department of Mechanical Engineering, Boston University, Boston, MA, 225,

More information

Porting a parallel rotor wake simulation to GPGPU accelerators using OpenACC

Porting a parallel rotor wake simulation to GPGPU accelerators using OpenACC DLR.de Chart 1 Porting a parallel rotor wake simulation to GPGPU accelerators using OpenACC Melven Röhrig-Zöllner DLR, Simulations- und Softwaretechnik DLR.de Chart 2 Outline Hardware-Architecture (CPU+GPU)

More information

Sorting Through EM Simulators

Sorting Through EM Simulators DesignFeature DAVE MORRIS Application Engineer Agilent Technologies, Lakeside, Cheadle Royal Business Park, Stockport 3K8 3GR, England; e-mail: david_morris@agilent.com, www.agilent.com. ELECTRONICALLY

More information

COSC6365. Introduction to HPC. Lecture 21. Lennart Johnsson Department of Computer Science

COSC6365. Introduction to HPC. Lecture 21. Lennart Johnsson Department of Computer Science Introduction to HPC Lecture 21 Department of Computer Science Most slides from UC Berkeley CS 267 Spring 2011, Lecture 12, Dense Linear Algebra (part 2), Parallel Gaussian Elimination. Jim Demmel Dense

More information

Application Performance on Dual Processor Cluster Nodes

Application Performance on Dual Processor Cluster Nodes Application Performance on Dual Processor Cluster Nodes by Kent Milfeld milfeld@tacc.utexas.edu edu Avijit Purkayastha, Kent Milfeld, Chona Guiang, Jay Boisseau TEXAS ADVANCED COMPUTING CENTER Thanks Newisys

More information

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain Massively Parallel Computing on Silicon: SIMD Implementations V.M.. Brea Univ. of Santiago de Compostela Spain GOAL Give an overview on the state-of of-the- art of Digital on-chip CMOS SIMD Solutions,

More information

Maxwell: a 64-FPGA Supercomputer

Maxwell: a 64-FPGA Supercomputer Maxwell: a 64-FPGA Supercomputer Copyright 2007, the University of Edinburgh Dr Rob Baxter Software Development Group Manager, EPCC R.Baxter@epcc.ed.ac.uk +44 131 651 3579 Outline The FHPCA Why build Maxwell?

More information

Using Sonnet Interface in Eagleware-Elanix GENESYS. Sonnet Application Note: SAN-205A JULY 2005

Using Sonnet Interface in Eagleware-Elanix GENESYS. Sonnet Application Note: SAN-205A JULY 2005 Using Sonnet Interface in Eagleware-Elanix GENESYS Sonnet Application Note: SAN-205A JULY 2005 Description of Sonnet Suites Professional Sonnet Suites Professional is an industry leading full-wave 3D Planar

More information

Graphical Processing Units (GPU)-based modeling for Acoustic and Ultrasonic NDE

Graphical Processing Units (GPU)-based modeling for Acoustic and Ultrasonic NDE 18th World Conference on Nondestructive Testing, 16-20 April 2012, Durban, South Africa Graphical Processing Units (GPU)-based modeling for Acoustic and Ultrasonic NDE Nahas CHERUVALLYKUDY, Krishnan BALASUBRAMANIAM

More information

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI

EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI EFFICIENT SOLVER FOR LINEAR ALGEBRAIC EQUATIONS ON PARALLEL ARCHITECTURE USING MPI 1 Akshay N. Panajwar, 2 Prof.M.A.Shah Department of Computer Science and Engineering, Walchand College of Engineering,

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

New Technologies in CST STUDIO SUITE CST COMPUTER SIMULATION TECHNOLOGY

New Technologies in CST STUDIO SUITE CST COMPUTER SIMULATION TECHNOLOGY New Technologies in CST STUDIO SUITE 2016 Outline Design Tools & Modeling Antenna Magus Filter Designer 2D/3D Modeling 3D EM Solver Technology Cable / Circuit / PCB Systems Multiphysics CST Design Tools

More information

Introduction to C omputational F luid Dynamics. D. Murrin

Introduction to C omputational F luid Dynamics. D. Murrin Introduction to C omputational F luid Dynamics D. Murrin Computational fluid dynamics (CFD) is the science of predicting fluid flow, heat transfer, mass transfer, chemical reactions, and related phenomena

More information

Simulation of Turbulent Flow around an Airfoil

Simulation of Turbulent Flow around an Airfoil 1. Purpose Simulation of Turbulent Flow around an Airfoil ENGR:2510 Mechanics of Fluids and Transfer Processes CFD Lab 2 (ANSYS 17.1; Last Updated: Nov. 7, 2016) By Timur Dogan, Michael Conger, Andrew

More information

Outline. Darren Wang ADS Momentum P2

Outline. Darren Wang ADS Momentum P2 Outline Momentum Basics: Microstrip Meander Line Momentum RF Mode: RFIC Launch Designing with Momentum: Via Fed Patch Antenna Momentum Techniques: 3dB Splitter Look-alike Momentum Optimization: 3 GHz Band

More information

Experimental Validation of the Computation Method for Strongly Nonlinear Wave-Body Interactions

Experimental Validation of the Computation Method for Strongly Nonlinear Wave-Body Interactions Experimental Validation of the Computation Method for Strongly Nonlinear Wave-Body Interactions by Changhong HU and Masashi KASHIWAGI Research Institute for Applied Mechanics, Kyushu University Kasuga

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

D036 Accelerating Reservoir Simulation with GPUs

D036 Accelerating Reservoir Simulation with GPUs D036 Accelerating Reservoir Simulation with GPUs K.P. Esler* (Stone Ridge Technology), S. Atan (Marathon Oil Corp.), B. Ramirez (Marathon Oil Corp.) & V. Natoli (Stone Ridge Technology) SUMMARY Over the

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Outlining Midterm Projects Topic 3: GPU-based FEA Topic 4: GPU Direct Solver for Sparse Linear Algebra March 01, 2011 Dan Negrut, 2011 ME964

More information

Dual Polarized Phased Array Antenna Simulation Using Optimized FDTD Method With PBC.

Dual Polarized Phased Array Antenna Simulation Using Optimized FDTD Method With PBC. Dual Polarized Phased Array Antenna Simulation Using Optimized FDTD Method With PBC. Sudantha Perera Advanced Radar Research Center School of Electrical and Computer Engineering The University of Oklahoma,

More information

Computation of Velocity, Pressure and Temperature Distributions near a Stagnation Point in Planar Laminar Viscous Incompressible Flow

Computation of Velocity, Pressure and Temperature Distributions near a Stagnation Point in Planar Laminar Viscous Incompressible Flow Excerpt from the Proceedings of the COMSOL Conference 8 Boston Computation of Velocity, Pressure and Temperature Distributions near a Stagnation Point in Planar Laminar Viscous Incompressible Flow E. Kaufman

More information

A Kernel-independent Adaptive Fast Multipole Method

A Kernel-independent Adaptive Fast Multipole Method A Kernel-independent Adaptive Fast Multipole Method Lexing Ying Caltech Joint work with George Biros and Denis Zorin Problem Statement Given G an elliptic PDE kernel, e.g. {x i } points in {φ i } charges

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

SENSEI / SENSEI-Lite / SENEI-LDC Updates

SENSEI / SENSEI-Lite / SENEI-LDC Updates SENSEI / SENSEI-Lite / SENEI-LDC Updates Chris Roy and Brent Pickering Aerospace and Ocean Engineering Dept. Virginia Tech July 23, 2014 Collaborations with Math Collaboration on the implicit SENSEI-LDC

More information

Fra superdatamaskiner til grafikkprosessorer og

Fra superdatamaskiner til grafikkprosessorer og Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc

More information

MAGNETIC ANALYSIS OF BRUSHLESS DC MOTORS USING THE BOUNDARY ELEMENT METHOD. Integrated Engineering Software - Website Links

MAGNETIC ANALYSIS OF BRUSHLESS DC MOTORS USING THE BOUNDARY ELEMENT METHOD. Integrated Engineering Software - Website Links MAGNETIC ANALYSIS OF BRUSHLESS DC MOTORS USING THE BOUNDARY ELEMENT METHOD ABSTRACT The advent of high speed microcomputers and available software has made possible the calculation of magnetic fields and

More information

EM Software & Systems GmbH

EM Software & Systems GmbH EM Software & Systems GmbH Otto-Lilienthal-Straße 36 D-71034 Böblingen GERMANY Telefon +49 7031 714 5200 Telefax +49 7031 714 5249 E-Mail Web info@emss.de http://www.emss.de FEKO Benchmark to handle big

More information

Accelerating Double Precision FEM Simulations with GPUs

Accelerating Double Precision FEM Simulations with GPUs In Proceedings of ASIM 2005-18th Symposium on Simulation Technique, Sept. 2005. Accelerating Double Precision FEM Simulations with GPUs Dominik Göddeke dominik.goeddeke@math.uni-dortmund.de Universität

More information

NIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011

NIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011 NIA CFD Seminar, October 4, 2011 Hyperbolic Seminar, NASA Langley, October 17, 2011 First-Order Hyperbolic System Method If you have a CFD book for hyperbolic problems, you have a CFD book for all problems.

More information