Parallel Multigrid Preconditioning on Graphics Processing Units (GPUs) for Robust Power Grid Analysis
|
|
- Augusta Bates
- 5 years ago
- Views:
Transcription
1 Design Auomaion Group Parallel Muligrid Precondiioning on Graphics Processing Unis (GPUs) for Robus Power Grid Analysis Zhuo Feng Michigan Technological Universiy Zhiyu Zeng Texas A&M Universiy 200 ACM/EDAC/IEEE Design Auomaion Conference
2 Moivaion On-chip power disribuion nework verificaion challenge Tens of millions of grid nodes (recen IBM design reaches ~400M) Need long simulaion ime for ransien power grid verificaion Parallel circui simulaion algorihms on GPUs Pros: very cos efficien: 240-core GPU coss $400 Hardware resource usage limiaions: Shared memory size, number of regisers, ec Algorihm and daa srucure design preferences: Mulilevel ieraive algorihms for SIMD compuing plaform GPU-friendly device memory access paerns, simple conrol flow Our conribuion: a robus power grid simulaion mehod for GPU Muligrid precondiioning assures fas convergence (< 20 ieraions) GPU-specific daa srucure guaranee coalesced memory access 2
3 IR Drop in Power Disribuion Nework IR drop: volage drop due o non-ideal resisive wires V DD X X VDD VDD GND 3 GND Cadence
4 Power Grid Modeling & Analysis Muli-layer inerconnecs are modeled as 3D RC nework Swiching gae effecs are modeled by ime-varying curren loadings Vdd Vdd Vdd Vdd DC analysis solves linear sysem G v = b Transien analysis solves dv() G v () + C = b () d 4 Tens of millions of unknowns! G R C R v R b R nn nn n : n : : : Conducance Marix Capaciance Marix Node Volage Vecor Curren Loading Vecor
5 Prior Work Prior power grid analysis approaches Direc mehods (LU facorizaion, Cholesky decomposiion) Cholmod uses 7GB memory and >,000 s for a 9-million grid Ieraive mehods Precondiioned conjugae gradien (T. Chen e al, DAC 0) Muligrid mehods (S. Nassif e al, DAC 00) Sochasic mehod Random walk (H. Qian e al, DAC 05) V DD 5 Direc Mehod Muligrid V DD Random walk
6 Prior Work (Con.) Recen GPU based power grid analysis mehods Hybrid Muligrid mehod on GPU (Z. Feng e al, ICCAD 08) Pros: very fas (solves four million nodes per second) Cons: convergence rae depends on 2D grid approximaion Poisson Solver (J. Shi e al, DAC 09) Pros: public CUFFT library -> easier implemenaion Cons: only suiable 2D regular grids Robus precondiioned Krylov subspace ieraive mehods on GPU Precondiioners using incomplee LU or Cholesky marix facors Marix facors are hard o sore and process on GPU Muligrid based precondiioning mehods SIMD muligrid solver + sparse marix-vecor operaions on GPU 6
7 NVIDIA GPU Archiecure Sreaming Muliprocessor (SM) 8 sreaming processors (SP) 2 special funcion unis (SFU) Mulihreaded insrucion fech/dispach uni Muli-hreaded insrucion dispach o 52 acive hreads Read/wrie Thread Execuion Manager Parallel Daa Parallel Daa Parallel Daa Parallel Daa Parallel Daa Parallel Daa Parallel Daa Parallel Cache Cache Cache Cache Cache Cache Cache Daa Texure Texure Texure Texure Texure Texure Texure Texure Cache Read/wrie Read/wrie Read/wrie Read/wrie Read/wrie Global Memory Sreaming Muliprocessor 32 hreads (a warp) share one Insrucion L Daa L insrucion fech Insrucion Fech/Dispach Cover memory load laency Some facs abou an SM SP SP 7 6 KB shared memory 8,96 regisers >30 Gflops peak performance SP SP SP Shared Memory SFU SP SP SP SFU
8 GPU Memory Space (CUDA Memory Model) Each hread: R/W per-hread local memory R/W per-block shared memory R/W per-grid global memory Read only per-grid exure/consan memory Local Shared Global Texure Read Yes Yes Yes Yes Wrie Yes Yes Yes No Size Large Small Large Large BW High High High High Cached? No Yes No Yes Laency 500 cyc. 20 cyc. 500 cyc. 300 cyc. 8 Device Memory Comparison Block Block Thread Block Grid 0 Block 2 Grid Block Block N Block N Local Memory Shared Memory Global Memory
9 Conribuion of This Work Muligrid precondiioned Krylov subspace ieraive solver on GPU Hos (CPU) Memory 3D Muli layer Irregular Power Grid VDD VDD VDD VDD VDD VDD GPU Global Memory Jacobi (DRAM) Smooher using Sparse Marix a, a,4 a,5 a2,2 a2,3 a 2,6 a3,2 a3,3 a 3,8 a4, a4,4 a5, a5,5 a 5,7 a6,2 a6,6 a6,8 a7,5 a 7,7 a8,3 a8,6 a8,8 + Geomerical Muligrid Solver (GMD) MGPCG Algorihm on GPU Se Iniial Soluion Ge Iniial Residual and Search Direcion Updae Soluion and Residual Check Convergence No Converged Muligrid Precondiioning Converged DC : TR : Gx = b () Gx() + C dx = b() d Updae Search Direcion Reurn Final Soluion 9 Original Grid Marix + Geomerical Represenaion GPU-friendly Muli-level Ieraive Algorihm
10 Muligrid Mehods Among fases numerical algorihms for PDE-like problems Linear complexiy in he number of unknowns A hierarchy of exac o coarse replicas of he problem High (low) frequency errors damped on fine (coarse) grids Direc/ieraive solvers for coarses grid Muligrid operaions Smoohing, resricion, prolongaion and correcion, ec Algebraic MG (AMG) and Geomeric MG (GMD) GMD: suiable for GPU s SIMD compuaion AMG: robus for handling irregular grids, bu needs irregular memory access and complex conrol flow 0
11 Power Grid Topology Regularizaion Locaion-based mapping (Z. Feng e al, ICCAD 08) Meal 5~6 Meal 3~4 Meal ~2 2D Regular Grid
12 Parallel Muligrid Precondiioning 3D grid smooher + 2D gird GMD solver 3D fines grid is sored using ELL-like sparse marix forma 2D coarser o coarses grids are processed geomerically Coalesced memory accesses are guaraneed on GPU Jacobi Smoohing Jacobi Smoohing RHS Soluion Smooh Smooh Smooh Smooh Resric Smooh Prolong & Correc Smooh Resric Smooh Prolong & Correc Smooh Resric Prolong & Correc Resric Prolong & Correc Ieraive Marix Solver GMD Solver Ieraive Marix Solver Jacobi GMD Solve Jacobi GMD Solve Jacobi 2
13 GMD Smooher Mixed block-wise relaxaion on GPU Weighed Jacobi ieraions wihin each block Sreaming processors SP SP3 SP5 SP2 SP4 SP6 SP SP3 SP5 SP2 SP4 SP6 SP SP3 SP5 SP2 SP4 SP6 Muliprocessors SP7 SP8 SP7 SP8 SP7 SP8 Gauss-Seidel ieraions i among blocks Shared Memory SM Shared Memory SM2 Shared Memory SM3 Global Memory Execuion Time 3
14 Memory Layou on GPU Mixed daa srucures Original grid (fines grid level) X Level 0 Resricion Prolongaion ELL-like Sparse Marix a, a,4 a,5 a2,2 a2,3 a2,6 a a a a4, a4,4 a5, a5,5 a5,7 a a a a7,5 a7,7 a a a 3,2 3,3 3,8 6,2 6,6 6,8 8,3 8,6 8,8 Y Level 3 (coarses grid) Level 2 Level Regularized coarse o coarses grids Graphics Pixels on GPU 4
15 Nodal Analysis Marix ELL-like sparse marix sorage a, a,4 4 a,5 5 a2,2 a2,3 a2,6 a a a a4, a4,4 a5, a5,5 a5,7 a a a a7,5 a7,7 a a a 3,2 3,3 3,8 6,2 6,6 6,8 A = D + M D 8,3 8,6 8,8 Elemen Value Vecor Off Diagonal Elemens M Elemen Index Vecor Col Col 2 Col Col 2 : Diagonal Elemens of A M :Off-Diagonal Elemens of A P P P P a,4 a 2,3 a 7,5 8,3 a,5 a 2, a 3 6 a 8,6 0 + Inversed Diagonal Elemens D a a, a 2,2 a 7,7 a 8,8 P 5
16 GPU Device Memory Access Paern GPU-based Jacobi Ieraion (smooher): ( k + ) ( k ) x = D b Mx 2 a,4 a 2, a,5 a 2, a, a 2,2 7 8 a 7, a a8,3 8,6 7 8 a a 7,7 a 8,8 P P P P P Execuion T T2 T3 T4 Time ( k ) ( k + ) S = b M x x = D S + 6
17 Algorihm Flow Jacobi Smooher using Sparse Marix a, a,4 a,5 a2,2 a2,3 a 2,6 a3,2 a3,3 a 3,8 a4, a4,4 a5, a5,5 a 5,7 a6,2 a6,6 a6,8 a7,5 a 7,7 a8,3 a8,6 a8,8 + Geomerical Muligrid Solver (GMD) Se Iniial Soluion Ge Iniial Residual and Search Direcion Updae Soluion and Residual Check Convergence No Converged Muligrid Precondiioning Updae Search Direcion Converged Reurn Final Soluion 7
18 Experimen Resuls Linux compuing sysem: C++ & CUDA CPU: Core 2 Quad 2.66GHz + 6GB DRAM GPU: NVIDIA GTX 285.5GHz wih 240 SPs ($400) Power grid es cases IBM power grid benchmark circuis CKT~5 (0.3M ~ 7M).7 Larger indusrial power grid designs CKT6~8 (4.5 M ~ 0M) Direc solver on he hos Cholmod wih Supernodal and Meis funcions Ieraive solvers on GPU MGPCG: muligrid precondiioned CG DPCG: diagonally precondiioned CG HMD: hybrid muligrid (Z. Feng e al, ICCAD 08) ) 8
19 Power Grid Design Informaion CKT N_node N_layer N_nnz N_res N_cur CKT 27.0K K 209.7K 37.9K CKT2 85.6K 5 3.7M.4M 20.K CKT K 6 4.M.5M 277.0K CKT4.0M 3 4.3M.7M 540.8K CKT5.7M 3 6.6M 2.5M 76.5K CKT6 4.7M 8 8.8M 6.8M 85.5K CKT7 6.7M M 9.7M 267.3K CKT8 0.5M M 4.8M 49.3K N_layer: he number of meal layers N_res: he number of resisors N_cur: he number of curren sources 9
20 Convergence Comparison Residua al 0 0 MGPCG Max Errors HMD: e 3 Vol MGPCG: e 5 Vol o 0-4 HMD Ieraion Number HMD: Hybrid bidmuligrid idmehod MGPCG: Muligrid Precondiioned Conjugae Gradien Mehod 20 MGPC CG conve erges muc ch faser
21 Resuls Power Grid DC Analysis CKT NCG NDPCG NMGPCG NHMD TCG TDPCG CKT, CKT2 4,834 3, CKT3 2, CKT4 4, > CKT5 6, > NCG: he number of CG ieraions NDPCG: he number of diagonally precondiioned CG ieraions NMGPCG: he number of muligrid precondiioned CG ieraions NHMD: he number of hybrid muligrid ieraions TCG: he runime of CG TDPCG: he runime of diagonally precondiioned CG ieraions 2
22 Resuls (Con.) Power Grid DC Analysis (Con.) CKT TMGPCG THMD TCHOL Eavg Emax Speedup CKT e-4 4e-4 34X CKT e-6 2e-5 40X CKT e-5 e-4 54X CKT4 0.9 > e-5 7e-4 22X CKT5. > e-4 5e-4 25X TMGPCG: he runime of muligrid precondiioned CG ieraions THMD: he runime of hybrid muligrid ieraions TCHOL: he runime of direc marix solver (Cholmod) Eavg: average error Emax: maximum error Speedup: TCHOL/TMGPCG 22
23 Resuls (Con.) DC Analysis of Large Circuis CKT N_node N_MGPCG T_MGPCG T_CHOL Speedup CKT6 4.7M X CKT7 6.7M X CKT8 0.5M.6 N/A N/A N_MGPCG: he number of MGPCG ieraions T_MGPCG: he runime of MGPCG solver T_CHOL: he runime of he Cholmod solver 23
24 Resuls (Con.) Transien Analysis Resuls CKT Tcpu Tgpu Ngpu Eavg Emax Speedup CKT e-6 8e-4 20X CKT e-5 3e-4 2X CKT e-6 e-4 23X CKT e-5 2e-4 8X CKT e-5 e-4 2X Tcpu: Cholmod od solve ime Tgpu: MGPCG ime Ngpu: he number of MGPCG ieraions 24
25 Transien Analysis: CKT.8.75 Volage (V).7 Cholmod GPU Time (seconds) x CKT wih 27K nodes 500ime seps 509 MGPCG iers. Volage (V) Cholmod GPU Cholmod: 22s GPU: 9.2s 23X Speedups Time (seconds) x
26 Transien Analysis: CKT Cholmod GPU lage (V) Vo Time (seconds) x CKT5 wih 7Mnodes.7M ime seps 693 MGPCG iers. 26 Volage (V) Cholmod GPU Time (seconds) x 0-9 Cholmod: 2,700s GPU: 28s 22X Speedups
27 Conclusion and Fuure Work Robus circui simulaion on GPU is challenging How o accelerae simulaions for irregular problems? Hard o guaranee he accuracy and robusness? Parallel muligrid precondiioning mehod for power grid analysis Muligrid id precondiioning i (geomerical + marix represenaions) i Geomerical muligrid solver on GPU ELL-like sparse marix-vecor operaions for original grids on GPU Applicable o more general power grids wih srong irregulariies ii Much faser convergence & higher accuracy hen ever before Fuure work Node ordering and grid pariioning for Muli-Core-Muli-GPUs GPU performance modeling for furher improving he solver efficiency Heerogeneous compuing o adapively balance he work loads 27
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 10, OCTOBER
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 10, OCTOBER 2011 1823 Parallel On-Chip Power Distribution Network Analysis on Multi-Core-Multi-GPU Platforms Zhuo Feng, Member,
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Compuer Archiecure and Engineering Lecure 7 - Memory Hierarchy-II Krse Asanovic Elecrical Engineering and Compuer Sciences Universiy of California a Berkeley hp://www.eecs.berkeley.edu/~krse hp://ins.eecs.berkeley.edu/~cs152
More informationReal Time Integral-Based Structural Health Monitoring
Real Time Inegral-Based Srucural Healh Monioring The nd Inernaional Conference on Sensing Technology ICST 7 J. G. Chase, I. Singh-Leve, C. E. Hann, X. Chen Deparmen of Mechanical Engineering, Universiy
More informationCS 152 Computer Architecture and Engineering. Lecture 6 - Memory
CS 152 Compuer Archiecure and Engineering Lecure 6 - Memory Krse Asanovic Elecrical Engineering and Compuer Sciences Universiy of California a Berkeley hp://www.eecs.berkeley.edu/~krse hp://ins.eecs.berkeley.edu/~cs152
More informationImplementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report)
Implemening Ray Casing in Terahedral Meshes wih Programmable Graphics Hardware (Technical Repor) Marin Kraus, Thomas Erl March 28, 2002 1 Inroducion Alhough cell-projecion, e.g., [3, 2], and resampling,
More informationCS 152 Computer Architecture and Engineering. Lecture 6 - Memory
CS 152 Compuer Archiecure and Engineering Lecure 6 - Memory Krse Asanovic Elecrical Engineering and Compuer Sciences Universiy of California a Berkeley hp://www.eecs.berkeley.edu/~krse hp://ins.eecs.berkeley.edu/~cs152
More informationC 1. Last Time. CSE 490/590 Computer Architecture. Cache I. Branch Delay Slots (expose control hazard to software)
CSE 490/590 Compuer Archiecure Cache I Seve Ko Compuer Sciences and Engineering Universiy a Buffalo Las Time Pipelining hazards Srucural hazards hazards Conrol hazards hazards Sall Bypass Conrol hazards
More informationCPUs and GPUs with Many Cores
Mulimedia Signal Processing on CPUs and GPUs wih Many Cores Yen-Kuang Chen Inel Corporaion y.k.chen@ieee.org Summary Muli-Core is Becoming Mainsream Archiecure: Number of cores per chip will grow quickly
More informationLet s get physical - EDA Tools for Mobility
Le s ge physical - EDA Tools for Mobiliy Aging and Reliabiliy Communicaion Mobile and Green Mobiliy - Smar and Safe Frank Oppenheimer OFFIS Insiue for Informaion Technology OFFIS a a glance Applicaion-oriened
More informationMultigrid on GPU: Tackling Power Grid Analysis on Parallel SIMT Platforms
Multigrid on GPU: Tackling Power Grid Analysis on Parallel SIMT Platforms Zhuo Feng and Peng Li Department of Electrical and Computer Engineering Texas A&M University, College Station, TX 77843 Email:
More informationCAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL
CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL Klečka Jan Docoral Degree Programme (1), FEEC BUT E-mail: xkleck01@sud.feec.vubr.cz Supervised by: Horák Karel E-mail: horak@feec.vubr.cz
More informationMotor Control. 5. Control. Motor Control. Motor Control
5. Conrol In his chaper we will do: Feedback Conrol On/Off Conroller PID Conroller Moor Conrol Why use conrol a all? Correc or wrong? Supplying a cerain volage / pulsewidh will make he moor spin a a cerain
More informationM(t)/M/1 Queueing System with Sinusoidal Arrival Rate
20 TUTA/IOE/PCU Journal of he Insiue of Engineering, 205, (): 20-27 TUTA/IOE/PCU Prined in Nepal M()/M/ Queueing Sysem wih Sinusoidal Arrival Rae A.P. Pan, R.P. Ghimire 2 Deparmen of Mahemaics, Tri-Chandra
More informationA GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER
A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER ABSTRACT Modern graphics cards for compuers, and especially heir graphics processing unis (GPUs), are designed for fas rendering of graphics.
More informationA Progressive-ILP Based Routing Algorithm for Cross-Referencing Biochips
16.3 A Progressive-ILP Based Rouing Algorihm for Cross-Referencing Biochips Ping-Hung Yuh 1, Sachin Sapanekar 2, Chia-Lin Yang 1, Yao-Wen Chang 3 1 Deparmen of Compuer Science and Informaion Engineering,
More informationQuick Verification of Concurrent Programs by Iteratively Relaxed Scheduling
Quick Verificaion of Concurren Programs by Ieraively Relaxed Scheduling Parick Mezler, Habib Saissi, Péer Bokor, Neeraj Suri Technische Univerisä Darmsad, Germany {mezler, saissi, pbokor, suri}@deeds.informaik.u-darmsad.de
More informationSpline Curves. Color Interpolation. Normal Interpolation. Last Time? Today. glshademodel (GL_SMOOTH); Adjacency Data Structures. Mesh Simplification
Las Time? Adjacency Daa Srucures Spline Curves Geomeric & opologic informaion Dynamic allocaion Efficiency of access Mesh Simplificaion edge collapse/verex spli geomorphs progressive ransmission view-dependen
More informationGPU Implementation of Spiking Neural Networks for Color Image Segmentation
20 4h Inernaional Congress on Image and Signal Processing GPU Implemenaion of Spiking Neural Neworks for Color Image Segmenaion Ermai Xie Marin McGinni QingXiang Wu Inelligen Ssems Research Cener Universi
More informationCombinatorial Optimization for Embedded System Design. Luca Benini
Combinaorial Opimizaion for Embedded Sysem Design Luca Benini Work in cooperaion wih Michela Milano s group Embedded Sysems A rough definiion Any compuing sysem which is no a compuer Large variey of devices
More informationPART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR
. ~ PART 1 c 0 \,).,,.,, REFERENCE NFORMATON CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONTOR n CONTROL DATA 6400 Compuer Sysems, sysem funcions are normally handled by he Monior locaed in a Peripheral
More informationLearning in Games via Opponent Strategy Estimation and Policy Search
Learning in Games via Opponen Sraegy Esimaion and Policy Search Yavar Naddaf Deparmen of Compuer Science Universiy of Briish Columbia Vancouver, BC yavar@naddaf.name Nando de Freias (Supervisor) Deparmen
More informationComputer representations of piecewise
Edior: Gabriel Taubin Inroducion o Geomeric Processing hrough Opimizaion Gabriel Taubin Brown Universiy Compuer represenaions o piecewise smooh suraces have become vial echnologies in areas ranging rom
More informationParallel Breadth First Search on GPU Clusters
1 Parallel Breadh Firs Search on GPU Clusers Zhisong Fu, Harish Kumar Dasari, Marin Berzins and Bryan Thompson Absrac Fas, scalable, low-cos, and low-power execuion of parallel graph algorihms is imporan
More informationAn Adaptive Spatial Depth Filter for 3D Rendering IP
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.3, NO. 4, DECEMBER, 23 175 An Adapive Spaial Deph Filer for 3D Rendering IP Chang-Hyo Yu and Lee-Sup Kim Absrac In his paper, we presen a new mehod
More informationPei-Yin Tsai, Tien-Ming Wang, and Alvin Su SCREAM Lab National Cheng Kung University Tainan, Taiwan
Proc. of he 13 h In. Conference on Digial Audio Effecs (DAFx-1), Graz, Ausria, Sepember 6-1, 21 GPU-BASED SPECTRAL MODEL SYNTHESIS FOR REAL-TIME SOUND RENDERING Pei-Yin Tsai, Tien-Ming Wang, and Alvin
More informationData Structures and Algorithms. The material for this lecture is drawn, in part, from The Practice of Programming (Kernighan & Pike) Chapter 2
Daa Srucures and Algorihms The maerial for his lecure is drawn, in par, from The Pracice of Programming (Kernighan & Pike) Chaper 2 1 Moivaing Quoaion Every program depends on algorihms and daa srucures,
More informationUser Adjustable Process Scheduling Mechanism for a Multiprocessor Embedded System
Proceedings of he 6h WSEAS Inernaional Conference on Applied Compuer Science, Tenerife, Canary Islands, Spain, December 16-18, 2006 346 User Adjusable Process Scheduling Mechanism for a Muliprocessor Embedded
More informationY. Tsiatouhas. VLSI Systems and Computer Architecture Lab
CMOS INEGRAED CIRCUI DESIGN ECHNIQUES Universiy of Ioannina Clocking Schemes Dep. of Compuer Science and Engineering Y. siaouhas CMOS Inegraed Circui Design echniques Overview 1. Jier Skew hroughpu Laency
More informationSTEREO PLANE MATCHING TECHNIQUE
STEREO PLANE MATCHING TECHNIQUE Commission III KEY WORDS: Sereo Maching, Surface Modeling, Projecive Transformaion, Homography ABSTRACT: This paper presens a new ype of sereo maching algorihm called Sereo
More informationA Bayesian Approach to Video Object Segmentation via Merging 3D Watershed Volumes
A Bayesian Approach o Video Objec Segmenaion via Merging 3D Waershed Volumes Yu-Pao Tsai 1,3, Chih-Chuan Lai 1,2, Yi-Ping Hung 1,2, and Zen-Chung Shih 3 1 Insiue of Informaion Science, Academia Sinica,
More informationReal-time 2D Video/3D LiDAR Registration
Real-ime 2D Video/3D LiDAR Regisraion C. Bodenseiner Fraunhofer IOSB chrisoph.bodenseiner@iosb.fraunhofer.de M. Arens Fraunhofer IOSB michael.arens@iosb.fraunhofer.de Absrac Progress in LiDAR scanning
More informationMobile Robots Mapping
Mobile Robos Mapping 1 Roboics is Easy conrol behavior percepion modelling domain model environmen model informaion exracion raw daa planning ask cogniion reasoning pah planning navigaion pah execuion
More informationA GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER
A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER Gusaf Hendeby, Jeroen D. Hol, Rickard Karlsson, Fredrik Gusafsson Deparmen of Elecrical Engineering Auomaic Conrol Linköping Universiy,
More informationFIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS
FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS Mohammed A. Aseeri and M. I. Sobhy Deparmen of Elecronics, The Universiy of Ken a Canerbury Canerbury, Ken, CT2
More informationOptimal Crane Scheduling
Opimal Crane Scheduling Samid Hoda, John Hooker Laife Genc Kaya, Ben Peerson Carnegie Mellon Universiy Iiro Harjunkoski ABB Corporae Research EWO - 13 November 2007 1/16 Problem Track-mouned cranes move
More informationCurves & Surfaces. Last Time? Today. Readings for Today (pick one) Limitations of Polygonal Meshes. Today. Adjacency Data Structures
Las Time? Adjacency Daa Srucures Geomeric & opologic informaion Dynamic allocaion Efficiency of access Curves & Surfaces Mesh Simplificaion edge collapse/verex spli geomorphs progressive ransmission view-dependen
More informationImage segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding
Moivaion Image segmenaion Which pixels belong o he same objec in an image/video sequence? (spaial segmenaion) Which frames belong o he same video sho? (emporal segmenaion) Which frames belong o he same
More informationOutline. EECS Components and Design Techniques for Digital Systems. Lec 06 Using FSMs Review: Typical Controller: state
Ouline EECS 5 - Componens and Design Techniques for Digial Sysems Lec 6 Using FSMs 9-3-7 Review FSMs Mapping o FPGAs Typical uses of FSMs Synchronous Seq. Circuis safe composiion Timing FSMs in verilog
More informationEffects needed for Realism. Ray Tracing. Ray Tracing: History. Outline. Foundations of Computer Graphics (Fall 2012)
Foundaions of ompuer Graphics (Fall 2012) S 184, Lecure 16: Ray Tracing hp://ins.eecs.berkeley.edu/~cs184 Effecs needed for Realism (Sof) Shadows Reflecions (Mirrors and Glossy) Transparency (Waer, Glass)
More informationParallel Breadth First Search on GPU Clusters
Parallel Breadh Firs Search on GPU Clusers h6p://mapgraph.io This work was (parially) funded by he DARPA XDATA program under AFRL Conrac #FA8750-13-C-0002. This maerial is based upon work suppored by he
More informationDAGM 2011 Tutorial on Convex Optimization for Computer Vision
DAGM 2011 Tuorial on Convex Opimizaion for Compuer Vision Par 3: Convex Soluions for Sereo and Opical Flow Daniel Cremers Compuer Vision Group Technical Universiy of Munich Graz Universiy of Technology
More informationBulletin 700-HA Plug-in Style Relays
Bullein 00-HA Bullein 00-HA or Changeover s Tube Base Socke Mouning Muli-Range Time and Surge Suppressor Modules Table of Conens Descripion Page Descripion Page Overview...........................................
More informationRao-Blackwellized Particle Filtering for Probing-Based 6-DOF Localization in Robotic Assembly
MITSUBISHI ELECTRIC RESEARCH LABORATORIES hp://www.merl.com Rao-Blackwellized Paricle Filering for Probing-Based 6-DOF Localizaion in Roboic Assembly Yuichi Taguchi, Tim Marks, Haruhisa Okuda TR1-8 June
More informationFrantz LOHIER 1,2, Lionel LACASSAGNE 1,2, Pr. Patrick GARDA 2
A New Mehodology o Opimize DMA Daa Caching: Applicaion owards he Real-ime Execuion of an MRF-based Moion Deecion Algorihm on a muli-processor DSP Franz LOHER 1,2, Lionel LACASSAGNE 1,2, Pr. Parick GARDA
More informationNumerical Solution of ODE
Numerical Soluion of ODE Euler and Implici Euler resar; wih(deools): wih(plos): The package ploools conains more funcions for ploing, especially a funcion o draw a single line: wih(ploools): wih(linearalgebra):
More informationScheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012
EDA421/DIT171 - Parallel and Disribued Real-Time Sysems, Chalmers/GU, 2011/2012 Lecure #4 Updaed March 16, 2012 Aemps o mee applicaion consrains should be done in a proacive way hrough scheduling. Schedule
More informationFUZZY HUMAN/MACHINE RELIABILITY USING VHDL
FUZZY HUMN/MCHINE RELIBILITY USING VHDL Carlos. Graciós M. 1, lejandro Díaz S. 2, Efrén Gorroiea H. 3 (1) Insiuo Tecnológico de Puebla v. Tecnológico 420. Col. Maravillas, C. P. 72220, Puebla, Pue. México
More informationLast Time: Curves & Surfaces. Today. Questions? Limitations of Polygonal Meshes. Can We Disguise the Facets?
Las Time: Curves & Surfaces Expeced value and variance Mone-Carlo in graphics Imporance sampling Sraified sampling Pah Tracing Irradiance Cache Phoon Mapping Quesions? Today Moivaion Limiaions of Polygonal
More informationWhy not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems
Simulaion Wha is simulaion? Simple synonym: imiaion We are ineresed in sudying a Insead of experimening wih he iself we experimen wih a model of he Experimen wih he Acual Ways o sudy a Sysem Experimen
More informationCOSC 3213: Computer Networks I Chapter 6 Handout # 7
COSC 3213: Compuer Neworks I Chaper 6 Handou # 7 Insrucor: Dr. Marvin Mandelbaum Deparmen of Compuer Science York Universiy F05 Secion A Medium Access Conrol (MAC) Topics: 1. Muliple Access Communicaions:
More informationFisheye Lens Distortion Correction on Multicore and Hardware Accelerator Platforms
Fiheye Len Diorion Correcion on Mulicore and Hardware Acceleraor Plaform Konani Dalouka 1 Chrio D. Anonopoulo 1 Nikolao Sek M. Bella 1 Chai 1 Deparmen of Compuer and Communicaion Engineering Univeriy of
More informationLocation. Electrical. Loads. 2-wire mains-rated. 0.5 mm² to 1.5 mm² Max. length 300 m (with 1.5 mm² cable). Example: Belden 8471
Produc Descripion Insallaion and User Guide Transiser Dimmer (454) The DIN rail mouned 454 is a 4channel ransisor dimmer. I can operae in one of wo modes; leading edge or railing edge. All 4 channels operae
More informationSchedule. Curves & Surfaces. Questions? Last Time: Today. Limitations of Polygonal Meshes. Acceleration Data Structures.
Schedule Curves & Surfaces Sunday Ocober 5 h, * 3-5 PM *, Room TBA: Review Session for Quiz 1 Exra Office Hours on Monday (NE43 Graphics Lab) Tuesday Ocober 7 h : Quiz 1: In class 1 hand-wrien 8.5x11 shee
More informationSimple Network Management Based on PHP and SNMP
Simple Nework Managemen Based on PHP and SNMP Krasimir Trichkov, Elisavea Trichkova bsrac: This paper aims o presen simple mehod for nework managemen based on SNMP - managemen of Cisco rouer. The paper
More informationOn Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators
On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC
More informationAdvanced Modeling and Simulation Strategies for Power Integrity in High-Speed Designs
Advanced Modeling and Simulation Strategies for Power Integrity in High-Speed Designs Ramachandra Achar Carleton University 5170ME, Dept. of Electronics Ottawa, Ont, Canada K1S 5B6 *Email: achar@doe.carleton.ca;
More informationWindow Query and Analysis on Massive Spatio-Temporal Data
Available online a www.sciencedirec.com ScienceDirec IERI Procedia 10 (2014 ) 138 143 2014 Inernaional Conference on Fuure Informaion Engineering Window Query and Analysis on Massive Spaio-Temporal Daa
More informationDimmer time switch AlphaLux³ D / 27
Dimmer ime swich AlphaLux³ D2 426 26 / 27! Safey noes This produc should be insalled in line wih insallaion rules, preferably by a qualified elecrician. Incorrec insallaion and use can lead o risk of elecric
More information3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs
3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs H. Knibbe, C. W. Oosterlee, C. Vuik Abstract We are focusing on an iterative solver for the three-dimensional
More informationMaximum Flows: Polynomial Algorithms
Maximum Flow: Polynomial Algorihm Algorihm Augmening pah Algorihm - Labeling Algorihm - Capaciy Scaling Algorihm - Shore Augmening Pah Algorihm Preflow-Puh Algorihm - FIFO Preflow-Puh Algorihm - Highe
More informationUsing CANopen Slave Driver
CAN Bus User Manual Using CANopen Slave Driver V1. Table of Conens 1. SDO Communicaion... 1 2. PDO Communicaion... 1 3. TPDO Reading and RPDO Wriing... 2 4. RPDO Reading... 3 5. CANopen Communicaion Parameer
More informationA PARALLEL IMPLEMENTATION OF THE BLOCK-PARTITIONED. Key words. Symmetric multifrontal method, supernode, Takahashi equations,
A PARALLEL IMPLEMENTATION OF THE BLOCK-PARTITIONED INVERSE MULTIFRONTAL ZSPARSE ALGORITHM YOGIN E. CAMPBELL AND TIMOTHY A. DAVIS y Technical Repor TR-95-023, Compuer and Informaion Sciences Deparmen, Universiy
More informationGPU Simulation of Finite Element Facial Soft-Tissue Models
EG UK Theory and Pracice of Compuer Graphics (203) Silveser Czanner and Wen Tang (Ediors) GPU Simulaion of Finie Elemen Facial Sof-Tissue Models Mark Warburon and Seve Maddock Deparmen of Compuer Science,
More informationNetwork management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional);
QoS in Frame Relay Frame relay characerisics are:. packe swiching wih virual circui service (virual circuis are bidirecional);. labels are called DLCI (Daa Link Connecion Idenifier);. for connecion is
More informationAnalysis of Various Types of Bugs in the Object Oriented Java Script Language Coding
Indian Journal of Science and Technology, Vol 8(21), DOI: 10.17485/ijs/2015/v8i21/69958, Sepember 2015 ISSN (Prin) : 0974-6846 ISSN (Online) : 0974-5645 Analysis of Various Types of Bugs in he Objec Oriened
More informationCOMP26120: Algorithms and Imperative Programming
COMP26120 ecure C3 1/48 COMP26120: Algorihms and Imperaive Programming ecure C3: C - Recursive Daa Srucures Pee Jinks School of Compuer Science, Universiy of Mancheser Auumn 2011 COMP26120 ecure C3 2/48
More informationMORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES
MORPHOLOGICAL SEGMENTATION OF IMAGE SEQUENCES B. MARCOTEGUI and F. MEYER Ecole des Mines de Paris, Cenre de Morphologie Mahémaique, 35, rue Sain-Honoré, F 77305 Fonainebleau Cedex, France Absrac. In image
More informationA Matching Algorithm for Content-Based Image Retrieval
A Maching Algorihm for Conen-Based Image Rerieval Sue J. Cho Deparmen of Compuer Science Seoul Naional Universiy Seoul, Korea Absrac Conen-based image rerieval sysem rerieves an image from a daabase using
More informationGPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis
GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis Abstract: Lower upper (LU) factorization for sparse matrices is the most important computing step for circuit simulation
More informationLess Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks
Less Pessimisic Wors-Case Delay Analysis for Packe-Swiched Neworks Maias Wecksén Cenre for Research on Embedded Sysems P O Box 823 SE-31 18 Halmsad maias.wecksen@hh.se Magnus Jonsson Cenre for Research
More informationReal-Time Non-Rigid Multi-Frame Depth Video Super-Resolution
Real-Time Non-Rigid Muli-Frame Deph Video Super-Resoluion Kassem Al Ismaeil 1, Djamila Aouada 1, Thomas Solignac 2, Bruno Mirbach 2, Björn Oersen 1 1 Inerdisciplinary Cenre for Securiy, Reliabiliy, and
More informationAutomatic Generation of Algorithms and Data Structures for Geometric Multigrid. Harald Köstler, Sebastian Kuckuk Siam Parallel Processing 02/21/2014
Automatic Generation of Algorithms and Data Structures for Geometric Multigrid Harald Köstler, Sebastian Kuckuk Siam Parallel Processing 02/21/2014 Introduction Multigrid Goal: Solve a partial differential
More informationRobust Multi-view Face Detection Using Error Correcting Output Codes
Robus Muli-view Face Deecion Using Error Correcing Oupu Codes Hongming Zhang,2, Wen GaoP P, Xilin Chen 2, Shiguang Shan 2, and Debin Zhao Deparmen of Compuer Science and Engineering, Harbin Insiue of Technolog
More informationOpenFOAM + GPGPU. İbrahim Özküçük
OpenFOAM + GPGPU İbrahim Özküçük Outline GPGPU vs CPU GPGPU plugins for OpenFOAM Overview of Discretization CUDA for FOAM Link (cufflink) Cusp & Thrust Libraries How Cufflink Works Performance data of
More informationData Structures and Algorithms
Daa Srucures and Algorihms The maerial for his lecure is drawn, in ar, from The Pracice of Programming (Kernighan & Pike) Chaer 2 1 Goals of his Lecure Hel you learn (or refresh your memory) abou: Common
More informationGauss-Jordan Algorithm
Gauss-Jordan Algorihm The Gauss-Jordan algorihm is a sep by sep procedure for solving a sysem of linear equaions which may conain any number of variables and any number of equaions. The algorihm is carried
More informationTimers CT Range. CT-D Range. Electronic timers. CT-D Range. Phone: Fax: Web: -
CT-D Range Timers CT-D Range Elecronic imers Characerisics Diversiy: mulifuncion imers 0 single-funcion imers Conrol supply volages: Wide range: -0 V AC/DC Muli range: -8 V DC, 7 ime ranges from 0.0s o
More informationA Hierarchical Object Recognition System Based on Multi-scale Principal Curvature Regions
A Hierarchical Objec Recogniion Sysem Based on Muli-scale Principal Curvaure Regions Wei Zhang, Hongli Deng, Thomas G Dieerich and Eric N Morensen School of Elecrical Engineering and Compuer Science Oregon
More informationJ. Vis. Commun. Image R.
J. Vis. Commun. Image R. 20 (2009) 9 27 Conens liss available a ScienceDirec J. Vis. Commun. Image R. journal homepage: www.elsevier.com/locae/jvci Face deecion and racking using a Boosed Adapive Paricle
More informationMotion Level-of-Detail: A Simplification Method on Crowd Scene
Moion Level-of-Deail: A Simplificaion Mehod on Crowd Scene Absrac Junghyun Ahn VR lab, EECS, KAIST ChocChoggi@vr.kais.ac.kr hp://vr.kais.ac.kr/~zhaoyue Recen echnological improvemen in characer animaion
More informationMIC2569. Features. General Description. Applications. Typical Application. CableCARD Power Switch
CableCARD Power Swich General Descripion is designed o supply power o OpenCable sysems and CableCARD hoss. These CableCARDs are also known as Poin of Disribuion (POD) cards. suppors boh Single and Muliple
More informationEfficient AMG on Hybrid GPU Clusters. ScicomP Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann. Fraunhofer SCAI
Efficient AMG on Hybrid GPU Clusters ScicomP 2012 Jiri Kraus, Malte Förster, Thomas Brandes, Thomas Soddemann Fraunhofer SCAI Illustration: Darin McInnis Motivation Sparse iterative solvers benefit from
More informationRobust Visual Tracking for Multiple Targets
Robus Visual Tracking for Muliple Targes Yizheng Cai, Nando de Freias, and James J. Lile Universiy of Briish Columbia, Vancouver, B.C., Canada, V6T 1Z4 {yizhengc, nando, lile}@cs.ubc.ca Absrac. We address
More informationShortest Path Algorithms. Lecture I: Shortest Path Algorithms. Example. Graphs and Matrices. Setting: Dr Kieran T. Herley.
Shores Pah Algorihms Background Seing: Lecure I: Shores Pah Algorihms Dr Kieran T. Herle Deparmen of Compuer Science Universi College Cork Ocober 201 direced graph, real edge weighs Le he lengh of a pah
More informationVisual Perception as Bayesian Inference. David J Fleet. University of Toronto
Visual Percepion as Bayesian Inference David J Flee Universiy of Torono Basic rules of probabiliy sum rule (for muually exclusive a ): produc rule (condiioning): independence (def n ): Bayes rule: marginalizaion:
More informationInteractive Cuts through 3-Dimensional Soft Tissue
Insiue of Scienific Compuing Eidgenossische Technische Hochschule Zurich Ecole polyechnique federale de Zurich Poliecnico federale di Zurigo Swiss Federal Insiue of Technology Zurich Compuer Graphics Research
More informationConstant-Work-Space Algorithms for Shortest Paths in Trees and Simple Polygons
Journal of Graph Algorihms and Applicaions hp://jgaa.info/ vol. 15, no. 5, pp. 569 586 (2011) Consan-Work-Space Algorihms for Shores Pahs in Trees and Simple Polygons Tesuo Asano 1 Wolfgang Mulzer 2 Yajun
More informationSENSING using 3D technologies, structured light cameras
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 39, NO. 10, OCTOBER 2017 2045 Real-Time Enhancemen of Dynamic Deph Videos wih Non-Rigid Deformaions Kassem Al Ismaeil, Suden Member,
More informationNVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield
NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host
More informationToday. Curves & Surfaces. Can We Disguise the Facets? Limitations of Polygonal Meshes. Better, but not always good enough
Today Curves & Surfaces Moivaion Limiaions of Polygonal Models Some Modeling Tools & Definiions Curves Surfaces / Paches Subdivision Surfaces Limiaions of Polygonal Meshes Can We Disguise he Faces? Planar
More informationOpen Access Research on an Improved Medical Image Enhancement Algorithm Based on P-M Model. Luo Aijing 1 and Yin Jin 2,* u = div( c u ) u
Send Orders for Reprins o reprins@benhamscience.ae The Open Biomedical Engineering Journal, 5, 9, 9-3 9 Open Access Research on an Improved Medical Image Enhancemen Algorihm Based on P-M Model Luo Aijing
More informationResearch Article Particle Filtering: The Need for Speed
Hindawi Publishing Corporaion EURASIP Journal on Advances in Signal Processing Volume 2010, Aricle ID 181403, 9 pages doi:10.1155/2010/181403 Research Aricle Paricle Filering: The Need for Speed Gusaf
More informationA Face Detection Method Based on Skin Color Model
A Face Deecion Mehod Based on Skin Color Model Dazhi Zhang Boying Wu Jiebao Sun Qinglei Liao Deparmen of Mahemaics Harbin Insiue of Technology Harbin China 150000 Zhang_dz@163.com mahwby@hi.edu.cn sunjiebao@om.com
More informationPage 1. Key Points from Last Lecture Frame format. EEC173B/ECS152C, Winter Wireless LANs
EEC173/ECS152C, Winer 2006 Key Poins from Las Lecure Wireless LANs 802.11 Frame forma 802.11 MAC managemen Synchronizaion, Handoffs, Power MAC mehods: DCF & PCF CSMA/CA wih posiive ACK Exponenial backoff
More informationWeb System for the Remote Control and Execution of an IEC Application
Web Sysem for he Remoe Conrol and Execuion of an IEC 61499 Applicaion Oana ROHAT, Dan POPESCU Faculy of Auomaion and Compuer Science, Poliehnica Universiy, Splaiul Independenței 313, Bucureși, 060042,
More informationFast Tridiagonal Solvers on GPU
Fast Tridiagonal Solvers on GPU Yao Zhang John Owens UC Davis Jonathan Cohen NVIDIA GPU Technology Conference 2009 Outline Introduction Algorithms Design algorithms for GPU architecture Performance Bottleneck-based
More informationStudy and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou
Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm
More informationGTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013
GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»
More informationMATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008
MATH 5 - Differenial Equaions Sepember 15, 8 Projec 1, Fall 8 Due: Sepember 4, 8 Lab 1.3 - Logisics Populaion Models wih Harvesing For his projec we consider lab 1.3 of Differenial Equaions pages 146 o
More informationAn Iterative Scheme for Motion-Based Scene Segmentation
An Ieraive Scheme for Moion-Based Scene Segmenaion Alexander Bachmann and Hildegard Kuehne Deparmen for Measuremen and Conrol Insiue for Anhropomaics Universiy of Karlsruhe (H), 76 131 Karlsruhe, Germany
More information