Loop Tiling for Parallelism
|
|
- Sheena Day
- 5 years ago
- Views:
Transcription
1 Loop Tiling for Parallelism
2 THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE
3 LOOP TILING FOR PARALLELISM JINGLING XUE School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, Australia ~. " Springer Science+Business Media, LLC
4 Library of Congress Cataloging-in-Publication Xue, Jingling, Loop tiling for parallelism / Jingling Xue. p. cm. -- (Kluwer international series in engineering and computer science ; SECS 575) lncludes bibliographical references and index. ISBN ISBN (ebook) DOI / Parallel processing (Electronic computers) 2. Electronic data processing--distributed processing. 3. Loop tiling (Computer science) 1. Title. II. Series. QA76.58.X '.35--dc Copyright 2000 Springer Science+Business Media New York Originally published by Kluwer Academic Publishers, N ew York in 2000 Softcover reprint ofthe hardcover Ist edition 2000 All rights reserved. No part ofthis publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photo-copying, recording, or otherwise, without the prior written permission ofthe publisher, Springer Science+Business Media, LLC Printed an acid-free paper.
5 Contents List of Figures List of Tables Preface Acknowledgments ix xiii xv xix Part I Mathematic Background and Loop Transformation 1. MATHEMATICAL BACKGROUND 1.1 Logic 1.2 Sets 1.3 Arithmetic 1.4 Vectors and Matrices 1.5 Roots of Cubic and Quartic Equations 1.6 Integer Matrices 1. 7 Convex Analysis 1.8 Convex Polyhedra 1.9 Convex Cones 1.10 Fourier-Motzkin Elimination 1.11 Further Reading 2. NONSINGULAR TRANSFORMATIONS AND PERMUTABILITY Perfectly Nested Loops Dependence Vectors and Their Polyhedra Iteration-Reordering Transformations Fully Permutable Loop Nests Further Reading Part II Tiling as a Loop Transformation 3. RECTANGULAR TILING 73
6 vi Contents 3.1 Modeling Rectangular Tiling Legality Test Tile Dependences and Tile Space Graph Tiled Code Tiling-Related Transfonnations Yet Another Tiling Model Further Reading PARALLELEPIPED TILING ' Why Parallelepiped Tiling? Legality Test Tile Dependences and Tile Space Graph Tiled Code Decomposition of Parallelepiped Tiling Yet Another Tiling Model Loop Partitioning v.s. Loop Tiling Further Reading 120 Part III Tiling for Distributed-Memory Machines 5. SPMD CODE GENERATION Background Machine Model Computation Distribution Data Distribution Code Generation Message-Passing Code Generation Memory Management SPMD Code in Local Address Space SPMD Code for Parallelepiped Tiling Experiments Further Reading COMMUNICATION-MINIMAL TILING Computation & Communication Volumes Problem Fonnulation Closed-Fonn Optimal Tilings All Extremal-Ray Optimal Tilings Making H-1 Integral Further Reading TIME-MINIMAL TILING Parallelogram Tiling Executing Tiles in the SPMD Paradigm 201
7 Contents 7.3 Computation and Communication Models 7.4 Rise 7.5 Optimal Tile Size 7.6 Experiments 7.7 Further Reading Bibliography Index Vll
8 List of Figures 1.1 Plots of two cubic functions Plot of a quartic function Integer lattices Convex and affine hulls The unique minimum of a strictly convex function Convex polyhedra Sets that are not convex cones Convex cones The dual cones of the convex cones in Figure An algorithm for finding the lines and rays of a cone Fourier-Motzkin elimination and its inexactness in 7l.n Using Fourier-Motzkin elimination to scan a polytope The iteration space of Example The iteration space graphs for Example An algorithm for constructing the dependence polyhedron The dependence polyhedra for four dependence vectors Transformation of a distance vector by an injective mapping One-to-many transformation of distance vectors Transformation of a distance vector by a linear transformation The transformed iteration spaces of Example Transformations of dependence polyhedra Approximating the dependence polyhedra in Figure An algorithm for splitting a dependence vector J Finding a unimodular canonical transformation Examples of rectangular things Tile origins x 2 rectangular tiling of a double loop A non-convex tile space 78
9 x List of Figures 3.5 Tile offset v.s. loop normalisation Transformations of distance vectors Legal tiling for a non-permutable double loop Two examples tiled by the same 2 x 2 tiling Tile space graph Sequential tiled code in the 'Y model Examples of 2-D loop nests Tile codes for example loop nests in the 'Y model Strip-mining and tiling Loop skewing and rectangular tiling Two rectangular tiling models compared Sequential tiled code in the 'Y model Tile codes for example loop nests in the 'Y model A parallelogram tiling for Example Non-identical tiles under a ''n~tegral'' tiling transformation Approximation ofpp(d) by Pp(d) Sequential tiled code in the p model Decomposition of the parallelogram tiling in Figure Sequential tiled code in the p model Transformed loop nest for loop partitioning SPMD code generation A running example Sequential tiled code SPMD code after computation distribution Local data spaces for Figure The data owned by the host for Figure I/O code for receiving read-only data Communication sets for Example Two message-passing code sections for Figure Simplified message-passing code SPMD program with communication code Local memory allocation for the running example Translation of global and local loop indices Local array indices for the running example SPMD code in local address space Normalised SPMD code in local address space SPMD program for running example in local address space Iteration spaces for SOR Computation distribution for SOR Impact of tile size Computation and communication overlap Tiling, communication and validity test overheads 166
10 Contents Xl 5.23 Impact of tile shape The ISG and its 4 x 4 tiling for Example Dependence and time cones for Example Approximation of the communication volume of a tile nonlocal data accessed by a tile for Example Dependence and time cones for example Optimal tiling when D = I 2x Optimal tiling when D = (d~, d-;) E 7l. 2X Optimal tiling when D E 7l. 2xm The dependence cone for Example The dependence cone for Example A procedure for finding all extremal-ray optimal tilings Average run time of OptComTiling A geometric interpretation of Hl in Example Tiling of a parallelogram-shaped iteration space The legality of a parallelogram tiling Cyclic tile distribution over P = 4 processors Pipelining of non-constant dependences Communication cost model Rise for a tiled parallelogram iteration space Three types of rise values Derivation of execution time when rise r < Solution spaces and two separating constraints B 1 ( w) is above the right boundary of F B 1 ( w) intersects the right boundary of F A sketch of the proof for Lemma Bl (w) is below the right boundary of F A sketch of the proof for Lemma Derivation of execution time when rise r = A sketch of the proof for Lemma Derivation of execution time when rise r > Zigzag path when swp > H An algorithm for finding optimal tile size when r > Fidle. Ffree and C2 (w, h) = Communication parameters for AP x 2 rectangular tiling of 5-point SOR Performance of 5-point SOR on 10 processors Performance of 5-point SOR on 50 processors Performance results for several values of P x 2 parallelogram-shaped tiling of 3-point SOR Performance of 3-point SOR on 10 processors Performance of 3-point SOR on 50 processors 240
11 xii List of Figures 7.29 Performance of 3-point SOR on 100 processors 7.30 Performance of 3-point SOR on 61 processors 7.31 Performance of 3-point SOR on 50 processors 7.32 Plots of hi = h(wi) and hf = Fh(Wf)
12 List of Tables Shorthand notations for direction values A list of representative loop transformations 40 45
13 Preface Techniques for constructing restructuring compilers for parallel machines have been developed over the past three decades. Some of these techniques are introduced in Hans Zima's book on Supercompilers for Parallel and Vector Computers, Utpal Banerjee's book series on Loop Transformations for Restructuring Compilers, Michael Wolfe's book on High Performance Compilers for Parallel Computing, and recently, the book on Scheduling and Automatic Parallelization co-authored by Darte, Robert and Vivien. When optimising the performance of scientific and engineering programs, the most gains come from optimising nested loops or recursive procedures, where major chunks of computation are performed repeatedly. A large number of loop transformations have been accumulated over the years, and some of these can be found in research and production compilers. Loop tiling, originally promoted by Francois Irigoin and Michael Wolfe, is one of the most important iteration-reordering loop transformations. Loop tiling is beneficial for both parallel machines and uniprocessors with multilevels of cache. Together with other transformations such as loop distribution and loop fusion, loop tiling can reduce communication and synchronisation cost, maximise parallelism and improve memory hierarchy performance. Over the last few years or so, a lot of research efforts have been focussed on exploring the use of loop tiling to maximise parallelism for parallel machines or otherwise improve cache locality. Optimising for cache locality has become critically important for performance. Several research groups around the world are actively working on tackling this problem. Although progress has been made, much remains to be done. Therefore, the use of loop tiling for locality optimisations is not covered. As a consequence, some related publications are not cited in the reference list. However, the first two parts of the book provide the basic foundation useful for the general loop tiling technique. This book explores the use of loop tiling for minimising synchronisation and communication cost and maximising parallelism for parallel machines. The
14 xvi Preface book is organised into three parts. The first part, consisting of Chapters 1 and 2, provides the general mathematical background and introduces a theory of nonsingular loop transformations. Chapter 1 describes the basic mathematical concepts and tools necessary for a understanding of the subject with a particular emphasis on convex cones. Convex cones will be used throughout the book for addressing a number of important problems, including data dependence abstraction, loop permutability, legality test, and tile size and shape selection. Our treatment of nonsingular loop transformations in Chapter 2 serves to set up the context in which other iteration-reordering transformations such as loop tiling can be further developed. In particular, this chapter discusses data dependences, introduces legality test and code generation required for a nonsingular transformation, and relates the full permutability of a loop nest with the degree of parallelism and locality inherent in the loop nest. The second part, consisting of Chapters 3 and 4, deals with both rectangular and parallelepiped tiling. Tiling is discussed in terms of its effects on the data dependences and the required dependence test for legality. Unlike nonsingular loop transformations, the exact test for the legality of a tiling requires the knowledge of both the data dependences and the extent and shape of the iteration space and can be solved, in principle, by integer programming. For realistic tiling cases, efficient legality tests based on the data dependence information alone are described. This chapter also discusses the generation of tiled code and exposes the duality between loop tiling and loop partitioning. The last part, consisting of Chapters 5-7, focuses on minimising the execution time of a loop nest on a distributed memory machine. Chapter 5 describes a suite of compiler techniques for generating a SPMD program to execute a tiled iteration space. Chapter 6 addresses an interesting problem of determining the best tile shape to minimise inter-tile communication once the tile size is given. The solution to this problem provides insights for understanding various tilingrelated problems. Chapter 7 deals with the problem of finding the best tile size for a double loop once the tile shape is more or less given. The techniques presented in the last part can be adapted to work for a cluster of workstations, except that the tiles of varying sizes and a more sophisticated cost model may be needed to cope with heterogeneity present at all levels of network, processor and program. The techniques presented in the last part are also directly applicable to shared-memory machines once the machines are modeled as BSP (Bulk Synchronous Parallel) machines. In the case of the SPMD code generation, the send and receive calls can be replaced with an appropriate synchronisation mechanism. Each chapter includes a "Further Reading" section that contains citations to the original material in the reference list. JINGLING XUE
15 To my wife, Lili
16 Acknowledgments The author would like to thank all those who gave their own time and effort in the making of this book. Francois lrigoin of Ecole des Mines de Paris found time in his busy schedule to provide insightful and critical comments to my questions. The other reviewers of the book include Peizong Lee of Academia Sinica, Taiwan, Zhiyuan Li of Purdue University, Yves Robert of of Laboratoire de l'informatique du Parallelisme at Lyon and Peiyi Tang of University of Southern Queensland. I am very grateful to all these reviewers for encouraging me to write this book and for giving a number of suggestions. Alain Darte of Laboratoire de l'informatique du Parallelisme at Lyon read Chapter 1 very carefully, found errors and gave several suggestions. XIX
PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE
PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE Yau-Tsun Steven Li Monterey Design Systems, Inc. Sharad Malik Princeton University ~. " SPRINGER
More informationARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs
ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs Vaughn Betz Jonathan Rose Alexander Marquardt
More informationCommunication-Minimal Tiling of Uniform Dependence Loops
Communication-Minimal Tiling of Uniform Dependence Loops Jingling Xue Department of Mathematics, Statistics and Computing Science University of New England, Armidale 2351, Australia Abstract. Tiling is
More informationTopological Structure and Analysis of Interconnection Networks
Topological Structure and Analysis of Interconnection Networks Network Theory and Applications Volume 7 Managing Editors: Ding-Zhu Du, University of Minnesota, U.S.A. and Cauligi Raghavendra, University
More informationTHEORY OF LINEAR AND INTEGER PROGRAMMING
THEORY OF LINEAR AND INTEGER PROGRAMMING ALEXANDER SCHRIJVER Centrum voor Wiskunde en Informatica, Amsterdam A Wiley-Inter science Publication JOHN WILEY & SONS^ Chichester New York Weinheim Brisbane Singapore
More informationLARGE SCALE LINEAR AND INTEGER OPTIMIZATION: A UNIFIED APPROACH
LARGE SCALE LINEAR AND INTEGER OPTIMIZATION: A UNIFIED APPROACH Richard Kipp Martin Graduate School of Business University of Chicago % Kluwer Academic Publishers Boston/Dordrecht/London CONTENTS Preface
More informationTHE VERILOG? HARDWARE DESCRIPTION LANGUAGE
THE VERILOG? HARDWARE DESCRIPTION LANGUAGE THE VERILOGf HARDWARE DESCRIPTION LANGUAGE by Donald E. Thomas Carnegie Mellon University and Philip R. Moorby Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS
More informationMULTIMEDIA DATABASE MANAGEMENT SYSTEMS
MULTIMEDIA DATABASE MANAGEMENT SYSTEMS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE MULTIMEDIA SYSTEMS AND APPLICATIONS Recently Published Titles: Consulting Editor Borko Furht Florida
More informationStructured Parallel Programming Patterns for Efficient Computation
Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationSymbolic Evaluation of Sums for Parallelising Compilers
Symbolic Evaluation of Sums for Parallelising Compilers Rizos Sakellariou Department of Computer Science University of Manchester Oxford Road Manchester M13 9PL United Kingdom e-mail: rizos@csmanacuk Keywords:
More informationStructured Parallel Programming
Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationLoop Transformations, Dependences, and Parallelization
Loop Transformations, Dependences, and Parallelization Announcements HW3 is due Wednesday February 15th Today HW3 intro Unimodular framework rehash with edits Skewing Smith-Waterman (the fix is in!), composing
More informationSPECC: SPECIFICATION LANGUAGE AND METHODOLOGY
SPECC: SPECIFICATION LANGUAGE AND METHODOLOGY SPECC: SPECIFICATION LANGUAGE AND METHODOLOGY Daniel D. Gajski Jianwen Zhu Rainer Dömer Andreas Gerstlauer Shuqing Zhao University of California, Irvine SPRINGER
More informationASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING
ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE PARALLEL PROCESSING AND FIFTH GENERATION COMPUTING Consulting Editor Doug DeGroot
More informationYves Nievergelt. Wavelets Made Easy. Springer Science+Business Media, LLC
Wavelets Made Easy Yves Nievergelt Wavelets Made Easy Springer Science+Business Media, LLC Yves Nievergelt Department of Mathematics Eastem Washington University Cheney, WA 99004-2431 USA Library of Congress
More informationA Course in Convexity
A Course in Convexity Alexander Barvinok Graduate Studies in Mathematics Volume 54 American Mathematical Society Providence, Rhode Island Preface vii Chapter I. Convex Sets at Large 1 1. Convex Sets. Main
More informationLinear Programming: Mathematics, Theory and Algorithms
Linear Programming: Mathematics, Theory and Algorithms Applied Optimization Volume 2 The titles published in this series are listed at the end of this volume. Linear Programming: Mathematics, Theory and
More informationMINING VERY LARGE DATABASES WITH PARALLEL PROCESSING
MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS Series Editor Ahmed K. Elmagarmid Purdue University West Lafayette, IN 47907 Other books
More informationFundamentals of Operating Systems. Fifth Edition
Fundamentals of Operating Systems Fifth Edition Fundamentals of Operating Systems A.M. Lister University of Queensland R. D. Eager University of Kent at Canterbury Fifth Edition Springer Science+Business
More informationHigh-Performance Parallel Database Processing and Grid Databases
High-Performance Parallel Database Processing and Grid Databases David Taniar Monash University, Australia Clement H.C. Leung Hong Kong Baptist University and Victoria University, Australia Wenny Rahayu
More informationWIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures
WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures C-K Toh, Ph.D. University of Cambridge Cambridge, United Kingdom SPRINGER-SCIENCE+BUSINESS
More informationTASK SCHEDULING FOR PARALLEL SYSTEMS
TASK SCHEDULING FOR PARALLEL SYSTEMS Oliver Sinnen Department of Electrical and Computer Engineering The University of Aukland New Zealand TASK SCHEDULING FOR PARALLEL SYSTEMS TASK SCHEDULING FOR PARALLEL
More informationA Structured Programming Approach to Data
A Structured Programming Approach to Data Derek Coleman A Structured Programming Approach to Data Springer-Verlag New York Derek Coleman Department of Computation Institute of Science Technology University
More informationConvex Analysis and Minimization Algorithms I
Jean-Baptiste Hiriart-Urruty Claude Lemarechal Convex Analysis and Minimization Algorithms I Fundamentals With 113 Figures Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Hong Kong Barcelona
More informationGraphics Programming in c++
Graphics Programming in c++ Springer London Berlin Heidelberg New York Barcelona Budapest Hong Kong Milan Paris Santa Clara Singapore Tokyo Mark Walmsley Graphics Programming in c++ Writing Graphics Applications
More informationComputing and Informatics, Vol. 36, 2017, , doi: /cai
Computing and Informatics, Vol. 36, 2017, 566 596, doi: 10.4149/cai 2017 3 566 NESTED-LOOPS TILING FOR PARALLELIZATION AND LOCALITY OPTIMIZATION Saeed Parsa, Mohammad Hamzei Department of Computer Engineering
More informationData Mining for Association Rules and Sequential Patterns
Data Mining for Association Rules and Sequential Patterns Springer-Science+Business Media, LLC Jean-Marc Adamo Data Mining for Association Rules and Sequential Patterns Sequential and Parallel Algorithms
More informationPARALLEL, OBJECT -ORIENTED, AND ACTIVE KNOWLEDGE BASE SYSTEMS
PARALLEL, OBJECT -ORIENTED, AND ACTIVE KNOWLEDGE BASE SYSTEMS The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS Series Editor Ahmed K. Elmagarmid Purdue University West Lafayette, IN 47907
More informationGEOMETRIC TOOLS FOR COMPUTER GRAPHICS
GEOMETRIC TOOLS FOR COMPUTER GRAPHICS PHILIP J. SCHNEIDER DAVID H. EBERLY MORGAN KAUFMANN PUBLISHERS A N I M P R I N T O F E L S E V I E R S C I E N C E A M S T E R D A M B O S T O N L O N D O N N E W
More informationPARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS
PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ROBOTICS: VISION, MANIPULATION AND SENSORS Consulting Editor:
More informationRETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS
RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS Rainer LEUPERS University of Dortmund Department of Computer Science Dortmund, Germany
More informationLegal and impossible dependences
Transformations and Dependences 1 operations, column Fourier-Motzkin elimination us use these tools to determine (i) legality of permutation and Let generation of transformed code. (ii) Recall: Polyhedral
More informationComputer Architecture
Computer Architecture Springer-Verlag Berlin Heidelberg GmbH Silvia M. Mueller Wolfgang J. Paul Computer Architecture Complexity and Correctness With 214 Figures and 185 Tables Springer Silvia Melitta
More informationAutomatic Parallel Code Generation for Tiled Nested Loops
2004 ACM Symposium on Applied Computing Automatic Parallel Code Generation for Tiled Nested Loops Georgios Goumas, Nikolaos Drosinos, Maria Athanasaki, Nectarios Koziris National Technical University of
More informationVERILOG QUICKSTART. James M. Lee Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS MEDIA, LLC
VERILOG QUICKSTART VERILOG QUICKSTART by James M. Lee Cadence Design Systems, Inc. ~. " SPRINGER SCIENCE+BUSINESS MEDIA, LLC ISBN 978-1-4613-7801-3 ISBN 978-1-4615-6113-2 (ebook) DOI 10.1007/978-1-4615-6113-2
More informationAlgorithms and Parallel Computing
Algorithms and Parallel Computing Algorithms and Parallel Computing Fayez Gebali University of Victoria, Victoria, BC A John Wiley & Sons, Inc., Publication Copyright 2011 by John Wiley & Sons, Inc. All
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationThe Automatic Design of Batch Processing Systems
The Automatic Design of Batch Processing Systems by Barry Dwyer, M.A., D.A.E., Grad.Dip. A thesis submitted for the degree of Doctor of Philosophy in the Department of Computer Science University of Adelaide
More informationPolyhedral Compilation Foundations
Polyhedral Compilation Foundations Louis-Noël Pouchet pouchet@cse.ohio-state.edu Dept. of Computer Science and Engineering, the Ohio State University Feb 22, 2010 888.11, Class #5 Introduction: Polyhedral
More informationAffine and Unimodular Transformations for Non-Uniform Nested Loops
th WSEAS International Conference on COMPUTERS, Heraklion, Greece, July 3-, 008 Affine and Unimodular Transformations for Non-Uniform Nested Loops FAWZY A. TORKEY, AFAF A. SALAH, NAHED M. EL DESOUKY and
More informationLinear Loop Transformations for Locality Enhancement
Linear Loop Transformations for Locality Enhancement 1 Story so far Cache performance can be improved by tiling and permutation Permutation of perfectly nested loop can be modeled as a linear transformation
More informationLooPo: Automatic Loop Parallelization
LooPo: Automatic Loop Parallelization Michael Claßen Fakultät für Informatik und Mathematik Düsseldorf, November 27 th 2008 Model-Based Loop Transformations model-based approach: map source code to an
More informationI = 4+I, I = 1, I 4
On Reducing Overhead in Loops Peter M.W. Knijnenburg Aart J.C. Bik High Performance Computing Division, Dept. of Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, the Netherlands. E-mail:
More informationHIGH-LEVEL SYNTHESIS FOR REAL-TIME DIGITAL SIGNAL PROCESSING
HIGH-LEVEL SYNTHESIS FOR REAL-TIME DIGITAL SIGNAL PROCESSING THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE VLSI, COMPUTER ARCHITECfURE AND DIGITAL SIGNAL PROCESSING Latest Titles
More informationof Convex Analysis Fundamentals Jean-Baptiste Hiriart-Urruty Claude Lemarechal Springer With 66 Figures
2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. Jean-Baptiste Hiriart-Urruty Claude Lemarechal Fundamentals of Convex
More informationIterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen and Nicolas Vasilache ALCHEMY, INRIA Futurs / University of Paris-Sud XI March
More informationCompiling for Advanced Architectures
Compiling for Advanced Architectures In this lecture, we will concentrate on compilation issues for compiling scientific codes Typically, scientific codes Use arrays as their main data structures Have
More informationA Bibliography of Publications of Jingling Xue
A Bibliography of Publications of Jingling Xue Jingling Xue Department of Mathematics, Statistics and Computing Science Armidale, NSW 2351 Australia Tel: +61 67 73 3149 FAX: +61 67 73 3312 E-mail: xue@neumann.une.edu.au
More informationTHE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE
THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ONTOLOGY LEARNING FOR THE SEMANTIC WEB ONTOLOGY LEARNING FOR THE SEMANTIC WEB by Alexander Maedche University of Karlsruhe, Germany SPRINGER
More informationINTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach
INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach Library of Congress Cataloging-in-Publication ISBN 978-1-4613-5091-0 ISBN 978-1-4615-0467-2 (ebook) DOI 10.1007/978-1-4615-0467-2
More informationEnergy Efficient Microprocessor Design
Energy Efficient Microprocessor Design Energy Efficient Microprocessor Design by Thomas D. Burd Robert W. Brodersen with Contributions Irom Trevor Pering Anthony Stratakos Berkeley Wireless Research Center
More informationLecture 11 Loop Transformations for Parallelism and Locality
Lecture 11 Loop Transformations for Parallelism and Locality 1. Examples 2. Affine Partitioning: Do-all 3. Affine Partitioning: Pipelining Readings: Chapter 11 11.3, 11.6 11.7.4, 11.9-11.9.6 1 Shared Memory
More informationScheduling in Distributed Computing Systems Analysis, Design & Models
Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) by Deo Prakash
More informationThis image cannot currently be displayed. Course Catalog. Pre-algebra Glynlyon, Inc.
This image cannot currently be displayed. Course Catalog Pre-algebra 2016 Glynlyon, Inc. Table of Contents COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 1 UNIT 2: MODELING PROBLEMS IN INTEGERS...
More informationIncreasing Parallelism of Loops with the Loop Distribution Technique
Increasing Parallelism of Loops with the Loop Distribution Technique Ku-Nien Chang and Chang-Biau Yang Department of pplied Mathematics National Sun Yat-sen University Kaohsiung, Taiwan 804, ROC cbyang@math.nsysu.edu.tw
More informationConvex Geometry arising in Optimization
Convex Geometry arising in Optimization Jesús A. De Loera University of California, Davis Berlin Mathematical School Summer 2015 WHAT IS THIS COURSE ABOUT? Combinatorial Convexity and Optimization PLAN
More informationLecture 9 Basic Parallelization
Lecture 9 Basic Parallelization I. Introduction II. Data Dependence Analysis III. Loop Nests + Locality IV. Interprocedural Parallelization Chapter 11.1-11.1.4 CS243: Parallelization 1 Machine Learning
More informationLecture 9 Basic Parallelization
Lecture 9 Basic Parallelization I. Introduction II. Data Dependence Analysis III. Loop Nests + Locality IV. Interprocedural Parallelization Chapter 11.1-11.1.4 CS243: Parallelization 1 Machine Learning
More informationNonlinear Programming
Nonlinear Programming SECOND EDITION Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book Information and Orders http://world.std.com/~athenasc/index.html Athena Scientific, Belmont,
More informationComputational Geometry on Surfaces
Computational Geometry on Surfaces Computational Geometry on Surfaces Performing Computational Geometry on the Cylinder, the Sphere, the Torus, and the Cone by Clara I. Grima Department 0/ Applied Mathematics
More informationCurriculum Catalog
2017-2018 Curriculum Catalog 2017 Glynlyon, Inc. Table of Contents MATHEMATICS 800 FUNDAMENTALS COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 1 UNIT 2: MODELING PROBLEMS IN INTEGERS... 2 UNIT
More informationSymbolicC++: An Introduction to Computer Algebra using Object-Oriented Programming
SymbolicC++: An Introduction to Computer Algebra using Object-Oriented Programming Springer-Verlag London Ltd. Tan Kiat Shi, Willi-Hans Steeb and Yorick Hardy SymbolicC ++: An Introdurtion to Computer
More informationCurriculum Catalog
2018-2019 Curriculum Catalog Table of Contents MATHEMATICS 800 COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 1 UNIT 2: MODELING PROBLEMS IN INTEGERS... 3 UNIT 3: MODELING PROBLEMS WITH RATIONAL
More informationSeparable Programming
Separable Programming Applied Optimization Volume 53 Series Editors: Panos M. Pardalos University 0/ Florida, USA. Donald Hearn University 0/ Florida, USA. The tit/es published in this series are listed
More informationFINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS
FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE INFORMATION THEORY Consulting Editor Robert G. Gallager FINITE FIELDS FOR COMPUTER
More informationWe propose the following strategy for applying unimodular transformations to non-perfectly nested loops. This strategy amounts to reducing the problem
Towards Unimodular Transformations for Non-perfectly Nested Loops Peter M.W. Knijnenburg High Performance Computing Division, Dept. of Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden,
More informationComputer-Aided Design in Magnetics
Computer-Aided Design in Magnetics D. A. Lowther P. P. Silvester Computer-Aided Design in Magnetics With 84 illustrations Springer-Verlag Berlin Heidelberg New York Tokyo D. A. Lowther Associate Professor
More informationMANY signal processing systems, particularly in the multimedia
1304 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 9, SEPTEMBER 2009 Signal Assignment to Hierarchical Memory Organizations for Embedded Multidimensional Signal Processing
More informationEc 181: Convex Analysis and Economic Theory
Division of the Humanities and Social Sciences Ec 181: Convex Analysis and Economic Theory KC Border Winter 2018 v. 2018.03.08::13.11 src: front KC Border: for Ec 181, Winter 2018 Woe to the author who
More informationDM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini
DM545 Linear and Integer Programming Lecture 2 The Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. 2. 3. 4. Standard Form Basic Feasible Solutions
More informationInteger and Combinatorial Optimization
Integer and Combinatorial Optimization GEORGE NEMHAUSER School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, Georgia LAURENCE WOLSEY Center for Operations Research and
More informationAlgorithm Collections for Digital Signal Processing Applications Using Matlab
Algorithm Collections for Digital Signal Processing Applications Using Matlab Algorithm Collections for Digital Signal Processing Applications Using Matlab E.S. Gopi National Institute of Technology, Tiruchi,
More informationCompilation Issues for High Performance Computers: A Comparative. Overview of a General Model and the Unied Model. Brian J.
Compilation Issues for High Performance Computers: A Comparative Overview of a General Model and the Unied Model Abstract This paper presents a comparison of two models suitable for use in a compiler for
More informationAn Introduction to Programming with IDL
An Introduction to Programming with IDL Interactive Data Language Kenneth P. Bowman Department of Atmospheric Sciences Texas A&M University AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN
More informationLinear and Integer Programming (ADM II) Script. Rolf Möhring WS 2010/11
Linear and Integer Programming (ADM II) Script Rolf Möhring WS 200/ Contents -. Algorithmic Discrete Mathematics (ADM)... 3... 4.3 Winter term 200/... 5 2. Optimization problems 2. Examples... 7 2.2 Neighborhoods
More informationDigital Functions and Data Reconstruction
Digital Functions and Data Reconstruction Li M. Chen Digital Functions and Data Reconstruction Digital-Discrete Methods 123 Li M. Chen University of the District of Columbia Washington, DC, USA ISBN 978-1-4614-5637-7
More informationDavid G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer
David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms
More informationCPSC / Sonny Chan - University of Calgary. Collision Detection II
CPSC 599.86 / 601.86 Sonny Chan - University of Calgary Collision Detection II Outline Broad phase collision detection: - Problem definition and motivation - Bounding volume hierarchies - Spatial partitioning
More informationTheory of Automatic Robot Assembly and Programming
Theory of Automatic Robot Assembly and Programming Theory of Automatic Robot Assembly and Programming Bartholomew o. Nnaji Professor and Director Automation and Robotics Laboratory Department of Industrial
More informationPARALLEL ALGORITHMS FOR LINEAR MODELS
PARALLEL ALGORITHMS FOR LINEAR MODELS Advances in Computational Economics VOLUME 15 SERIES EDITORS Hans Amman, University of Amsterdam, Amsterdam, The Netherlands Anna Nagurney, University of Massachusetts
More informationFundamentals of Discrete Mathematical Structures
Fundamentals of Discrete Mathematical Structures THIRD EDITION K.R. Chowdhary Campus Director JIET School of Engineering and Technology for Girls Jodhpur Delhi-110092 2015 FUNDAMENTALS OF DISCRETE MATHEMATICAL
More informationModule 13: INTRODUCTION TO COMPILERS FOR HIGH PERFORMANCE COMPUTERS Lecture 25: Supercomputing Applications. The Lecture Contains: Loop Unswitching
The Lecture Contains: Loop Unswitching Supercomputing Applications Programming Paradigms Important Problems Scheduling Sources and Types of Parallelism Model of Compiler Code Optimization Data Dependence
More informationSimplex Algorithm in 1 Slide
Administrivia 1 Canonical form: Simplex Algorithm in 1 Slide If we do pivot in A r,s >0, where c s
More informationModeling and Simulation in Scilab/Scicos with ScicosLab 4.4
Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Stephen L. Campbell, Jean-Philippe Chancelier and Ramine Nikoukhah Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Second Edition
More informationOptimality certificates for convex minimization and Helly numbers
Optimality certificates for convex minimization and Helly numbers Amitabh Basu Michele Conforti Gérard Cornuéjols Robert Weismantel Stefan Weltge October 20, 2016 Abstract We consider the problem of minimizing
More informationTiling: A Data Locality Optimizing Algorithm
Tiling: A Data Locality Optimizing Algorithm Previously Unroll and Jam Homework PA3 is due Monday November 2nd Today Unroll and Jam is tiling Code generation for fixed-sized tiles Paper writing and critique
More informationFunctional Programming in R
Functional Programming in R Advanced Statistical Programming for Data Science, Analysis and Finance Thomas Mailund Functional Programming in R: Advanced Statistical Programming for Data Science, Analysis
More informationFrom acute sets to centrally symmetric 2-neighborly polytopes
From acute sets to centrally symmetric -neighborly polytopes Isabella Novik Department of Mathematics University of Washington Seattle, WA 98195-4350, USA novik@math.washington.edu May 1, 018 Abstract
More informationGroupware and the World Wide Web
Groupware and the World Wide Web Edited by Richard Bentley, Uwe Busbach, David Kerr & Klaas Sikkel German National Research Center for Information Technology, Institutefor Applied Information Technology
More informationSoftware Development for SAP R/3
Software Development for SAP R/3 Springer-Verlag Berlin Heidelberg GmbH Ulrich Mende Software Development for SAP R/3 Data Dictionary, ABAP/4, Interfaces With Diskette With 124 Figures and Many Example
More informationPolyèdres et compilation
Polyèdres et compilation François Irigoin & Mehdi Amini & Corinne Ancourt & Fabien Coelho & Béatrice Creusillet & Ronan Keryell MINES ParisTech - Centre de Recherche en Informatique 12 May 2011 François
More informationINFORMATION RETRIEVAL SYSTEMS: Theory and Implementation
INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation THE KLUWER INTERNATIONAL SERIES ON INFORMATION RETRIEVAL Series Editor W. Bruce Croft University of Massachusetts Amherst, MA 01003 Also in the
More informationLinear programming and duality theory
Linear programming and duality theory Complements of Operations Research Giovanni Righini Linear Programming (LP) A linear program is defined by linear constraints, a linear objective function. Its variables
More informationCURRICULUM CATALOG. CCR Mathematics Grade 8 (270720) MS
2018-19 CURRICULUM CATALOG Table of Contents COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 2 UNIT 2: MODELING PROBLEMS IN INTEGERS... 2 UNIT 3: MODELING PROBLEMS WITH RATIONAL NUMBERS... 2 UNIT
More informationTriangulations of hyperbolic 3-manifolds admitting strict angle structures
Triangulations of hyperbolic 3-manifolds admitting strict angle structures Craig D. Hodgson, J. Hyam Rubinstein and Henry Segerman segerman@unimelb.edu.au University of Melbourne January 4 th 2012 Ideal
More informationDigital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz
Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming by Nasser Kehtarnavaz University
More informationIntroductory Combinatorics
Introductory Combinatorics Third Edition KENNETH P. BOGART Dartmouth College,. " A Harcourt Science and Technology Company San Diego San Francisco New York Boston London Toronto Sydney Tokyo xm CONTENTS
More informationDETERMINISTIC OPERATIONS RESEARCH
DETERMINISTIC OPERATIONS RESEARCH Models and Methods in Optimization Linear DAVID J. RADER, JR. Rose-Hulman Institute of Technology Department of Mathematics Terre Haute, IN WILEY A JOHN WILEY & SONS,
More informationThe Polytope Model: Past, Present, Future
The Polytope Model: Past, Present, Future Paul Feautrier ENS de Lyon Paul.Feautrier@ens-lyon.fr 8 octobre 2009 1 / 39 What is a Model? What is a Polytope? Basis of the Polytope Model Fundamental Algorithms
More informationLinear Programming in Small Dimensions
Linear Programming in Small Dimensions Lekcija 7 sergio.cabello@fmf.uni-lj.si FMF Univerza v Ljubljani Edited from slides by Antoine Vigneron Outline linear programming, motivation and definition one dimensional
More information