Loop Tiling for Parallelism

Size: px
Start display at page:

Download "Loop Tiling for Parallelism"

Transcription

1 Loop Tiling for Parallelism

2 THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

3 LOOP TILING FOR PARALLELISM JINGLING XUE School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, Australia ~. " Springer Science+Business Media, LLC

4 Library of Congress Cataloging-in-Publication Xue, Jingling, Loop tiling for parallelism / Jingling Xue. p. cm. -- (Kluwer international series in engineering and computer science ; SECS 575) lncludes bibliographical references and index. ISBN ISBN (ebook) DOI / Parallel processing (Electronic computers) 2. Electronic data processing--distributed processing. 3. Loop tiling (Computer science) 1. Title. II. Series. QA76.58.X '.35--dc Copyright 2000 Springer Science+Business Media New York Originally published by Kluwer Academic Publishers, N ew York in 2000 Softcover reprint ofthe hardcover Ist edition 2000 All rights reserved. No part ofthis publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photo-copying, recording, or otherwise, without the prior written permission ofthe publisher, Springer Science+Business Media, LLC Printed an acid-free paper.

5 Contents List of Figures List of Tables Preface Acknowledgments ix xiii xv xix Part I Mathematic Background and Loop Transformation 1. MATHEMATICAL BACKGROUND 1.1 Logic 1.2 Sets 1.3 Arithmetic 1.4 Vectors and Matrices 1.5 Roots of Cubic and Quartic Equations 1.6 Integer Matrices 1. 7 Convex Analysis 1.8 Convex Polyhedra 1.9 Convex Cones 1.10 Fourier-Motzkin Elimination 1.11 Further Reading 2. NONSINGULAR TRANSFORMATIONS AND PERMUTABILITY Perfectly Nested Loops Dependence Vectors and Their Polyhedra Iteration-Reordering Transformations Fully Permutable Loop Nests Further Reading Part II Tiling as a Loop Transformation 3. RECTANGULAR TILING 73

6 vi Contents 3.1 Modeling Rectangular Tiling Legality Test Tile Dependences and Tile Space Graph Tiled Code Tiling-Related Transfonnations Yet Another Tiling Model Further Reading PARALLELEPIPED TILING ' Why Parallelepiped Tiling? Legality Test Tile Dependences and Tile Space Graph Tiled Code Decomposition of Parallelepiped Tiling Yet Another Tiling Model Loop Partitioning v.s. Loop Tiling Further Reading 120 Part III Tiling for Distributed-Memory Machines 5. SPMD CODE GENERATION Background Machine Model Computation Distribution Data Distribution Code Generation Message-Passing Code Generation Memory Management SPMD Code in Local Address Space SPMD Code for Parallelepiped Tiling Experiments Further Reading COMMUNICATION-MINIMAL TILING Computation & Communication Volumes Problem Fonnulation Closed-Fonn Optimal Tilings All Extremal-Ray Optimal Tilings Making H-1 Integral Further Reading TIME-MINIMAL TILING Parallelogram Tiling Executing Tiles in the SPMD Paradigm 201

7 Contents 7.3 Computation and Communication Models 7.4 Rise 7.5 Optimal Tile Size 7.6 Experiments 7.7 Further Reading Bibliography Index Vll

8 List of Figures 1.1 Plots of two cubic functions Plot of a quartic function Integer lattices Convex and affine hulls The unique minimum of a strictly convex function Convex polyhedra Sets that are not convex cones Convex cones The dual cones of the convex cones in Figure An algorithm for finding the lines and rays of a cone Fourier-Motzkin elimination and its inexactness in 7l.n Using Fourier-Motzkin elimination to scan a polytope The iteration space of Example The iteration space graphs for Example An algorithm for constructing the dependence polyhedron The dependence polyhedra for four dependence vectors Transformation of a distance vector by an injective mapping One-to-many transformation of distance vectors Transformation of a distance vector by a linear transformation The transformed iteration spaces of Example Transformations of dependence polyhedra Approximating the dependence polyhedra in Figure An algorithm for splitting a dependence vector J Finding a unimodular canonical transformation Examples of rectangular things Tile origins x 2 rectangular tiling of a double loop A non-convex tile space 78

9 x List of Figures 3.5 Tile offset v.s. loop normalisation Transformations of distance vectors Legal tiling for a non-permutable double loop Two examples tiled by the same 2 x 2 tiling Tile space graph Sequential tiled code in the 'Y model Examples of 2-D loop nests Tile codes for example loop nests in the 'Y model Strip-mining and tiling Loop skewing and rectangular tiling Two rectangular tiling models compared Sequential tiled code in the 'Y model Tile codes for example loop nests in the 'Y model A parallelogram tiling for Example Non-identical tiles under a ''n~tegral'' tiling transformation Approximation ofpp(d) by Pp(d) Sequential tiled code in the p model Decomposition of the parallelogram tiling in Figure Sequential tiled code in the p model Transformed loop nest for loop partitioning SPMD code generation A running example Sequential tiled code SPMD code after computation distribution Local data spaces for Figure The data owned by the host for Figure I/O code for receiving read-only data Communication sets for Example Two message-passing code sections for Figure Simplified message-passing code SPMD program with communication code Local memory allocation for the running example Translation of global and local loop indices Local array indices for the running example SPMD code in local address space Normalised SPMD code in local address space SPMD program for running example in local address space Iteration spaces for SOR Computation distribution for SOR Impact of tile size Computation and communication overlap Tiling, communication and validity test overheads 166

10 Contents Xl 5.23 Impact of tile shape The ISG and its 4 x 4 tiling for Example Dependence and time cones for Example Approximation of the communication volume of a tile nonlocal data accessed by a tile for Example Dependence and time cones for example Optimal tiling when D = I 2x Optimal tiling when D = (d~, d-;) E 7l. 2X Optimal tiling when D E 7l. 2xm The dependence cone for Example The dependence cone for Example A procedure for finding all extremal-ray optimal tilings Average run time of OptComTiling A geometric interpretation of Hl in Example Tiling of a parallelogram-shaped iteration space The legality of a parallelogram tiling Cyclic tile distribution over P = 4 processors Pipelining of non-constant dependences Communication cost model Rise for a tiled parallelogram iteration space Three types of rise values Derivation of execution time when rise r < Solution spaces and two separating constraints B 1 ( w) is above the right boundary of F B 1 ( w) intersects the right boundary of F A sketch of the proof for Lemma Bl (w) is below the right boundary of F A sketch of the proof for Lemma Derivation of execution time when rise r = A sketch of the proof for Lemma Derivation of execution time when rise r > Zigzag path when swp > H An algorithm for finding optimal tile size when r > Fidle. Ffree and C2 (w, h) = Communication parameters for AP x 2 rectangular tiling of 5-point SOR Performance of 5-point SOR on 10 processors Performance of 5-point SOR on 50 processors Performance results for several values of P x 2 parallelogram-shaped tiling of 3-point SOR Performance of 3-point SOR on 10 processors Performance of 3-point SOR on 50 processors 240

11 xii List of Figures 7.29 Performance of 3-point SOR on 100 processors 7.30 Performance of 3-point SOR on 61 processors 7.31 Performance of 3-point SOR on 50 processors 7.32 Plots of hi = h(wi) and hf = Fh(Wf)

12 List of Tables Shorthand notations for direction values A list of representative loop transformations 40 45

13 Preface Techniques for constructing restructuring compilers for parallel machines have been developed over the past three decades. Some of these techniques are introduced in Hans Zima's book on Supercompilers for Parallel and Vector Computers, Utpal Banerjee's book series on Loop Transformations for Restructuring Compilers, Michael Wolfe's book on High Performance Compilers for Parallel Computing, and recently, the book on Scheduling and Automatic Parallelization co-authored by Darte, Robert and Vivien. When optimising the performance of scientific and engineering programs, the most gains come from optimising nested loops or recursive procedures, where major chunks of computation are performed repeatedly. A large number of loop transformations have been accumulated over the years, and some of these can be found in research and production compilers. Loop tiling, originally promoted by Francois Irigoin and Michael Wolfe, is one of the most important iteration-reordering loop transformations. Loop tiling is beneficial for both parallel machines and uniprocessors with multilevels of cache. Together with other transformations such as loop distribution and loop fusion, loop tiling can reduce communication and synchronisation cost, maximise parallelism and improve memory hierarchy performance. Over the last few years or so, a lot of research efforts have been focussed on exploring the use of loop tiling to maximise parallelism for parallel machines or otherwise improve cache locality. Optimising for cache locality has become critically important for performance. Several research groups around the world are actively working on tackling this problem. Although progress has been made, much remains to be done. Therefore, the use of loop tiling for locality optimisations is not covered. As a consequence, some related publications are not cited in the reference list. However, the first two parts of the book provide the basic foundation useful for the general loop tiling technique. This book explores the use of loop tiling for minimising synchronisation and communication cost and maximising parallelism for parallel machines. The

14 xvi Preface book is organised into three parts. The first part, consisting of Chapters 1 and 2, provides the general mathematical background and introduces a theory of nonsingular loop transformations. Chapter 1 describes the basic mathematical concepts and tools necessary for a understanding of the subject with a particular emphasis on convex cones. Convex cones will be used throughout the book for addressing a number of important problems, including data dependence abstraction, loop permutability, legality test, and tile size and shape selection. Our treatment of nonsingular loop transformations in Chapter 2 serves to set up the context in which other iteration-reordering transformations such as loop tiling can be further developed. In particular, this chapter discusses data dependences, introduces legality test and code generation required for a nonsingular transformation, and relates the full permutability of a loop nest with the degree of parallelism and locality inherent in the loop nest. The second part, consisting of Chapters 3 and 4, deals with both rectangular and parallelepiped tiling. Tiling is discussed in terms of its effects on the data dependences and the required dependence test for legality. Unlike nonsingular loop transformations, the exact test for the legality of a tiling requires the knowledge of both the data dependences and the extent and shape of the iteration space and can be solved, in principle, by integer programming. For realistic tiling cases, efficient legality tests based on the data dependence information alone are described. This chapter also discusses the generation of tiled code and exposes the duality between loop tiling and loop partitioning. The last part, consisting of Chapters 5-7, focuses on minimising the execution time of a loop nest on a distributed memory machine. Chapter 5 describes a suite of compiler techniques for generating a SPMD program to execute a tiled iteration space. Chapter 6 addresses an interesting problem of determining the best tile shape to minimise inter-tile communication once the tile size is given. The solution to this problem provides insights for understanding various tilingrelated problems. Chapter 7 deals with the problem of finding the best tile size for a double loop once the tile shape is more or less given. The techniques presented in the last part can be adapted to work for a cluster of workstations, except that the tiles of varying sizes and a more sophisticated cost model may be needed to cope with heterogeneity present at all levels of network, processor and program. The techniques presented in the last part are also directly applicable to shared-memory machines once the machines are modeled as BSP (Bulk Synchronous Parallel) machines. In the case of the SPMD code generation, the send and receive calls can be replaced with an appropriate synchronisation mechanism. Each chapter includes a "Further Reading" section that contains citations to the original material in the reference list. JINGLING XUE

15 To my wife, Lili

16 Acknowledgments The author would like to thank all those who gave their own time and effort in the making of this book. Francois lrigoin of Ecole des Mines de Paris found time in his busy schedule to provide insightful and critical comments to my questions. The other reviewers of the book include Peizong Lee of Academia Sinica, Taiwan, Zhiyuan Li of Purdue University, Yves Robert of of Laboratoire de l'informatique du Parallelisme at Lyon and Peiyi Tang of University of Southern Queensland. I am very grateful to all these reviewers for encouraging me to write this book and for giving a number of suggestions. Alain Darte of Laboratoire de l'informatique du Parallelisme at Lyon read Chapter 1 very carefully, found errors and gave several suggestions. XIX

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE Yau-Tsun Steven Li Monterey Design Systems, Inc. Sharad Malik Princeton University ~. " SPRINGER

More information

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs Vaughn Betz Jonathan Rose Alexander Marquardt

More information

Communication-Minimal Tiling of Uniform Dependence Loops

Communication-Minimal Tiling of Uniform Dependence Loops Communication-Minimal Tiling of Uniform Dependence Loops Jingling Xue Department of Mathematics, Statistics and Computing Science University of New England, Armidale 2351, Australia Abstract. Tiling is

More information

Topological Structure and Analysis of Interconnection Networks

Topological Structure and Analysis of Interconnection Networks Topological Structure and Analysis of Interconnection Networks Network Theory and Applications Volume 7 Managing Editors: Ding-Zhu Du, University of Minnesota, U.S.A. and Cauligi Raghavendra, University

More information

THEORY OF LINEAR AND INTEGER PROGRAMMING

THEORY OF LINEAR AND INTEGER PROGRAMMING THEORY OF LINEAR AND INTEGER PROGRAMMING ALEXANDER SCHRIJVER Centrum voor Wiskunde en Informatica, Amsterdam A Wiley-Inter science Publication JOHN WILEY & SONS^ Chichester New York Weinheim Brisbane Singapore

More information

LARGE SCALE LINEAR AND INTEGER OPTIMIZATION: A UNIFIED APPROACH

LARGE SCALE LINEAR AND INTEGER OPTIMIZATION: A UNIFIED APPROACH LARGE SCALE LINEAR AND INTEGER OPTIMIZATION: A UNIFIED APPROACH Richard Kipp Martin Graduate School of Business University of Chicago % Kluwer Academic Publishers Boston/Dordrecht/London CONTENTS Preface

More information

THE VERILOG? HARDWARE DESCRIPTION LANGUAGE

THE VERILOG? HARDWARE DESCRIPTION LANGUAGE THE VERILOG? HARDWARE DESCRIPTION LANGUAGE THE VERILOGf HARDWARE DESCRIPTION LANGUAGE by Donald E. Thomas Carnegie Mellon University and Philip R. Moorby Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS

More information

MULTIMEDIA DATABASE MANAGEMENT SYSTEMS

MULTIMEDIA DATABASE MANAGEMENT SYSTEMS MULTIMEDIA DATABASE MANAGEMENT SYSTEMS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE MULTIMEDIA SYSTEMS AND APPLICATIONS Recently Published Titles: Consulting Editor Borko Furht Florida

More information

Structured Parallel Programming Patterns for Efficient Computation

Structured Parallel Programming Patterns for Efficient Computation Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

Symbolic Evaluation of Sums for Parallelising Compilers

Symbolic Evaluation of Sums for Parallelising Compilers Symbolic Evaluation of Sums for Parallelising Compilers Rizos Sakellariou Department of Computer Science University of Manchester Oxford Road Manchester M13 9PL United Kingdom e-mail: rizos@csmanacuk Keywords:

More information

Structured Parallel Programming

Structured Parallel Programming Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

Loop Transformations, Dependences, and Parallelization

Loop Transformations, Dependences, and Parallelization Loop Transformations, Dependences, and Parallelization Announcements HW3 is due Wednesday February 15th Today HW3 intro Unimodular framework rehash with edits Skewing Smith-Waterman (the fix is in!), composing

More information

SPECC: SPECIFICATION LANGUAGE AND METHODOLOGY

SPECC: SPECIFICATION LANGUAGE AND METHODOLOGY SPECC: SPECIFICATION LANGUAGE AND METHODOLOGY SPECC: SPECIFICATION LANGUAGE AND METHODOLOGY Daniel D. Gajski Jianwen Zhu Rainer Dömer Andreas Gerstlauer Shuqing Zhao University of California, Irvine SPRINGER

More information

ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING

ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE PARALLEL PROCESSING AND FIFTH GENERATION COMPUTING Consulting Editor Doug DeGroot

More information

Yves Nievergelt. Wavelets Made Easy. Springer Science+Business Media, LLC

Yves Nievergelt. Wavelets Made Easy. Springer Science+Business Media, LLC Wavelets Made Easy Yves Nievergelt Wavelets Made Easy Springer Science+Business Media, LLC Yves Nievergelt Department of Mathematics Eastem Washington University Cheney, WA 99004-2431 USA Library of Congress

More information

A Course in Convexity

A Course in Convexity A Course in Convexity Alexander Barvinok Graduate Studies in Mathematics Volume 54 American Mathematical Society Providence, Rhode Island Preface vii Chapter I. Convex Sets at Large 1 1. Convex Sets. Main

More information

Linear Programming: Mathematics, Theory and Algorithms

Linear Programming: Mathematics, Theory and Algorithms Linear Programming: Mathematics, Theory and Algorithms Applied Optimization Volume 2 The titles published in this series are listed at the end of this volume. Linear Programming: Mathematics, Theory and

More information

MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING

MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS Series Editor Ahmed K. Elmagarmid Purdue University West Lafayette, IN 47907 Other books

More information

Fundamentals of Operating Systems. Fifth Edition

Fundamentals of Operating Systems. Fifth Edition Fundamentals of Operating Systems Fifth Edition Fundamentals of Operating Systems A.M. Lister University of Queensland R. D. Eager University of Kent at Canterbury Fifth Edition Springer Science+Business

More information

High-Performance Parallel Database Processing and Grid Databases

High-Performance Parallel Database Processing and Grid Databases High-Performance Parallel Database Processing and Grid Databases David Taniar Monash University, Australia Clement H.C. Leung Hong Kong Baptist University and Victoria University, Australia Wenny Rahayu

More information

WIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures

WIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures C-K Toh, Ph.D. University of Cambridge Cambridge, United Kingdom SPRINGER-SCIENCE+BUSINESS

More information

TASK SCHEDULING FOR PARALLEL SYSTEMS

TASK SCHEDULING FOR PARALLEL SYSTEMS TASK SCHEDULING FOR PARALLEL SYSTEMS Oliver Sinnen Department of Electrical and Computer Engineering The University of Aukland New Zealand TASK SCHEDULING FOR PARALLEL SYSTEMS TASK SCHEDULING FOR PARALLEL

More information

A Structured Programming Approach to Data

A Structured Programming Approach to Data A Structured Programming Approach to Data Derek Coleman A Structured Programming Approach to Data Springer-Verlag New York Derek Coleman Department of Computation Institute of Science Technology University

More information

Convex Analysis and Minimization Algorithms I

Convex Analysis and Minimization Algorithms I Jean-Baptiste Hiriart-Urruty Claude Lemarechal Convex Analysis and Minimization Algorithms I Fundamentals With 113 Figures Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Hong Kong Barcelona

More information

Graphics Programming in c++

Graphics Programming in c++ Graphics Programming in c++ Springer London Berlin Heidelberg New York Barcelona Budapest Hong Kong Milan Paris Santa Clara Singapore Tokyo Mark Walmsley Graphics Programming in c++ Writing Graphics Applications

More information

Computing and Informatics, Vol. 36, 2017, , doi: /cai

Computing and Informatics, Vol. 36, 2017, , doi: /cai Computing and Informatics, Vol. 36, 2017, 566 596, doi: 10.4149/cai 2017 3 566 NESTED-LOOPS TILING FOR PARALLELIZATION AND LOCALITY OPTIMIZATION Saeed Parsa, Mohammad Hamzei Department of Computer Engineering

More information

Data Mining for Association Rules and Sequential Patterns

Data Mining for Association Rules and Sequential Patterns Data Mining for Association Rules and Sequential Patterns Springer-Science+Business Media, LLC Jean-Marc Adamo Data Mining for Association Rules and Sequential Patterns Sequential and Parallel Algorithms

More information

PARALLEL, OBJECT -ORIENTED, AND ACTIVE KNOWLEDGE BASE SYSTEMS

PARALLEL, OBJECT -ORIENTED, AND ACTIVE KNOWLEDGE BASE SYSTEMS PARALLEL, OBJECT -ORIENTED, AND ACTIVE KNOWLEDGE BASE SYSTEMS The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS Series Editor Ahmed K. Elmagarmid Purdue University West Lafayette, IN 47907

More information

GEOMETRIC TOOLS FOR COMPUTER GRAPHICS

GEOMETRIC TOOLS FOR COMPUTER GRAPHICS GEOMETRIC TOOLS FOR COMPUTER GRAPHICS PHILIP J. SCHNEIDER DAVID H. EBERLY MORGAN KAUFMANN PUBLISHERS A N I M P R I N T O F E L S E V I E R S C I E N C E A M S T E R D A M B O S T O N L O N D O N N E W

More information

PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS

PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ROBOTICS: VISION, MANIPULATION AND SENSORS Consulting Editor:

More information

RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS

RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS Rainer LEUPERS University of Dortmund Department of Computer Science Dortmund, Germany

More information

Legal and impossible dependences

Legal and impossible dependences Transformations and Dependences 1 operations, column Fourier-Motzkin elimination us use these tools to determine (i) legality of permutation and Let generation of transformed code. (ii) Recall: Polyhedral

More information

Computer Architecture

Computer Architecture Computer Architecture Springer-Verlag Berlin Heidelberg GmbH Silvia M. Mueller Wolfgang J. Paul Computer Architecture Complexity and Correctness With 214 Figures and 185 Tables Springer Silvia Melitta

More information

Automatic Parallel Code Generation for Tiled Nested Loops

Automatic Parallel Code Generation for Tiled Nested Loops 2004 ACM Symposium on Applied Computing Automatic Parallel Code Generation for Tiled Nested Loops Georgios Goumas, Nikolaos Drosinos, Maria Athanasaki, Nectarios Koziris National Technical University of

More information

VERILOG QUICKSTART. James M. Lee Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS MEDIA, LLC

VERILOG QUICKSTART. James M. Lee Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS MEDIA, LLC VERILOG QUICKSTART VERILOG QUICKSTART by James M. Lee Cadence Design Systems, Inc. ~. " SPRINGER SCIENCE+BUSINESS MEDIA, LLC ISBN 978-1-4613-7801-3 ISBN 978-1-4615-6113-2 (ebook) DOI 10.1007/978-1-4615-6113-2

More information

Algorithms and Parallel Computing

Algorithms and Parallel Computing Algorithms and Parallel Computing Algorithms and Parallel Computing Fayez Gebali University of Victoria, Victoria, BC A John Wiley & Sons, Inc., Publication Copyright 2011 by John Wiley & Sons, Inc. All

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

The Automatic Design of Batch Processing Systems

The Automatic Design of Batch Processing Systems The Automatic Design of Batch Processing Systems by Barry Dwyer, M.A., D.A.E., Grad.Dip. A thesis submitted for the degree of Doctor of Philosophy in the Department of Computer Science University of Adelaide

More information

Polyhedral Compilation Foundations

Polyhedral Compilation Foundations Polyhedral Compilation Foundations Louis-Noël Pouchet pouchet@cse.ohio-state.edu Dept. of Computer Science and Engineering, the Ohio State University Feb 22, 2010 888.11, Class #5 Introduction: Polyhedral

More information

Affine and Unimodular Transformations for Non-Uniform Nested Loops

Affine and Unimodular Transformations for Non-Uniform Nested Loops th WSEAS International Conference on COMPUTERS, Heraklion, Greece, July 3-, 008 Affine and Unimodular Transformations for Non-Uniform Nested Loops FAWZY A. TORKEY, AFAF A. SALAH, NAHED M. EL DESOUKY and

More information

Linear Loop Transformations for Locality Enhancement

Linear Loop Transformations for Locality Enhancement Linear Loop Transformations for Locality Enhancement 1 Story so far Cache performance can be improved by tiling and permutation Permutation of perfectly nested loop can be modeled as a linear transformation

More information

LooPo: Automatic Loop Parallelization

LooPo: Automatic Loop Parallelization LooPo: Automatic Loop Parallelization Michael Claßen Fakultät für Informatik und Mathematik Düsseldorf, November 27 th 2008 Model-Based Loop Transformations model-based approach: map source code to an

More information

I = 4+I, I = 1, I 4

I = 4+I, I = 1, I 4 On Reducing Overhead in Loops Peter M.W. Knijnenburg Aart J.C. Bik High Performance Computing Division, Dept. of Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, the Netherlands. E-mail:

More information

HIGH-LEVEL SYNTHESIS FOR REAL-TIME DIGITAL SIGNAL PROCESSING

HIGH-LEVEL SYNTHESIS FOR REAL-TIME DIGITAL SIGNAL PROCESSING HIGH-LEVEL SYNTHESIS FOR REAL-TIME DIGITAL SIGNAL PROCESSING THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE VLSI, COMPUTER ARCHITECfURE AND DIGITAL SIGNAL PROCESSING Latest Titles

More information

of Convex Analysis Fundamentals Jean-Baptiste Hiriart-Urruty Claude Lemarechal Springer With 66 Figures

of Convex Analysis Fundamentals Jean-Baptiste Hiriart-Urruty Claude Lemarechal Springer With 66 Figures 2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. Jean-Baptiste Hiriart-Urruty Claude Lemarechal Fundamentals of Convex

More information

Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time

Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen and Nicolas Vasilache ALCHEMY, INRIA Futurs / University of Paris-Sud XI March

More information

Compiling for Advanced Architectures

Compiling for Advanced Architectures Compiling for Advanced Architectures In this lecture, we will concentrate on compilation issues for compiling scientific codes Typically, scientific codes Use arrays as their main data structures Have

More information

A Bibliography of Publications of Jingling Xue

A Bibliography of Publications of Jingling Xue A Bibliography of Publications of Jingling Xue Jingling Xue Department of Mathematics, Statistics and Computing Science Armidale, NSW 2351 Australia Tel: +61 67 73 3149 FAX: +61 67 73 3312 E-mail: xue@neumann.une.edu.au

More information

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ONTOLOGY LEARNING FOR THE SEMANTIC WEB ONTOLOGY LEARNING FOR THE SEMANTIC WEB by Alexander Maedche University of Karlsruhe, Germany SPRINGER

More information

INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach

INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach Library of Congress Cataloging-in-Publication ISBN 978-1-4613-5091-0 ISBN 978-1-4615-0467-2 (ebook) DOI 10.1007/978-1-4615-0467-2

More information

Energy Efficient Microprocessor Design

Energy Efficient Microprocessor Design Energy Efficient Microprocessor Design Energy Efficient Microprocessor Design by Thomas D. Burd Robert W. Brodersen with Contributions Irom Trevor Pering Anthony Stratakos Berkeley Wireless Research Center

More information

Lecture 11 Loop Transformations for Parallelism and Locality

Lecture 11 Loop Transformations for Parallelism and Locality Lecture 11 Loop Transformations for Parallelism and Locality 1. Examples 2. Affine Partitioning: Do-all 3. Affine Partitioning: Pipelining Readings: Chapter 11 11.3, 11.6 11.7.4, 11.9-11.9.6 1 Shared Memory

More information

Scheduling in Distributed Computing Systems Analysis, Design & Models

Scheduling in Distributed Computing Systems Analysis, Design & Models Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) by Deo Prakash

More information

This image cannot currently be displayed. Course Catalog. Pre-algebra Glynlyon, Inc.

This image cannot currently be displayed. Course Catalog. Pre-algebra Glynlyon, Inc. This image cannot currently be displayed. Course Catalog Pre-algebra 2016 Glynlyon, Inc. Table of Contents COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 1 UNIT 2: MODELING PROBLEMS IN INTEGERS...

More information

Increasing Parallelism of Loops with the Loop Distribution Technique

Increasing Parallelism of Loops with the Loop Distribution Technique Increasing Parallelism of Loops with the Loop Distribution Technique Ku-Nien Chang and Chang-Biau Yang Department of pplied Mathematics National Sun Yat-sen University Kaohsiung, Taiwan 804, ROC cbyang@math.nsysu.edu.tw

More information

Convex Geometry arising in Optimization

Convex Geometry arising in Optimization Convex Geometry arising in Optimization Jesús A. De Loera University of California, Davis Berlin Mathematical School Summer 2015 WHAT IS THIS COURSE ABOUT? Combinatorial Convexity and Optimization PLAN

More information

Lecture 9 Basic Parallelization

Lecture 9 Basic Parallelization Lecture 9 Basic Parallelization I. Introduction II. Data Dependence Analysis III. Loop Nests + Locality IV. Interprocedural Parallelization Chapter 11.1-11.1.4 CS243: Parallelization 1 Machine Learning

More information

Lecture 9 Basic Parallelization

Lecture 9 Basic Parallelization Lecture 9 Basic Parallelization I. Introduction II. Data Dependence Analysis III. Loop Nests + Locality IV. Interprocedural Parallelization Chapter 11.1-11.1.4 CS243: Parallelization 1 Machine Learning

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming SECOND EDITION Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book Information and Orders http://world.std.com/~athenasc/index.html Athena Scientific, Belmont,

More information

Computational Geometry on Surfaces

Computational Geometry on Surfaces Computational Geometry on Surfaces Computational Geometry on Surfaces Performing Computational Geometry on the Cylinder, the Sphere, the Torus, and the Cone by Clara I. Grima Department 0/ Applied Mathematics

More information

Curriculum Catalog

Curriculum Catalog 2017-2018 Curriculum Catalog 2017 Glynlyon, Inc. Table of Contents MATHEMATICS 800 FUNDAMENTALS COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 1 UNIT 2: MODELING PROBLEMS IN INTEGERS... 2 UNIT

More information

SymbolicC++: An Introduction to Computer Algebra using Object-Oriented Programming

SymbolicC++: An Introduction to Computer Algebra using Object-Oriented Programming SymbolicC++: An Introduction to Computer Algebra using Object-Oriented Programming Springer-Verlag London Ltd. Tan Kiat Shi, Willi-Hans Steeb and Yorick Hardy SymbolicC ++: An Introdurtion to Computer

More information

Curriculum Catalog

Curriculum Catalog 2018-2019 Curriculum Catalog Table of Contents MATHEMATICS 800 COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 1 UNIT 2: MODELING PROBLEMS IN INTEGERS... 3 UNIT 3: MODELING PROBLEMS WITH RATIONAL

More information

Separable Programming

Separable Programming Separable Programming Applied Optimization Volume 53 Series Editors: Panos M. Pardalos University 0/ Florida, USA. Donald Hearn University 0/ Florida, USA. The tit/es published in this series are listed

More information

FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS

FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE INFORMATION THEORY Consulting Editor Robert G. Gallager FINITE FIELDS FOR COMPUTER

More information

We propose the following strategy for applying unimodular transformations to non-perfectly nested loops. This strategy amounts to reducing the problem

We propose the following strategy for applying unimodular transformations to non-perfectly nested loops. This strategy amounts to reducing the problem Towards Unimodular Transformations for Non-perfectly Nested Loops Peter M.W. Knijnenburg High Performance Computing Division, Dept. of Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden,

More information

Computer-Aided Design in Magnetics

Computer-Aided Design in Magnetics Computer-Aided Design in Magnetics D. A. Lowther P. P. Silvester Computer-Aided Design in Magnetics With 84 illustrations Springer-Verlag Berlin Heidelberg New York Tokyo D. A. Lowther Associate Professor

More information

MANY signal processing systems, particularly in the multimedia

MANY signal processing systems, particularly in the multimedia 1304 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 9, SEPTEMBER 2009 Signal Assignment to Hierarchical Memory Organizations for Embedded Multidimensional Signal Processing

More information

Ec 181: Convex Analysis and Economic Theory

Ec 181: Convex Analysis and Economic Theory Division of the Humanities and Social Sciences Ec 181: Convex Analysis and Economic Theory KC Border Winter 2018 v. 2018.03.08::13.11 src: front KC Border: for Ec 181, Winter 2018 Woe to the author who

More information

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini DM545 Linear and Integer Programming Lecture 2 The Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. 2. 3. 4. Standard Form Basic Feasible Solutions

More information

Integer and Combinatorial Optimization

Integer and Combinatorial Optimization Integer and Combinatorial Optimization GEORGE NEMHAUSER School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, Georgia LAURENCE WOLSEY Center for Operations Research and

More information

Algorithm Collections for Digital Signal Processing Applications Using Matlab

Algorithm Collections for Digital Signal Processing Applications Using Matlab Algorithm Collections for Digital Signal Processing Applications Using Matlab Algorithm Collections for Digital Signal Processing Applications Using Matlab E.S. Gopi National Institute of Technology, Tiruchi,

More information

Compilation Issues for High Performance Computers: A Comparative. Overview of a General Model and the Unied Model. Brian J.

Compilation Issues for High Performance Computers: A Comparative. Overview of a General Model and the Unied Model. Brian J. Compilation Issues for High Performance Computers: A Comparative Overview of a General Model and the Unied Model Abstract This paper presents a comparison of two models suitable for use in a compiler for

More information

An Introduction to Programming with IDL

An Introduction to Programming with IDL An Introduction to Programming with IDL Interactive Data Language Kenneth P. Bowman Department of Atmospheric Sciences Texas A&M University AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN

More information

Linear and Integer Programming (ADM II) Script. Rolf Möhring WS 2010/11

Linear and Integer Programming (ADM II) Script. Rolf Möhring WS 2010/11 Linear and Integer Programming (ADM II) Script Rolf Möhring WS 200/ Contents -. Algorithmic Discrete Mathematics (ADM)... 3... 4.3 Winter term 200/... 5 2. Optimization problems 2. Examples... 7 2.2 Neighborhoods

More information

Digital Functions and Data Reconstruction

Digital Functions and Data Reconstruction Digital Functions and Data Reconstruction Li M. Chen Digital Functions and Data Reconstruction Digital-Discrete Methods 123 Li M. Chen University of the District of Columbia Washington, DC, USA ISBN 978-1-4614-5637-7

More information

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms

More information

CPSC / Sonny Chan - University of Calgary. Collision Detection II

CPSC / Sonny Chan - University of Calgary. Collision Detection II CPSC 599.86 / 601.86 Sonny Chan - University of Calgary Collision Detection II Outline Broad phase collision detection: - Problem definition and motivation - Bounding volume hierarchies - Spatial partitioning

More information

Theory of Automatic Robot Assembly and Programming

Theory of Automatic Robot Assembly and Programming Theory of Automatic Robot Assembly and Programming Theory of Automatic Robot Assembly and Programming Bartholomew o. Nnaji Professor and Director Automation and Robotics Laboratory Department of Industrial

More information

PARALLEL ALGORITHMS FOR LINEAR MODELS

PARALLEL ALGORITHMS FOR LINEAR MODELS PARALLEL ALGORITHMS FOR LINEAR MODELS Advances in Computational Economics VOLUME 15 SERIES EDITORS Hans Amman, University of Amsterdam, Amsterdam, The Netherlands Anna Nagurney, University of Massachusetts

More information

Fundamentals of Discrete Mathematical Structures

Fundamentals of Discrete Mathematical Structures Fundamentals of Discrete Mathematical Structures THIRD EDITION K.R. Chowdhary Campus Director JIET School of Engineering and Technology for Girls Jodhpur Delhi-110092 2015 FUNDAMENTALS OF DISCRETE MATHEMATICAL

More information

Module 13: INTRODUCTION TO COMPILERS FOR HIGH PERFORMANCE COMPUTERS Lecture 25: Supercomputing Applications. The Lecture Contains: Loop Unswitching

Module 13: INTRODUCTION TO COMPILERS FOR HIGH PERFORMANCE COMPUTERS Lecture 25: Supercomputing Applications. The Lecture Contains: Loop Unswitching The Lecture Contains: Loop Unswitching Supercomputing Applications Programming Paradigms Important Problems Scheduling Sources and Types of Parallelism Model of Compiler Code Optimization Data Dependence

More information

Simplex Algorithm in 1 Slide

Simplex Algorithm in 1 Slide Administrivia 1 Canonical form: Simplex Algorithm in 1 Slide If we do pivot in A r,s >0, where c s

More information

Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4

Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Stephen L. Campbell, Jean-Philippe Chancelier and Ramine Nikoukhah Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Second Edition

More information

Optimality certificates for convex minimization and Helly numbers

Optimality certificates for convex minimization and Helly numbers Optimality certificates for convex minimization and Helly numbers Amitabh Basu Michele Conforti Gérard Cornuéjols Robert Weismantel Stefan Weltge October 20, 2016 Abstract We consider the problem of minimizing

More information

Tiling: A Data Locality Optimizing Algorithm

Tiling: A Data Locality Optimizing Algorithm Tiling: A Data Locality Optimizing Algorithm Previously Unroll and Jam Homework PA3 is due Monday November 2nd Today Unroll and Jam is tiling Code generation for fixed-sized tiles Paper writing and critique

More information

Functional Programming in R

Functional Programming in R Functional Programming in R Advanced Statistical Programming for Data Science, Analysis and Finance Thomas Mailund Functional Programming in R: Advanced Statistical Programming for Data Science, Analysis

More information

From acute sets to centrally symmetric 2-neighborly polytopes

From acute sets to centrally symmetric 2-neighborly polytopes From acute sets to centrally symmetric -neighborly polytopes Isabella Novik Department of Mathematics University of Washington Seattle, WA 98195-4350, USA novik@math.washington.edu May 1, 018 Abstract

More information

Groupware and the World Wide Web

Groupware and the World Wide Web Groupware and the World Wide Web Edited by Richard Bentley, Uwe Busbach, David Kerr & Klaas Sikkel German National Research Center for Information Technology, Institutefor Applied Information Technology

More information

Software Development for SAP R/3

Software Development for SAP R/3 Software Development for SAP R/3 Springer-Verlag Berlin Heidelberg GmbH Ulrich Mende Software Development for SAP R/3 Data Dictionary, ABAP/4, Interfaces With Diskette With 124 Figures and Many Example

More information

Polyèdres et compilation

Polyèdres et compilation Polyèdres et compilation François Irigoin & Mehdi Amini & Corinne Ancourt & Fabien Coelho & Béatrice Creusillet & Ronan Keryell MINES ParisTech - Centre de Recherche en Informatique 12 May 2011 François

More information

INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation

INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation THE KLUWER INTERNATIONAL SERIES ON INFORMATION RETRIEVAL Series Editor W. Bruce Croft University of Massachusetts Amherst, MA 01003 Also in the

More information

Linear programming and duality theory

Linear programming and duality theory Linear programming and duality theory Complements of Operations Research Giovanni Righini Linear Programming (LP) A linear program is defined by linear constraints, a linear objective function. Its variables

More information

CURRICULUM CATALOG. CCR Mathematics Grade 8 (270720) MS

CURRICULUM CATALOG. CCR Mathematics Grade 8 (270720) MS 2018-19 CURRICULUM CATALOG Table of Contents COURSE OVERVIEW... 1 UNIT 1: THE REAL NUMBER SYSTEM... 2 UNIT 2: MODELING PROBLEMS IN INTEGERS... 2 UNIT 3: MODELING PROBLEMS WITH RATIONAL NUMBERS... 2 UNIT

More information

Triangulations of hyperbolic 3-manifolds admitting strict angle structures

Triangulations of hyperbolic 3-manifolds admitting strict angle structures Triangulations of hyperbolic 3-manifolds admitting strict angle structures Craig D. Hodgson, J. Hyam Rubinstein and Henry Segerman segerman@unimelb.edu.au University of Melbourne January 4 th 2012 Ideal

More information

Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz

Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming by Nasser Kehtarnavaz University

More information

Introductory Combinatorics

Introductory Combinatorics Introductory Combinatorics Third Edition KENNETH P. BOGART Dartmouth College,. " A Harcourt Science and Technology Company San Diego San Francisco New York Boston London Toronto Sydney Tokyo xm CONTENTS

More information

DETERMINISTIC OPERATIONS RESEARCH

DETERMINISTIC OPERATIONS RESEARCH DETERMINISTIC OPERATIONS RESEARCH Models and Methods in Optimization Linear DAVID J. RADER, JR. Rose-Hulman Institute of Technology Department of Mathematics Terre Haute, IN WILEY A JOHN WILEY & SONS,

More information

The Polytope Model: Past, Present, Future

The Polytope Model: Past, Present, Future The Polytope Model: Past, Present, Future Paul Feautrier ENS de Lyon Paul.Feautrier@ens-lyon.fr 8 octobre 2009 1 / 39 What is a Model? What is a Polytope? Basis of the Polytope Model Fundamental Algorithms

More information

Linear Programming in Small Dimensions

Linear Programming in Small Dimensions Linear Programming in Small Dimensions Lekcija 7 sergio.cabello@fmf.uni-lj.si FMF Univerza v Ljubljani Edited from slides by Antoine Vigneron Outline linear programming, motivation and definition one dimensional

More information