F. THOMSON LEIGHTON INTRODUCTION TO PARALLEL ALGORITHMS AND ARCHITECTURES: ARRAYS TREES HYPERCUBES

Similar documents
Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest. Introduction to Algorithms

4.1.2 Merge Sort Sorting Lower Bound Counting Sort Sorting in Practice Solving Problems by Sorting...

THE DESIGN AND ANALYSIS OF COMPUTER ALGORITHMS

The Algorithm Design Manual

Contents. I The Basic Framework for Stationary Problems 1

About the Author. Dependency Chart. Chapter 1: Logic and Sets 1. Chapter 2: Relations and Functions, Boolean Algebra, and Circuit Design

The Spectral Relation between the Cube-Connected Cycles and the Shuffle-Exchange Network

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Introduction to Algorithms Third Edition

Discrete Mathematics SECOND EDITION OXFORD UNIVERSITY PRESS. Norman L. Biggs. Professor of Mathematics London School of Economics University of London

Table of Contents. Chapter 1: Introduction to Data Structures... 1

Contents. 1 Introduction. 2 Searching and Traversal Techniques. Preface... (vii) Acknowledgements... (ix)

SDSU CS 662 Theory of Parallel Algorithms Networks part 2

Data Communication and Parallel Computing on Twisted Hypercubes

Part III. Mesh-Based Architectures. Winter 2016 Parallel Processing, Mesh-Based Architectures Slide 1

Topological Structure and Analysis of Interconnection Networks

GEOMETRIC TOOLS FOR COMPUTER GRAPHICS

CS521 \ Notes for the Final Exam

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

MATHEMATICAL STRUCTURES FOR COMPUTER SCIENCE

LARGE SCALE LINEAR AND INTEGER OPTIMIZATION: A UNIFIED APPROACH

Welcome to the course Algorithm Design

Introductory Combinatorics

Direct Routing: Algorithms and Complexity

DISCRETE MATHEMATICS

CLASSIC DATA STRUCTURES IN JAVA

Computational Discrete Mathematics

Fundamentals of Digital Image Processing

Contents. Preface. About the Authors BASIC TECHNIQUES CHAPTER 1 PARALLEL COMPUTERS. l. 1 The Demand for Computational Speed 3

COMPUTER AND ROBOT VISION

ENGINEERING PROBLEM SOLVING WITH C++

Lectures 8/9. 1 Overview. 2 Prelude:Routing on the Grid. 3 A couple of networks.

Interconnection networks

LOGIC SYNTHESIS AND VERIFICATION ALGORITHMS. Gary D. Hachtel University of Colorado. Fabio Somenzi University of Colorado.

r=1 The Binomial Theorem. 4 MA095/98G Revision

Parallel Implementations of Gaussian Elimination

Chapter 1 Introduction

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Hypercubes. (Chapter Nine)

CS256 Applied Theory of Computation

Fundamentals of Discrete Mathematical Structures

WITH C+ + William Ford University of the Pacific. William Topp University of the Pacific. Prentice Hall, Englewood Cliffs, New Jersey 07632

COMPUTER AIDED GEOMETRIC DESIGN. Thomas W. Sederberg

Applied Combinatorics

Heuristic Search. Theory and Applications. Stefan Edelkamp. Stefan Schrodl ELSEVIER. Morgan Kaufmann is an imprint of Elsevier HEIDELBERG LONDON

Lecture 3: Sorting 1

COURSE: DATA STRUCTURES USING C & C++ CODE: 05BMCAR17161 CREDITS: 05

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

Structured Parallel Programming Patterns for Efficient Computation

DESIGN AND ANALYSIS OF ALGORITHMS

Introduction to Parallel Computing

Jörgen Bang-Jensen and Gregory Gutin. Digraphs. Theory, Algorithms and Applications. Springer

CSCE 321/3201 Analysis and Design of Algorithms. Prof. Amr Goneid. Fall 2016

Introduction p. 1 Pseudocode p. 2 Algorithm Header p. 2 Purpose, Conditions, and Return p. 3 Statement Numbers p. 4 Variables p. 4 Algorithm Analysis

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

Performance Level Descriptors. Mathematics

Analysis of Algorithms

Preface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS

CS 614 COMPUTER ARCHITECTURE II FALL 2005

INDEX. Cambridge University Press How to Think About Algorithms Jeff Edmonds Index More information

Probabilistic Robotics

DETERMINISTIC OPERATIONS RESEARCH

Efficient Bufferless Packet Switching on Trees and Leveled Networks

F# for Scientists. Jon Harrop Flying Frog Consultancy Ltd. Foreword by Don Syme A JOHN WILEY & SONS, INC., PUBLICATION WILEY

CS 6143 COMPUTER ARCHITECTURE II SPRING 2014

DATA STRUCTURES THROUGH C++

Lecture 8 Parallel Algorithms II

High-Performance Parallel Database Processing and Grid Databases

Fast Hierarchical Clustering via Dynamic Closest Pairs

Algorithms and Applications

Department of Computer Applications. MCA 312: Design and Analysis of Algorithms. [Part I : Medium Answer Type Questions] UNIT I

Elements of Graph Theory

Computation with No Memory, and Rearrangeable Multicast Networks

Computer Programming C++ (wg) CCOs

Part I Basic Concepts 1

CONTENTS. PART 1 Structured Programming 1. 1 Getting started 3. 2 Basic programming elements 17

Cpt S 223 Course Overview. Cpt S 223, Fall 2007 Copyright: Washington State University

Algorithms and Data Structures

Structured Parallel Programming

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. November Parallel Sorting

Interconnection Networks. Issues for Networks

TOPOLOGICAL ALGEBRAS SELECTED TOPICS

x = 12 x = 12 1x = 16

Curves and Fractal Dimension

Course Name: B.Tech. 3 th Sem. No of hours allotted to complete the syllabi: 44 Hours No of hours allotted per week: 3 Hours. Planned.

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

Contents. Chapter 1 SPECIFYING SYNTAX 1

Digital Image Processing

EE/CSCI 451: Parallel and Distributed Computation

2. True or false: even though BFS and DFS have the same space complexity, they do not always have the same worst case asymptotic time complexity.

Anany Levitin 3RD EDITION. Arup Kumar Bhattacharjee. mmmmm Analysis of Algorithms. Soumen Mukherjee. Introduction to TllG DCSISFI &

CS-6402 DESIGN AND ANALYSIS OF ALGORITHMS

DYNAMIC MEMORY ALLOCATION AND DEALLOCATION

Data Structures and Algorithms

UML CS Algorithms Qualifying Exam Spring, 2004 ALGORITHMS QUALIFYING EXAM

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. Fall Parallel Sorting

Dijkstra s Algorithm Last time we saw two methods to solve the all-pairs shortest path problem: Min-plus matrix powering in O(n 3 log n) time and the

Geometric Algebra for Computer Graphics

SHARED MEMORY VS DISTRIBUTED MEMORY

Transcription:

F. THOMSON LEIGHTON INTRODUCTION TO PARALLEL ALGORITHMS AND ARCHITECTURES: ARRAYS TREES HYPERCUBES MORGAN KAUFMANN PUBLISHERS SAN MATEO, CALIFORNIA

Contents Preface Organization of the Material Teaching from the Text Exercises and Bibliographic Notes Errors Preview of Volume II Acknowledgments Notation ix x xi xii xiii xiv xv xix 1 ARRAYS AND TREES 1 1.1 Elementary Sorting and Counting 4 1.1.1 Sorting on a Linear Array 5 Assessing the Performance of the Algorithm... 7 Sorting N Numbers with Fewer Than N Processors 10 1.1.2 Sorting in the Bit Model 12 1.1.3 Lower Bounds 18 1.1.4 A Counterexample Counting 22 1.1.5 Properties of the Fixed-Connection Network Model. 29 1.2 Integer Arithmetic 32 1.2.1 Carry-Lookahead Addition 32 1.2.2 Prefix Computations 37 Segmented Prefix Computations 43 1.2.3 Carry-Save Addition 44

1.2.4 Multiplication and Convolution 48 1.2.5 Division and Newton Iteration * 55 1.3 Matrix Algorithms. 59 1.3.1 Elementary Matrix Products 60 1.3.2 Algorithms for Triangular Matrices 66 1.3.3 Algorithms for Tridiagonal Matrices * 72 Odd-Even Reduction 72 Parallel Prefix Algorithms 78 1.3.4 Gaussian Elimination 82 1.3.5 Iterative Methods * 92 Jacobi Relaxation 93 Gauss-Seidel Relaxation 95 Finite Difference Methods 97 Multigrid Methods 99 1.4 Retiming and Systolic Conversion 102 1.4.1 A Motivating Example Palindrome Recognition.. 103 1.4.2 The Systolic and Semisystolic Models of Computation 103 1.4.3 Retiming Semisystolic Networks 108 1.4.4 Conversion of a Semisystolic Network into a Systolic Network 113 1.4.5 The Special Case of Broadcasting 118 1.4.6 Retiming the Host 119 1.4.7 Design by Systolic Conversion A Summary... 123 1.5 Graph Algorithms 125 1.5.1 Transitive Closure 125 1.5.2 Connected Components 130 1.5.3 Shortest Paths 131 1.5.4 Breadth-First Spanning Trees 132 1.5.5 Minimum-Weight Spanning Trees 136 1.6 Sorting Revisited 139 1.6.1 Odd-Even Transposition Sort on a Linear Array.. 139 1.6.2 A Simple y/n(log N + 1)-Step Sorting Algorithm.. 144 1.6.3 A (3v^V + o(v / ]V r ))-Step Sorting Algorithm *... 148 1.6.4 A Matching Lower Bound 151

iii 1.7 Packet Routing 154 1.7.1 Greedy Algorithms 155 1.7.2 Average-Case Analysis of Greedy Algorithms... 163 Routing N Packets to Random Destinations.. 163 Analysis of Dynamic Routing Problems.... 173 1.7.3 Randomized Routing Algorithms 178 1.7.4 Deterministic Algorithms with Small Queues... 183 1.7.5 An Off-line Algorithm 186 1.7.6 Other Routing Models and Algorithms 197 1.8 Image Analysis and Computational Geometry... 200 1.8.1 Component-Labelling Algorithms 201 Levialdi's Algorithm * 202 An 0(-//V)-Step Recursive Algorithm 207 1.8.2 Computing Hough Transforms 210 1.8.3 Nearest-Neighbor Algorithms 214 1.8.4 Finding Convex Hulls * 216 1.9 Higher-Dimensional Arrays 222 1.9.1 Definitions and Properties 223 1.9.2 Matrix Multiplication 226 1.9.3 Sorting 229 1.9.4 Packet Routing 232 1.9.5 Simulating High-Dimensional Arrays on Low-Dimensional Arrays 234 1.10 Problems 237 1.11 Bibliographic Notes 272 2 MESHES OF TREES 277 2.1 The Two-Dimensional Mesh of Trees 280 2.1.1 Definition and Properties 280 2.1.2 Recursive Decomposition 282 2.1.3 Derivation from K N<N 283 2.1.4 Variations 286 2.1.5 Comparison With the Pyramid and Multigrid... 287 2.2 Elementary О (log AT)-Step Algorithms 288 2.2.1 Routing 288 2.2.2 Sorting 289 ч

2.2.3 Matrix-Vector Multiplication 291 2.2.4 Jacobi Relaxation 292 2.2.5 Pivoting 294 2.2.6 Convolution 295 2.2.7 Convex Hull* 296 2.3 Integer Arithmetic 298 2.3.1 Multiplication 298 2.3.2 Division and Chinese Remaindering 301 2.3.3 Related Problems 306 Iterated Products 306 Root Finding 308 2.4 Matrix Algorithms 309 2.4.1 The Three-Dimensional Mesh of Trees 310 2.4.2 Matrix Multiplication 311 2.4.3 Inverting Lower Triangular Matrices 312 2.4.4 Inverting Arbitrary Matrices * 316 Csanky's Algorithm 316 Inversion by Newton Iteration 319 2.4.5 Related Problems 320 2.5 Graph Algorithms 324 2.5.1 Minimum-Weight Spanning Trees * 325 2.5.2 Connected Components 338 2.5.3 Transitive Closure 339 2.5.4 Shortest Paths 340 2.5.5 Matching Problems * 341 2.6 Fast Evaluation of Straight-Line Code 354 2.6.1 Addition and Multiplication Over a Semiring... 355 2.6.2 Extension to Codes with Subtraction and Division. 367 2.6.3 Applications 371 2.7 Higher-Dimensional Meshes of Trees 373 2.7.1 Definitions and Properties 373 2.7.2 The Shuffle-Tree Graph 374 2.8 Problems 378 2.9 Bibliographic Notes 386

3 HYPERCUBES AND RELATED NETWORKS 389 3.1 The Hypercube 392 3.1.1 Definitions and Properties 393 3.1.2 Containment of Arrays 396 Higher-Dimensional Arrays 399 Non-Power-of-2 Arrays 401 3.1.3 Containment of Complete Binary Trees 404 3.1.4 Embeddings of Arbitrary Binary Trees * 410 Embeddings with Dilation 1 and Load 0(f + logiv) 412 Embeddings with Dilation O(l) and LoadO(f+ 1) 416 A Review of One-Error-Correcting Codes *... 418 Embedding P\ 0e N into H iogn 427 3.1.5 Containment of Meshes of Trees 430 3.1.6 Other Containment Results 437 3.2 The Butterfly, Cube-Connected-Cycles, and Benes Network 439 3.2.1 Definitions and Properties 440 3.2.2 Simulation of Arbitrary Networks * 456 3.2.3 Simulation of Normal Hypercube Algorithms *... 461 3.2.4 Some Containment and Simulation Results 465 3.3 The Shuffle-Exchange and de Bruijn Graphs... 473 3.3.1 Definitions and Properties 474 3.3.2 The Diaconis Card Tricks 483 3.3.3 Simulation of Normal Hypercube Algorithms... 491 3.3.4 Similarities with the Butterfly * * 495 3.3.5 Some Containment and Simulation Results 509 3.4 Packet-Routing Algorithms 511 3.4.1 Definitions and Routing Models 513 3.4.2 Greedy Routing Algorithms and Worst-Case Problems 515 A General Lower Bound for Oblivious Routing * 521 3.4.3 Packing, Spreading, and Monotone Routing Problems 524 i

Reducing a Many-to-Many Routing Problem to a Many-to-One Routing Problem 536 Reducing a Routing Problem to a Sorting Problem 538 3.4.4 The Average-Case Behavior of the Greedy Algorithm 539 Bounds on Congestion 542 Bounds on Running Time 547 Analyzing Non-Predictive Contention-Resolution Protocols 556 3.4.5 Converting Worst-Case Routing Problems into Average-Case Routing Problems 561 Hashing 562 Randomized Routing 568 3.4.6 Bounding Queue Sizes 571 Routing on Arbitrary Levelled Networks.... 588 3.4.7 Routing with Combining 591 3.4.8 The Information Dispersal Approach to Routing.. 598 Using Information Dispersal to Attain Fault-Tolerance 604 Finite Fields and Coding Theory 608 3.4.9 Circuit-Switching Algorithms 612 3.5 Sorting 621 3.5.1 Odd-Even Merge Sort 622 Constructing a Sorting Circuit with Depth logiv(logiv + l)/2 628 3.5.2 Sorting Small Sets * 632 3.5.3 A Deterministic 0(logiVloglog./V)-Step Sorting Algorithm 642 3.5.4 Randomized 0(logiV)-Step Sorting Algorithms *. 657 A Circuit with Depth 7.45 log N that Usually Sorts 662 3.6 Simulating a Parallel Random Access Machine... 697 3.6.1 PRAM Models and Shared Memories 698 3.6.2 Randomized Simulations Based on Hashing 700 3.6.3 Deterministic Simulations Using Replicated Data. 703 3.6.4 Using Information Dispersal to Improve Performance 709

3.7 The Fast Fourier Transform 711 3.7.1 The Algorithm 711 3.7.2 Implementation on the Butterfly and Shuffle-Exchange Graph 713 3.7.3 Application to Convolution and Polynomial Arithmetic 717 3.7.4 Application to Integer Multiplication 722 3.8 Other Hypercubic Networks 730 3.8.1 Butterflylike Networks 730 The Omega Network 730 The Flip Network 732 The Baseline and Reverse Baseline Networks.. 732 Banyan and Delta Networks 736 k-ary Butterflies 739 3.8.2 De Bruijn-Type Networks 739 The fc-ary de Bruijn Graph 741 The Generalized Shuffle-Exchange Graph... 742 3.9 Problems 743 3.10 Bibliographic Notes 777 BIBLIOGRAPHY 785 INDEX 803 Lemmas, Theorems, and Corollaries 804 Author Index 807 Subject Index 811 4