Hierarchical Scheduling in Parallel and Cluster Systems

Size: px
Start display at page:

Download "Hierarchical Scheduling in Parallel and Cluster Systems"

Transcription

1 Hierarchical Scheduling in Parallel and Cluster Systems

2 SERIES IN COMPUTER SCIENCE Series Editor: Rami G. Melhem University of Pittsburgh Pittsburgh, Pennsylvania ENGINEERING ELECTRONIC NEGOTIATIONS A Guide to Electronic Negotiation Technologies for the Design and Implementation of Next-Generation Electronic Markets-Future Silkroads of ecommerce Michael Strobel HIERARCHICAL SCHEDULING IN PARALLEL AND CLUSTER SYSTEMS Sivarama Dandamudi INTRODUCTION TO PARALLEL PROCESSING Algorithms and Architectures Behrooz Parhami OBJECT-ORIENTED DISCRETE-EVENT SIMULATION WITH JAVA A Practical Introduction jose M. Garrido A PARALLEL ALGORITHM SYNTHESIS PROCEDURE FOR HIGH PERFORMANCE COMPUTER ARCHITECTURES Ian N. Dunn and Gerard G. L. Meyer PERFORMANCE MODELING OF OPERATING SYSTEMS USING OBJECT-ORIENTED SIMULATION A Practical Introduction jose M. Garrido POWER AWARE COMPUTING Edited by Robert Graybill and Rami Melhem THE STRUCTURAL THEORY OF PROBABILITY New Ideas from Computer Science on the Ancient Problem of Probability Interpretation Paolo Rocchi

3 Hierarchical Scheduling in Parallel and Cluster Systems Sivarama Dandamudi Carleton University Ottawa, Ontario, Canada Springer Science+Business Media, LLC

4 Library of Congress Cataloging-in-Publication Data Dandamudi, Sivarama P., Hierarchical scheduling in parallel and cluster systems/sivarama Dandamudi. p. cm. - (Series in computer science) Includes bibliographical references and index. ISBN ISBN (ebook) DOl / Parallel processing (Electronic computers) 2. Computer architecture. 3. Electronic data processing-distributed processing. I. Title. 11. Series in computer science (Springer-Science+Business Media, LLC) QA76.58.D '.35-dc ISBN Springer Science + Business Media New York Originally published by Kluwer Academic / Plenum Publishers in 2003 Softcover reprint of the hardcover 1 st edition A c.i.p_ record for this book is available from the library of Congress All rights reserved No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording, or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Permissions for books published in Europe: permissions@wkap.nl Permissions for books published in the United States of America: permissions@wkap.com

5 To my parents, Subba Rao and Prameela Rani, my wife, Sobba, and my daughter, Veda

6 Preface Multiple processor systems are an important class of parallel systems. Over the years, several architectures have been proposed to build such systems to satisfy the requirements of high performance computing. These architectures span a wide variety of system types. At the low end of the spectrum, we can build a small, shared-memory parallel system with tens of processors. These systems typically use a bus to interconnect the processors and memory. Such systems, for example, are becoming commonplace in high-performance graphics workstations. These systems are called uniform memory access (UMA) multiprocessors because they provide uniform access of memory to all processors. These systems provide a single address space, which is preferred by programmers. This architecture, however, cannot be extended even to medium systems with hundreds of processors due to bus bandwidth limitations. To scale systems to medium range i.e., to hundreds of processors, non-bus interconnection networks have been proposed. These systems, for example, use a multistage dynamic interconnection network. Such systems also provide global, shared memory like the UMA systems. However, they introduce local and remote memories, which lead to non-uniform memory access (NUMA) architecture. Distributed-memory architecture is used for systems with thousands of processors. These systems differ from the shared-memory architectures in that there is no globally accessible shared memory. Instead, they use message passing to facilitate communication among the processors. As a result, they do not provide single address space. Architecture of a distributed-memory system is remarkably close to that of a network of workstations or a workstation cluster. There are some significant differences between the two systems in the kind of hardware used. Forexample, distributed-memory systems such as Cray T3E use high-bandwidth, low-latency interconnect. However, cluster systems offer significant cost advantage. As a result, these systems are increasingly becoming popular for high performance vii

7 viii HIERARCHICAL SCHEDULING computing. In this book, we are interested in parallel systems as well as cluster systems. From the hardware point of view, it is relatively straightforward to build large parallel systems with thousands of processors. Such systems are becoming economically viable as well. However, managing system resources in such large systems is very complex. In this book, we look at job scheduling problem in parallel and cluster systems. Parallel job scheduling has been extensively studied over the last two decades. Initial studies have focused on small UMA architectures. More recent interest is in the cluster systems. A job scheduling policy that works effectively for small UMA systems might not work for large distributed-memory systems with thousands of processors. Thus, scalability is an important characteristic of a scheduling policy if we want to use it in large distributed-memory systems. In this book we present a hierarchical scheduling policy that scales well with system size. This policy is based on the hierarchical task queue organization we introduced to organize the system run queue. The book is divided into four parts. Part I consists of the first three chapters. This part gives introduction to parallel and cluster systems. Furthermore, it surveys the parallel job scheduling policies proposed in the literature. Part II, comprising of Chapters 4 to 6, gives details about our hierarchical task queue organization and its performance. We demonstrate that this organization scales well, which makes it suitable for systems with hundreds to thousands of processors. In Part III we use this task queue organization as the basis to devise hierarchical scheduling policies for parallel and cluster systems. Chapter 7 gives details on the hierarchical policy for shared-memory systems. The next two chapters describe how the hierarchical policy can be adopted to the distributed-memory systems and cluster systems. These three chapters show that the hierarchical policy provides substantial performance advantages over other policies proposed in the literature. Finally, Part IV concludes the book with a brief summary and concluding remarks. Acknowledgments First and foremost, I would like to thank my wife Sobha and my daughter Veda for enduring my preoccupation with this project during the evenings and weekends. This book draws upon the research we did as part of our parallel scheduling project. Over the past eight years several students have worked on this project for their theses. I would like to thank the following students for their contribution to some of the results presented in this book: Jemal Abawajy, Terrence Au, Samir Ayachi, Philip Cheng, Thyagaraj Thanalapati, Hai Yu, and Zhengao Zhou.

8 PREFACE IX Thanks are also due to Prof. Rami Melhem of the University of Pittsburgh for inviting me to write this monograph. I also thank Ana Bozicevic, Editor, Kluwer Academic Publishers for following up the proposal with enthusiasm. My sincere appreciation goes to the School of Computer Science and Carleton University for supporting our parallel scheduling project. I gratefully acknowledge the financial support received by the project from the Natural Sciences and Engineering Research Council of Canada. SIVARAMA DANDAMUDI

9 Contents List of Figures List of Tables xvii xxv PART I: Background 1 1. INTRODUCTION Why Parallel Processing? Parallel Architectures SIMD Systems MIMD Systems Job Scheduling Software Architectures Overview of the Monograph PARALLEL AND CLUSTER SYSTEMS Introduction l3 2.2 Parallel Architectures UMA Systems NUMA Systems Distributed-Memory Systems Distributed Shared Memory Example Parallel Systems IBM SP2 System Stanford DASH System ASCI Systems Interconnection Networks Dynamic Interconnection Networks Static Interconnection Networks 29 Xl

10 xii HIERARCHICAL SCHEDULING 2.5 Interprocess Communication PVM MPI TreadMarks Cluster Systems Beowulf Summary PARALLEL JOB SCHEDULING Introduction Parallel Program Structures Fork-and-Join Programs Divide-and-Conquer Programs Matrix Factorization Programs Task Queue Organizations Basic Task Queue Organizations Improving Centralized Organization Improving Distributed Organization Scheduling Policies Space-Sharing Policies Static Policies Dynamic Policies An Example Space-Sharing Policy Adaptive Space-Sharing Policy A Modification An Improvement Performance Comparison Performance Comparison Handling Heterogeneity Time-Sharing Policies Hybrid Policies Example Policies IBMSP ASCI Blue-Pacific Portable Batch System Summary 84 PART II: Hierarchical Task Queue Organization 85

11 Contents xiii 4. HIERARCHICAL TASK QUEUE ORGANIZATION Motivation Hierarchical Organization Workload and System Models Performance Analysis Queue Access Overhead Utilization Analysis Centralized Organization Distributed Organization Hierarchical Organization Contention Analysis Centralized Organization Distributed Organization Hierarchical Organization Performance Comparison Impact of Access Contention Effect of Number of Tasks Sensitivity to Service Time Variance Impact of System Size Influence of Branching and Transfer Factors III 4.6 Performance of Dynamic Task Removal Policies Summary PERFORMANCE OF SCHEDULING POLICIES Introduction Performance of Job Scheduling Policies Policies Results Performance Sensitivity to System Load Sensitivity to Task Service Time Variance Sensitivity to Variance in Task Distribution Performance of Task Scheduling Policies Task Scheduling Policies Results and Discussion Principal Comparison Impact of Variance in Task Service Time Impact of Variance in Task Distribution Effect of Window Size 135

12 xiv HIERARCHICAL SCHEDULING Sensitivity to Other Parameters Conclusions PERFORMANCE WITH SYNCHRONIZATION WORKLOADS Introduction Related Work System and Workload Models Spinning and Blocking Policies Spinning Policy Blocking Policies Lock Accessing Workload Results Workload Model Simulation Results Principal Comparison Sensitivity to Service Time Variance Impact of Granularity Impact of Queue Access Time Barrier Synchronization Workload Results Workload Model Simulation Results Impact of System Load Sensitivity to Service Time Variance Impact of Granularity Impact of Queue Access Time Cache Effects Summary 163 PART III: Hierarchical Scheduling Policies SCHEDULING IN SHARED-MEMORY MULTIPROCESSORS Introduction Space-Sharing and Time-Sharing Policies Equipartitioning Modified RRJob Hierarchical Scheduling Policy Performance Evaluation System and Workload Models 174

13 Contents xv System Model Workload Model Performance Analysis Effect of Scheduling Overhead Impact of Variance in Service Demand Effect of Task Granularity Effect of the ERF Factor Effect of Quantum Size Sensitivity to Other Parameters Performance with Lock Accessing Workload Lock Accessing Workload Results Conclusions SCHEDULING IN DISTRIBUTED-MEMORY MULTICOMPUTERS Introduction Hierarchical Scheduling Policy Scheduling Policies for Performance Comparison Space Partitioning Time-Sharing Policy Workload Model Performance Comparison Performance with Ideal Workload Performance with Non-Uniform Workload Performance with distribution Sensitivity to variance in job service demand Performance under distribution Performance under distribution Discussion Conclusions SCHEDULING IN CLUSTER SYSTEMS Introduction Hierarchical Scheduling Policy Job Placement Policy Dynamic Load Balancing Algorithm Space-Sharing and Time-Sharing Policies Space-Sharing Policy 221

14 xvi HIERARCHICAL SCHEDULING Time-Sharing Policy 9.4 Performance Comparison Workload Model Ideal Workload Results Non-Uniform Workload Results 9.5 Summary PART IV: Epilog 10. CONCLUSIONS 10.1 Summary 10.2 Concluding Remarks REFERENCES INDEX

15 List of Figures 1.1 A SIMD system with N processing elements A shared-memory multiprocessor system with N processors and k memory modules A distributed-memory multicomputer system with N processors and N memory modules UMA shared-memory system architecture NUMA shared-memory system architecture Architecture of a distributed-memory system Distributed shared-memory system The SP2 switch board uses 4 x 4 crossbar switching elements The DASH system organization A high level view of the ASCI Blue-Pacific system Crossbar network (the small squares represents switches) Four possible settings of a 2 x 2 switching box The perfect shuffle for N = A multistage shuffle-exchange network A multistage shuffle-exchange network A ring network A chordal ring network A complete connection network A binary tree network X-tree and hypertree networks Two-dimensional mesh and torus networks. 34 xvii

16 xviii HIERARCHICAL SCHEDULING 2.19 Hypercube networks: (a) I-dimensional hypercube, (b) 2-dimensional hypercube, (c) 3-dimensional hypercube A two-level hierarchical network with four different types of networks The fork-and-join job structure The divide-and-conquer job structure The matrix factorization job structure Two basic task queue organizations (a) Centralized organization (b) Distributed organization Performance of the centralized organization as a function of system utilization Performance of the distributed organization as a function of system utilization Performance sensitivity of the distributed organization to variance in task service times Performance of the four placement strategies as a function of system utilization Impact of service time variance on the performance of the four placement strategies (utilization = 80%) Performance sensitivity of the shortest queue and SRT queue policies to the number of probes (utilization = 70%) The effect of task size estimation error on the performance of the SRT policy (utilization = 80%). The ESRT queue represents performance of the SRT policy when the task size estimation error is ±30%. For comparison, performance of the shortest and SRT policies is included Relative performance of the AP and MAP policies as a function of system utilization and job structure Performance comparison of the AP and MAP policies as a function of variance in interarrival times for the GE job structure Performance comparison of the AP and MAP policies as a function of variance in service times for the GE job structure Performance sensitivity of the MAP policy to parameter f 74

17 List of Figures 3.16 Impact of Eager Release policy on the performance of the MAP policy. The y-axis gives the response time improvement over the MAP policy. Eager Release policy does not have any significant impact on the FJ applica- tion. 74 xix 3.17 Performance sensitivity of the MAP and HAP policies to interarrival time variance Performance sensitivity of the MAP and HAP policies to service time variance Organization of the GangLL scheduler Hierarchical task queue organization for N = 8 processors with a branching factor B = Task transfer process in the hierarchical organization for N = 64 processors with a branching factor B = 4 and transfer factor Tr = Task transfer process in the hierarchical organization for N = 64 processors with a branching factor B = 4 and transfer factor Tr = 2. Compare this figure with Figure 4.2 to see the impact of increasing the transfer factor from 1 to Performance of the three task queue organizations as a function of utilization (a) Centralized organization (b) Distributed and hierarchical organizations Performance of the three task queue organizations as a function of average number of tasks per job for the fixed task size workload (a) Centralized organization (b) Distributed and hierarchical organizations (j = 3%) Performance of the distributed and hierarchical task queue organizations as a function of average number of tasks per job for the fixed job size workload Performance sensitivity to the task service time variance (N = 64, T = 64, J-l = 1, B = 4, Tr = 1, A = 0.75 and f = 0%). Note that the lines for the centralized and hierarchical organizations are very close together. Performance sensitivity of the distributed and hierarchical organizations to the task service time variance (N = 64, T = 64, J-l = 1, B = 4, Tr = 1 and f = 4%)

18 xx HIERARCHICAL SCHEDULING 4.9 Performance sensitivity to the system size when the number of tasks per job is doubled (B = 4, Tr = 1, I = 4%, T = N, J-L = 1) Performance sensitivity to the system size when the task service time is doubled (B = 4, Tr = 1, I = 4%, T = 64, J-L = 64/N) Impact of branching factor on the performance of the hierarchical organization (N = 64, T = 64, J-L = 1, Tr = 1) Impact of transfer factor on the performance of the hierarchical organization (N = 64, T = 64, J-L = 1, B = 4) Task transfer behavior of Policy Task transfer behavior of Policy Performance of the two dynamic task transfer policies in the hierarchical organization (N = 64, T = 64, J-L = 1, B = 4, I = 2%) Performance of the three job scheduling policies as a function of system load Performance sensitivity to service time variance at system utilization of 85% Performance sensitivity to task distribution variance at system utilization of 85% Behavior of the RRI policy (N = 64, B = 4, Tr = 1, W = 2) Behavior of the RR2 policy (N = 64, B = 4, Tr = 1, W = 2) Behavior of the RR3 policy (N = 64, B = 4, Tr = 1) Performance of task scheduling policies as a function of system load (task service time CV = 1) Performance of task scheduling policies as a function of system load (task service time CV = 7) Sensitivity of task scheduling policies to the service time variance (system utilization = 85%) Performance sensitivity to task distribution variance Performance sensitivity of the round robin policies to the window size Performance sensitivity of the round robin policies to the quantum size. 137

19 List of Figures xxi 5.13 Performance sensitivity of the round robin policies to the context switch overhead Generic lock access workload task structure for task 'Ii Generic barrier syncronization workload task structure for task Ti Performance of the spinning and blocking policies as a function of useful utilization Impact of the lock holding ratio (useful utilization = 70% and Be + BI = 0.25) Performance impact of service time variability in the lock accessing model Performance as a function of the number of iterations MaXi in the lock accessing model (useful utilization = 70%) Performance sensitivity to queue access time f in the lock accessing model (useful utilization = 70%) Performance of the spinning and blocking policies as a function of useful utilization under the barrier synchronization workload Performance impact of service time variability (useful utilization = 50%) Performance sensitivity to the maximum number of iterations MaXi (useful utilization = 50%) Performance sensitivity to queue access time f Hierarchical task queue organization for N = 8 processors with a branching factor B = Example curves for ERF 1'(Avg) = ( 1 ; t! a ~ ; g Response time versus utilization for low overhead Response time versus utilization for high overhead Response time versus utilization with service demand CV CVd = Response time versus utilization with service demand CV CVd = Response time versus service demand CV CVd at 72% utilization Performance sensitivity to average parallelism (Avg) at 50% utilization Performance sensitivity to average parallelism (Avg) at 75% utilization. 184

20 xxii HIERARCHICAL SCHEDULING 7.10 Sensitivity to the ERF factor at 50% utilization Sensitivity to the ERF factor at 75% utilization Sensitivity of hierarchical and RRJob policies to quantum size at 50% utilization Sensitivity of hierarchical and RRJob policies to quantum size at 75% utilization Response time versus utilization for low overhead Response time versus utilization for high overhead Response time versus utilization for service demand CV = Response time versus C Vd at 72% utilization Job and task transfer modes in the hierarchical policy (number of processors N = 64 and the branching factor B = 4) Job and task transfer modes in the hierarchical policy (number of processors N = 32 and the branching factor B = 2) Algorithm used by the space-sharing policy Performance of the three policies under the ideal workload Performance of the three policies under distribution (service CV = 10) Performance of the three policies under distribution (service CV = 1) Performance of the three policies under distribution (service CV = 15) Performance of the three policies under distribution (service CV = 10) Performance of the three policies under distribution (service CV = 10) A cluster tree example (SS: Sysytem scheduler, CS: Cluster scheduler, LS: Local scheduler, Wi: Workstation i) An overview of the job placement policy. 217' 9.3 An overview of the dynamic load balancing algorithm An illustration of the load balancing activity in the hierarchical policy Performance of the three scheduling policies for the ideal workload (Dedicated-heterogeneous system). 225

21 List of Figures xxiii Performance of the three scheduling policies for the ideal workload (Shared-homogeneous system). Performance of the three scheduling policies for the non-uniform workload (Dedicated-heterogeneous configuration). Performance of the three scheduling policies for the non-uniform workload (Shared-homogeneous configuration)

22 List of Tables 4.1 Average number of queue accesses required to schedule a task in the hierarchical organization (from Eq. 4.1) Default parameter values used in the lock accessing workload experiments Default parameter values used in the barrier synchronization workload experiments Default parameter values used in the simulation experiments Additional parameters for the lock accessing workload A summary of work distribution in the four workloads Node types used in the simulation and their ratings Default parameter values used in the experiments 223 xxv

Guide to RISC Processors

Guide to RISC Processors Guide to RISC Processors Sivarama P. Dandamudi Guide to RISC Processors for Programmers and Engineers Sivarama P. Dandamudi School of Computer Science Carleton University Ottawa, ON K1S 5B6 Canada sivarama@scs.carleton.ca

More information

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11 Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed

More information

SMD149 - Operating Systems - Multiprocessing

SMD149 - Operating Systems - Multiprocessing SMD149 - Operating Systems - Multiprocessing Roland Parviainen December 1, 2005 1 / 55 Overview Introduction Multiprocessor systems Multiprocessor, operating system and memory organizations 2 / 55 Introduction

More information

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy Overview SMD149 - Operating Systems - Multiprocessing Roland Parviainen Multiprocessor systems Multiprocessor, operating system and memory organizations December 1, 2005 1/55 2/55 Multiprocessor system

More information

Performance Comparison of Processor Scheduling Strategies in a Distributed-Memory Multicomputer System

Performance Comparison of Processor Scheduling Strategies in a Distributed-Memory Multicomputer System Performance Comparison of Processor Scheduling Strategies in a Distributed-Memory Multicomputer System Yuet-Ning Chan, Sivarama P. Dandamudi School of Computer Science Carleton University Ottawa, Ontario

More information

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs Vaughn Betz Jonathan Rose Alexander Marquardt

More information

Performance of Hierarchical Processor Scheduling in Shared-Memory Multiprocessor Systems

Performance of Hierarchical Processor Scheduling in Shared-Memory Multiprocessor Systems Performance of Processor Scheduling in Shared-Memory Multiprocessor Systems Sivarama P. Dandamudi School of Computer Science Carleton University Ottawa, Ontario K1S 5B, Canada Samir Ayachi Northern Telecom

More information

Advanced Parallel Architecture. Annalisa Massini /2017

Advanced Parallel Architecture. Annalisa Massini /2017 Advanced Parallel Architecture Annalisa Massini - 2016/2017 References Advanced Computer Architecture and Parallel Processing H. El-Rewini, M. Abd-El-Barr, John Wiley and Sons, 2005 Parallel computing

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer

More information

Outline. Distributed Shared Memory. Shared Memory. ECE574 Cluster Computing. Dichotomy of Parallel Computing Platforms (Continued)

Outline. Distributed Shared Memory. Shared Memory. ECE574 Cluster Computing. Dichotomy of Parallel Computing Platforms (Continued) Cluster Computing Dichotomy of Parallel Computing Platforms (Continued) Lecturer: Dr Yifeng Zhu Class Review Interconnections Crossbar» Example: myrinet Multistage» Example: Omega network Outline Flynn

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico February 29, 2016 CPD

More information

CS 770G - Parallel Algorithms in Scientific Computing Parallel Architectures. May 7, 2001 Lecture 2

CS 770G - Parallel Algorithms in Scientific Computing Parallel Architectures. May 7, 2001 Lecture 2 CS 770G - arallel Algorithms in Scientific Computing arallel Architectures May 7, 2001 Lecture 2 References arallel Computer Architecture: A Hardware / Software Approach Culler, Singh, Gupta, Morgan Kaufmann

More information

PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS

PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ROBOTICS: VISION, MANIPULATION AND SENSORS Consulting Editor:

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico September 26, 2011 CPD

More information

CS Parallel Algorithms in Scientific Computing

CS Parallel Algorithms in Scientific Computing CS 775 - arallel Algorithms in Scientific Computing arallel Architectures January 2, 2004 Lecture 2 References arallel Computer Architecture: A Hardware / Software Approach Culler, Singh, Gupta, Morgan

More information

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE Yau-Tsun Steven Li Monterey Design Systems, Inc. Sharad Malik Princeton University ~. " SPRINGER

More information

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing Dr Izadi CSE-4533 Introduction to Parallel Processing Chapter 4 Models of Parallel Processing Elaborate on the taxonomy of parallel processing from chapter Introduce abstract models of shared and distributed

More information

Parallel Architecture. Sathish Vadhiyar

Parallel Architecture. Sathish Vadhiyar Parallel Architecture Sathish Vadhiyar Motivations of Parallel Computing Faster execution times From days or months to hours or seconds E.g., climate modelling, bioinformatics Large amount of data dictate

More information

Contents. Preface. About the Authors BASIC TECHNIQUES CHAPTER 1 PARALLEL COMPUTERS. l. 1 The Demand for Computational Speed 3

Contents. Preface. About the Authors BASIC TECHNIQUES CHAPTER 1 PARALLEL COMPUTERS. l. 1 The Demand for Computational Speed 3 Preface About the Authors PARTI BASIC TECHNIQUES CHAPTER 1 PARALLEL COMPUTERS l. 1 The Demand for Computational Speed 3 1.2 Potential for Increased Computational Speed 6 Speedup Factor 6 What Is the Maximum

More information

Parallel Architectures

Parallel Architectures Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36

More information

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer MIMD Overview Intel Paragon XP/S Overview! MIMDs in the 1980s and 1990s! Distributed-memory multicomputers! Intel Paragon XP/S! Thinking Machines CM-5! IBM SP2! Distributed-memory multicomputers with hardware

More information

Computer Architecture

Computer Architecture Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors

More information

The Effect of Scheduling Discipline on Dynamic Load Sharing in Heterogeneous Distributed Systems

The Effect of Scheduling Discipline on Dynamic Load Sharing in Heterogeneous Distributed Systems Appears in Proc. MASCOTS'97, Haifa, Israel, January 1997. The Effect of Scheduling Discipline on Dynamic Load Sharing in Heterogeneous Distributed Systems Sivarama P. Dandamudi School of Computer Science,

More information

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K. Fundamentals of Parallel Computing Sanjay Razdan Alpha Science International Ltd. Oxford, U.K. CONTENTS Preface Acknowledgements vii ix 1. Introduction to Parallel Computing 1.1-1.37 1.1 Parallel Computing

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

Algorithms and Parallel Computing

Algorithms and Parallel Computing Algorithms and Parallel Computing Algorithms and Parallel Computing Fayez Gebali University of Victoria, Victoria, BC A John Wiley & Sons, Inc., Publication Copyright 2011 by John Wiley & Sons, Inc. All

More information

Parallel Architectures

Parallel Architectures Parallel Architectures Instructor: Tsung-Che Chiang tcchiang@ieee.org Department of Science and Information Engineering National Taiwan Normal University Introduction In the roughly three decades between

More information

Types of Parallel Computers

Types of Parallel Computers slides1-22 Two principal types: Types of Parallel Computers Shared memory multiprocessor Distributed memory multicomputer slides1-23 Shared Memory Multiprocessor Conventional Computer slides1-24 Consists

More information

Topological Structure and Analysis of Interconnection Networks

Topological Structure and Analysis of Interconnection Networks Topological Structure and Analysis of Interconnection Networks Network Theory and Applications Volume 7 Managing Editors: Ding-Zhu Du, University of Minnesota, U.S.A. and Cauligi Raghavendra, University

More information

ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING

ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE PARALLEL PROCESSING AND FIFTH GENERATION COMPUTING Consulting Editor Doug DeGroot

More information

Multiple Processor Systems. Lecture 15 Multiple Processor Systems. Multiprocessor Hardware (1) Multiprocessors. Multiprocessor Hardware (2)

Multiple Processor Systems. Lecture 15 Multiple Processor Systems. Multiprocessor Hardware (1) Multiprocessors. Multiprocessor Hardware (2) Lecture 15 Multiple Processor Systems Multiple Processor Systems Multiprocessors Multicomputers Continuous need for faster computers shared memory model message passing multiprocessor wide area distributed

More information

MULTIMEDIA DATABASE MANAGEMENT SYSTEMS

MULTIMEDIA DATABASE MANAGEMENT SYSTEMS MULTIMEDIA DATABASE MANAGEMENT SYSTEMS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE MULTIMEDIA SYSTEMS AND APPLICATIONS Recently Published Titles: Consulting Editor Borko Furht Florida

More information

Parallel Architectures

Parallel Architectures Parallel Architectures CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Parallel Architectures Spring 2018 1 / 36 Outline 1 Parallel Computer Classification Flynn s

More information

Mathematics Shape and Space: Polygon Angles

Mathematics Shape and Space: Polygon Angles a place of mind F A C U L T Y O F E D U C A T I O N Department of Curriculum and Pedagogy Mathematics Shape and Space: Polygon Angles Science and Mathematics Education Research Group Supported by UBC Teaching

More information

Scheduling in Distributed Computing Systems Analysis, Design & Models

Scheduling in Distributed Computing Systems Analysis, Design & Models Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) by Deo Prakash

More information

Computer parallelism Flynn s categories

Computer parallelism Flynn s categories 04 Multi-processors 04.01-04.02 Taxonomy and communication Parallelism Taxonomy Communication alessandro bogliolo isti information science and technology institute 1/9 Computer parallelism Flynn s categories

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

TASK SCHEDULING FOR PARALLEL SYSTEMS

TASK SCHEDULING FOR PARALLEL SYSTEMS TASK SCHEDULING FOR PARALLEL SYSTEMS Oliver Sinnen Department of Electrical and Computer Engineering The University of Aukland New Zealand TASK SCHEDULING FOR PARALLEL SYSTEMS TASK SCHEDULING FOR PARALLEL

More information

Shared-Memory Multiprocessor Systems Hierarchical Task Queue

Shared-Memory Multiprocessor Systems Hierarchical Task Queue UNIVERSITY OF LUGANO Advanced Learning and Research Institute -ALaRI PROJECT COURSE: PERFORMANCE EVALUATION Shared-Memory Multiprocessor Systems Hierarchical Task Queue Mentor: Giuseppe Serazzi Candidates:

More information

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism Motivation for Parallelism Motivation for Parallelism The speed of an application is determined by more than just processor speed. speed Disk speed Network speed... Multiprocessors typically improve the

More information

Interconnection networks

Interconnection networks Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory

More information

BlueGene/L (No. 4 in the Latest Top500 List)

BlueGene/L (No. 4 in the Latest Top500 List) BlueGene/L (No. 4 in the Latest Top500 List) first supercomputer in the Blue Gene project architecture. Individual PowerPC 440 processors at 700Mhz Two processors reside in a single chip. Two chips reside

More information

MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING

MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS Series Editor Ahmed K. Elmagarmid Purdue University West Lafayette, IN 47907 Other books

More information

Overview. Processor organizations Types of parallel machines. Real machines

Overview. Processor organizations Types of parallel machines. Real machines Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters, DAS Programming methods, languages, and environments

More information

Fault-Tolerant Parallel and Distributed Systems

Fault-Tolerant Parallel and Distributed Systems Fault-Tolerant Parallel and Distributed Systems Fault-Tolerant Parallel and Distributed Systems by DIMITER R. AVRESKY Department of Electrical and Computer Engineering Boston University Boston, MA and

More information

Installing and Administering a Satellite Environment

Installing and Administering a Satellite Environment IBM DB2 Universal Database Installing and Administering a Satellite Environment Version 8 GC09-4823-00 IBM DB2 Universal Database Installing and Administering a Satellite Environment Version 8 GC09-4823-00

More information

Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2

Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2 Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS Teacher: Jan Kwiatkowski, Office 201/15, D-2 COMMUNICATION For questions, email to jan.kwiatkowski@pwr.edu.pl with 'Subject=your name.

More information

Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed

Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448 1 The Greed for Speed Two general approaches to making computers faster Faster uniprocessor All the techniques we ve been looking

More information

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes:

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes: BIT 325 PARALLEL PROCESSING ASSESSMENT CA 40% TESTS 30% PRESENTATIONS 10% EXAM 60% CLASS TIME TABLE SYLLUBUS & RECOMMENDED BOOKS Parallel processing Overview Clarification of parallel machines Some General

More information

LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS

LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS DINESH C. VERMA IBM T. J. Watson Research Center A JOHN WILEY & SONS, INC., PUBLICATION LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS LEGITIMATE APPLICATIONS

More information

INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach

INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach Library of Congress Cataloging-in-Publication ISBN 978-1-4613-5091-0 ISBN 978-1-4615-0467-2 (ebook) DOI 10.1007/978-1-4615-0467-2

More information

THE VERILOG? HARDWARE DESCRIPTION LANGUAGE

THE VERILOG? HARDWARE DESCRIPTION LANGUAGE THE VERILOG? HARDWARE DESCRIPTION LANGUAGE THE VERILOGf HARDWARE DESCRIPTION LANGUAGE by Donald E. Thomas Carnegie Mellon University and Philip R. Moorby Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS

More information

WIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures

WIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures C-K Toh, Ph.D. University of Cambridge Cambridge, United Kingdom SPRINGER-SCIENCE+BUSINESS

More information

Parallel Systems Prof. James L. Frankel Harvard University. Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved.

Parallel Systems Prof. James L. Frankel Harvard University. Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved. Parallel Systems Prof. James L. Frankel Harvard University Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved. Architectures SISD (Single Instruction, Single Data)

More information

INTERCONNECTION NETWORKS LECTURE 4

INTERCONNECTION NETWORKS LECTURE 4 INTERCONNECTION NETWORKS LECTURE 4 DR. SAMMAN H. AMEEN 1 Topology Specifies way switches are wired Affects routing, reliability, throughput, latency, building ease Routing How does a message get from source

More information

High Performance Datacenter Networks

High Performance Datacenter Networks M & C Morgan & Claypool Publishers High Performance Datacenter Networks Architectures, Algorithms, and Opportunity Dennis Abts John Kim SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE Mark D. Hill, Series

More information

Fundamentals of Operating Systems. Fifth Edition

Fundamentals of Operating Systems. Fifth Edition Fundamentals of Operating Systems Fifth Edition Fundamentals of Operating Systems A.M. Lister University of Queensland R. D. Eager University of Kent at Canterbury Fifth Edition Springer Science+Business

More information

Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz

Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming by Nasser Kehtarnavaz University

More information

CSC630/CSC730: Parallel Computing

CSC630/CSC730: Parallel Computing CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control

More information

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed

More information

Application Programming

Application Programming Multicore Application Programming For Windows, Linux, and Oracle Solaris Darryl Gove AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris

More information

MULTIPROCESSORS. Characteristics of Multiprocessors. Interconnection Structures. Interprocessor Arbitration

MULTIPROCESSORS. Characteristics of Multiprocessors. Interconnection Structures. Interprocessor Arbitration MULTIPROCESSORS Characteristics of Multiprocessors Interconnection Structures Interprocessor Arbitration Interprocessor Communication and Synchronization Cache Coherence 2 Characteristics of Multiprocessors

More information

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems 1 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ 10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems To enhance system performance and, in some cases, to increase

More information

WHY PARALLEL PROCESSING? (CE-401)

WHY PARALLEL PROCESSING? (CE-401) PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:

More information

Design of Parallel Algorithms. The Architecture of a Parallel Computer

Design of Parallel Algorithms. The Architecture of a Parallel Computer + Design of Parallel Algorithms The Architecture of a Parallel Computer + Trends in Microprocessor Architectures n Microprocessor clock speeds are no longer increasing and have reached a limit of 3-4 Ghz

More information

Chapter 20: Database System Architectures

Chapter 20: Database System Architectures Chapter 20: Database System Architectures Chapter 20: Database System Architectures Centralized and Client-Server Systems Server System Architectures Parallel Systems Distributed Systems Network Types

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #4 1/24/2018 Xuehai Qian xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Announcements PA #1

More information

INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation

INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation THE KLUWER INTERNATIONAL SERIES ON INFORMATION RETRIEVAL Series Editor W. Bruce Croft University of Massachusetts Amherst, MA 01003 Also in the

More information

BİL 542 Parallel Computing

BİL 542 Parallel Computing BİL 542 Parallel Computing 1 Chapter 1 Parallel Programming 2 Why Use Parallel Computing? Main Reasons: Save time and/or money: In theory, throwing more resources at a task will shorten its time to completion,

More information

INFORMATION HIDING IN COMMUNICATION NETWORKS

INFORMATION HIDING IN COMMUNICATION NETWORKS 0.8125 in Describes information hiding in communication networks, and highlights its important issues, challenges, trends, and applications. Highlights development trends and potential future directions

More information

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1 Credits:4 1 Understand the Distributed Systems and the challenges involved in Design of the Distributed Systems. Understand how communication is created and synchronized in Distributed systems Design and

More information

Chapter 7. Multicores, Multiprocessors, and Clusters. Goal: connecting multiple computers to get higher performance

Chapter 7. Multicores, Multiprocessors, and Clusters. Goal: connecting multiple computers to get higher performance Chapter 7 Multicores, Multiprocessors, and Clusters Introduction Goal: connecting multiple computers to get higher performance Multiprocessors Scalability, availability, power efficiency Job-level (process-level)

More information

VERILOG QUICKSTART. James M. Lee Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS MEDIA, LLC

VERILOG QUICKSTART. James M. Lee Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS MEDIA, LLC VERILOG QUICKSTART VERILOG QUICKSTART by James M. Lee Cadence Design Systems, Inc. ~. " SPRINGER SCIENCE+BUSINESS MEDIA, LLC ISBN 978-1-4613-7801-3 ISBN 978-1-4615-6113-2 (ebook) DOI 10.1007/978-1-4615-6113-2

More information

6.1 Multiprocessor Computing Environment

6.1 Multiprocessor Computing Environment 6 Parallel Computing 6.1 Multiprocessor Computing Environment The high-performance computing environment used in this book for optimization of very large building structures is the Origin 2000 multiprocessor,

More information

Multiprocessors - Flynn s Taxonomy (1966)

Multiprocessors - Flynn s Taxonomy (1966) Multiprocessors - Flynn s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) Conventional uniprocessor Although ILP is exploited Single Program Counter -> Single Instruction stream The

More information

Top500 Supercomputer list

Top500 Supercomputer list Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity

More information

EE382 Processor Design. Illinois

EE382 Processor Design. Illinois EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture X: Parallel Databases Topics Motivation and Goals Architectures Data placement Query processing Load balancing

More information

Performance Impact of I/O on Sender-Initiated and Receiver-Initiated Load Sharing Policies in Distributed Systems

Performance Impact of I/O on Sender-Initiated and Receiver-Initiated Load Sharing Policies in Distributed Systems Appears in Proc. Conf. Parallel and Distributed Computing Systems, Dijon, France, 199. Performance Impact of I/O on Sender-Initiated and Receiver-Initiated Load Sharing Policies in Distributed Systems

More information

Knowledge libraries and information space

Knowledge libraries and information space University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2009 Knowledge libraries and information space Eric Rayner University

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen

More information

Loop Tiling for Parallelism

Loop Tiling for Parallelism Loop Tiling for Parallelism THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE LOOP TILING FOR PARALLELISM JINGLING XUE School of Computer Science and Engineering The University of New

More information

Introduction to Windchill PDMLink 10.2 for the Implementation Team

Introduction to Windchill PDMLink 10.2 for the Implementation Team Introduction to Windchill PDMLink 10.2 for the Implementation Team Overview Course Code Course Length TRN-4262-T 2 Days In this course, you will learn how to complete basic Windchill PDMLink functions.

More information

Dr. Joe Zhang PDC-3: Parallel Platforms

Dr. Joe Zhang PDC-3: Parallel Platforms CSC630/CSC730: arallel & Distributed Computing arallel Computing latforms Chapter 2 (2.3) 1 Content Communication models of Logical organization (a programmer s view) Control structure Communication model

More information

Lecture 23 Database System Architectures

Lecture 23 Database System Architectures CMSC 461, Database Management Systems Spring 2018 Lecture 23 Database System Architectures These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used

More information

Computer-Aided Design in Magnetics

Computer-Aided Design in Magnetics Computer-Aided Design in Magnetics D. A. Lowther P. P. Silvester Computer-Aided Design in Magnetics With 84 illustrations Springer-Verlag Berlin Heidelberg New York Tokyo D. A. Lowther Associate Professor

More information

To Everyone... iii To Educators... v To Students... vi Acknowledgments... vii Final Words... ix References... x. 1 ADialogueontheBook 1

To Everyone... iii To Educators... v To Students... vi Acknowledgments... vii Final Words... ix References... x. 1 ADialogueontheBook 1 Contents To Everyone.............................. iii To Educators.............................. v To Students............................... vi Acknowledgments........................... vii Final Words..............................

More information

COMMUNICATION SYSTEMS The State of the Art

COMMUNICATION SYSTEMS The State of the Art COMMUNICATION SYSTEMS The State of the Art IFIP The International Federation for Information Processing lfip was founded in 1960 under the auspices of UNESCO, following the First World Computer Congress

More information

Parallel Computing Introduction

Parallel Computing Introduction Parallel Computing Introduction Bedřich Beneš, Ph.D. Associate Professor Department of Computer Graphics Purdue University von Neumann computer architecture CPU Hard disk Network Bus Memory GPU I/O devices

More information

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and

More information

Chapter 11. Introduction to Multiprocessors

Chapter 11. Introduction to Multiprocessors Chapter 11 Introduction to Multiprocessors 11.1 Introduction A multiple processor system consists of two or more processors that are connected in a manner that allows them to share the simultaneous (parallel)

More information

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1 Introduction to parallel computers and parallel programming Introduction to parallel computersand parallel programming p. 1 Content A quick overview of morden parallel hardware Parallelism within a chip

More information

AAM Guide for Authors

AAM Guide for Authors ISSN: 1932-9466 AAM Guide for Authors Application and Applied Mathematics: An International Journal (AAM) invites contributors from throughout the world to submit their original manuscripts for review

More information

WIRELESS SENSOR NETWORKS A Networking Perspective Edited by Jun Zheng Abbas Jamalipour A JOHN WILEY & SONS, INC., PUBLICATION WIRELESS SENSOR NETWORKS IEEE Press 445 Hoes Lane Piscataway, NJ 08854 IEEE

More information

CITY UNIVERSITY OF NEW YORK. Creating a New Project in IRBNet. i. After logging in, click Create New Project on left side of the page.

CITY UNIVERSITY OF NEW YORK. Creating a New Project in IRBNet. i. After logging in, click Create New Project on left side of the page. CITY UNIVERSITY OF NEW YORK Creating a New Project in IRBNet i. After logging in, click Create New Project on left side of the page. ii. Enter the title of the project, the principle investigator s (PI)

More information

Parallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008

Parallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008 Parallel Computing Using OpenMP/MPI Presented by - Jyotsna 29/01/2008 Serial Computing Serially solving a problem Parallel Computing Parallelly solving a problem Parallel Computer Memory Architecture Shared

More information

High Performance Computing

High Performance Computing The Need for Parallelism High Performance Computing David McCaughan, HPC Analyst SHARCNET, University of Guelph dbm@sharcnet.ca Scientific investigation traditionally takes two forms theoretical empirical

More information

Practical Amazon EC2, SQS, Kinesis, and S3

Practical Amazon EC2, SQS, Kinesis, and S3 Practical Amazon EC2, SQS, Kinesis, and S3 A Hands-On Approach to AWS Sunil Gulabani Practical Amazon EC2, SQS, Kinesis, and S3: A Hands-On Approach to AWS Sunil Gulabani Ahmedabad, Gujarat, India ISBN-13

More information

Edge Detection Using Circular Sliding Window

Edge Detection Using Circular Sliding Window Edge Detection Using Circular Sliding Window A.A. D. Al-Zuky and H. J. M. Al-Taa'y Department of Physics, College of Science, University of Al-Mustansiriya Abstract In this paper, we devoted to use circular

More information