Structured Parallel Programming

Size: px
Start display at page:

Download "Structured Parallel Programming"

Transcription

1 Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann Publishers is an imprint of Elsevier M< rtorc.lui "AU I MANN

2 Contents \ " J\..,.:...,, ~! Listings ':.,.' xv Preface... ~.". : xx Preliminaries xxiii CHAPTER Introduction Think Parallel Performance Motivation: Pervasive Parallelism Hardware Trends Encouraging Parallelism Observed Historical Trends in Parallelism Need for Explicit Parallel Programming Structured Pattern-Based Programming Parallel Programming Models Desired Properties..... Abstractions Instead of Mechanisms Expression of Regular Data Parallelism Composability Portability of Functionality Performance Portability Safety, Determinism, and Maintainability Overview of Programming Models Used When to Use Which Model? Organization of this Book Summary CHAPTER Background Vocabulary and Notation Strategies Mechanisms Machine Models Machine Model Key Features for Performance Flynn's Characterization Evolution Performance Theory Latency and Throughput Speedup, Efficiency, and Scalability v

3 vi Contents..3 Power Amdahl's Law Gustafson-Barsis' Law Work-Span Model Asymptotic Complexity Asymptotic Speedup and Efficiency Little's Formula Pitfalls Race Conditions Mutual Exclusion and Locks Deadlock Strangled Scaling Lack of Locality Load Imbalance Overhead Summary PART I PATTERNS CHAPTER 3 Patterns Nesting Pattern Structured Serial Control Flow Patterns Sequence Selection Iteration Recursion Parallel Control Patterns Fork-Join Map Stencil Reduction Scan Recurrence Serial Data Management Patterns Random Read and Write Stack Allocation Heap Allocation Closures Objects

4 Contents vii Parallel Data Management Patterns Pack Pipeline Geometric Decomposition Gather Scatter Other Parallel Patterns Superscalar Sequences Futures Speculative Selection Workpile Search Segmentation Expand Category Reduction Term Graph Rewriting Non-Deterministic Patterns Branch and Bound Transactions Programming Model Support for Patterns CilkPlus Threading Building Blocks OpenMP Array Building Blocks OpenCL Summary CHAPTER 4 Map Map Scaled Vector Addition (SAXPY) Description of the Problem Serial Implementation TBB Cilk Plus Cilk Plus with Array Notation OpenMP ArBB Using Vector Operations ArBB Using Elemental Functions OpenCL

5 ~----- viii Contents 4.3 Mandelbrot Description of the Problem Serial Implementation TBB Cilk Plus Cilk Plus with Array Notations OpenMP ArBB OpenCL Sequence of Maps versus Map of Sequence Comparison of Parallel Models Related Patterns Stencil Workpile Divide-and-conquer Summary CHAPTER Collectives Reduce Reordering Computations Vectorization Tiling Precision Implementation Fusing Map and Reduce Explicit Fusion in TBB Explicit Fusion in Cilk Plus Automatic Fusion in ArBB Dot Product Description of the Problem Serial Implementation SSE Intrinsics TBB Cilk Plus OpenMP c.3. ArBB Scan Cilk Plus TBB ArBB OpenMP Fusing Map and Scan

6 Contents ix Integration Description of the Problem Serial Implementation Cilk Plus OpenMP TBB ArBB Summary CHAPTER Data Reorganization......, Gather General Gather Shift Zip Scatter Atomic Scatter Permutation Scatter Merge Scatter Priority Scatter Converting Scatter to Gather Pack Fusing Map and Pack Geometric Decomposition and Partition Array of Structures vs. Structures of Arrays Summary CHAPTER Stencil and Recurrence Stencil Implementing Stencil with Shift Tiling Stencils for Cache : Optimizing Stencils for Communication Recurrence Summary CHAPTER Fork-Join Definition Programming Model Support for Fork-Join Cilk Plus Support for Fork-Join TBB Support for Fork-Join OpenMP Support for Fork-Join Recursive Implementation of Map Choosing Base Cases

7 p x Contents. Load Balancing Complexity of Parallel Divide-and-Conquer Karatsuba Multiplication of Polynomials Note on Allocating Scratch Space Cache Locality and Cache-Oblivious Algorithms Quicksort Cilk Quicksort TBB Quicksort Work and Span for Quicksort Reductions and Hyperobjects Implementing Scan with Fork-Join Applying Fork-Join to Recurrences Analysis Flat Fork-Join Summary CHAPTER 9 Pipeline Basic Pipeline Pipeline with Parallel Stages Implementation of a Pipeline Programming Model Support for Pipelines Pipeline in TBB Pipeline in Cilk Plus More General Topologies Mandatory versus Optional Parallelism Summary PART II EXAMPLES CHAPTER 0 Forward Seismic Simulation Background Stencil Computation Impact of Caches on Arithmetic Intensity Raising Arithmetic Intensity with Space-Time Tiling Cilk Plus Code ArBB Implementation Summary

8 Contents xi 0 3 o CHAPTER K-Means Clustering Algorithm K-Means with Cilk Plus Hyperobjects K-Means with TBB Summary CHAPTER Bzip Data Compression The Bzip Algorithm : Three-Stage Pipeline Using TBB Four-Stage Pipeline Using TBB Three-Stage Pipeline Using Cilk Plus Summary CHAPTER 3 Merge Sort Parallel Merge TBB Parallel Merge Work and Span of Parallel Merge Parallel Merge Sort Work and Span of Merge Sort Summary CHAPTER 4 Sample Sort Overall Structure Choosing the Number of Bins Binning TBB Implementation Repacking and Subsorting Performance Analysis of Sample Sort For C++ Experts Summary CHAPTER Cholesky Factorization Fortran Rules! Recursive Cholesky Decomposition Triangular Solve Symmetric Rank Update Where Is the Time Spent? Summary

9 xii Contents APPENDICES APPENDIX A Further Reading A. Parallel Algorithms and Patterns A. Computer Architecture Including Parallel Systems A.3 Parallel Programming Models APPENDIX B Cilk Plus Shared Design Principles with TBB Unique Characteristics Borrowing Components from TBB Keyword Spelling ci l k_for ci l k_s pa wn and ci l Lsync Reducers (Hyperobjects) B.. C++ Syntax B.. CSyntax ' Array Notation B.. Specifying Array Sections B.. Operations on Array Sections B..3 Reductions on Array Sections B..4 Implicit Index B.. Avoid Partial Overlap of Array Sections #p r a gma s i md Elemental Functions B.0. Attribute Syntax Note on C Notes on Setup History Summary APPENDIX c T C. Unique Characteristics C. Using TBB C.3 parall el_for C.3. bl oc ked_r ange C.3. Partitioners C.4 pa rall el_redu ce C. parall eldeterministi c_r educe C.G parall el pi pel i ne C. parall el_i nvok e

10 p Contents xiii C. task_group C.9 task C.9. empty_task C.0 atomic C. enumerabl e_thread_speci fi c C. NotesonC++ll C.3 History C.4 Summary APPENDIX D C++ll Declaring with auto : Lambda Expressions std:: move APPENDIX E Glossary Bibliography Index

Structured Parallel Programming Patterns for Efficient Computation

Structured Parallel Programming Patterns for Efficient Computation Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

An Introduction to Parallel Programming

An Introduction to Parallel Programming F 'C 3 R'"'C,_,. HO!.-IJJ () An Introduction to Parallel Programming Peter S. Pacheco University of San Francisco ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11

Contents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11 Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed

More information

Application Programming

Application Programming Multicore Application Programming For Windows, Linux, and Oracle Solaris Darryl Gove AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris

More information

Algorithmic Graph Theory and Perfect Graphs

Algorithmic Graph Theory and Perfect Graphs Algorithmic Graph Theory and Perfect Graphs Second Edition Martin Charles Golumbic Caesarea Rothschild Institute University of Haifa Haifa, Israel 2004 ELSEVIER.. Amsterdam - Boston - Heidelberg - London

More information

Computers as Components Principles of Embedded Computing System Design

Computers as Components Principles of Embedded Computing System Design Computers as Components Principles of Embedded Computing System Design Third Edition Marilyn Wolf ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY

More information

Computer Architecture A Quantitative Approach

Computer Architecture A Quantitative Approach Computer Architecture A Quantitative Approach Third Edition John L. Hennessy Stanford University David A. Patterson University of California at Berkeley With Contributions by David Goldberg Xerox Palo

More information

Information Modeling and Relational Databases

Information Modeling and Relational Databases Information Modeling and Relational Databases Second Edition Terry Halpin Neumont University Tony Morgan Neumont University AMSTERDAM» BOSTON. HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

CLASSIC DATA STRUCTURES IN JAVA

CLASSIC DATA STRUCTURES IN JAVA CLASSIC DATA STRUCTURES IN JAVA Timothy Budd Oregon State University Boston San Francisco New York London Toronto Sydney Tokyo Singapore Madrid Mexico City Munich Paris Cape Town Hong Kong Montreal CONTENTS

More information

"Charting the Course to Your Success!" MOC A Developing High-performance Applications using Microsoft Windows HPC Server 2008

Charting the Course to Your Success! MOC A Developing High-performance Applications using Microsoft Windows HPC Server 2008 Description Course Summary This course provides students with the knowledge and skills to develop high-performance computing (HPC) applications for Microsoft. Students learn about the product Microsoft,

More information

Parallel Programming Patterns Overview CS 472 Concurrent & Parallel Programming University of Evansville

Parallel Programming Patterns Overview CS 472 Concurrent & Parallel Programming University of Evansville Parallel Programming Patterns Overview CS 472 Concurrent & Parallel Programming of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information

More information

Heuristic Search. Theory and Applications. Stefan Edelkamp. Stefan Schrodl ELSEVIER. Morgan Kaufmann is an imprint of Elsevier HEIDELBERG LONDON

Heuristic Search. Theory and Applications. Stefan Edelkamp. Stefan Schrodl ELSEVIER. Morgan Kaufmann is an imprint of Elsevier HEIDELBERG LONDON Heuristic Search Theory and Applications Stefan Edelkamp Stefan Schrodl AMSTERDAM BOSTON HEIDELBERG LONDON ELSEVIER NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY» TOKYO Morgan Kaufmann

More information

Computer Architecture

Computer Architecture Computer Architecture Pipelined and Parallel Processor Design Michael J. Flynn Stanford University Technische Universrtat Darmstadt FACHBEREICH INFORMATIK BIBLIOTHEK lnventar-nr.: Sachgebiete: Standort:

More information

Engineering Real- Time Applications with Wild Magic

Engineering Real- Time Applications with Wild Magic 3D GAME ENGINE ARCHITECTURE Engineering Real- Time Applications with Wild Magic DAVID H. EBERLY Geometric Tools, Inc. AMSTERDAM BOSTON HEIDELRERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

Embedded Systems Architecture

Embedded Systems Architecture Embedded Systems Architecture A Comprehensive Guide for Engineers and Programmers By Tammy Noergaard ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

SQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez

SQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez SQL Queries for Mere Mortals Third Edition A Hands-On Guide to Data Manipulation in SQL John L. Viescas Michael J. Hernandez r A TT TAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco

More information

Programming. In Ada JOHN BARNES TT ADDISON-WESLEY

Programming. In Ada JOHN BARNES TT ADDISON-WESLEY Programming In Ada 2005 JOHN BARNES... TT ADDISON-WESLEY An imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong Kong Seoul Taipei New Delhi

More information

M (~ Computer Organization and Design ELSEVIER. David A. Patterson. John L. Hennessy. University of California, Berkeley. Stanford University

M (~ Computer Organization and Design ELSEVIER. David A. Patterson. John L. Hennessy. University of California, Berkeley. Stanford University T H I R D EDITION REVISED Computer Organization and Design THE HARDWARE/SOFTWARE INTERFACE David A. Patterson University of California, Berkeley John L. Hennessy Stanford University With contributions

More information

PROBLEM SOLVING WITH FORTRAN 90

PROBLEM SOLVING WITH FORTRAN 90 David R. Brooks PROBLEM SOLVING WITH FORTRAN 90 FOR SCIENTISTS AND ENGINEERS Springer Contents Preface v 1.1 Overview for Instructors v 1.1.1 The Case for Fortran 90 vi 1.1.2 Structure of the Text vii

More information

An Introduction to Programming with IDL

An Introduction to Programming with IDL An Introduction to Programming with IDL Interactive Data Language Kenneth P. Bowman Department of Atmospheric Sciences Texas A&M University AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN

More information

Barbara Chapman, Gabriele Jost, Ruud van der Pas

Barbara Chapman, Gabriele Jost, Ruud van der Pas Using OpenMP Portable Shared Memory Parallel Programming Barbara Chapman, Gabriele Jost, Ruud van der Pas The MIT Press Cambridge, Massachusetts London, England c 2008 Massachusetts Institute of Technology

More information

Foundations of Multidimensional and Metric Data Structures

Foundations of Multidimensional and Metric Data Structures Foundations of Multidimensional and Metric Data Structures Hanan Samet University of Maryland, College Park ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

MPI: A Message-Passing Interface Standard

MPI: A Message-Passing Interface Standard MPI: A Message-Passing Interface Standard Version 2.1 Message Passing Interface Forum June 23, 2008 Contents Acknowledgments xvl1 1 Introduction to MPI 1 1.1 Overview and Goals 1 1.2 Background of MPI-1.0

More information

Maya Python. for Games and Film. and the Maya Python API. A Complete Reference for Maya Python. Ryan Trowbridge. Adam Mechtley ELSEVIER

Maya Python. for Games and Film. and the Maya Python API. A Complete Reference for Maya Python. Ryan Trowbridge. Adam Mechtley ELSEVIER Maya Python for Games and Film A Complete Reference for Maya Python and the Maya Python API Adam Mechtley Ryan Trowbridge AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

Curriculum 2013 Knowledge Units Pertaining to PDC

Curriculum 2013 Knowledge Units Pertaining to PDC Curriculum 2013 Knowledge Units Pertaining to C KA KU Tier Level NumC Learning Outcome Assembly level machine Describe how an instruction is executed in a classical von Neumann machine, with organization

More information

The Unified Modeling Language User Guide

The Unified Modeling Language User Guide The Unified Modeling Language User Guide Grady Booch James Rumbaugh Ivar Jacobson Rational Software Corporation TT ADDISON-WESLEY Boston San Francisco New York Toronto Montreal London Munich Paris Madrid

More information

DB2 SQL Tuning Tips for z/os Developers

DB2 SQL Tuning Tips for z/os Developers DB2 SQL Tuning Tips for z/os Developers Tony Andrews IBM Press, Pearson pic Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Cape Town Sydney

More information

15-853:Algorithms in the Real World. Outline. Parallelism: Lecture 1 Nested parallelism Cost model Parallel techniques and algorithms

15-853:Algorithms in the Real World. Outline. Parallelism: Lecture 1 Nested parallelism Cost model Parallel techniques and algorithms :Algorithms in the Real World Parallelism: Lecture 1 Nested parallelism Cost model Parallel techniques and algorithms Page1 Andrew Chien, 2008 2 Outline Concurrency vs. Parallelism Quicksort example Nested

More information

Moving to the Cloud. Developing Apps in. the New World of Cloud Computing. Dinkar Sitaram. Geetha Manjunath. David R. Deily ELSEVIER.

Moving to the Cloud. Developing Apps in. the New World of Cloud Computing. Dinkar Sitaram. Geetha Manjunath. David R. Deily ELSEVIER. Moving to the Cloud Developing Apps in the New World of Cloud Computing Dinkar Sitaram Geetha Manjunath Technical Editor David R. Deily AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO

More information

The Definitive Guide to the ARM Cortex-M3

The Definitive Guide to the ARM Cortex-M3 The Definitive Guide to the ARM Cortex-M3 Joseph Yiu AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Newnes is an imprint of Elsevier Newnes Forewopd

More information

The Essential Guide to Video Processing

The Essential Guide to Video Processing The Essential Guide to Video Processing Second Edition EDITOR Al Bovik Department of Electrical and Computer Engineering The University of Texas at Austin Austin, Texas AMSTERDAM BOSTON HEIDELBERG LONDON

More information

Real World Multicore Embedded Systems

Real World Multicore Embedded Systems Real World Multicore Embedded Systems A Practical Approach Expert Guide Bryon Moyer AMSTERDAM BOSTON HEIDELBERG LONDON I J^# J NEW YORK OXFORD PARIS SAN DIEGO S V J SAN FRANCISCO SINGAPORE SYDNEY TOKYO

More information

Parallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops

Parallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Parallel Programming Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Single computers nowadays Several CPUs (cores) 4 to 8 cores on a single chip Hyper-threading

More information

Programming with POSIX Threads

Programming with POSIX Threads Programming with POSIX Threads David R. Butenhof :vaddison-wesley Boston San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sidney Tokyo Singapore Mexico City Contents List of

More information

LOGIC AND DISCRETE MATHEMATICS

LOGIC AND DISCRETE MATHEMATICS LOGIC AND DISCRETE MATHEMATICS A Computer Science Perspective WINFRIED KARL GRASSMANN Department of Computer Science University of Saskatchewan JEAN-PAUL TREMBLAY Department of Computer Science University

More information

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K. Fundamentals of Parallel Computing Sanjay Razdan Alpha Science International Ltd. Oxford, U.K. CONTENTS Preface Acknowledgements vii ix 1. Introduction to Parallel Computing 1.1-1.37 1.1 Parallel Computing

More information

Multi-Core Programming

Multi-Core Programming Multi-Core Programming Increasing Performance through Software Multi-threading Shameem Akhter Jason Roberts Intel PRESS Copyright 2006 Intel Corporation. All rights reserved. ISBN 0-9764832-4-6 No part

More information

Computer Architecture and Structured Parallel Programming James Reinders, Intel

Computer Architecture and Structured Parallel Programming James Reinders, Intel Computer Architecture and Structured Parallel Programming James Reinders, Intel Parallel Computing CIS 410/510 Department of Computer and Information Science Lecture 17 Manycore Computing and GPUs Computer

More information

Coding for Penetration

Coding for Penetration Coding for Penetration Testers Building Better Tools Jason Andress Ryan Linn ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Syngress is

More information

Cilk Plus GETTING STARTED

Cilk Plus GETTING STARTED Cilk Plus GETTING STARTED Overview Fundamentals of Cilk Plus Hyperobjects Compiler Support Case Study 3/17/2015 CHRIS SZALWINSKI 2 Fundamentals of Cilk Plus Terminology Execution Model Language Extensions

More information

Parallel Computing. November 20, W.Homberg

Parallel Computing. November 20, W.Homberg Mitglied der Helmholtz-Gemeinschaft Parallel Computing November 20, 2017 W.Homberg Why go parallel? Problem too large for single node Job requires more memory Shorter time to solution essential Better

More information

Programming 8-bit PIC Microcontrollers in С

Programming 8-bit PIC Microcontrollers in С Programming 8-bit PIC Microcontrollers in С with Interactive Hardware Simulation Martin P. Bates älllllltlilisft &Щ*лЛ AMSTERDAM BOSTON HEIDELBERG LONDON ^^Ш NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

Introduction to Algorithms Third Edition

Introduction to Algorithms Third Edition Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest Clifford Stein Introduction to Algorithms Third Edition The MIT Press Cambridge, Massachusetts London, England Preface xiü I Foundations Introduction

More information

MULTIDIMENSIONAL SIGNAL, IMAGE, AND VIDEO PROCESSING AND CODING

MULTIDIMENSIONAL SIGNAL, IMAGE, AND VIDEO PROCESSING AND CODING MULTIDIMENSIONAL SIGNAL, IMAGE, AND VIDEO PROCESSING AND CODING JOHN W. WOODS Rensselaer Polytechnic Institute Troy, New York»iBllfllfiii.. i. ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD

More information

DATABASE SYSTEM CONCEPTS

DATABASE SYSTEM CONCEPTS DATABASE SYSTEM CONCEPTS HENRY F. KORTH ABRAHAM SILBERSCHATZ University of Texas at Austin McGraw-Hill, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico Milan Montreal

More information

Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz

Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming by Nasser Kehtarnavaz University

More information

ARCHITECTURE DESIGN FOR SOFT ERRORS

ARCHITECTURE DESIGN FOR SOFT ERRORS ARCHITECTURE DESIGN FOR SOFT ERRORS Shubu Mukherjee ^ШВпШшр"* AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO T^"ТГПШГ SAN FRANCISCO SINGAPORE SYDNEY TOKYO ^ P f ^ ^ ELSEVIER Morgan

More information

Oracle Real Application Clusters Handbook

Oracle Real Application Clusters Handbook ORACLE Oracle Press Oracle Database 11 g Oracle Real Application Clusters Handbook Second Edition K Copalakrishnan Mc Gnaw Hill McGraw-Hill New York Chicago San Francisco Lisbon London Madrid Mexico City

More information

Coding for Penetration Testers Building Better Tools

Coding for Penetration Testers Building Better Tools Coding for Penetration Testers Building Better Tools Second Edition Jason Andress Ryan Linn Clara Hartwell, Technical Editor ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO

More information

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 13: Memory Consistency + a Course-So-Far Review Parallel Computer Architecture and Programming Today: what you should know Understand the motivation for relaxed consistency models Understand the

More information

FPGAs: Instant Access

FPGAs: Instant Access FPGAs: Instant Access Clive"Max"Maxfield AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO % ELSEVIER Newnes is an imprint of Elsevier Newnes Contents

More information

A Primer on Scheduling Fork-Join Parallelism with Work Stealing

A Primer on Scheduling Fork-Join Parallelism with Work Stealing Doc. No.: N3872 Date: 2014-01-15 Reply to: Arch Robison A Primer on Scheduling Fork-Join Parallelism with Work Stealing This paper is a primer, not a proposal, on some issues related to implementing fork-join

More information

Computer Animation. Algorithms and Techniques. z< MORGAN KAUFMANN PUBLISHERS. Rick Parent Ohio State University AN IMPRINT OF ELSEVIER SCIENCE

Computer Animation. Algorithms and Techniques. z< MORGAN KAUFMANN PUBLISHERS. Rick Parent Ohio State University AN IMPRINT OF ELSEVIER SCIENCE Computer Animation Algorithms and Techniques Rick Parent Ohio State University z< MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF ELSEVIER SCIENCE AMSTERDAM BOSTON LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

PTC Mathcad Prime 3.0

PTC Mathcad Prime 3.0 Essential PTC Mathcad Prime 3.0 A Guide for New and Current Users Brent Maxfield, P.E. AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO @ Academic

More information

Modern Embedded Computing Designing Connected, Pervasive, Media-Rich Systems

Modern Embedded Computing Designing Connected, Pervasive, Media-Rich Systems Modern Embedded Computing Designing Connected, Pervasive, Media-Rich Systems Peter Barry Patrick Crowley ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

An Introduction to Object-Oriented Programming

An Introduction to Object-Oriented Programming An Introduction to Object-Oriented Programming Timothy Budd Oregon State University TT Addison-Wesley Publishing Company Reading, Massachusetts Menlo Park, California New York Don Mills, Ontario Wokingham,

More information

Anany Levitin 3RD EDITION. Arup Kumar Bhattacharjee. mmmmm Analysis of Algorithms. Soumen Mukherjee. Introduction to TllG DCSISFI &

Anany Levitin 3RD EDITION. Arup Kumar Bhattacharjee. mmmmm Analysis of Algorithms. Soumen Mukherjee. Introduction to TllG DCSISFI & Introduction to TllG DCSISFI & mmmmm Analysis of Algorithms 3RD EDITION Anany Levitin Villa nova University International Edition contributions by Soumen Mukherjee RCC Institute of Information Technology

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval Ricardo Baeza-Yates Berthier Ribeiro-Neto ACM Press NewYork Harlow, England London New York Boston. San Francisco. Toronto. Sydney Singapore Hong Kong Tokyo Seoul Taipei. New

More information

Understand and Implement Effective PCI Data Security Standard Compliance

Understand and Implement Effective PCI Data Security Standard Compliance PCI Compliance Understand and Implement Effective PCI Data Security Standard Compliance Second Edition Dr. Anton A. Chuvakin Branden R. Williams Technical Editor Ward Spangenberg ELSEVIER AMSTERDAM BOSTON

More information

Parallelization on Multi-Core CPUs

Parallelization on Multi-Core CPUs 1 / 30 Amdahl s Law suppose we parallelize an algorithm using n cores and p is the proportion of the task that can be parallelized (1 p cannot be parallelized) the speedup of the algorithm is assuming

More information

Contents. Preface. About the Authors BASIC TECHNIQUES CHAPTER 1 PARALLEL COMPUTERS. l. 1 The Demand for Computational Speed 3

Contents. Preface. About the Authors BASIC TECHNIQUES CHAPTER 1 PARALLEL COMPUTERS. l. 1 The Demand for Computational Speed 3 Preface About the Authors PARTI BASIC TECHNIQUES CHAPTER 1 PARALLEL COMPUTERS l. 1 The Demand for Computational Speed 3 1.2 Potential for Increased Computational Speed 6 Speedup Factor 6 What Is the Maximum

More information

System Assurance. Beyond Detecting. Vulnerabilities. Djenana Campara. Nikolai Mansourov

System Assurance. Beyond Detecting. Vulnerabilities. Djenana Campara. Nikolai Mansourov System Assurance Beyond Detecting Vulnerabilities Nikolai Mansourov Djenana Campara ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SYDNEY TOKYO Morgan Kaufmann

More information

Intel Thread Building Blocks, Part II

Intel Thread Building Blocks, Part II Intel Thread Building Blocks, Part II SPD course 2013-14 Massimo Coppola 25/03, 16/05/2014 1 TBB Recap Portable environment Based on C++11 standard compilers Extensive use of templates No vectorization

More information

Algorithms and Parallel Computing

Algorithms and Parallel Computing Algorithms and Parallel Computing Algorithms and Parallel Computing Fayez Gebali University of Victoria, Victoria, BC A John Wiley & Sons, Inc., Publication Copyright 2011 by John Wiley & Sons, Inc. All

More information

F. THOMSON LEIGHTON INTRODUCTION TO PARALLEL ALGORITHMS AND ARCHITECTURES: ARRAYS TREES HYPERCUBES

F. THOMSON LEIGHTON INTRODUCTION TO PARALLEL ALGORITHMS AND ARCHITECTURES: ARRAYS TREES HYPERCUBES F. THOMSON LEIGHTON INTRODUCTION TO PARALLEL ALGORITHMS AND ARCHITECTURES: ARRAYS TREES HYPERCUBES MORGAN KAUFMANN PUBLISHERS SAN MATEO, CALIFORNIA Contents Preface Organization of the Material Teaching

More information

Managed. Code Rootkits. Hooking. into Runtime. Environments. Erez Metula ELSEVIER. Syngress is an imprint of Elsevier SYNGRESS

Managed. Code Rootkits. Hooking. into Runtime. Environments. Erez Metula ELSEVIER. Syngress is an imprint of Elsevier SYNGRESS Managed Code Rootkits Hooking into Runtime Environments Erez Metula ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEWYORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Syngress is an imprint

More information

Programming in Python 3

Programming in Python 3 Programming in Python 3 A Complete Introduction to the Python Language Mark Summerfield.4.Addison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich

More information

Essential MATLAB for Engineers and Scientists

Essential MATLAB for Engineers and Scientists Essential MATLAB for Engineers and Scientists Third edition Brian D. Hahn and Daniel T. Valentine ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY

More information

The Designer's Guide to VHDL Second Edition

The Designer's Guide to VHDL Second Edition The Designer's Guide to VHDL Second Edition Peter J. Ashenden EDA CONSULTANT, ASHENDEN DESIGNS PTY. VISITING RESEARCH FELLOW, ADELAIDE UNIVERSITY Cl MORGAN KAUFMANN PUBLISHERS An Imprint of Elsevier SAN

More information

Parallel Programming. OpenMP Parallel programming for multiprocessors for loops

Parallel Programming. OpenMP Parallel programming for multiprocessors for loops Parallel Programming OpenMP Parallel programming for multiprocessors for loops OpenMP OpenMP An application programming interface (API) for parallel programming on multiprocessors Assumes shared memory

More information

Intel Array Building Blocks

Intel Array Building Blocks Intel Array Building Blocks Productivity, Performance, and Portability with Intel Parallel Building Blocks Intel SW Products Workshop 2010 CERN openlab 11/29/2010 1 Agenda Legal Information Vision Call

More information

CSE 613: Parallel Programming

CSE 613: Parallel Programming CSE 613: Parallel Programming Lecture 3 ( The Cilk++ Concurrency Platform ) ( inspiration for many slides comes from talks given by Charles Leiserson and Matteo Frigo ) Rezaul A. Chowdhury Department of

More information

List of Figures. About the Authors. Acknowledgments

List of Figures. About the Authors. Acknowledgments List of Figures Preface About the Authors Acknowledgments xiii xvii xxiii xxv 1 Compilation 1 1.1 Compilers..................................... 1 1.1.1 Programming Languages......................... 1

More information

DATA ABSTRACTION AND PROBLEM SOLVING WITH JAVA

DATA ABSTRACTION AND PROBLEM SOLVING WITH JAVA DATA ABSTRACTION AND PROBLEM SOLVING WITH JAVA WALLS AND MIRRORS First Edition Frank M. Carrano University of Rhode Island Janet J. Prichard Bryant College Boston San Francisco New York London Toronto

More information

Trends and Challenges in Multicore Programming

Trends and Challenges in Multicore Programming Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores

More information

Real-Time Systems and Programming Languages

Real-Time Systems and Programming Languages Real-Time Systems and Programming Languages Ada, Real-Time Java and C/Real-Time POSIX Fourth Edition Alan Burns and Andy Wellings University of York * ADDISON-WESLEY An imprint of Pearson Education Harlow,

More information

4.1.2 Merge Sort Sorting Lower Bound Counting Sort Sorting in Practice Solving Problems by Sorting...

4.1.2 Merge Sort Sorting Lower Bound Counting Sort Sorting in Practice Solving Problems by Sorting... Contents 1 Introduction... 1 1.1 What is Competitive Programming?... 1 1.1.1 Programming Contests.... 2 1.1.2 Tips for Practicing.... 3 1.2 About This Book... 3 1.3 CSES Problem Set... 5 1.4 Other Resources...

More information

Chapter 1 Introduction

Chapter 1 Introduction Preface xv Chapter 1 Introduction 1.1 What's the Book About? 1 1.2 Mathematics Review 2 1.2.1 Exponents 3 1.2.2 Logarithms 3 1.2.3 Series 4 1.2.4 Modular Arithmetic 5 1.2.5 The P Word 6 1.3 A Brief Introduction

More information

Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest. Introduction to Algorithms

Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest. Introduction to Algorithms Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest Introduction to Algorithms Preface xiii 1 Introduction 1 1.1 Algorithms 1 1.2 Analyzing algorithms 6 1.3 Designing algorithms 1 1 1.4 Summary 1 6

More information

Data Structures and Algorithm Analysis in C++

Data Structures and Algorithm Analysis in C++ INTERNATIONAL EDITION Data Structures and Algorithm Analysis in C++ FOURTH EDITION Mark A. Weiss Data Structures and Algorithm Analysis in C++, International Edition Table of Contents Cover Title Contents

More information

Contents. 1 Introduction. 2 Searching and Traversal Techniques. Preface... (vii) Acknowledgements... (ix)

Contents. 1 Introduction. 2 Searching and Traversal Techniques. Preface... (vii) Acknowledgements... (ix) Contents Preface... (vii) Acknowledgements... (ix) 1 Introduction 1.1 Algorithm 1 1.2 Life Cycle of Design and Analysis of Algorithm 2 1.3 Pseudo-Code for Expressing Algorithms 5 1.4 Recursive Algorithms

More information

The Automatic Design of Batch Processing Systems

The Automatic Design of Batch Processing Systems The Automatic Design of Batch Processing Systems by Barry Dwyer, M.A., D.A.E., Grad.Dip. A thesis submitted for the degree of Doctor of Philosophy in the Department of Computer Science University of Adelaide

More information

CS 445: Data Structures Final Examination: Study Guide

CS 445: Data Structures Final Examination: Study Guide CS 445: Data Structures Final Examination: Study Guide Java prerequisites Classes, objects, and references Access modifiers Arguments and parameters Garbage collection Self-test questions: Appendix C Designing

More information

DATA STRUCTURES AND PROBLEM SOLVING USING JAVA

DATA STRUCTURES AND PROBLEM SOLVING USING JAVA DATA STRUCTURES AND PROBLEM SOLVING USING JAVA Second Edition MARK ALLEN WEISS Florida International University Addison Wesley Boston San Francisco New York London Toronto Sydney Tokyo Singapore Madrid

More information

Parallel Programming. Presentation to Linux Users of Victoria, Inc. November 4th, 2015

Parallel Programming. Presentation to Linux Users of Victoria, Inc. November 4th, 2015 Parallel Programming Presentation to Linux Users of Victoria, Inc. November 4th, 2015 http://levlafayette.com 1.0 What Is Parallel Programming? 1.1 Historically, software has been written for serial computation

More information

Networked Graphics 01_P374423_PRELIMS.indd i 10/27/2009 6:57:42 AM

Networked Graphics 01_P374423_PRELIMS.indd i 10/27/2009 6:57:42 AM Networked Graphics Networked Graphics Building Networked Games and Virtual Environments Anthony Steed Manuel Fradinho Oliveira AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO

More information

Summary of Contents LIST OF FIGURES LIST OF TABLES

Summary of Contents LIST OF FIGURES LIST OF TABLES Summary of Contents LIST OF FIGURES LIST OF TABLES PREFACE xvii xix xxi PART 1 BACKGROUND Chapter 1. Introduction 3 Chapter 2. Standards-Makers 21 Chapter 3. Principles of the S2ESC Collection 45 Chapter

More information

Parallel and Distributed Computing (PD)

Parallel and Distributed Computing (PD) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Parallel and Distributed Computing (PD) The past decade has brought explosive growth in multiprocessor computing, including multi-core

More information

FISMAand the Risk Management Framework

FISMAand the Risk Management Framework FISMAand the Risk Management Framework The New Practice of Federal Cyber Security Stephen D. Gantz Daniel R. Phi I pott Darren Windham, Technical Editor ^jm* ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON

More information

JAVA CONCEPTS Early Objects

JAVA CONCEPTS Early Objects INTERNATIONAL STUDENT VERSION JAVA CONCEPTS Early Objects Seventh Edition CAY HORSTMANN San Jose State University Wiley CONTENTS PREFACE v chapter i INTRODUCTION 1 1.1 Computer Programs 2 1.2 The Anatomy

More information

Acknowledgments. Amdahl s Law. Contents. Programming with MPI Parallel programming. 1 speedup = (1 P )+ P N. Type to enter text

Acknowledgments. Amdahl s Law. Contents. Programming with MPI Parallel programming. 1 speedup = (1 P )+ P N. Type to enter text Acknowledgments Programming with MPI Parallel ming Jan Thorbecke Type to enter text This course is partly based on the MPI courses developed by Rolf Rabenseifner at the High-Performance Computing-Center

More information

Analytical Modeling of Parallel Programs

Analytical Modeling of Parallel Programs 2014 IJEDR Volume 2, Issue 1 ISSN: 2321-9939 Analytical Modeling of Parallel Programs Hardik K. Molia Master of Computer Engineering, Department of Computer Engineering Atmiya Institute of Technology &

More information

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples Multicore Jukka Julku 19.2.2009 1 2 3 4 5 6 Disclaimer There are several low-level, languages and directive based approaches But no silver bullets This presentation only covers some examples of them is

More information

Parallel Programming Principle and Practice. Lecture 7 Threads programming with TBB. Jin, Hai

Parallel Programming Principle and Practice. Lecture 7 Threads programming with TBB. Jin, Hai Parallel Programming Principle and Practice Lecture 7 Threads programming with TBB Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Outline Intel Threading

More information

CSCE 321/3201 Analysis and Design of Algorithms. Prof. Amr Goneid. Fall 2016

CSCE 321/3201 Analysis and Design of Algorithms. Prof. Amr Goneid. Fall 2016 CSCE 321/3201 Analysis and Design of Algorithms Prof. Amr Goneid Fall 2016 CSCE 321/3201 Analysis and Design of Algorithms Prof. Amr Goneid Course Resources Instructor: Prof. Amr Goneid E-mail: goneid@aucegypt.edu

More information

Microsoft Windows HPC Server 2008 R2 for the Cluster Developer

Microsoft Windows HPC Server 2008 R2 for the Cluster Developer 50291B - Version: 1 02 May 2018 Microsoft Windows HPC Server 2008 R2 for the Cluster Developer Microsoft Windows HPC Server 2008 R2 for the Cluster Developer 50291B - Version: 1 5 days Course Description:

More information

Computer Organization and Design

Computer Organization and Design Computer Organization and Design THE H A R D W A R E / S O F T W A R E I N T E R F A C E John L. Hennessy Stanford University David A. Patterson University of California at Berkeley With a contribution

More information

CS4961 Parallel Programming. Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms 9/5/12. Administrative. Mary Hall September 4, 2012

CS4961 Parallel Programming. Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms 9/5/12. Administrative. Mary Hall September 4, 2012 CS4961 Parallel Programming Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms Administrative Mailing list set up, everyone should be on it - You should have received a test mail last night

More information

High Performance Computing. Introduction to Parallel Computing

High Performance Computing. Introduction to Parallel Computing High Performance Computing Introduction to Parallel Computing Acknowledgements Content of the following presentation is borrowed from The Lawrence Livermore National Laboratory https://hpc.llnl.gov/training/tutorials

More information

Fundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON.

Fundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON. Fundamentals of Database Systems 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute

More information