Towards a Domain-Specific Language for Patterns-Oriented Parallel Programming
|
|
- Monica Potter
- 5 years ago
- Views:
Transcription
1 Towards a Domain-Specific Language for Patterns-Oriented Parallel Programming Dalvan Griebler, Luiz Gustavo Fernandes Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS Programa de Pós-Graduação em Ciência da Computação - PPGCC Grupo de Modelagem de Aplicações Paralelas - GMAP Brazilian Symposium on Programming Languages - SBLP October 203 / 2
2 2 / 2 Summary Introduction 2 Patterns-Oriented Parallel Programming (POPP) 3 DSL-POPP Compilation Process Programming Interface and Implementation Levels of parallelism 4 Results Implementation Example of the DSL-POPP Tests Scenario Performance of DSL-POPP 5 Conclusions 6 References
3 3 / 2 Introduction Skeletons/Patterns ([], [2], [3])
4 3 / 2 Introduction Skeletons/Patterns ([], [2], [3]) Programming Interfaces (FastFlow [4], Muesli [5], SkeTo[6], Skandium [7], eskel[8], P3L [9], Lithium [0], Muskel [] and Skil [2])
5 3 / 2 Introduction Skeletons/Patterns ([], [2], [3]) Programming Interfaces (FastFlow [4], Muesli [5], SkeTo[6], Skandium [7], eskel[8], P3L [9], Lithium [0], Muskel [] and Skil [2]) Main goals of DSL-POPP [3]: Reduce the effort without compromise the performance Patterns-Oriented Parallel Programming Abstract details of patterns implementation Offer different levels of parallelism
6 3 / 2 Introduction Skeletons/Patterns ([], [2], [3]) Programming Interfaces (FastFlow [4], Muesli [5], SkeTo[6], Skandium [7], eskel[8], P3L [9], Lithium [0], Muskel [] and Skil [2]) Main goals of DSL-POPP [3]: Reduce the effort without compromise the performance Patterns-Oriented Parallel Programming Abstract details of patterns implementation Offer different levels of parallelism Paper contributions We propose the POPP model We introduce DSL-POPP We present a case study based on an image processing algorithm
7 4 / 2 Patterns-Oriented Parallel Programming (POPP) Main Routine Code Block Code Block n Subroutine Code Block Code Block n... Subroutine Subroutine n Code Block Code Block n Code Block Code Block n Master/Slave pattern code blocks S m s sn M... Sn m s sn subroutine subroutine n main routine Pipeline pattern code blocks P Pn p pn p pn subroutine subroutine n main routine Legend: M,S: Master/Slave (main routine) m,s: master/slave (subrotine) P: Pipeline stage (main routine) p: pipeline stage (subroutine) Figure: POPP model Figure: Master/Slave - Pipeline.
8 Patterns-Oriented Parallel Programming (POPP) Main Routine Code Block Code Block n Subroutine Code Block Code Block n... Subroutine Subroutine n Code Block Code Block n Code Block Code Block n Master/Slave pattern code blocks S m s sn M... Sn m s sn subroutine subroutine n main routine Pipeline pattern code blocks P Pn p pn p pn subroutine subroutine n main routine Legend: M,S: Master/Slave (main routine) m,s: master/slave (subrotine) P: Pipeline stage (main routine) p: pipeline stage (subroutine) Figure: POPP model Figure: Master/Slave - Pipeline. Hybrid patterns P P2 Pn m s sn subroutine (master/slave) m s sn subroutine 2 (master/slave) main routine (pipeline) p pn subroutine n (pipeline) Figure: Combination of Patterns. 4 / 2
9 5 / 2 Compilation Process DSL-POPP $PipelinePattern @Stage(){ Source Code Pattern Tree Syntatic/Semantic Analysis include pthread.h include smmpi.h SMMPI_send() SMMPI_recv() pthread_create() pthread_join() Source-to-Source Transformation DSL-POPP Precompiler System Figure: Compilation process. GCC Compiler Binary Code
10 6 / 2 Programming Interface and Implementation DSL-POPP $PipelinePattern num_th, void* buffer, int num_th, void* buffer, int num_th, void* buffer, int buf_size){ Pipeline Block C re a t e Work 0 Stage Block thread 0 thread Work 0 T h re a Work 0 d s thread 2 J o i n T h re a d s (a) Pipeline
11 Programming Interface and Implementation DSL-POPP $PipelinePattern num_th, void* buffer, int num_th, void* buffer, int num_th, void* buffer, int buf_size){ Pipeline Block C re a t e Work 0 Stage Block thread 0 thread Work 0 T h re a Work 0 d s thread 2 J o i n T h re a d s (a) Pipeline $MasterSlavePattern num_th, void* buffer, int buf_size, const POPP_LB_Policy){ Master Block Create Threads Work 0.0 Slave Block Work 0.n Work n.0 thread 0 thread n Join Threads Work n.n (b) Master/Slave Figure: Syntax and logical structure of the DSL-POPP Policies for Load Balancing: POPP_LB_STATIC; POPP_LB_DYNAMIC; POPP_LB_COST. 6 / 2
12 7 / 2 Levels of parallelism DSL-POPP Pipeline - Pipeline a) b) Pipeline - Master/Slave Master/Slave - Master/Slave c) d) Master/Slave - Pipeline Control threads (master) First level active threads Second level active threads Figure: Overview of thread graph in DSL-POPP.
13 8 / 2 Implementation Example of the DSL-POPP Results List of images with 3000x2550 resolution IM IM2 IM3 IM4 IM39 IM40 Prewitt Sobel Roberts IM IM IM Figure: Overview of DSL-POPP Image Processing Algorithm Implementation.
14 9 / 2 Implementation Example of the DSL-POPP Results List of images with 3000x2550 resolution IM IM2 IM3 IM4 IM39 IM40 Prewitt 2 n IM Split Split Sobel IM Split IM... 2 n Roberts IM Split Figure: Overview of DSL-POPP Image Processing Algorithm Implementation.
15 0 / 2 Implementation Example of the DSL-POPP Results
16 / 2 Implementation Example of the DSL-POPP Results
17 ... 2 / 2 Tests Scenario Results List of images with 3000x2550 resolution IM IM2 IM3 IM4 IM39 IM40 Test- Prewitt Sobel Roberts Master/Slave IM Split Master/Slave IM Split Master/Slave IM Split 2 IM n
18 3 / 2 Tests Scenario Results Pipeline List of images with 3000x2550 resolution IM IM2 IM3 IM4 IM39 IM40 Prewitt Sobel Roberts IM IM2 IM Test-2 IM3 IM39 IM2 IM IM39 IM39
19 ... 4 / 2 Tests Scenario Results List of images with 3000x2550 resolution IM IM2 IM3 IM4 IM39 IM40 Prewitt Sobel IM Master/Slave Split 2 n Master/Slave IM Split Master/Slave Split 2 IM n Test-3. and Test-3.2 Roberts IM Master/Slave Split
20 ... 5 / 2 Tests Scenario Results Pipeline List of images with 3000x2550 resolution IM IM2 IM3 IM4 IM39 IM40 Prewitt Sobel Roberts IM IM IM IM Test-4 IM2 IM3 IM IM2 IM Master/Slave IM39 Split Master/Slave IM39 Split Master/Slave IM39 Split IM 2 n
21 6 / 2 Tests Scenario Results List of images with 3000x2550 resolution IM IM2 IM3 IM4 IM39 IM40 Prewitt Sobel 2 n IM IM Master/Slave Split Test-5 Roberts IM
22 Performance of DSL-POPP Results Speedup Test Number of threads Efficiency Speedup Ideal Efficiency Speedup Test Number of threads Efficiency Speedup Ideal Efficiency Speedup Test Number of threads Efficiency Speedup Ideal Efficiency Speedup Test Number of threads Efficiency Speedup Ideal Efficiency Speedup Test Number of threads Efficiency Speedup Ideal Efficiency Speedup Test Number of threads Efficiency Speedup Ideal Efficiency 7 / 2
23 8 / 2 Conclusions About this paper Hide Low level parallel programming primitives Patterns may be easily nested or combined Good performance for image processing application Different parallel implementation tests were performed Future Works Include other parallel patterns Investigate optimized techniques for code generation Effort evaluation.
24 References I Mattson G. T., Sanders A. B., and Massingill L. B. Patterns for Parallel Programming. Addison-Wesley, Boston, USA, Intel and Mccool D. M. Structured Parallel Programming with Deterministic Patterns. In HotPar-2nd USENIX Workshop on Hot Topics in Parallelism, pages 6, Berkeley, CA, June 200. Catanzaro R. and Keutzer K. Parallel Computing with Patterns and Frameworks. XRDS: Crossroads, The ACM Magazine for Students, 7():22 27, 200. Aldinucci M. and Danelutto M. and Kilpatrick P. and Torquati M. FastFlow: High-Level and Efficient Streaming on Multi-core. In Programming Multi-core and Many-core Computing Systems, Parallel and Distributed Computing, chapter 3. Wiley, Boston, USA, 203. Ciechanowicz P. and Kuchen H. Enhancing Muesli s Data Parallel Skeletons for Multi-core Computer Architectures. In High Performance Computing and Communications (HPCC), 200 2th IEEE International Conference on, pages 08 3, Melbourne, Australia, September 200. Karasawa Y. and Iwasaki H. A Parallel Skeleton Library for Multi-core Clusters. In Parallel Processing, ICPP 09. International Conference on, pages 84 9, Vienna, Austria, September / 2
25 References II Leyton M. and Piquer J.M. Skandium: Multi-core Programming with Algorithmic Skeletons. In Parallel, Distributed and Network-Based Processing (PDP), 200 8th Euromicro International Conference on, pages , Pisa, Italy, February 200. Benoit A., Cole M., Gilmore S., and Hillston J. Flexible Skeletal Programming with eskel. In Proceedings of the th international Euro-Par conference on Parallel Processing, pages , Lisboa, Portugal, September, Bacci B. and Danelutto M. and Orlando S. and Pelagatti S. and Vanneschi M. P3L: A Structured High-Level Parallel Language, and its Structured Support. Concurrency: Practice and Experience, 7(3): , 995. Aldinucci M. and Danelutto M. and Teti P. An Advanced Environment Supporting Structured Parallel Programming in Java. Future Gener. Comput. Syst., 9(5):6 626, Aldinucci M. and Danelutto M. and Kilpatrick P. Skeletons for Multi/Many-core Systems. In Parallel Computing: From Multicores and GPU s to Petascale (Proc. of PARCO 2009, Lyon, France), pages , Lyon, France, September Botorog G.H. and Kuchen H. Skil: An Imperative Language with Algorithmic Skeletons for Efficient Distributed Programming. In High Performance Distributed Computing, 996., Proceedings of 5th IEEE International Symposium on, pages , Syracuse, NY, USA, August / 2
26 References III Griebler D. J. Proposta de uma Linguagem Específica de Domínio de Programação Paralela Orientada a Padrões Paralelos: um Estudo de Caso Baseado no Padrão Mestre/Escravo para Arquiteturas Multi-Core. Master s thesis, PUCRS, 202. Voltar para Capa 2 / 2
Performance and Usability Evaluation of a Pattern-Oriented Parallel Programming Interface for Multi-Core Architectures
Performance and Usability Evaluation of a Pattern-Oriented Parallel Programming Interface for Multi-Core Architectures Dalvan Griebler, Daniel Adornes, Luiz Gustavo Fernandes Pontifícia Universidade Católica
More informationMarco Danelutto. May 2011, Pisa
Marco Danelutto Dept. of Computer Science, University of Pisa, Italy May 2011, Pisa Contents 1 2 3 4 5 6 7 Parallel computing The problem Solve a problem using n w processing resources Obtaining a (close
More informationJoint Structured/Unstructured Parallelism Exploitation in muskel
Joint Structured/Unstructured Parallelism Exploitation in muskel M. Danelutto 1,4 and P. Dazzi 2,3,4 1 Dept. Computer Science, University of Pisa, Italy 2 ISTI/CNR, Pisa, Italy 3 IMT Institute for Advanced
More informationSkel: A Streaming Process-based Skeleton Library for Erlang (Early Draft!)
Skel: A Streaming Process-based Skeleton Library for Erlang (Early Draft!) Archibald Elliott 1, Christopher Brown 1, Marco Danelutto 2, and Kevin Hammond 1 1 School of Computer Science, University of St
More informationParallel patterns + Macro Data Flow for multi-core programming
Parallel patterns + Macro Data Flow for multi-core programming M. Aldinucci Dept. Computer Science Univ. of Torino, Italy aldinuc@di.unito.it L. Anardu & M. Danelutto & M. Torquati Dept. Computer Science
More informationSkeletons for multi/many-core systems
Skeletons for multi/many-core systems Marco ALDINUCCI a Marco DANELUTTO b Peter KILPATRICK c a Dept. Computer Science Univ. of Torino Italy b Dept. Computer Science Univ. of Pisa Italy c Dept. Computer
More informationEvaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications
Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications Fernando Rui, Márcio Castro, Dalvan Griebler, Luiz Gustavo Fernandes Email: fernando.rui@acad.pucrs.br,
More informationHigh-Level and Efficient Stream Parallelism on Multi-core Systems with SPar for Data Compression Applications
High-Level and Efficient Stream Parallelism on Multi-core Systems with SPar for Data Compression Applications Dalvan Griebler 1, Renato B. Hoffmann 1, Junior Loff 1, Marco Danelutto 2, Luiz Gustavo Fernandes
More informationGMaVis: A Domain-Specific Language for Large-Scale Geospatial Data Visualization Supporting Multi-core Parallelism
1 / 61 GMaVis: A Domain-Specific Language for Large-Scale Geospatial Data Visualization Supporting Multi-core Parallelism Cleverson Ledur Advisor: Ph.D. Luiz Gustavo Fernandes Co-Advisor: Ph.D. Isabel
More informationThe cost of security in skeletal systems
The cost of security in skeletal systems M. Aldinucci Dept. Computer Science Univ. of Pisa Italy aldinuc@di.unipi.it M. Danelutto Dept. Computer Science Univ. of Pisa Italy marcod@di.unipi.it Abstract
More informationMarco Danelutto. October 2010, Amsterdam. Dept. of Computer Science, University of Pisa, Italy. Skeletons from grids to multicores. M.
Marco Danelutto Dept. of Computer Science, University of Pisa, Italy October 2010, Amsterdam Contents 1 2 3 4 Structured parallel programming Structured parallel programming Algorithmic Cole 1988 common,
More informationTwo Fundamental Concepts in Skeletal Parallel Programming
Two Fundamental Concepts in Skeletal Parallel Programming Anne Benoit and Murray Cole School of Informatics, The University of Edinburgh, James Clerk Maxwell Building, The King s Buildings, Mayfield Road,
More informationA Parallel Sweep Line Algorithm for Visibility Computation
Universidade Federal de Viçosa Departamento de Informática Programa de Pós-Graduação em Ciência da Computação A Parallel Sweep Line Algorithm for Visibility Computation Chaulio R. Ferreira Marcus V. A.
More informationImplementing Fusion-Equipped Parallel Skeletons by Expression Templates
Implementing Fusion-Equipped Parallel Skeletons by Expression Templates Kiminori Matsuzaki and Kento Emoto Graduate School of Information Science and Technology, University of Tokyo, Japan. {kmatsu,emoto@ipl.t.u-tokyo.ac.jp
More informationAlgorithmic skeletons meeting grids q
Parallel Computing 32 (2006) 449 462 www.elsevier.com/locate/parco Algorithmic skeletons meeting grids q Marco Danelutto *, Marco Aldinucci Department of Computer Science, University of Pisa, Largo Pontecorvo
More informationType Safe Algorithmic Skeletons
Type Safe Algorithmic Skeletons Denis Caromel, Ludovic Henrio, and Mario Leyton INRIA Sophia-Antipolis, Université de Nice Sophia-Antipolis, CNRS - I3S 2004, Route des Lucioles, BP 93, F-06902 Sophia-Antipolis
More informationExceptions for Algorithmic Skeletons
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/44016639 Exceptions for Algorithmic Skeletons Article August 2010 DOI: 10.1007/978-3-642-15291-7_3
More informationThe multi/many core challenge: a pattern based programming perspective
The multi/many core challenge: a pattern based programming perspective Marco Dept. Computer Science, Univ. of Pisa CoreGRID Programming model Institute XXII Jornadas de Paralelismo Sept. 7-9, 2011, La
More informationEfficient streaming applications on multi-core with FastFlow: the biosequence alignment test-bed
Efficient streaming applications on multi-core with FastFlow: the biosequence alignment test-bed Marco Aldinucci Computer Science Dept. - University of Torino - Italy Marco Danelutto, Massimiliano Meneghin,
More informationCommunicating Process Architectures in Light of Parallel Design Patterns and Skeletons
Communicating Process Architectures in Light of Parallel Design Patterns and Skeletons Dr Kevin Chalmers School of Computing Edinburgh Napier University Edinburgh k.chalmers@napier.ac.uk Overview ˆ I started
More informationAssistConf: a Grid configuration tool for the ASSIST parallel programming environment
AssistConf: a Grid configuration tool for the ASSIST parallel programming environment R. Baraglia 1, M. Danelutto 2, D. Laforenza 1, S. Orlando 3, P. Palmerini 1,3, P. Pesciullesi 2, R. Perego 1, M. Vanneschi
More informationDesign patterns percolating to parallel programming framework implementation
International Journal of Parallel Programming - ISSN 0885-7458 DOI: 10.1007/s10766-013-0273-6 The final publication is available at link.springer.com Design patterns percolating to parallel programming
More informationScalable Farms. Michael Poldner a, Herbert Kuchen a. D Münster, Germany
1 Scalable Farms Michael Poldner a, Herbert Kuchen a a University of Münster, Department of Information Systems, Leonardo Campus 3, D-4819 Münster, Germany Algorithmic skeletons intend to simplify parallel
More informationFine Tuning Algorithmic Skeletons
Fine Tuning Algorithmic Skeletons Denis Caromel and Mario Leyton INRIA Sophia-Antipolis, CNRS, I3S, UNSA. 2004, Route des Lucioles, BP 93, F-06902 Sophia-Antipolis Cedex, France. First.Last@sophia.inria.fr
More informationStructured approaches for multi/many core targeting
Structured approaches for multi/many core targeting Marco Danelutto M. Torquati, M. Aldinucci, M. Meneghin, P. Kilpatrick, D. Buono, S. Lametti Dept. Computer Science, Univ. of Pisa CoreGRID Programming
More informationeskimo: EXPERIMENTING WITH SKELETONS IN THE SHARED ADDRESS MODEL
c World Scientific Publishing Company eskimo: EXPERIMENTING WITH SKELETONS IN THE SHARED ADDRESS MODEL MARCO ALDINUCCI Inst. of Information Science and Technologies (ISTI) National Research Council (CNR)
More informationEfficient Smith-Waterman on multi-core with FastFlow
BioBITs Euromicro PDP 2010 - Pisa Italy - 17th Feb 2010 Efficient Smith-Waterman on multi-core with FastFlow Marco Aldinucci Computer Science Dept. - University of Torino - Italy Massimo Torquati Computer
More informationA Fusion-Embedded Skeleton Library
A Fusion-Embedded Skeleton Library Kiminori Matsuzaki 1, Kazuhiko Kakehi 1, Hideya Iwasaki 2, Zhenjiang Hu 1,3, and Yoshiki Akashi 2 1 Graduate School of Information Science and Technology, University
More informationStructured parallel programming
Marco Dept. Computer Science, Univ. of Pisa DSL 2013 July 2013, Cluj, Romania Contents Introduction Structured programming Targeting HM2C Managing vs. computing Progress... Introduction Structured programming
More informationA UNIFIED MAPREDUCE PROGRAMMING INTERFACE FOR MULTI-CORE AND DISTRIBUTED ARCHITECTURES
PONTIFICAL CATHOLIC UNIVERSITY OF RIO GRANDE DO SUL FACULTY OF INFORMATICS COMPUTER SCIENCE GRADUATE PROGRAM A UNIFIED MAPREDUCE PROGRAMMING INTERFACE FOR MULTI-CORE AND DISTRIBUTED ARCHITECTURES DANIEL
More informationParallel Skeletons for Variable-Length Lists in SkeTo Skeleton Library
Parallel Skeletons for Variable-Length Lists in SkeTo Skeleton Library Aug. 27, 2009 Haruto Tanno Hideya Iwasaki The University of Electro-Communications (Japan) 1 Outline Introduction Problems of Exiting
More informationarxiv: v1 [cs.dc] 16 Sep 2016
State access patterns in embarrassingly parallel computations Marco Danelutto & Massimo Torquati Dept. of Computer Science Univ. of Pisa {marcod,torquati}@di.unipi.it Peter Kilpatrick Dept. of Computer
More informationTOWARDS THE AUTOMATIC MAPPING OF ASSIST APPLICATIONS FOR THE GRID
TOWARDS THE AUTOMATIC MAPPING OF ASSIST APPLICATIONS FOR THE GRID Marco Aldinucci Computer Science Departement, University of Pisa Largo Bruno Pontecorvo 3, I-56127 Pisa, Italy aldinuc@di.unipi.it Anne
More informationState Access Patterns in Stream Parallel Computations
State Access Patterns in Stream Parallel Computations M. Danelutto, P. Kilpatrick, G. Mencagli M. Torquati Dept. of Computer Science Univ. of Pisa Dept. of Computer Science Queen s Univ. Belfast Abstract
More informationThe Loop-of-Stencil-Reduce paradigm
The Loop-of-Stencil-Reduce paradigm M. Aldinucci, M. Danelutto, M. Drocco, P. Kilpatrick, G. Peretti Pezzi and M. Torquati Computer Science Department, University of Turin, Italy. Computer Science Department,
More informationON IMPLEMENTING THE FARM SKELETON
ON IMPLEMENTING THE FARM SKELETON MICHAEL POLDNER and HERBERT KUCHEN Department of Information Systems, University of Münster, D-48149 Münster, Germany ABSTRACT Algorithmic skeletons intend to simplify
More informationThe Implementation of ASSIST, an Environment for Parallel and Distributed Programming
The Implementation of ASSIST, an Environment for Parallel and Distributed Programming Marco Aldinucci 2, Sonia Campa 1, Pierpaolo Ciullo 1, Massimo Coppola 2, Silvia Magini 1, Paolo Pesciullesi 1, Laura
More informationLIBERO: a framework for autonomic management of multiple non-functional concerns
LIBERO: a framework for autonomic management of multiple non-functional concerns M. Aldinucci, M. Danelutto, P. Kilpatrick, V. Xhagjika University of Torino University of Pisa Queen s University Belfast
More informationExercising high-level parallel programming on streams: a systems biology use case
Exercising high-level parallel programming on streams: a systems biology use case Marco Aldinucci, Maurizio Drocco, Guilherme Peretti Pezzi, Claudia Misale, Fabio Tordini Computer Science Department, University
More informationAn efficient Unbounded Lock-Free Queue for Multi-Core Systems
An efficient Unbounded Lock-Free Queue for Multi-Core Systems Authors: Marco Aldinucci 1, Marco Danelutto 2, Peter Kilpatrick 3, Massimiliano Meneghin 4 and Massimo Torquati 2 1 Computer Science Dept.
More informationTEMPLET: A MARKUP LANGUAGE FOR CONCURRENT ACTOR-ORIENTED PROGRAMMING
TEMPLET: A MARKUP LANGUAGE FOR CONCURRENT ACTOR-ORIENTED PROGRAMMING S.V. Vostokin Samara National Research University, Samara, Russia Abstract. The article presents a markup domain-specific language (DSL)
More informationOptimization Techniques for Implementing Parallel Skeletons in Grid Environments
Optimization Techniques for Implementing Parallel Skeletons in Grid Environments M. Aldinucci 1,M.Danelutto 2,andJ.Dünnweber 3 1 Inst. of Information Science and Technologies CNR, Via Moruzzi 1, Pisa,
More informationCSE 333 SECTION 9. Threads
CSE 333 SECTION 9 Threads HW4 How s HW4 going? Any Questions? Threads Sequential execution of a program. Contained within a process. Multiple threads can exist within the same process. Every process starts
More informationEfficient, Deterministic, and Deadlock-free Concurrency
Efficient, Deterministic Concurrency p. 1/31 Efficient, Deterministic, and Deadlock-free Concurrency Nalini Vasudevan Columbia University Efficient, Deterministic Concurrency p. 2/31 Data Races int x;
More informationEnhancing the performance of Grid Applications with Skeletons and Process Algebras
Enhancing the performance of Grid Applications with Skeletons and Process Algebras (funded by the EPSRC, grant number GR/S21717/01) A. Benoit, M. Cole, S. Gilmore, J. Hillston http://groups.inf.ed.ac.uk/enhance/
More informationPOSIX PTHREADS PROGRAMMING
POSIX PTHREADS PROGRAMMING Download the exercise code at http://www-micrel.deis.unibo.it/~capotondi/pthreads.zip Alessandro Capotondi alessandro.capotondi(@)unibo.it Hardware Software Design of Embedded
More informationTargeting heterogeneous architectures via macro data flow
Targeting heterogeneous architectures via macro data flow M. Aldinucci Dept. Computer Science University of Torino C.So Svizzera, 185 10149 Torino Italy M. Danelutto Dept. Computer Science University of
More informationConcurrency Patterns in SCOOP
Concurrency Patterns in SCOOP Master Thesis Project Plan Project period: 10. March to 8. September 2014 Student name: Roman Schmocker, 09-911-215 Status: 4. semester, Msc in Computer Science Email address:
More informationThreaded Programming. Lecture 9: Alternatives to OpenMP
Threaded Programming Lecture 9: Alternatives to OpenMP What s wrong with OpenMP? OpenMP is designed for programs where you want a fixed number of threads, and you always want the threads to be consuming
More informationJun Li, Ph.D. School of Computing and Information Sciences Phone:
Jun Li, Ph.D. School of Computing and Information Sciences Phone: + 1-305-348-4964 Florida International University Email: junli @ cs. fiu. edu 11200 SW 8th St, ECS 380, Miami, FL 33199 Web: http://users.cs.fiu.edu/
More informationParallel Programming using FastFlow
Parallel Programming using FastFlow Massimo Torquati Computer Science Department, University of Pisa - Italy Karlsruhe, September 2nd, 2014 Outline Structured Parallel Programming
More informationLithium: A Structured Parallel Programming Environment in Java
Lithium: A Structured Parallel Programming Environment in Java M. Danelutto & P. Teti Dept. Computer Science University of Pisa Italy {Marco.Danelutto@di.unipi.it, tetipaol@libero.it} Abstract. We describe
More informationParallel Programming Concepts. Parallel Algorithms. Peter Tröger
Parallel Programming Concepts Parallel Algorithms Peter Tröger Sources: Ian Foster. Designing and Building Parallel Programs. Addison-Wesley. 1995. Mattson, Timothy G.; S, Beverly A.; ers,; Massingill,
More informationPrinciples of Parallel Algorithm Design: Concurrency and Mapping
Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 17 January 2017 Last Thursday
More informationFastFlow: targeting distributed systems
FastFlow: targeting distributed systems Massimo Torquati ParaPhrase project meeting, Pisa Italy 11 th July, 2012 torquati@di.unipi.it Talk outline FastFlow basic concepts two-tier parallel model From single
More informationStructured Parallel Programming with Deterministic Patterns
Structured Parallel Programming with Deterministic Patterns Michael D. McCool, Intel, michael.mccool@intel.com Many-core processors target improved computational performance by making available various
More informationMolecular Dynamics. Dim=3, parts=8192, steps=10. crayc (Cray T3E) Processors
The llc language and its implementation Antonio J. Dorta, Jose Rodr guez, Casiano Rodr guez and Francisco de Sande Dpto. Estad stica, I.O. y Computación Universidad de La Laguna La Laguna, 38271, Spain
More informationProgramming Assignment #4
SSE2030: INTRODUCTION TO COMPUTER SYSTEMS (Fall 2014) Programming Assignment #4 Due: November 15, 11:59:59 PM 1. Introduction The goal of this programing assignment is to enable the student to get familiar
More informationExpressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks A full-day tutorial proposal for SC17
Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks A full-day tutorial proposal for SC17 Tutorial Instructors [James Reinders, Michael J. Voss, Pablo Reble, Rafael Asenjo]
More informationShared-memory Parallel Programming with Cilk Plus
Shared-memory Parallel Programming with Cilk Plus John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 4 30 August 2018 Outline for Today Threaded programming
More informationRelational Algebra Teaching Support Tool
Journal of Information Systems Engineering & Management, 2(2), 8 ISSN: 2468-4376 Relational Algebra Teaching Support Tool Jonathas Jivago de Almeida Cruz 1 *, Kleber Kroll de Azevedo Silva 2 1 Federal
More informationPool evolution: a parallel pattern for evolutionary and symbolic computing
This is an author version of the contribution published by Springer on INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING DOI: 10.1007/s10766-013-0273-6 Pool evolution: a parallel pattern for evolutionary and
More informationCSCI4430 Data Communication and Computer Networks. Pthread Programming. ZHANG, Mi Jan. 26, 2017
CSCI4430 Data Communication and Computer Networks Pthread Programming ZHANG, Mi Jan. 26, 2017 Outline Introduction What is Multi-thread Programming Why to use Multi-thread Programming Basic Pthread Programming
More informationDistributed-memory Algorithms for Dense Matrices, Vectors, and Arrays
Distributed-memory Algorithms for Dense Matrices, Vectors, and Arrays John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 19 25 October 2018 Topics for
More informationBrian F. Cooper. Distributed systems, digital libraries, and database systems
Brian F. Cooper Home Office Internet 2240 Homestead Ct. #206 Stanford University cooperb@stanford.edu Los Altos, CA 94024 Gates 424 http://www.stanford.edu/~cooperb/app/ (408) 730-5543 Stanford, CA 94305
More informationIntroduction to pthreads
CS 220: Introduction to Parallel Computing Introduction to pthreads Lecture 25 Threads In computing, a thread is the smallest schedulable unit of execution Your operating system has a scheduler that decides
More informationCharm++ Workshop 2010
Charm++ Workshop 2010 Eduardo R. Rodrigues Institute of Informatics Federal University of Rio Grande do Sul - Brazil ( visiting scholar at CS-UIUC ) errodrigues@inf.ufrgs.br Supported by Brazilian Ministry
More informationFastFlow: targeting distributed systems Massimo Torquati
FastFlow: targeting distributed systems Massimo Torquati May 17 th, 2012 torquati@di.unipi.it http://www.di.unipi.it/~torquati FastFlow node FastFlow's implementation is based on the concept of node (ff_node
More informationTop-down definition of Network Centric Operating System features
Position paper submitted to the Workshop on Network Centric Operating Systems Bruxelles 16-17 march 2005 Top-down definition of Network Centric Operating System features Thesis Marco Danelutto Dept. Computer
More informationParallelism paradigms
Parallelism paradigms Intro part of course in Parallel Image Analysis Elias Rudberg elias.rudberg@it.uu.se March 23, 2011 Outline 1 Parallelization strategies 2 Shared memory 3 Distributed memory 4 Parallelization
More information[8] GEBREMEDHIN, A. H.; MANNE, F.; MANNE, G. F. ; OPENMP, P. Scalable parallel graph coloring algorithms, 2000.
9 Bibliography [1] CECKA, C.; LEW, A. J. ; DARVE, E. International Journal for Numerical Methods in Engineering. Assembly of finite element methods on graphics processors, journal, 2010. [2] CELES, W.;
More informationPerformance Testing from UML Models with Resource Descriptions *
Performance Testing from UML Models with Resource Descriptions * Flávio M. de Oliveira 1, Rômulo da S. Menna 1, Hugo V. Vieira 1, Duncan D.A. Ruiz 1 1 Faculdade de Informática Pontifícia Universidade Católica
More informationShared-memory Parallel Programming with Cilk Plus
Shared-memory Parallel Programming with Cilk Plus John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 4 19 January 2017 Outline for Today Threaded programming
More informationCommunication Library to Overlap Computation and Communication for OpenCL Application
Communication Library to Overlap Computation and Communication for OpenCL Application Toshiya Komoda, Shinobu Miwa, Hiroshi Nakamura Univ.Tokyo What is today s talk about? Heterogeneous Computing System
More information[4] ANDREWS, G. R. Foundations of Multithreaded, Parallel, and Distributed Programming. 1. ed. Boston: Addison Wesley, p.
Bibliografia [1] ADELSON, E. H.; BERGEN, J. R. The Plenoptic Function and the Elements of Early Vision. In: LANDY M.; MOVSHON A. Computational Model of Visual Processing. Cambridge, Massachusetts: The
More informationIntroduction to FastFlow programming
Introduction to FastFlow programming SPM lecture, November 2016 Massimo Torquati Computer Science Department, University of Pisa - Italy Objectives Have a good idea of the FastFlow
More informationThreads. Threads (continued)
Threads A thread is an alternative model of program execution A process creates a thread through a system call Thread operates within process context Use of threads effectively splits the process state
More informationAutomatic mapping of ASSIST applications using process algebra
Automatic mapping of ASSIST applications using process algebra Marco Aldinucci Dept. of Computer Science, University of Pisa Largo B. Pontecorvo 3, Pisa I-56127, Italy and Anne Benoit LIP, Ecole Normale
More informationRemote and Partial Reconfiguration of FPGAs: Tools and Trends
Remote and Partial Reconfiguration of FPGAs: Tools and Trends Daniel Mesquita, Fernando Moraes, José palma, Leandro Moller, Ney Calazans Laboratoire de Informatique, de Robotique et de Microéletronique
More informationCS333 Intro to Operating Systems. Jonathan Walpole
CS333 Intro to Operating Systems Jonathan Walpole Threads & Concurrency 2 Threads Processes have the following components: - an address space - a collection of operating system state - a CPU context or
More informationFoundation of Parallel Computing- Term project report
Foundation of Parallel Computing- Term project report Shobhit Dutia Shreyas Jayanna Anirudh S N (snd7555@rit.edu) (sj7316@rit.edu) (asn5467@rit.edu) 1. Overview: Graphs are a set of connections between
More information1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008
1 of 6 Lecture 7: March 4 CISC 879 Software Support for Multicore Architectures Spring 2008 Lecture 7: March 4, 2008 Lecturer: Lori Pollock Scribe: Navreet Virk Open MP Programming Topics covered 1. Introduction
More informationSkeleton programming environments Muesli (1)
Skeleton programming environments Muesli (1) Patrizio Dazzi ISTI - CNR Pisa Research Campus mail: patrizio.dazzi@isti.cnr.it Master Degree (Laurea Magistrale) in Computer Science and Networking Academic
More informationarxiv: v1 [cs.dc] 16 Jun 2016
Proc. of the 9th Intl Symposium on High-Level Parallel Programming and Applications (HLPP) July 4-5 2016, Muenster, Germany A Comparison of Big Data Frameworks on a Layered Dataflow Model Claudia Misale
More informationSubset Sum Problem Parallel Solution
Subset Sum Problem Parallel Solution Project Report Harshit Shah hrs8207@rit.edu Rochester Institute of Technology, NY, USA 1. Overview Subset sum problem is NP-complete problem which can be solved in
More informationIntroduction to FastFlow programming
Introduction to FastFlow programming SPM lecture, November 2016 Massimo Torquati Computer Science Department, University of Pisa - Italy Data Parallel Computations In data parallel
More informationContention-Aware Scheduling of Parallel Code for Heterogeneous Systems
Contention-Aware Scheduling of Parallel Code for Heterogeneous Systems Chris Gregg Jeff S. Brantley Kim Hazelwood Department of Computer Science, University of Virginia Abstract A typical consumer desktop
More informationA brief introduction to OpenMP
A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism
More informationThe Art of Parallel Processing
The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a
More informationDryadLINQ. by Yuan Yu et al., OSDI 08. Ilias Giechaskiel. January 28, Cambridge University, R212
DryadLINQ by Yuan Yu et al., OSDI 08 Ilias Giechaskiel Cambridge University, R212 ig305@cam.ac.uk January 28, 2014 Conclusions Takeaway Messages SQL cannot express iteration Unsuitable for machine learning,
More informationA Skeletal Parallel Framework with Fusion Optimizer for GPGPU Programming
A Skeletal Parallel Framework with Fusion Optimizer for GPGPU Programming Shigeyuki Sato and Hideya Iwasaki Department of Computer Science The University of Electro-Communications sato@ipl.cs.uec.ac.jp,
More informationProgramming with Shared Memory. Nguyễn Quang Hùng
Programming with Shared Memory Nguyễn Quang Hùng Outline Introduction Shared memory multiprocessors Constructs for specifying parallelism Creating concurrent processes Threads Sharing data Creating shared
More informationMultiprocessors 2007/2008
Multiprocessors 2007/2008 Abstractions of parallel machines Johan Lukkien 1 Overview Problem context Abstraction Operating system support Language / middleware support 2 Parallel processing Scope: several
More informationPrinciples of Parallel Algorithm Design: Concurrency and Mapping
Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 28 August 2018 Last Thursday Introduction
More informationBlack-Box Program Specialization
Published in Technical Report 17/99, Department of Software Engineering and Computer Science, University of Karlskrona/Ronneby: Proceedings of WCOP 99 Black-Box Program Specialization Ulrik Pagh Schultz
More informationFormalizing OO Frameworks and Framework Instantiation
Formalizing OO Frameworks and Framework Instantiation Christiano de O. Braga, Marcus Felipe M. C. da Fontoura, Edward H. Hæusler, and Carlos José P. de Lucena Departamento de Informática, Pontifícia Universidade
More informationManagement in Distributed Systems: A Semi-formal Approach
Management in Distributed Systems: A Semi-formal Approach Marco Aldinucci 1, Marco Danelutto 1, and Peter Kilpatrick 2 1 Department of Computer Science, University of Pisa {aldinuc,marcod}@di.unipi.it
More informationOptimization of thread affinity and memory affinity for remote core locking synchronization in multithreaded programs for multicore computer systems
Optimization of thread affinity and memory affinity for remote core locking synchronization in multithreaded programs for multicore computer systems Alexey Paznikov Saint Petersburg Electrotechnical University
More informationSkeletor: A DSL for Describing Type-based Specifications of Parallel Skeletons
Skeletor: A DSL for Describing Type-based Specifications of Parallel Skeletons David Castro Kevin Hammond School of Computer Science, University of St Andrews, St Andrews, UK. dc84@st-andrews.ac.uk, kh@cs.st-andrews.ac.uk
More informationOrdered Read Write Locks for Multicores and Accelarators
Ordered Read Write Locks for Multicores and Accelarators INRIA & ICube Strasbourg, France mariem.saied@inria.fr ORWL, Ordered Read-Write Locks An inter-task synchronization model for data-oriented parallel
More information