DBMS Environment. Application Running in DMS. Source of data. Utilization of data. Standard files. Parallel files. Input. File. Output.

Size: px
Start display at page:

Download "DBMS Environment. Application Running in DMS. Source of data. Utilization of data. Standard files. Parallel files. Input. File. Output."


1 Language, Compiler and Parallel Database Support for I/O Intensive Applications? Peter Brezany a, Thomas A. Mueck b and Erich Schikuta b University of Vienna a Inst. for Softw. Technology and Parallel Systems, Liechtensteinstr. 22, A-1092 Vienna b Department of Data Engineering, Rathausstrasse 19/4, A-1010 Vienna, Austria Abstract. Automatic mapping of I/O intensive applications on massively parallel systems is a challenging problem of great importance. This paper proposes a novel solution to the I/O problem. First, Fortran language extensions are introduced that support highly ecient I/O processing. Second, we specify the appropriate compilation method that utilizes an advanced runtime system called VIPIOS that is designed on the basis of parallel database technology. We present this proposal in the context of Vienna Fortran and its compiler. 1 Introduction This paper proposes a language, compiler and runtime software solution to the problem of I/O in distributed-memory systems (DMSs). We present this proposal in the context of Vienna Fortran [8], and its compilation system. In typical supercomputing applications six types of I/O can be identied ([5]): (1) input, (2) debugging, (3) scratch les, (4) checkpoint/restart, (5) output, and (6) accessing out-of-core structures. Types (3), (4) and in some phases (6) too, do not contribute to the communication with the environment of the processing system. Therefore, the data they include may be stored on devices of the parallel I/O subsystem as parallel les. The lay-out of such les may be optimized to achieve the highest I/O data transfer rate. In our approach, this I/O functionality is implemented by the VIenna Parallel I/O System (VIPIOS). All other types of I/O operations involve les which have to resemble the standard sequential FORTRAN le format - standard les. The ow of I/O data in a typical application processing cycle is graphically sketched in Figure 1. In general the logical view of a VIPIOS le corresponds to the conventional sequential FORTRAN le model.? The work described in this paper was carried out as part of the CEI PACT Project funded by the Austrian Ministry for Science and Research (BMWF).

2 DBMS Environment Application Running in DMS Source of data Input File COPYIN READ Parallel I/O Subsystem (VIPIOS) READ/WRITE Computational nodes Utilization of data Standard files WRITE Output File COPYOUT Parallel files Fig. 1. I/O in a Typical Application Processing Cycle 2 Language Support for I/O Operations 2.1 Opening a Parallel File Specication of the File Location. The standard Fortran OPEN statement is extended by a new optional specier MODE. The meaning of MODE specier follows from the examples introduced in Figure2: units 8 and 9 will refer to standard les (MODE = 'ST'), and unit 10 will be connected to a parallel le (MODE = 'PF'). D1: IO DTYPE PAT1(M,N1,N2,K1,K2) D2: PROCESSORS P2D(M,M); ELM TYPE REAL; ARRAY SHAPE (N1,N2) D3: ARRAY DIST (CYCLIC(K1), CYCLIC(K2)) TO P2D) D4: END IO DTYPE PAT1 O1: OPEN (u1 = 8, FILE = '/usr/exa1', MODE = 'ST', STATUS = 'NEW') O2: OPEN (u2 = 9, FILE = '/usr/exa2', STATUS = 'OLD') O3: OPEN (u3 = 10, FILE = '/usr/exa3', MODE = 'PF', STATUS = 'NEW', & O4: IO DIST = PAT1(8,400,100,4,2)) C1: COPYIN (u1) ONTO (u3); COPYOUT (u3) ONTO (u2) Fig. 2. Opening and Copying Files { Examples

3 I/O Distribution Hints. Using a new optional specier IO DIST, the application programmer may pass to the compiler/runtime system a hint that data in the le will be written to or read from an array of the specied distribution. According to lines D1{D4 and O3{O4 in Figure 2, by default, elements of all arrays will be written to the le '/usr/exa3' so as to optimize reading them into real arrays which have the shape (400,100), and are distributed as (CYCLIC(4), CYCLIC(2)) onto a grid of processors having the shape (8,8). This predened global I/O distribution specication can be temporarily changed by a WRITE statement. 2.2 I/O Operations on Parallel Files (i) In the simplest form, the individual distributions of the arrays determine the sequence of array elements written out to the le. For example, in the following statement: WRITE (f) A 1, A 2,..., A r where A i, 1 i r are array identiers. This form should be used when the data is going to be read into arrays with the same distribution as A i. (ii) The IO DIST specier of the WRITE statement enables the application programmer to specify the distribution of the target array in a similar way as outlined in subsection 2.1. WRITE (f, IO DIST = PAT1(4,100,100,5,5)) A (iii) If the data in a le is to be subsequently read into arrays with dierent distributions or if there is no information available about the distribution of the target arrays, the application programmer may allow the compiler to choose the sequence of the elements to be written out. WRITE (f, IO DIST = SYSTEM) A 1,...,A r (iv) A read operation to one or more distributed arrays is specied by READ (f) B 1, B 2,..., B r (v) The REORGANIZE statement enables the application programmer to restructure a le. The statements COPYIN and COPYOUT copy les (see line C1 in Figure 2). All I/O statements may include an EVENT specier to specify the asynchronous mode. 3 Specication of the Compilation Method The implementation of parallel I/O operations introduced in the last section comprises both run time and compile time elements. At compile time, processing of parallel I/O operations conceptually consists of two phases: basic compilation and advanced optimizations. The basic compilation phase extracts parameters about data distributions and le access patterns from the VF programs and passes them in a normalized form to the VIPIOS runtime primitives without performing any sophisticated program analysis and optimization. As an example, a possible translation of the

4 OPEN and WRITE statements is shown in Figure 3. These statements are translated to calls of functions VIPIOS open and VIPIOS write, respectively. The last function writes synchronously the distributed array referenced in the statement to the open VIPIOS le. The structures dd source and dd target store the data distribution descriptors associated with the source and target arrays, respectively. The le descriptor fd contains all information about the associated le in a compact form. This information is needed during the subsequent le operations. File descriptors are stored in an one dimensional array FdArray using the unit number as an index. The value of the logical variable result indicates whether the operation succeeded or failed. Original code PROCESSORS P1D(16); REAL A(10000) DIST (BLOCK) TO P1D OPEN (13, FILE = '/usr/exa6', MODE = 'PF', STATUS = 'NEW') WRITE (13, IO DIST = '(CYCLIC)') A Transformed form (generated by VFCS automatically) PARAMETER :: Max Numb of Units =...; LOGICAL :: result TYPE (Distr Descriptor) :: dd source, dd target; TYPE (File Descriptor) :: fd TYPE (File Descriptor), DIMENSION(Max Numb of Units) :: FdArray... initialization of dd source and dd target... fd = VIPIOS open(name='/usr/exa6',status='new'); FdArray(13) = fd result = VIPIOS write (le descr=fdarray(f),data address = A, & dist source=dd source,dist target=dd target) Fig. 3. Translation of the OPEN and WRITE statements. The optimization phase utilizes the results of program analysis which are provided by the Analysis Subsystem of VFCS. Program analysis that supports I/O optimizations comprises data ow analysis, reaching distribution analysis and cost estimation. The goal of I/O optimizations is to derive an ecient parallel program with: Low I/O overhead. The compiler derives hints for data organization in the les and inserts the appropriate data reorganization statements. High amount of computation that can be performed concurrently with I/O. These optimizations restructure the computation and I/O to increase the amount of useful computation that may be performed in parallel with I/O. The VIPIOS calls oer a choice between synchronous and asynchronous I/O. I/O performed concurrently with computation and other I/O. The program analysis is capable of providing information whether or not the I/O-computation- I/O parallelism is save (due to the data dependence analysis) and useful (due to the performance analysis). If both preconditions are fullled the compiler allows I/O to run in parallel with other computations or other I/O statements.

5 4 VIPIOS In contrast to the parallelized computation supported by HPF languages (like VF) les are read and written sequentially in current implementations. I/O requests are processed by a single centralized host process and data is transferred via the network interconnections to the node processes. Therefore parallel le I/O is not yet supported by the current system architecture. 4.1 Design characteristics To exploit parallelization the physical le reads and writes have to be shifted from the host process to the node processes. The proposed solution is the VI- PIOS (Vienna Parallel Input Output System), a separate I/O subsystem, which resolves the read/write requests locally on the node. The VIPIOS is realized by cooperative parallelized data server modules running on the nodes. The data requests of the processes of each node are received and handled by the I/O subsystem on the nodes directly. The VIPIOS guarantees that each processor has access to its requested data. Based on the information about the data and the access prole provided by a HPF language system the VIPIOS organizes the information and tries to assure a high performance of the accesses to the stored data. To reach this aim, the design and development of the VIPIOS is determined by the following characteristics. Parallelism. The foremost design principle is the utilization of parallelization to achieve highest possible performance. This is be reached by parallelized accesses of processors to multiple disks. To avoid unnecessary communication and synchronization overhead the physical data distribution has to reect the problem distribution of the SPMD processes. This guarantees that each processor accesses mostly the data of its local disk ("data locality principle"). Abstract I/O model. The notion of a data type is supported by the VIPIOS. Stored information is not seen as byte sequences only, but as topologically ordered typed data values bearing semantics. This can be exploited by the runtime system and allows data administration on a higher level, which in turn results in a smarter data organization and higher performance. Ecient data administration. Finding specic information in a sequential le is a costly process, which can result into a scan of the whole le. Index structures support the accesses to stored data set. This can improve the performance dramatically, due to the situation that the size of the data set to be scanned is reduced drastically. Scalability. The size of the I/O system, i.e. the number of I/O nodes, is independent of any implementation or system dependent characteristics. The only dening attribute is the problem size. Further the possibility to change the number of I/O nodes dynamically corresponding to the problem solution process is supported. This requires the feature to redistribute the data among the changed set of participating nodes.

6 4.2 Declustering Declustering of a data set is the distribution of the blocks, the records, the data objects or the bytes of a le among two or more disk drives according to a dened declustering schema. Declustering allows the I/O system to increase the bandwidth of the I/O operations by reading and writing multiple disks in parallel. It is the common technique of parallel database systems to speed up the data accesses. Generally three dierent declustering types can be distinguished. { Key declustering. The declustering is performed according to the key values of one or more attributes of the records. { Data independent declustering. A general data-independent declustering algorithm is employed, for example round-robin declustering. { Problem declustering. The location of the data records is dened by the distribution criteria of the superimposed problem solution approach. That means in the context of a data parallel language like VF the array distribution among the processes. This approach seems extremely promising to increase the system performance due to the data locality principle. It is one of the key elements of VIPIOS project. The problem specic distribution information given by the VF example PROCESSOR P1D(5) REAL A(100, 10000, 200) DIST (:, BLOCK, :) TO P1D WRITE (f) A can be reected by the data declustering shown in Figure Implementation Basis of the VIPIOS implementation is the existing DiNG le system [6]. It is a prototype based on Distributed and Nested Grid les (i.e. DiNG le), which supports the ecient parallel execution of exact match, partial match and range queries directly by its inherent data structure. All necessary operations are provided at the system call level. Distributed and nested grid les are multikey index structures designed for mass-storage subsystems on shared nothing architectures. 5 Conclusion In this paper a novel solution to the I/O problem of HPF languages is presented. The necessary language constructs, compilation methods and the runtime support is discussed. The language constructs proposed in this paper and the VIPIOS are described and discussed in [4] in more detail. The proposed system is planned to become part of the VFCS in the future. Further interesting topics are checkpoint/restart and out-of-core program. These issues are beyond the current state of the research project, but will be tackled in the future.

7 array A Declustering method Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Fig. 4. Problem specic data declustering References 1. Bordawekar R.R., Choudhary A.N., Language and Compiler Support for Parallel I/O. IFIP Working Conf. Prog. Env. for Massively Parallel Distributed Systems, Swiss, April Bordawekar R., Rosario J.M., Choudhary A., Design and Evaluation of Primitives for Parallel I/O, in Proc. Supercomputing '93, Nov Brezany P., Gerndt M., Mehrotra P., Zima H., Concurrent File Operations in a High Performance Fortran. In Proceedings of Supercomputing'92, (November 1992), 230{ Brezany P., Mueck T.A., Schikuta E., Language, Compiler and Database Support for Parallel I/O Operations, Int. Rep. Inst. for Softw. Techn. and Par. Sys., Dept. of Data Eng., Nov Galbreath N., Gropp W., Levine D., Applications-Driven Parallel I/O. Supercomputing 93, Portland, USA, 462{ Mueck T.A., The DiNG - A Parallel Multiattribute File System for Deductive Database Machines, 3rd Int. Symp. on Database Systems for Adv. Appl., World Scientic, Taejon, Snir M., Proposal for IO. Posted to HPFF I/O Forum by Marc Snir, July Zima H., Brezany P., Chapman B., Mehrotra P., and Schwald A., Vienna Fortran { a language specication. ACPC Technical Report Series, University of Vienna, Vienna, Austria, Also available as ICASE INTERIM REPORT 21, MS 132c, NASA, Hampton VA This article was processed using the LaT E X macro package with LLNCS style

Application Programmer. Vienna Fortran Out-of-Core Program

Application Programmer. Vienna Fortran Out-of-Core Program Mass Storage Support for a Parallelizing Compilation System b a Peter Brezany a, Thomas A. Mueck b, Erich Schikuta c Institute for Software Technology and Parallel Systems, University of Vienna, Liechtensteinstrasse

More information

Technische Universitat Munchen. Institut fur Informatik. D Munchen.

Technische Universitat Munchen. Institut fur Informatik. D Munchen. Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl

More information

Language and Compiler Support for Out-of-Core Irregular Applications on Distributed-Memory Multiprocessors

Language and Compiler Support for Out-of-Core Irregular Applications on Distributed-Memory Multiprocessors Language and Compiler Support for Out-of-Core Irregular Applications on Distributed-Memory Multiprocessors Peter Brezany 1, Alok Choudhary 2, and Minh Dang 1 1 Institute for Software Technology and Parallel

More information

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety Data Parallel Programming with the Khoros Data Services Library Steve Kubica, Thomas Robey, Chris Moorman Khoral Research, Inc. 6200 Indian School Rd. NE Suite 200 Albuquerque, NM 87110 USA E-mail: info@khoral.com

More information

Implementation and Evaluation of Prefetching in the Intel Paragon Parallel File System

Implementation and Evaluation of Prefetching in the Intel Paragon Parallel File System Implementation and Evaluation of Prefetching in the Intel Paragon Parallel File System Meenakshi Arunachalam Alok Choudhary Brad Rullman y ECE and CIS Link Hall Syracuse University Syracuse, NY 344 E-mail:

More information

Dierencegraph - A ProM Plugin for Calculating and Visualizing Dierences between Processes

Dierencegraph - A ProM Plugin for Calculating and Visualizing Dierences between Processes Dierencegraph - A ProM Plugin for Calculating and Visualizing Dierences between Processes Manuel Gall 1, Günter Wallner 2, Simone Kriglstein 3, Stefanie Rinderle-Ma 1 1 University of Vienna, Faculty of

More information

ViPIOS VIenna Parallel Input Output System

ViPIOS VIenna Parallel Input Output System arxiv:1808.166v1 [cs.dc] 3 Aug 28 ViPIOS VIenna Parallel Input Output System Language, Compiler and Advanced Data Structure Support for Parallel I/O Operations Project Deliverable Partially funded by FWF

More information

DYNAMIC DATA DISTRIBUTIONS IN VIENNA FORTRAN. Hans Zima a. Institute for Software Technology and Parallel Systems,

DYNAMIC DATA DISTRIBUTIONS IN VIENNA FORTRAN. Hans Zima a. Institute for Software Technology and Parallel Systems, DYNAMIC DATA DISTRIBUTIONS IN VIENNA FORTRAN Barbara Chapman a Piyush Mehrotra b Hans Moritsch a Hans Zima a a Institute for Software Technology and Parallel Systems, University of Vienna, Brunner Strasse

More information

Compiling FORTRAN for Massively Parallel Architectures. Peter Brezany. University of Vienna

Compiling FORTRAN for Massively Parallel Architectures. Peter Brezany. University of Vienna Compiling FORTRAN for Massively Parallel Architectures Peter Brezany University of Vienna Institute for Software Technology and Parallel Systems Brunnerstrasse 72, A-1210 Vienna, Austria 1 Introduction

More information

PASSION Runtime Library for Parallel I/O. Rajeev Thakur Rajesh Bordawekar Alok Choudhary. Ravi Ponnusamy Tarvinder Singh

PASSION Runtime Library for Parallel I/O. Rajeev Thakur Rajesh Bordawekar Alok Choudhary. Ravi Ponnusamy Tarvinder Singh Scalable Parallel Libraries Conference, Oct. 1994 PASSION Runtime Library for Parallel I/O Rajeev Thakur Rajesh Bordawekar Alok Choudhary Ravi Ponnusamy Tarvinder Singh Dept. of Electrical and Computer

More information

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2 Introduction :- Today single CPU based architecture is not capable enough for the modern database that are required to handle more demanding and complex requirements of the users, for example, high performance,

More information

David Kotz. Abstract. papers focus on the performance advantages and capabilities of disk-directed I/O, but say little

David Kotz. Abstract. papers focus on the performance advantages and capabilities of disk-directed I/O, but say little Interfaces for Disk-Directed I/O David Kotz Department of Computer Science Dartmouth College Hanover, NH 03755-3510 dfk@cs.dartmouth.edu Technical Report PCS-TR95-270 September 13, 1995 Abstract In other

More information

Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines

Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines 1063 7133/97 $10 1997 IEEE Proceedings of the 11th International Parallel Processing Symposium (IPPS '97) 1063-7133/97 $10 1997 IEEE Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs

More information

Optimizing Irregular HPF Applications Using Halos Siegfried Benkner C&C Research Laboratories NEC Europe Ltd. Rathausallee 10, D St. Augustin, G

Optimizing Irregular HPF Applications Using Halos Siegfried Benkner C&C Research Laboratories NEC Europe Ltd. Rathausallee 10, D St. Augustin, G Optimizing Irregular HPF Applications Using Halos Siegfried Benkner C&C Research Laboratories NEC Europe Ltd. Rathausallee 10, D-53757 St. Augustin, Germany Abstract. This paper presents language features

More information

clients (compute nodes) servers (I/O nodes)

clients (compute nodes) servers (I/O nodes) Parallel I/O on Networks of Workstations: Performance Improvement by Careful Placement of I/O Servers Yong Cho 1, Marianne Winslett 1, Szu-wen Kuo 1, Ying Chen, Jonghyun Lee 1, Krishna Motukuri 1 1 Department

More information

COMPUTE PARTITIONS Partition n. Partition 1. Compute Nodes HIGH SPEED NETWORK. I/O Node k Disk Cache k. I/O Node 1 Disk Cache 1.

COMPUTE PARTITIONS Partition n. Partition 1. Compute Nodes HIGH SPEED NETWORK. I/O Node k Disk Cache k. I/O Node 1 Disk Cache 1. Parallel I/O from the User's Perspective Jacob Gotwals Suresh Srinivas Shelby Yang Department of r Science Lindley Hall 215, Indiana University Bloomington, IN, 4745 fjgotwals,ssriniva,yangg@cs.indiana.edu

More information

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP (extended abstract) Mitsuhisa Sato 1, Motonari Hirano 2, Yoshio Tanaka 2 and Satoshi Sekiguchi 2 1 Real World Computing Partnership,

More information

proposed. In Sect. 3, the environment used for the automatic generation of data parallel programs is briey described, together with the proler tool pr

proposed. In Sect. 3, the environment used for the automatic generation of data parallel programs is briey described, together with the proler tool pr Performance Evaluation of Automatically Generated Data-Parallel Programs L. Massari Y. Maheo DIS IRISA Universita dipavia Campus de Beaulieu via Ferrata 1 Avenue du General Leclerc 27100 Pavia, ITALIA

More information

FORSCHUNGSZENTRUM J ULICH GmbH Zentralinstitut f ur Angewandte Mathematik D J ulich, Tel. (02461)

FORSCHUNGSZENTRUM J ULICH GmbH Zentralinstitut f ur Angewandte Mathematik D J ulich, Tel. (02461) FORSCHUNGSZENTRUM J ULICH GmbH Zentralinstitut f ur Angewandte Mathematik D-52425 J ulich, Tel. (02461) 61-6402 Interner Bericht SVM Support in the Vienna Fortran Compiling System Peter Brezany*, Michael

More information

Compilation Issues for High Performance Computers: A Comparative. Overview of a General Model and the Unied Model. Brian J.

Compilation Issues for High Performance Computers: A Comparative. Overview of a General Model and the Unied Model. Brian J. Compilation Issues for High Performance Computers: A Comparative Overview of a General Model and the Unied Model Abstract This paper presents a comparison of two models suitable for use in a compiler for

More information

Physical Schemas for Large Multidimensional Arrays in. Scientic Computing Applications. University of Illinois. Urbana, Illinois 61801

Physical Schemas for Large Multidimensional Arrays in. Scientic Computing Applications. University of Illinois. Urbana, Illinois 61801 Physical Schemas for Large Multidimensional Arrays in Scientic Computing Applications Kent E. Seamons and Marianne Winslett Computer Science Department University of Illinois Urbana, Illinois 61801 fseamons,winslettg@cs.uiuc.edu

More information

Tarek S. Abdelrahman and Thomas N. Wong. University oftoronto. Toronto, Ontario, M5S 1A4. Canada

Tarek S. Abdelrahman and Thomas N. Wong. University oftoronto. Toronto, Ontario, M5S 1A4. Canada Distributed Array Data Management on NUMA Multiprocessors Tarek S. Abdelrahman and Thomas N. Wong Department of Electrical and Computer Engineering University oftoronto Toronto, Ontario, M5S 1A Canada

More information

Northeast Parallel Architectures Center. Syracuse University. May 17, Abstract

Northeast Parallel Architectures Center. Syracuse University. May 17, Abstract The Design of VIP-FS: A Virtual, Parallel File System for High Performance Parallel and Distributed Computing NPAC Technical Report SCCS-628 Juan Miguel del Rosario, Michael Harry y and Alok Choudhary

More information

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer

More information

SVM Support in the Vienna Fortran Compilation System. Michael Gerndt. Research Centre Julich(KFA)

SVM Support in the Vienna Fortran Compilation System. Michael Gerndt. Research Centre Julich(KFA) SVM Support in the Vienna Fortran Compilation System Peter Brezany University of Vienna brezany@par.univie.ac.at Michael Gerndt Research Centre Julich(KFA) m.gerndt@kfa-juelich.de Viera Sipkova University

More information

Language-Based Parallel Program Interaction: The Breezy Approach. Darryl I. Brown Allen D. Malony. Bernd Mohr. University of Oregon

Language-Based Parallel Program Interaction: The Breezy Approach. Darryl I. Brown Allen D. Malony. Bernd Mohr. University of Oregon Language-Based Parallel Program Interaction: The Breezy Approach Darryl I. Brown Allen D. Malony Bernd Mohr Department of Computer And Information Science University of Oregon Eugene, Oregon 97403 fdarrylb,

More information

Accelerated Library Framework for Hybrid-x86

Accelerated Library Framework for Hybrid-x86 Software Development Kit for Multicore Acceleration Version 3.0 Accelerated Library Framework for Hybrid-x86 Programmer s Guide and API Reference Version 1.0 DRAFT SC33-8406-00 Software Development Kit

More information

Abstract HPF was originally created to simplify high-level programming of parallel computers. The inventors of HPF strove for an easy-to-use language

Abstract HPF was originally created to simplify high-level programming of parallel computers. The inventors of HPF strove for an easy-to-use language Ecient HPF Programs Harald J. Ehold 1 Wilfried N. Gansterer 2 Dieter F. Kvasnicka 3 Christoph W. Ueberhuber 2 1 VCPC, European Centre for Parallel Computing at Vienna E-Mail: ehold@vcpc.univie.ac.at 2

More information

Automatic Array Alignment for. Mitsuru Ikei. Hitachi Chemical Company Ltd. Michael Wolfe. Oregon Graduate Institute of Science & Technology

Automatic Array Alignment for. Mitsuru Ikei. Hitachi Chemical Company Ltd. Michael Wolfe. Oregon Graduate Institute of Science & Technology Automatic Array Alignment for Distributed Memory Multicomputers Mitsuru Ikei Hitachi Chemical Company Ltd. Michael Wolfe Oregon Graduate Institute of Science & Technology P.O. Box 91000 Portland OR 97291

More information

Network. Department of Statistics. University of California, Berkeley. January, Abstract

Network. Department of Statistics. University of California, Berkeley. January, Abstract Parallelizing CART Using a Workstation Network Phil Spector Leo Breiman Department of Statistics University of California, Berkeley January, 1995 Abstract The CART (Classication and Regression Trees) program,

More information

Transparent Access to Legacy Data in Java. Olivier Gruber. IBM Almaden Research Center. San Jose, CA Abstract

Transparent Access to Legacy Data in Java. Olivier Gruber. IBM Almaden Research Center. San Jose, CA Abstract Transparent Access to Legacy Data in Java Olivier Gruber IBM Almaden Research Center San Jose, CA 95120 Abstract We propose in this paper an extension to PJava in order to provide a transparent access

More information

1e+07 10^5 Node Mesh Step Number

1e+07 10^5 Node Mesh Step Number Implicit Finite Element Applications: A Case for Matching the Number of Processors to the Dynamics of the Program Execution Meenakshi A.Kandaswamy y Valerie E. Taylor z Rudolf Eigenmann x Jose' A. B. Fortes

More information

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples Multicore Jukka Julku 19.2.2009 1 2 3 4 5 6 Disclaimer There are several low-level, languages and directive based approaches But no silver bullets This presentation only covers some examples of them is

More information

OpenMP on Distributed Memory via Global Arrays

OpenMP on Distributed Memory via Global Arrays 1 OpenMP on Distributed Memory via Global Arrays Lei Huang a, Barbara Chapman a, and Ricky A. Kall b a Dept. of Computer Science, University of Houston, Texas. {leihuang,chapman}@cs.uh.edu b Scalable Computing

More information

A Component-based Programming Model for Composite, Distributed Applications

A Component-based Programming Model for Composite, Distributed Applications NASA/CR-2001-210873 ICASE Report No. 2001-15 A Component-based Programming Model for Composite, Distributed Applications Thomas M. Eidson ICASE, Hampton, Virginia ICASE NASA Langley Research Center Hampton,

More information

Dewayne E. Perry. Abstract. An important ingredient in meeting today's market demands

Dewayne E. Perry. Abstract. An important ingredient in meeting today's market demands Maintaining Consistent, Minimal Congurations Dewayne E. Perry Software Production Research, Bell Laboratories 600 Mountain Avenue, Murray Hill, NJ 07974 USA dep@research.bell-labs.com Abstract. An important

More information

On Estimating the Useful Work Distribution of. Thomas Fahringer. University of Vienna. Abstract

On Estimating the Useful Work Distribution of. Thomas Fahringer. University of Vienna. Abstract On Estimating the Useful Work Distribution of Parallel Programs under the P 3 T: A Static Performance Estimator Thomas Fahringer Institute for Software Technology and Parallel Systems University of Vienna

More information

Do! environment. DoT

Do! environment. DoT The Do! project: distributed programming using Java Pascale Launay and Jean-Louis Pazat IRISA, Campus de Beaulieu, F35042 RENNES cedex Pascale.Launay@irisa.fr, Jean-Louis.Pazat@irisa.fr http://www.irisa.fr/caps/projects/do/

More information

Implementing Scheduling Algorithms. Real-Time and Embedded Systems (M) Lecture 9

Implementing Scheduling Algorithms. Real-Time and Embedded Systems (M) Lecture 9 Implementing Scheduling Algorithms Real-Time and Embedded Systems (M) Lecture 9 Lecture Outline Implementing real time systems Key concepts and constraints System architectures: Cyclic executive Microkernel

More information

signature i-1 signature i instruction j j+1 branch adjustment value "if - path" initial value signature i signature j instruction exit signature j+1

signature i-1 signature i instruction j j+1 branch adjustment value if - path initial value signature i signature j instruction exit signature j+1 CONTROL FLOW MONITORING FOR A TIME-TRIGGERED COMMUNICATION CONTROLLER Thomas M. Galla 1, Michael Sprachmann 2, Andreas Steininger 1 and Christopher Temple 1 Abstract A novel control ow monitoring scheme

More information

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz Compiler and Runtime Support for Programming in Adaptive Parallel Environments 1 Guy Edjlali, Gagan Agrawal, Alan Sussman, Jim Humphries, and Joel Saltz UMIACS and Dept. of Computer Science University

More information

Interprocessor Communication Support in the Omega Parallel Database System

Interprocessor Communication Support in the Omega Parallel Database System Interprocessor Communication Support in the Omega Parallel Database System Leonid B. Sokolinsky Chelyabinsk State University Chelyabinsk, Russia sokolinsky@acm.org Abstract 1 Interprocessor communication

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture X: Parallel Databases Topics Motivation and Goals Architectures Data placement Query processing Load balancing

More information

The Pandore Data-Parallel Compiler. and its Portable Runtime. Abstract. This paper presents an environment for programming distributed

The Pandore Data-Parallel Compiler. and its Portable Runtime. Abstract. This paper presents an environment for programming distributed The Pandore Data-Parallel Compiler and its Portable Runtime Francoise Andre, Marc Le Fur, Yves Maheo, Jean-Louis Pazat? IRISA, Campus de Beaulieu, F-35 Rennes Cedex, FRANCE Abstract. This paper presents

More information

Dynamic Compilation for Reducing Energy Consumption of I/O-Intensive Applications

Dynamic Compilation for Reducing Energy Consumption of I/O-Intensive Applications Dynamic Compilation for Reducing Energy Consumption of I/O-Intensive Applications Seung Woo Son 1, Guangyu Chen 1, Mahmut Kandemir 1, and Alok Choudhary 2 1 Pennsylvania State University, University Park

More information

The S-Expression Design Language (SEDL) James C. Corbett. September 1, Introduction. 2 Origins of SEDL 2. 3 The Language SEDL 2.

The S-Expression Design Language (SEDL) James C. Corbett. September 1, Introduction. 2 Origins of SEDL 2. 3 The Language SEDL 2. The S-Expression Design Language (SEDL) James C. Corbett September 1, 1993 Contents 1 Introduction 1 2 Origins of SEDL 2 3 The Language SEDL 2 3.1 Scopes : : : : : : : : : : : : : : : : : : : : : : : :

More information

The Compositional C++ Language. Denition. Abstract. This document gives a concise denition of the syntax and semantics

The Compositional C++ Language. Denition. Abstract. This document gives a concise denition of the syntax and semantics The Compositional C++ Language Denition Peter Carlin Mani Chandy Carl Kesselman March 12, 1993 Revision 0.95 3/12/93, Comments welcome. Abstract This document gives a concise denition of the syntax and

More information

University of Ghent. St.-Pietersnieuwstraat 41. Abstract. Sucient and precise semantic information is essential to interactive

University of Ghent. St.-Pietersnieuwstraat 41. Abstract. Sucient and precise semantic information is essential to interactive Visualizing the Iteration Space in PEFPT? Qi Wang, Yu Yijun and Erik D'Hollander University of Ghent Dept. of Electrical Engineering St.-Pietersnieuwstraat 41 B-9000 Ghent wang@elis.rug.ac.be Tel: +32-9-264.33.75

More information

INTRODUCTION Introduction This document describes the MPC++ programming language Version. with comments on the design. MPC++ introduces a computationa

INTRODUCTION Introduction This document describes the MPC++ programming language Version. with comments on the design. MPC++ introduces a computationa TR-944 The MPC++ Programming Language V. Specication with Commentary Document Version. Yutaka Ishikawa 3 ishikawa@rwcp.or.jp Received 9 June 994 Tsukuba Research Center, Real World Computing Partnership

More information

Henning Koch. Dept. of Computer Science. University of Darmstadt. Alexanderstr. 10. D Darmstadt. Germany. Keywords:

Henning Koch. Dept. of Computer Science. University of Darmstadt. Alexanderstr. 10. D Darmstadt. Germany. Keywords: Embedding Protocols for Scalable Replication Management 1 Henning Koch Dept. of Computer Science University of Darmstadt Alexanderstr. 10 D-64283 Darmstadt Germany koch@isa.informatik.th-darmstadt.de Keywords:

More information

Benchmarking the CGNS I/O performance

Benchmarking the CGNS I/O performance 46th AIAA Aerospace Sciences Meeting and Exhibit 7-10 January 2008, Reno, Nevada AIAA 2008-479 Benchmarking the CGNS I/O performance Thomas Hauser I. Introduction Linux clusters can provide a viable and

More information

A Framework for Integrated Communication and I/O Placement

A Framework for Integrated Communication and I/O Placement Syracuse University SURFACE Electrical Engineering and Computer Science College of Engineering and Computer Science 1996 A Framework for Integrated Communication and I/O Placement Rajesh Bordawekar Syracuse

More information

US A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2009/ A1 Joshi et al. (43) Pub. Date: Nov.

US A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2009/ A1 Joshi et al. (43) Pub. Date: Nov. US 20090287845A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2009/0287845 A1 Joshi et al. (43) Pub. Date: Nov. 19, 2009 (54) (75) (73) MEDIATOR WITH INTERLEAVED STATIC AND

More information

Parallel Pipeline STAP System

Parallel Pipeline STAP System I/O Implementation and Evaluation of Parallel Pipelined STAP on High Performance Computers Wei-keng Liao, Alok Choudhary, Donald Weiner, and Pramod Varshney EECS Department, Syracuse University, Syracuse,

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

The Architecture of a System for the Indexing of Images by. Content

The Architecture of a System for the Indexing of Images by. Content The Architecture of a System for the Indexing of s by Content S. Kostomanolakis, M. Lourakis, C. Chronaki, Y. Kavaklis, and S. C. Orphanoudakis Computer Vision and Robotics Laboratory Institute of Computer

More information

Huge market -- essentially all high performance databases work this way

Huge market -- essentially all high performance databases work this way 11/5/2017 Lecture 16 -- Parallel & Distributed Databases Parallel/distributed databases: goal provide exactly the same API (SQL) and abstractions (relational tables), but partition data across a bunch

More information

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite Parallelization of 3-D Range Image Segmentation on a SIMD Multiprocessor Vipin Chaudhary and Sumit Roy Bikash Sabata Parallel and Distributed Computing Laboratory SRI International Wayne State University

More information

Modeling Cooperative Behavior Using Cooperation Contracts. Peter Lang. Abstract

Modeling Cooperative Behavior Using Cooperation Contracts. Peter Lang. Abstract Modeling Cooperative Behavior Using Cooperation Contracts Michael Schre Department of Data & Knowledge Engineering University of Linz Altenbergerstr. 69, A-4040 Linz, Austria schre@dke.uni-linz.ac.at Gerti

More information

Scalability issues : HPC Applications & Performance Tools

Scalability issues : HPC Applications & Performance Tools High Performance Computing Systems and Technology Group Scalability issues : HPC Applications & Performance Tools Chiranjib Sur HPC @ India Systems and Technology Lab chiranjib.sur@in.ibm.com Top 500 :

More information


MTIO A MULTI-THREADED PARALLEL I/O SYSTEM MTIO A MULTI-THREADED PARALLEL I/O SYSTEM Sachin More Alok Choudhary Dept.of Electrical and Computer Engineering Northwestern University, Evanston, IL 60201 USA Ian Foster Mathematics and Computer Science

More information



More information

Advanced Database Systems

Advanced Database Systems Lecture II Storage Layer Kyumars Sheykh Esmaili Course s Syllabus Core Topics Storage Layer Query Processing and Optimization Transaction Management and Recovery Advanced Topics Cloud Computing and Web

More information

2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t

2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t Data Reduction - an Adaptation Technique for Mobile Environments A. Heuer, A. Lubinski Computer Science Dept., University of Rostock, Germany Keywords. Reduction. Mobile Database Systems, Data Abstract.

More information

Striping without Sacrices: Maintaining POSIX Semantics in a Parallel File System

Striping without Sacrices: Maintaining POSIX Semantics in a Parallel File System Striping without Sacrices: Maintaining POSIX Semantics in a Parallel File System Jan Stender 1, Björn Kolbeck 1, Felix Hupfeld 1, Eugenio Cesario 5, Erich Focht 4, Matthias Hess 4, Jesús Malo 2, Jonathan

More information

Performance Cockpit: An Extensible GUI Platform for Performance Tools

Performance Cockpit: An Extensible GUI Platform for Performance Tools Performance Cockpit: An Extensible GUI Platform for Performance Tools Tianchao Li and Michael Gerndt Institut für Informatik, Technische Universität München, Boltzmannstr. 3, D-85748 Garching bei Mu nchen,

More information

pc++/streams: a Library for I/O on Complex Distributed Data-Structures

pc++/streams: a Library for I/O on Complex Distributed Data-Structures pc++/streams: a Library for I/O on Complex Distributed Data-Structures Jacob Gotwals Suresh Srinivas Dennis Gannon Department of Computer Science, Lindley Hall 215, Indiana University, Bloomington, IN

More information

Data Sieving and Collective I/O in ROMIO

Data Sieving and Collective I/O in ROMIO Appeared in Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182 189. c 1999 IEEE. Data Sieving and Collective I/O in ROMIO Rajeev Thakur William Gropp

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

on Current and Future Architectures Purdue University January 20, 1997 Abstract

on Current and Future Architectures Purdue University January 20, 1997 Abstract Performance Forecasting: Characterization of Applications on Current and Future Architectures Brian Armstrong Rudolf Eigenmann Purdue University January 20, 1997 Abstract A common approach to studying

More information

Automatic Code Generation for Non-Functional Aspects in the CORBALC Component Model

Automatic Code Generation for Non-Functional Aspects in the CORBALC Component Model Automatic Code Generation for Non-Functional Aspects in the CORBALC Component Model Diego Sevilla 1, José M. García 1, Antonio Gómez 2 1 Department of Computer Engineering 2 Department of Information and

More information

Yasuo Okabe. Hitoshi Murai. 1. Introduction. 2. Evaluation. Elapsed Time (sec) Number of Processors

Yasuo Okabe. Hitoshi Murai. 1. Introduction. 2. Evaluation. Elapsed Time (sec) Number of Processors Performance Evaluation of Large-scale Parallel Simulation Codes and Designing New Language Features on the (High Performance Fortran) Data-Parallel Programming Environment Project Representative Yasuo

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

Program Design in PVS. Eindhoven University of Technology. Abstract. Hoare triples (precondition, program, postcondition) have

Program Design in PVS. Eindhoven University of Technology. Abstract. Hoare triples (precondition, program, postcondition) have Program Design in PVS Jozef Hooman Dept. of Computing Science Eindhoven University of Technology P.O. Box 513, 5600 MB Eindhoven, The Netherlands e-mail: wsinjh@win.tue.nl Abstract. Hoare triples (precondition,

More information

Efficient Self-Reconfigurable Implementations Using On-Chip Memory

Efficient Self-Reconfigurable Implementations Using On-Chip Memory 10th International Conference on Field Programmable Logic and Applications, August 2000. Efficient Self-Reconfigurable Implementations Using On-Chip Memory Sameer Wadhwa and Andreas Dandalis University

More information


DATABASE MANAGEMENT SYSTEM ARCHITECTURE DATABASE 1 MANAGEMENT SYSTEM ARCHITECTURE DBMS ARCHITECTURE 2 The logical DBMS architecture The physical DBMS architecture DBMS ARCHITECTURE 3 The logical DBMS architecture The logical architecture deals

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

Compilation of I/O Communications for HPF. (in Frontiers'95 - also report A-264-CRI) 35, rue Saint-Honore, Fontainebleau cedex, France

Compilation of I/O Communications for HPF. (in Frontiers'95 - also report A-264-CRI) 35, rue Saint-Honore, Fontainebleau cedex, France Compilation of I/O Communications for HPF (in Frontiers'95 - also report A-64-CRI) Fabien Coelho (coelho@cri.ensmp.fr) Centre de Recherche en Informatique, Ecole des mines de Paris, 35, rue Saint-Honore,

More information

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID System Upgrade Teaches RAID In the growing computer industry we often find it difficult to keep track of the everyday changes in technology. At System Upgrade, Inc it is our goal and mission to provide

More information


INTEGER (M) REAL INTEGER REAL (N,N) A Coordination Layer for Exploiting Task Parallelism with HPF Salvatore Orlando 1 and Raaele Perego 2 1 Dip. di Matematica Appl. ed Informatica, Universita Ca' Foscari di Venezia, Italy 2 Istituto CNUCE,

More information

Ian Foster. Argonne, IL Fortran M is a small set of extensions to Fortran 77 that supports a

Ian Foster. Argonne, IL Fortran M is a small set of extensions to Fortran 77 that supports a FORTRAN M AS A LANGUAGE FOR BUILDING EARTH SYSTEM MODELS Ian Foster Mathematics and omputer Science Division Argonne National Laboratory Argonne, IL 60439 1. Introduction Fortran M is a small set of extensions

More information

An Object-Oriented Approach to Software Development for Parallel Processing Systems

An Object-Oriented Approach to Software Development for Parallel Processing Systems An Object-Oriented Approach to Software Development for Parallel Processing Systems Stephen S. Yau, Xiaoping Jia, Doo-Hwan Bae, Madhan Chidambaram, and Gilho Oh Computer and Information Sciences Department

More information

Curriculum 2013 Knowledge Units Pertaining to PDC

Curriculum 2013 Knowledge Units Pertaining to PDC Curriculum 2013 Knowledge Units Pertaining to C KA KU Tier Level NumC Learning Outcome Assembly level machine Describe how an instruction is executed in a classical von Neumann machine, with organization

More information

Implementation Principles of File Management System for Omega Parallel DBMS *

Implementation Principles of File Management System for Omega Parallel DBMS * Implementation Principles of File Management System for Omega Parallel DBMS * Mikhail L. Zymbler Chelyabinsk State University Russia mzym@cgu.chel.su Leonid B. Sokolinsky Chelyabinsk State University Russia

More information

Implementation Techniques

Implementation Techniques V Implementation Techniques 34 Efficient Evaluation of the Valid-Time Natural Join 35 Efficient Differential Timeslice Computation 36 R-Tree Based Indexing of Now-Relative Bitemporal Data 37 Light-Weight

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

Notes. Some of these slides are based on a slide set provided by Ulf Leser. CS 640 Query Processing Winter / 30. Notes

Notes. Some of these slides are based on a slide set provided by Ulf Leser. CS 640 Query Processing Winter / 30. Notes uery Processing Olaf Hartig David R. Cheriton School of Computer Science University of Waterloo CS 640 Principles of Database Management and Use Winter 2013 Some of these slides are based on a slide set

More information

The driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above

The driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above Janus a C++ Template Library for Parallel Dynamic Mesh Applications Jens Gerlach, Mitsuhisa Sato, and Yutaka Ishikawa fjens,msato,ishikawag@trc.rwcp.or.jp Tsukuba Research Center of the Real World Computing

More information

Developing InfoSleuth Agents Using Rosette: An Actor Based Language

Developing InfoSleuth Agents Using Rosette: An Actor Based Language Developing InfoSleuth Agents Using Rosette: An Actor Based Language Darrell Woelk Microeclectronics and Computer Technology Corporation (MCC) 3500 Balcones Center Dr. Austin, Texas 78759 InfoSleuth Architecture

More information

Boundary control : Access Controls: An access control mechanism processes users request for resources in three steps: Identification:

Boundary control : Access Controls: An access control mechanism processes users request for resources in three steps: Identification: Application control : Boundary control : Access Controls: These controls restrict use of computer system resources to authorized users, limit the actions authorized users can taker with these resources,

More information



More information

Requirements document for an automated teller machine. network

Requirements document for an automated teller machine. network Requirements document for an automated teller machine network August 5, 1996 Contents 1 Introduction 2 1.1 Purpose : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.2 Scope

More information

Consistent Logical Checkpointing. Nitin H. Vaidya. Texas A&M University. Phone: Fax:

Consistent Logical Checkpointing. Nitin H. Vaidya. Texas A&M University. Phone: Fax: Consistent Logical Checkpointing Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 hone: 409-845-0512 Fax: 409-847-8578 E-mail: vaidya@cs.tamu.edu Technical

More information

Keywords: networks-of-workstations, distributed-shared memory, compiler optimizations, locality

Keywords: networks-of-workstations, distributed-shared memory, compiler optimizations, locality Informatica 17 page xxx{yyy 1 Overlap of Computation and Communication on Shared-Memory Networks-of-Workstations Tarek S. Abdelrahman and Gary Liu Department of Electrical and Computer Engineering The

More information



More information

An In-place Algorithm for Irregular All-to-All Communication with Limited Memory

An In-place Algorithm for Irregular All-to-All Communication with Limited Memory An In-place Algorithm for Irregular All-to-All Communication with Limited Memory Michael Hofmann and Gudula Rünger Department of Computer Science Chemnitz University of Technology, Germany {mhofma,ruenger}@cs.tu-chemnitz.de

More information

A Test Suite for High-Performance Parallel Java

A Test Suite for High-Performance Parallel Java page 1 A Test Suite for High-Performance Parallel Java Jochem Häuser, Thorsten Ludewig, Roy D. Williams, Ralf Winkelmann, Torsten Gollnick, Sharon Brunett, Jean Muylaert presented at 5th National Symposium

More information

UNIVERSITY OF PITTSBURGH FACULTY OF ARTS AND SCIENCES This dissertation was presented by Xin Yuan It was defended on August, 1998 and approved by Prof

UNIVERSITY OF PITTSBURGH FACULTY OF ARTS AND SCIENCES This dissertation was presented by Xin Yuan It was defended on August, 1998 and approved by Prof Dynamic and Compiled Communication in Optical Time{Division{Multiplexed Point{to{Point Networks by Xin Yuan B.S., Shanghai Jiaotong University, 1989 M.S., Shanghai Jiaotong University, 1992 M.S., University

More information

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742 Availability of Coding Based Replication Schemes Gagan Agrawal Department of Computer Science University of Maryland College Park, MD 20742 Abstract Data is often replicated in distributed systems to improve

More information

A Transparent Communication Layer for Heterogenous, Distributed Systems

A Transparent Communication Layer for Heterogenous, Distributed Systems A Transparent Communication Layer for Heterogenous, Distributed Systems Thomas Fuerle and Erich Schikuta Institut für Informatik und Wirtschaftsinformatik, University of Vienna Rathausstr. 19/9, A-1010

More information