DataCutter Joel Saltz Alan Sussman Tahsin Kurc University of Maryland, College Park and Johns Hopkins Medical Institutions

Size: px
Start display at page:

Download "DataCutter Joel Saltz Alan Sussman Tahsin Kurc University of Maryland, College Park and Johns Hopkins Medical Institutions"

Transcription

1 DataCutter Joel Saltz Alan Sussman Tahsin Kurc University of Maryland, College Park and Johns Hopkins Medical Institutions

2 DataCutter A suite of Middleware for subsetting and filtering multi-dimensional datasets stored on archival storage systems Subsetting through Range Queries a hyperbox defined in the multi-dimensional space underlying the dataset items whose multi-dimensional coordinates fall into the box are retrieved.

3 DataCutter Restricted processing (filtering/aggregations) through Filters to reduce the amount of data transferred to the client filters can run anywhere, but intended to run near (i.e., over local area network) storage system based on filter-stream programming model -- to optimize use of limited resources, such as memory and disk space

4 Client DataCutter Client Range Query Segment Data Client Interface Service Filtering Service Filter Filter Segment Info. DataCutter Indexing Service Data Access Service Archival Storage System Segments: (File,Offset,Size) Archival Storage System (File,Offset,Size)

5 DataCutter Architecture Client Interface Service Manages client connections and client requests Manages data and information flow between different services Indexing Service Two-level hierarchical indexing -- summary and detailed index files Customizable -- Default R-tree index User can add new indexing methods

6 DataCutter Architecture Filtering Service Manages filters (registered in the system) Users can add/run new filters Data Access Service Manages storage/retrieval of data from the tertiary storage Low level system dependent I/O operations

7 DataCutter -- Subsetting Datasets are partitioned into segments used to index the dataset, unit of retrieval Indexing very large datasets Multi-level hierarchical indexing scheme Summary index files -- to index a group of segments or detailed index files Detailed index files -- to index the segments

8 DataCutter -- Filters Filters Specialized user program to process data (segments) before returning them to the client Filter-stream programming model Originally developed for Active Disks environment (Acharya, Uysal, and Saltz) Based on stream abstraction A stream denotes a supply of data Streams deliver data in fixed size buffers Communication of a filter with its environment is restricted to its input and output streams init, process, finalize interface

9 A Motivating Scenario Sample Application: generate 3D reconstructed view from new set of sensor readings compare features with reference db Grid Configuration: remote data server - reference db sensor host - large raw readings parallel computation farm available 3D reconstruction computationally intensive Reference DB feature list? Data Server Sensor? Raw Dataset sensor readings WAN? Computation Farm? Client PC

10 A Motivating Scenario (2) Application : // process relevant raw readings // generate 3D view // compute features of 3D view // find similar features in reference db // display new view and similar cases Reference DB feature list Extract ref Data Server 3D reconstruction Computation Farm WAN View result Extract raw 3D reconstruction Extract ref Sensor View result Extract raw Reference DB Raw Dataset sensor readings Client PC Raw Dataset

11 Filters Filters communicate with other filters only using streams cannot change stream endpoints are allowed to pre-disclose dynamic allocation of memory/scratch space in init phase, before processing phase Advantages location independence easier scheduling of resources filter stop and restart is defined explicitly in model

12 Placement The dynamic assignment of filters to particular hosts for execution is placement (mapping) Optimization criteria: Communication leverage filter affinity to dataset minimize communication volume on slower connections co-locate filters with large communication volume Computation expensive computation on faster, less loaded hosts

13 Restructuring Process Application Decompose Target Configuration Some set of filters f 1 f 2 f 3 f4 f 5 Placement / Schedule Execute Application

14 Software Infrastructure Prototype implementation of filter framework C++ language binding manual placement wide-area execution service one thread for each instantiated filter

15 Filter Framework class MyFilter : public AS_Filter_Base { public: int init(int argc, char *argv[ ]) { }; int process(stream_t st) { }; int finalize(void) { }; }

16 Filter Connectivity / Placement [filter.a] outs = stream1 stream3 [filter.b] ins = stream1 outs = stream2 [filter.c] ins = stream2 stream3 [placement] A = host1.cs.umd.edu B = host2.cs.umd.edu C = host3.cs.umd.edu A stream1 stream3 B C stream2

17 Execution Service 1. Read Specs Filter/Stream Placement Filter lib Application Console???.???.???.??? 3. Exec 2. Query Directory name host port **** **** **** **** **** **** Directory Daemon dir.cs.umd.edu:6000 AppExec Daemon EXEC AppExec Daemon EXEC AppExec Daemon EXEC Filter lib Filter lib Filter lib filter A Application host1.cs.umd.edu filter B Application host2.cs.umd.edu filter C Application host3.cs.umd.edu

18 Related Work Application Level AppLeS Client/Server Sockets Programming Models Infrastructure Services Legion DataCutter SRB Harmony NWS DPSS DSM MPI RPC Globus HPC++ JavaRMI, DCOM, CORBA NetSolve, Ninf Condor Pool Resource Level Grid available Resources User specified Resources Idle Resources

19 Integrating DataCutter with the Storage Resouce Broker

20 Storage Resource Broker (SRB) Middleware between clients and storage resources Remote Access to storage resources. Various types : File Systems - UNIX, HPSS, UniTree, DPSS (LBL). DB large objects - Oracle, DB2, Illustra. Uniform client interface (API).

21 Storage Resource Broker (SRB) MCAT - MetaData Catalog Datasets (files) and Collections (directories) - inodes and more. Storage resources User information - authentication, access privileges, etc. Software package Server, client library, UNIX-like utilities, Java GUI Platforms - Solaris, Sun OS, Digital Unix, SGI Irix, Cray T90.

22 SRB/DataCutter - Prototype Implementation Support for Range Queries Creation of indices over data sets (composed set of data files) Subsetting of data sets Search for files or portions of files that intersect a given range query Restricted filter operations on portions of files (data segments) before returning them to the client (to perform filtering or aggregation to reduce data volume)

23 SRB/DataCutter System Application (SRB client) Resource User File SID DBLobjID ObjSID Range Query Storage Resource Broker (SRB) Indexing Service DataCutter MCAT SRB I/O and MCAT API Filtering Service Filter Filter Application Meta-data DB2, Oracle, Illustra, ObjectStore HPSS, UniTree UNIX, ftp Distributed Storage Resources

24 SRB/DataCutter Client Interface Creating and Deleting Index int sfocreateindex(srbconn *conn, sfoclass class, int cattype, char *inindexname, char *outindexname, char *resourcename) int sfodeleteindex(srbconn *conn, sfoclass class, int cattype, char *indexname)

25 SRB/DataCutter Client Interface Searching Index -- R-tree index typedef struct { int dim; /* bounding box dimensions */ double *min; /* minimum in each dimension */ double *max; /* maximum in each dimension */ } sfombr; /* Bounding box structure */ typedef struct { sfombr segmentmbr; /* bounding box of the segment */ char *objid; /* object in SRB that contains the segment */ char *collectionname; /* collection where object is stored */ unsigned int offset; /* offset of the segment in the object */ unsigned int size; /* size of segment */ } segmentinfo; /* segment meta-data information */ typedef struct { int segmentcount; /* number of segments returned */ segmentinfo *segments; /* segment meta-data information */ int continueindex; /* continuation flag */ } indexsearchresult; /* search result structure */

26 SRB/DataCutter Client Interface Searching Index -- R-tree index int sfosearchindex(srbconn *conn, sfoclass class, char *indexname, void *query, indexsearchresult *myresult, int maxsegcount) typedef struct { int dim; double *min, *max; } rangequery; int sfogetmoresearchresult(srbconn *conn, int continueindex, indexsearchresult *myresult, int maxsegcount)

27 Applying Filters typedef struct { segmentinfo seginfo; /* info on segment data buffer after filter oper. */ char *segment; /* segment data buffer after filter is applied */ } segmentdata; typedef struct { int segmentdatacount; /* #segments in segmentdata array */ segmentdata *segments; /* segmentdata array */ int continueindex; /* continuation flag */ } filterdataresult;

28 Applying Filters int sfoapplyfilter(srbconn *conn, sfoclass class, char *hostname, int filterid, char *filterarg, int numofinputsegments, segmentinfo *inputsegments, filterdataresult *myresult, int maxsegcount) int sfogetmorefilterresult(srbconn *conn, int continueindex, filterdataresult *myresult, int maxsegcount)

29 Application: Virtual Microscope Interactive software emulation of high power light microscope for processing/visualizing image datasets 3-D Image Dataset (100MB to 5GB per focal plane) Client-server system organization Rectangular region queries, multiple data chunk reply pipeline style processing read_data decompress clip zoom view

30 Virtual Microscope Client

31 VM Application using SRB/DataCutter read Indexing Wide Area Network SRB/DataCutter Distributed Collection of Workstations decompress clip zoom Distributed Storage Resources Local Area Network read read image chunks decompress clip zoom convert jpeg image chunks into RGB pixels clip image to query boundaries sub-sample to the required magnification view Client view Client view stitch image pieces together and display image

DataCutter and A Client Interface for the Storage Resource Broker. with DataCutter Services. Johns Hopkins Medical Institutions

DataCutter and A Client Interface for the Storage Resource Broker. with DataCutter Services. Johns Hopkins Medical Institutions DataCutter and A Client Interface for the Storage Resource Broker with DataCutter Services Tahsin Kurc y, Michael Beynon y, Alan Sussman y, Joel Saltz y+ y Institute for Advanced Computer Studies + Dept.

More information

A Simple Mass Storage System for the SRB Data Grid

A Simple Mass Storage System for the SRB Data Grid A Simple Mass Storage System for the SRB Data Grid Michael Wan, Arcot Rajasekar, Reagan Moore, Phil Andrews San Diego Supercomputer Center SDSC/UCSD/NPACI Outline Motivations for implementing a Mass Storage

More information

ADR and DataCutter. Sergey Koren CMSC818S. Thursday March 4 th, 2004

ADR and DataCutter. Sergey Koren CMSC818S. Thursday March 4 th, 2004 ADR and DataCutter Sergey Koren CMSC818S Thursday March 4 th, 2004 Active Data Repository Used for building parallel databases from multidimensional data sets Integrates storage, retrieval, and processing

More information

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) and DataCutter

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) and DataCutter Very Large Dataset Access and Manipulation: Active Data Repository (ADR) and DataCutter Joel Saltz Alan Sussman Tahsin Kurc University of Maryland, College Park and Johns Hopkins Medical Institutions http://www.cs.umd.edu/projects/adr

More information

Multiple Query Optimization for Data Analysis Applications on Clusters of SMPs

Multiple Query Optimization for Data Analysis Applications on Clusters of SMPs Multiple Query Optimization for Data Analysis Applications on Clusters of SMPs Henrique Andrade y, Tahsin Kurc z, Alan Sussman y, Joel Saltz yz y Dept. of Computer Science University of Maryland College

More information

The Virtual Microscope

The Virtual Microscope The Virtual Microscope Renato Ferreira y, Bongki Moon, PhD y, Jim Humphries y,alansussman,phd y, Joel Saltz, MD, PhD y z, Robert Miller, MD z, Angelo Demarzo, MD z y UMIACS and Dept of Computer Science

More information

Knowledge-based Grids

Knowledge-based Grids Knowledge-based Grids Reagan Moore San Diego Supercomputer Center (http://www.npaci.edu/dice/) Data Intensive Computing Environment Chaitan Baru Walter Crescenzi Amarnath Gupta Bertram Ludaescher Richard

More information

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT. Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies

More information

Developing Data-Intensive Applications in the Grid

Developing Data-Intensive Applications in the Grid Developing Data-Intensive Applications in the Grid Grid Forum Advanced Programming Models Group White Paper Tahsin Kurc y, Michael Beynon y,alansussman y,joelsaltz yz y Dept. of Computer z Dept. of Pathology

More information

Query Planning for Range Queries with User-defined Aggregation on Multi-dimensional Scientific Datasets

Query Planning for Range Queries with User-defined Aggregation on Multi-dimensional Scientific Datasets Query Planning for Range Queries with User-defined Aggregation on Multi-dimensional Scientific Datasets Chialin Chang y, Tahsin Kurc y, Alan Sussman y, Joel Saltz y+ y UMIACS and Dept. of Computer Science

More information

The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures

The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures Samuel D. Gasster, Craig A. Lee, Brooks Davis, Matt Clark, Mike AuYeung, John R. Wilson Computer Systems

More information

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Henrique Andrade y, Tahsin Kurc z, Alan Sussman y, Joel Saltz z y Dept. of Computer Science University of Maryland College

More information

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid THE GLOBUS PROJECT White Paper GridFTP Universal Data Transfer for the Grid WHITE PAPER GridFTP Universal Data Transfer for the Grid September 5, 2000 Copyright 2000, The University of Chicago and The

More information

Large Irregular Datasets and the Computational Grid

Large Irregular Datasets and the Computational Grid Large Irregular Datasets and the Computational Grid Joel Saltz University of Maryland College Park Computer Science Department Johns Hopkins Medical Institutions Pathology Department Computational grids

More information

Data Grid Services: The Storage Resource Broker. Andrew A. Chien CSE 225, Spring 2004 May 26, Administrivia

Data Grid Services: The Storage Resource Broker. Andrew A. Chien CSE 225, Spring 2004 May 26, Administrivia Data Grid Services: The Storage Resource Broker Andrew A. Chien CSE 225, Spring 2004 May 26, 2004 Administrivia This week:» 5/28 meet ½ hour early (430pm) Project Reports Due, 6/10, to Andrew s Office

More information

Systems Design and Implementation I.4 Naming in a Multiserver OS

Systems Design and Implementation I.4 Naming in a Multiserver OS Systems Design and Implementation I.4 Naming in a Multiserver OS System, SS 2009 University of Karlsruhe 06.5.2009 Jan Stoess University of Karlsruhe The Issue 2 The Issue In system construction we combine

More information

Data Sharing with Storage Resource Broker Enabling Collaboration in Complex Distributed Environments. White Paper

Data Sharing with Storage Resource Broker Enabling Collaboration in Complex Distributed Environments. White Paper Data Sharing with Storage Resource Broker Enabling Collaboration in Complex Distributed Environments White Paper 2 SRB: Enabling Collaboration in Complex Distributed Environments Table of Contents Introduction...3

More information

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System File Systems Kartik Gopalan Chapter 4 From Tanenbaum s Modern Operating System 1 What is a File System? File system is the OS component that organizes data on the raw storage device. Data, by itself, is

More information

Communication. Distributed Systems Santa Clara University 2016

Communication. Distributed Systems Santa Clara University 2016 Communication Distributed Systems Santa Clara University 2016 Protocol Stack Each layer has its own protocol Can make changes at one layer without changing layers above or below Use well defined interfaces

More information

Layered Architecture

Layered Architecture The Globus Toolkit : Introdution Dr Simon See Sun APSTC 09 June 2003 Jie Song, Grid Computing Specialist, Sun APSTC 2 Globus Toolkit TM An open source software toolkit addressing key technical problems

More information

CS 550 Operating Systems Spring File System

CS 550 Operating Systems Spring File System 1 CS 550 Operating Systems Spring 2018 File System 2 OS Abstractions Process: virtualization of CPU Address space: virtualization of memory The above to allow a program to run as if it is in its own private,

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Distributed Systems 8. Remote Procedure Calls

Distributed Systems 8. Remote Procedure Calls Distributed Systems 8. Remote Procedure Calls Paul Krzyzanowski pxk@cs.rutgers.edu 10/1/2012 1 Problems with the sockets API The sockets interface forces a read/write mechanism Programming is often easier

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

Active Proxy-G: Optimizing the Query Execution Process in the Grid

Active Proxy-G: Optimizing the Query Execution Process in the Grid Active Proxy-G: Optimizing the Query Execution Process in the Grid Henrique Andrade y, Tahsin Kurc z,alansussman y,joelsaltz yz y Dept. of Computer Science University of Maryland College Park, MD 20742

More information

Functional Requirements for Grid Oriented Optical Networks

Functional Requirements for Grid Oriented Optical Networks Functional Requirements for Grid Oriented Optical s Luca Valcarenghi Internal Workshop 4 on Photonic s and Technologies Scuola Superiore Sant Anna Pisa June 3-4, 2003 1 Motivations Grid networking connection

More information

A Component-based Implementation of Iso-surface Rendering for Visualizing Large Datasets (Extended Abstract)

A Component-based Implementation of Iso-surface Rendering for Visualizing Large Datasets (Extended Abstract) A Component-based Implementation of Iso-surface Rendering for Visualizing Large Datasets (Extended Abstract) Michael D. Beynon y, Tahsin Kurc z, Umit Catalyurek z, Alan Sussman y, Joel Saltz yz y Institute

More information

Log Manager. Introduction

Log Manager. Introduction Introduction Log may be considered as the temporal database the log knows everything. The log contains the complete history of all durable objects of the system (tables, queues, data items). It is possible

More information

OS structure. Process management. Major OS components. CSE 451: Operating Systems Spring Module 3 Operating System Components and Structure

OS structure. Process management. Major OS components. CSE 451: Operating Systems Spring Module 3 Operating System Components and Structure CSE 451: Operating Systems Spring 2012 Module 3 Operating System Components and Structure Ed Lazowska lazowska@cs.washington.edu Allen Center 570 The OS sits between application programs and the it mediates

More information

Caching and Buffering in HDF5

Caching and Buffering in HDF5 Caching and Buffering in HDF5 September 9, 2008 SPEEDUP Workshop - HDF5 Tutorial 1 Software stack Life cycle: What happens to data when it is transferred from application buffer to HDF5 file and from HDF5

More information

Lecture 2 Process Management

Lecture 2 Process Management Lecture 2 Process Management Process Concept An operating system executes a variety of programs: Batch system jobs Time-shared systems user programs or tasks The terms job and process may be interchangeable

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

Operating Systems 2 nd semester 2016/2017. Chapter 4: Threads

Operating Systems 2 nd semester 2016/2017. Chapter 4: Threads Operating Systems 2 nd semester 2016/2017 Chapter 4: Threads Mohamed B. Abubaker Palestine Technical College Deir El-Balah Note: Adapted from the resources of textbox Operating System Concepts, 9 th edition

More information

NUMA-aware OpenMP Programming

NUMA-aware OpenMP Programming NUMA-aware OpenMP Programming Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group schmidl@itc.rwth-aachen.de Christian Terboven IT Center, RWTH Aachen University Deputy lead of the HPC

More information

Collection-Based Persistent Digital Archives - Part 1

Collection-Based Persistent Digital Archives - Part 1 Página 1 de 16 D-Lib Magazine March 2000 Volume 6 Number 3 ISSN 1082-9873 Collection-Based Persistent Digital Archives - Part 1 Reagan Moore, Chaitan Baru, Arcot Rajasekar, Bertram Ludaescher, Richard

More information

OPERATING SYSTEM. Chapter 4: Threads

OPERATING SYSTEM. Chapter 4: Threads OPERATING SYSTEM Chapter 4: Threads Chapter 4: Threads Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues Operating System Examples Objectives To

More information

SDS: A Scalable Data Services System in Data Grid

SDS: A Scalable Data Services System in Data Grid SDS: A Scalable Data s System in Data Grid Xiaoning Peng School of Information Science & Engineering, Central South University Changsha 410083, China Department of Computer Science and Technology, Huaihua

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems Operating System Concepts 99h Edition DM510-14 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation

More information

Chapter 3: Client-Server Paradigm and Middleware

Chapter 3: Client-Server Paradigm and Middleware 1 Chapter 3: Client-Server Paradigm and Middleware In order to overcome the heterogeneity of hardware and software in distributed systems, we need a software layer on top of them, so that heterogeneity

More information

Accessibility Features in the SAS Intelligence Platform Products

Accessibility Features in the SAS Intelligence Platform Products 1 CHAPTER 1 Overview of Common Data Sources Overview 1 Accessibility Features in the SAS Intelligence Platform Products 1 SAS Data Sets 1 Shared Access to SAS Data Sets 2 External Files 3 XML Data 4 Relational

More information

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms Grid Computing 1 Resource sharing Elements of Grid Computing - Computers, data, storage, sensors, networks, - Sharing always conditional: issues of trust, policy, negotiation, payment, Coordinated problem

More information

Distributing BaBar Data using the Storage Resource Broker (SRB)

Distributing BaBar Data using the Storage Resource Broker (SRB) Distributing BaBar Data using the Storage Resource Broker (SRB) W. Kröger (SLAC), L. Martin (Univ. Paris VI et VII), D. Boutigny (LAPP - CNRS/IN2P3), A. Hanushevsky (SLAC), A. Hasan (SLAC) For the BaBar

More information

QCOM Reference Guide

QCOM Reference Guide QCOM Reference Guide Lars Wirfelt 2002 06 10 Copyright 2005 2016 SSAB EMEA AB Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License,

More information

4.8 Summary. Practice Exercises

4.8 Summary. Practice Exercises Practice Exercises 191 structures of the parent process. A new task is also created when the clone() system call is made. However, rather than copying all data structures, the new task points to the data

More information

Data Management 1. Grid data management. Different sources of data. Sensors Analytic equipment Measurement tools and devices

Data Management 1. Grid data management. Different sources of data. Sensors Analytic equipment Measurement tools and devices Data Management 1 Grid data management Different sources of data Sensors Analytic equipment Measurement tools and devices Need to discover patterns in data to create information Need mechanisms to deal

More information

Virtual Machine Systems

Virtual Machine Systems Virtual Machine Systems Question Can a small operating system simulate the hardware of some machine so that Another operating system can run in that simulated hardware? More than one instance of that operating

More information

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD. OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD. File System Implementation FILES. DIRECTORIES (FOLDERS). FILE SYSTEM PROTECTION. B I B L I O G R A P H Y 1. S I L B E R S C H AT Z, G A L V I N, A N

More information

Chapter 10: File System Implementation

Chapter 10: File System Implementation Chapter 10: File System Implementation Chapter 10: File System Implementation File-System Structure" File-System Implementation " Directory Implementation" Allocation Methods" Free-Space Management " Efficiency

More information

Element: Relations: Topology: no constraints.

Element: Relations: Topology: no constraints. The Module Viewtype The Module Viewtype Element: Elements, Relations and Properties for the Module Viewtype Simple Styles Call-and-Return Systems Decomposition Style Uses Style Generalization Style Object-Oriented

More information

GRID Stream Database Managent for Scientific Applications

GRID Stream Database Managent for Scientific Applications GRID Stream Database Managent for Scientific Applications Milena Ivanova (Koparanova) and Tore Risch IT Department, Uppsala University, Sweden Outline Motivation Stream Data Management Computational GRIDs

More information

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition Chapter 12: File System Implementation Silberschatz, Galvin and Gagne 2013 Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods

More information

SRB Logical Structure

SRB Logical Structure SDSC Storage Resource Broker () Introduction and Applications based on material by Arcot Rajasekar, Reagan Moore et al San Diego Supercomputer Center, UC San Diego A distributed file system (Data Grid),

More information

OPERATING SYSTEM. Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management

More information

Insights into TSM/HSM for UNIX and Windows

Insights into TSM/HSM for UNIX and Windows IBM Software Group Insights into TSM/HSM for UNIX and Windows Oxford University TSM Symposium 2005 Jens-Peter Akelbein (akelbein@de.ibm.com) IBM Tivoli Storage SW Development 1 IBM Software Group Tivoli

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Silberschatz, Galvin and Gagne 2013 Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods

More information

CS 5460/6460 Operating Systems

CS 5460/6460 Operating Systems CS 5460/6460 Operating Systems Fall 2009 Instructor: Matthew Flatt Lecturer: Kevin Tew TAs: Bigyan Mukherjee, Amrish Kapoor 1 Join the Mailing List! Reminders Make sure you can log into the CADE machines

More information

Overview and Introduction to Scientific Visualization. Texas Advanced Computing Center The University of Texas at Austin

Overview and Introduction to Scientific Visualization. Texas Advanced Computing Center The University of Texas at Austin Overview and Introduction to Scientific Visualization Texas Advanced Computing Center The University of Texas at Austin Scientific Visualization The purpose of computing is insight not numbers. -- R. W.

More information

CS420: Operating Systems

CS420: Operating Systems Threads James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz, Galvin, Gagne Threads A thread is a basic unit of processing

More information

Chapter 4: Threads. Chapter 4: Threads

Chapter 4: Threads. Chapter 4: Threads Chapter 4: Threads Silberschatz, Galvin and Gagne 2013 Chapter 4: Threads Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues Operating System Examples

More information

Buffer Management for XFS in Linux. William J. Earl SGI

Buffer Management for XFS in Linux. William J. Earl SGI Buffer Management for XFS in Linux William J. Earl SGI XFS Requirements for a Buffer Cache Delayed allocation of disk space for cached writes supports high write performance Delayed allocation main memory

More information

Memory Hierarchies && The New Bottleneck == Cache Conscious Data Access. Martin Grund

Memory Hierarchies && The New Bottleneck == Cache Conscious Data Access. Martin Grund Memory Hierarchies && The New Bottleneck == Cache Conscious Data Access Martin Grund Agenda Key Question: What is the memory hierarchy and how to exploit it? What to take home How is computer memory organized.

More information

Distributed Objects. Object-Oriented Application Development

Distributed Objects. Object-Oriented Application Development Distributed s -Oriented Application Development Procedural (non-object oriented) development Data: variables Behavior: procedures, subroutines, functions Languages: C, COBOL, Pascal Structured Programming

More information

Chapter 11: Implementing File

Chapter 11: Implementing File Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

CS2506 Quick Revision

CS2506 Quick Revision CS2506 Quick Revision OS Structure / Layer Kernel Structure Enter Kernel / Trap Instruction Classification of OS Process Definition Process Context Operations Process Management Child Process Thread Process

More information

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

MATE-EC2: A Middleware for Processing Data with Amazon Web Services MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering

More information

Harnessing Metadata Characteristics for Efficient Deduplication in Distributed Storage Systems. Matthew Goldstein

Harnessing Metadata Characteristics for Efficient Deduplication in Distributed Storage Systems. Matthew Goldstein Harnessing Metadata Characteristics for Efficient Deduplication in Distributed Storage Systems by Matthew Goldstein Submitted to the Department of Electrical Engineering and Computer Science in partial

More information

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition Chapter 11: Implementing File Systems Operating System Concepts 9 9h Edition Silberschatz, Galvin and Gagne 2013 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory

More information

Distributed Data Management with Storage Resource Broker in the UK

Distributed Data Management with Storage Resource Broker in the UK Distributed Data Management with Storage Resource Broker in the UK Michael Doherty, Lisa Blanshard, Ananta Manandhar, Rik Tyer, Kerstin Kleese @ CCLRC, UK Abstract The Storage Resource Broker (SRB) is

More information

1.264 Lecture 16. Legacy Middleware

1.264 Lecture 16. Legacy Middleware 1.264 Lecture 16 Legacy Middleware What is legacy middleware? Client (user interface, local application) Client (user interface, local application) How do we connect clients and servers? Middleware Network

More information

RPC and RMI. 2501ICT Nathan

RPC and RMI. 2501ICT Nathan RPC and RMI 2501ICT Nathan Contents Client/Server revisited RPC Architecture XDR RMI Principles and Operation Case Studies Copyright 2002- René Hexel. 2 Client/Server Revisited Server Accepts commands

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Silberschatz, Galvin and Gagne 2013 Chapter 12: File System Implementation File-System Structure File-System Implementation Allocation Methods Free-Space Management

More information

Oracle Secure Backup 12.1 Technical Overview

Oracle Secure Backup 12.1 Technical Overview Oracle Secure Backup 12.1 Technical Overview February 12, 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and

More information

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) Dept. of Computer Science & Engineering Chentao Wu wuct@cs.sjtu.edu.cn Download lectures ftp://public.sjtu.edu.cn User:

More information

Part Two - Process Management. Chapter 3: Processes

Part Two - Process Management. Chapter 3: Processes Part Two - Process Management Chapter 3: Processes Chapter 3: Processes 3.1 Process Concept 3.2 Process Scheduling 3.3 Operations on Processes 3.4 Interprocess Communication 3.5 Examples of IPC Systems

More information

FILE SYSTEMS. Tanzir Ahmed CSCE 313 Fall 2018

FILE SYSTEMS. Tanzir Ahmed CSCE 313 Fall 2018 FILE SYSTEMS Tanzir Ahmed CSCE 313 Fall 2018 References Previous offerings of the same course by Prof Tyagi and Bettati Textbook: Operating System Principles and Practice 2 The UNIX File System File Systems

More information

Threads. What is a thread? Motivation. Single and Multithreaded Processes. Benefits

Threads. What is a thread? Motivation. Single and Multithreaded Processes. Benefits CS307 What is a thread? Threads A thread is a basic unit of CPU utilization contains a thread ID, a program counter, a register set, and a stack shares with other threads belonging to the same process

More information

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine Henrique Andrade y,tahsinkurc z, Alan Sussman y, Joel Saltz z y Dept. of Computer Science University of Maryland College

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Virtual File Systems. Allocation Methods. Folder Implementation. Free-Space Management. Directory Block Placement. Recovery. Virtual File Systems An object-oriented

More information

Using Virtual Slides in Medical Education with the Virtual Slice System. Jack Glaser, President MicroBrightField, Inc.

Using Virtual Slides in Medical Education with the Virtual Slice System. Jack Glaser, President MicroBrightField, Inc. Using Virtual Slides in Medical Education with the Virtual Slice System Jack Glaser, President MicroBrightField, Inc. Advantages of Virtual Slides Overview 0.08x Single section 0.63x Entire 2 x3 inch

More information

Parameterized specification, configuration and execution of data-intensive scientific workflows

Parameterized specification, configuration and execution of data-intensive scientific workflows Cluster Comput (2010) 13: 315 333 DOI 10.1007/s10586-010-0133-8 Parameterized specification, configuration and execution of data-intensive scientific workflows Vijay S. Kumar Tahsin Kurc Varun Ratnakar

More information

Distributed Systems. How do regular procedure calls work in programming languages? Problems with sockets RPC. Regular procedure calls

Distributed Systems. How do regular procedure calls work in programming languages? Problems with sockets RPC. Regular procedure calls Problems with sockets Distributed Systems Sockets interface is straightforward [connect] read/write [disconnect] Remote Procedure Calls BUT it forces read/write mechanism We usually use a procedure call

More information

Replica Selection in the Globus Data Grid

Replica Selection in the Globus Data Grid Replica Selection in the Globus Data Grid Sudharshan Vazhkudai 1, Steven Tuecke 2, and Ian Foster 2 1 Department of Computer and Information Science The University of Mississippi chucha@john.cs.olemiss.edu

More information

Chapter 3: Processes. Operating System Concepts 8th Edition

Chapter 3: Processes. Operating System Concepts 8th Edition Chapter 3: Processes Chapter 3: Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication Examples of IPC Systems Communication in Client-Server Systems 3.2 Objectives

More information

Presentation Services. Presentation Services: Motivation

Presentation Services. Presentation Services: Motivation Presentation Services need for a presentation services ASN.1 declaring data type encoding data types implementation issues reading: Tannenbaum 7.3.2 Presentation Services: Motivation Question: suppose

More information

Lecture 15: Network File Systems

Lecture 15: Network File Systems Lab 3 due 12/1 Lecture 15: Network File Systems CSE 120: Principles of Operating Systems Alex C. Snoeren Network File System Simple idea: access disks attached to other computers Share the disk with many

More information

Chapter 2. DB2 concepts

Chapter 2. DB2 concepts 4960ch02qxd 10/6/2000 7:20 AM Page 37 DB2 concepts Chapter 2 Structured query language 38 DB2 data structures 40 Enforcing business rules 49 DB2 system structures 52 Application processes and transactions

More information

Optimizing Multiple Queries on Scientific Datasets with Partial Replicas

Optimizing Multiple Queries on Scientific Datasets with Partial Replicas Optimizing Multiple Queries on Scientific Datasets with Partial Replicas Li Weng Umit Catalyurek Tahsin Kurc Gagan Agrawal Joel Saltz Department of Computer Science and Engineering Department of Biomedical

More information

Nested Virtualization and Server Consolidation

Nested Virtualization and Server Consolidation Nested Virtualization and Server Consolidation Vara Varavithya Department of Electrical Engineering, KMUTNB varavithya@gmail.com 1 Outline Virtualization & Background Nested Virtualization Hybrid-Nested

More information

Communication. Overview

Communication. Overview Communication Chapter 2 1 Overview Layered protocols Remote procedure call Remote object invocation Message-oriented communication Stream-oriented communication 2 Layered protocols Low-level layers Transport

More information

Chapter 4: Multithreaded Programming

Chapter 4: Multithreaded Programming Chapter 4: Multithreaded Programming Silberschatz, Galvin and Gagne 2013! Chapter 4: Multithreaded Programming Overview Multicore Programming Multithreading Models Threading Issues Operating System Examples

More information

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent

More information

Introduction and Overview Socket Programming Higher-level interfaces Final thoughts. Network Programming. Samuli Sorvakko/Nixu Oy

Introduction and Overview Socket Programming Higher-level interfaces Final thoughts. Network Programming. Samuli Sorvakko/Nixu Oy Network Programming Samuli Sorvakko/Nixu Oy Telecommunications software and Multimedia Laboratory T-110.4100 Computer Networks October 9, 2006 Agenda 1 Introduction and Overview Introduction 2 Socket Programming

More information

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Grid Computing Middleware. Definitions & functions Middleware components Globus glite Seminar Review 1 Topics Grid Computing Middleware Grid Resource Management Grid Computing Security Applications of SOA and Web Services Semantic Grid Grid & E-Science Grid Economics Cloud Computing 2 Grid

More information

CIS Operating Systems File Systems. Professor Qiang Zeng Fall 2017

CIS Operating Systems File Systems. Professor Qiang Zeng Fall 2017 CIS 5512 - Operating Systems File Systems Professor Qiang Zeng Fall 2017 Previous class I/O subsystem: hardware aspect Terms: controller, bus, port Addressing: port-mapped IO and memory-mapped IO I/O subsystem:

More information

Technical White Paper

Technical White Paper Technical White Paper On Implementing IBM InfoSphere Change Data Capture for Sybase with a Remote Database Server Awajeet Kumar Arya(awajarya@in.ibm.com) CONTENTS Trademarks...03 Introduction...04 Overview...04

More information

COSC 6397 Big Data Analytics. Distributed File Systems (II) Edgar Gabriel Fall HDFS Basics

COSC 6397 Big Data Analytics. Distributed File Systems (II) Edgar Gabriel Fall HDFS Basics COSC 6397 Big Data Analytics Distributed File Systems (II) Edgar Gabriel Fall 2018 HDFS Basics An open-source implementation of Google File System Assume that node failure rate is high Assumes a small

More information

A Metadata Catalog Service for Data Intensive Applications

A Metadata Catalog Service for Data Intensive Applications Metadata Catalog Service Draft August 5, 2002 A Metadata Catalog Service for Data Intensive Applications Ann Chervenak, Ewa Deelman, Carl Kesselman, Laura Pearlman, Gurmeet Singh Version 1.0 1 Introduction

More information

UNIT 4 CORBA 4/2/2013 Middleware 59

UNIT 4 CORBA 4/2/2013 Middleware 59 UNIT 4 CORBA 4/2/2013 Middleware 59 CORBA AN OBJECT ORIENTED RPC MECHANISM HELPS TO DEVELOP DISTRIBUTED SYTEMS IN DIFF. PLATFORMS OBJECTS WRITTEN IN DIFF., LANG, CAN BE CALLED BY OBJECTS WRITTEN IN ANOTHER

More information

Multithreaded Programming

Multithreaded Programming Multithreaded Programming The slides do not contain all the information and cannot be treated as a study material for Operating System. Please refer the text book for exams. September 4, 2014 Topics Overview

More information

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems Distributed Systems Outline Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems What Is A Distributed System? A collection of independent computers that appears

More information