Integration of Cloud Storage with Data Grids

Size: px
Start display at page:

Download "Integration of Cloud Storage with Data Grids"

Transcription

1 Integration of Cloud Storage with Data Grids M. WAN University of California, San Diego, CA, USA AND R. MOORE, AND A. RAJASEKAR, University of North Carolina, Chapel Hill, NC, USA The integrated Rule Oriented Data System (irods) is a data grid that organizes distributed data into a sharable collection, while enforcing management policies. The Amazon Simple Storage Service (S3) is an internet-based cloud storage service that allows users to store and retrieve data from anywhere, anytime on the web. Whereas the S3 provides robust storage it does not offer any other functionality. The irods system provides a rich set of authentication, authorization and auditing facilities, a means to associate descriptive metadata to data stored in irods through which users can discover and share data, and a means to maintain integrity and authenticity of data and recover from corruption based on replication strategies. The irods system provides a policy-based data management that allows each community of collaborating users to customize their complete data life-cycle management policies to meet their needs. We have integrated the S3 storage with irods such that users can have a rich set of functionality layered on top of the simple cloud storage offered by S3. The integration of S3 was accomplished using the "Compound Resource Framework" - one of the integration methods in irods. The compound resource framework provides an intermediate cache between the systems that allows irods to effectively manage the protocol mismatch between the put/get functionality exposed by S3 and the richer POSIX I/O of irods. Moreover, the framework performs the most efficient data transfer between the client and S3 by managing the bandwidth/latency mismatch between the client system and S3 host using the cache in an intelligent fashion. The integration of irods with S3 cloud storage system gives the user full-fledged data management functionality on top of the storage functionality offered by S3. Moreover, since irods can manage distributed resources, the integration allows one to integrate, discover and access data stored in multiple and diverse cloud and non-cloud storage systems. Categories and Subject Descriptors: H.3.2 [Information Storage] File Organization. General Terms: Design, Management Additional Key Words and Phrases: Storage model, cloud storage, data grids and rule-oriented systems ACM File Format: WAN, M., MOORE, R. AND RAJASEKAR, A Integration of Cloud Storage with Data Grids. Proc. Third International Conference on the Virtual Computing Initiative (October 2009), 10 pages. 1. INTRODUCTION The integrated Rule Oriented Data System (irods)[1,2,3] is a data grid that organizes distributed data into a sharable collection, while enforcing management policies. The Amazon Simple Storage Service (S3) [4] is an internet-based cloud storage service that allows users to store and retrieve data from anywhere, anytime on the web. Whereas the S3 provides robust storage it does not offer any other functionality. The irods system provides a rich set of authentication, authorization and auditing facilities, a means to associate descriptive metadata to data stored in irods through which users can discover and share data, and a means to maintain integrity and authenticity of data and recover from corruption based on replication strategies. The irods system provides a policybased data management that allows each community of collaborating users to customize their complete data life-cycle management to meet their needs. The integration of irods with S3 cloud storage system gives the user rich data management functionality on top of the storage functionality offered by S3. Moreover, since irods can manage distributed resources, the integration allows one to integrate, discover and access data stored in multiple and diverse cloud and non-cloud storage systems.

2 WAN, M., MOORE, R. AND RAJASEKAR, A Cloud computing [5,6] is a new model of computing on demand that is emerging as an alternate to in-house computing. Cloud computing is also a business model where virtualized computing resources are provided as a service, by a third-party provider, over the wide area network. Users buy time on these compute-resources as needed without worrying about installing, maintaining or upgrading local infrastructure. Cloud computing is very useful and highly efficient for meeting peak loads, short-term demands, changing user base, and fault tolerance. Amazon s Elastic Computing Cloud (EC2) [7], Google [8], Sun Cloud [9], IBM CloudBurst [10], Microsoft Azure[11] and GoGrid [12] are some examples of vendor-provided cloud computing. In association with cloud computing, development has also occurred in the area of provisioning demand-based storage, called cloud storage [13]. Cloud storage provides networkaccessible storage capacity for storing files in a remote, server site. Like cloud computing, cloud storage is a service provided by third party where users pay for the storage that they use and bandwidth they consume when importing and exporting data from the cloud storage system. Cloud storage is useful for off-line storage of files (disaster recovery and fault tolerance), caching of data new cloud computing resouces, meeting temporary storage spike needs, and for providing better and reliable web hosting services (load balancing). Amazon s Simple Storage Service (S3) [4], Nirvanix [14], Google Docs [15] and Rackspace Cloud [16] are examples of storage cloud services. The Grid is the software infrastructure that links distributed computational resources such as people, computers, sensors and data [17]. The Data Grid links distributed storage resources, from archival systems, to caches, to databases. The data within the Data Grid are mapped to a uniform logical name space to create global, persistent collections. It is possible to create and manage geographically distributed replicas of the digital entities that are registered into the collection. The naming convention for the digital entities can be global in scale, making it possible to use data grids to share access to data between continents. Data Grids enable sharing by providing network-wide user identification, third-party access control, and means to associate descriptive metadata with stored data enabling community users to discover and access relevant data. In addition to use as data sharing environments, data grids can also be used to support publication of data and preservation of data. Examples of data grids include the Storage Resource Broker [18], the integrated Rule-Oriented Data System [1,2,3] and the Globus Data Grid [19]. Examples of data grid usage can be found in [18]. We present an integration architecture where we combine the benefits of cloud storage systems and data grids to provide a user with rich distributed data management capabilities on top of the reliability, ease and cost-effectiveness provided by the cloud storage paradigms. We show and describe the feasibility of the approach by integrating the irods data grid with the Amazon S3 cloud environment. 2. AMAZON SIMPLE STORAGE SYSTEM (S3) Amazon S3 [4] is a pioneering cloud storage system that enables users to extend their storage capacities without much capital overlay. S3 provides internet-based access for storage at its site using a web-service interface. Through this interface users can store and retrieve any amount of data, at any time, from anywhere on the web providing scalability, reliability and portability. The service provided by S3 takes the onus off of small and large-scale organizations from incurring capital costs for installing large storage banks, and recurring costs maintaining and periodically upgrading them. Moreover, the organizations need not worry about meeting peak demand, recovery from

3 Integration of Cloud Storage with Data Grids faults, and provision of large bandwidth networks for data distribution. The shift to the per-use service-based model of S3 provides for an agile development cycle and for experimenting with new ideas without incurring large upfront costs. The key characteristics of S3 storage system can be seen as follows [4]: Improved agility allowing for changing strategies in usage of storage. Reduced cost due to reduction in capital and operational expenditures. Portability and access from any location increasing the types of usage models that one can achieve. Sharing of resources with other users who may be co-located. Improved reliability achieved through redundancy in storage offered by S3. Extensibility and scalability as the S3 model does not have any restrictions on storage provided. Hence, one can increase or decrease use of space as needed. Closer integration with cloud computing (Amazon EC2). Amazon s S3 provides a web service interface (SOAP and REST interfaces) for ingesting and accessing files from its cloud storage system. The functionality allows a user to read, write and delete files that are up to 5 Gigabytes in size. The files are stored in logically-named containers called buckets. The bucket names or keys are user defined and a user account can have up to 100 buckets. The system also allows one to list files in a bucket as well as query for system metadata about each file. One can store an unlimited number of files in any bucket by giving a unique name (key) for each file. There is no concept of a hierarchical directory structure in S3, but one is not precluded from using the / in the file name. In S3, buckets can be designated to be either in Europe or the United States and the files that are stored in a bucket gets stored in a storage system in that location. Internally, the files can be stored in any location within these areas and Amazon does not specify the redundancy it maintains for disaster recovery. The access of a file from S3 is independent of its storage location. S3 also provides access control mechanisms to ensure protection as well as to enable sharing among users. S3 has a pricing policy that differentiates between the storage costs and data transfer costs for transferring files in and out of S3. The cost of storage is proportional to the Gigabytes used and the rate becomes cheaper as the number of Terabytes stored increases. The cost of data transfer is higher for data ingestion compared to data access. Storage and data transfer costs between European and US sites also differ and data movement between sites is also charged. By configuring such a pricing policy, S3 makes it useful for small enterprises to leverage compute and storage capabilities immediately without building an extensive IT department and for large enterprises to beta test new projects and directions without additional capital equipment and IT staff time. S3 provides a web-service interface using both REST and SOAP protocols. S3 also provides other protocols such as the BitTorrent protocol for accessing files from multiple sites. The web services interface provides the following service end points (we provide simple explanations. A more informative documentation can be found at [4]): ListAllMyBuckets returns names of all buckets owned by the user CreateBucket Creates a new bucket DeleteBucket delete an empty bucket ListBucket List buckets which meets the given search criteria. GetBucketAccessControlPolicy Shows the ACLs for a bucket SetBucketAccessControlPolicy - Sets ACLs for a bucket for a given user

4 WAN, M., MOORE, R. AND RAJASEKAR, A GetBucketLoggingStatus - Shows the logging status of a Bucket SetBucketLoggingStatus Sets the logging status about what actions to log PutObjectInline Ingest an object which is part of a SOAP message PutObject - Ingest an object that is given as a DIME attachment CopyObject - Copy an object from one bucket to another possibly with a different name for the file. GetObject - Downloads a complete object GetObjectExtended Can download partial object meeting a given criteria DeleteObject Removes an object from a bucket (there is no trash can facility) GetObjectAccessControlPolicy - Shows the ACLs for an object SetObjectAccessControlPolicy - Sets ACLs for an object for a given user Several third-party groups have built user interfaces for accessing S3 capabilities by hiding the intricacies of the Amazon s web service interface. Several commercial enterprises also offer value added services for access to the services offered by S3. Reference [20] provides a list of tools available for storing files in S3. 2. INTEGRATED RULE-ORIENTED DATA SYSTEMS The integrated Rule-Oriented Data System [1,2,3] (irods) is peer-to-peer data grid middleware that provides a facility for collection-building, managing, querying, accessing, and preserving data in a distributed data grid framework. The irods system applies policy-based control when performing these functions. In brief, the irods system provides the following capabilities : Global persistent identifiers for naming digital objects. A unique identifier is used for each object stored in irods. Replicas and versions of the same object share the same global identifier but differ in replication and version metadata. Support for metadata to identify system-level physical properties of the stored data object. The properties that are stored include physical resource location, path names (or canned SQL queries in case of database resources), owner, creation time, modification time, expiration times, file types, access controls, file size, location of replicas, aggregation in a container, etc. Support for descriptive metadata to enable discovery through simple query mechanisms. The irods supports metadata in terms of attribute-value-unit triplets. Any number of such associations can be added for each digital object. Standard access mechanisms. Interfaces include Web browsers, Unix shell commands, Windows browsers, Python load libraries, Java, C library calls, Fuse-based file interface, WebDav, Kepler and Taverna workflow, etc. Storage repository abstraction. Files may be stored in multiple types of storage systems including tape systems, disk systems, databases and now cloud storage. Inter-realm authentication system for secure access to remote storage systems including secure passwords and certificate-based authentication such as GSI. Support for replication and synchronization of files between resource sites. Support for caching copies of files onto a local storage system and support for accessing files in an archive using compound resource methodology. This includes the concept of multiple replicas of an object with distinct usage models. Archives are used as safe copies and caches are used for immediate access. Support for physically aggregating files into tar-files to optimize management of large numbers of small files.

5 Integration of Cloud Storage with Data Grids Access controls and audit trails to control and track data usage. Support for execution of remote operations for data sub-setting, metadata extraction, indexing, remote data movement, etc using micro-services and rules. Support for rich I/O models for file ingestion and access including in- situ registration of files into the system, inline transfer of small files, and parallel transfer for large files. Support for federation of data grids. Two independently managed persistent archives can establish a defined level of trust for the exchange of materials and access in one or both directions. This concept is very useful for developing a full-fledged preservation environment with dark and light archives. The irods data grid system consists of several components. It has a metadata catalog server, called the icat server, which provides the metadata and abstraction services for the whole data grid. There can be multiple resource servers that provide access to storage resources. A resource server (ires) can provide access to more than one storage resource. The system can support any number of clients at a time. A client can connect to any server on the grid and request access to digital objects from the system. The request is parsed using the contextual and system information stored in the icat catalog, and a physical object is identified and transferred to the client. The request can be in terms of logical object names, or a conditional query based on descriptive and system metadata attributes. irods is a peer-to-peer server system. Hence, requests can be made to any server, which in turn acts (brokers) on behalf of the client for transferring the file. The final file transfer takes the shortest path in terms of number of hops. An important aspect of irods is its built-in rule framework. As part of each resource server, a distributed rule engine is implemented that provides extensibility and customizability by encoding server-side operations (including the main access APIs) into sequences of micro-services. The sequence of micro-services is controlled by userdefined and/or administrator-defined Event-Condition-Action rules similar to those found in active databases. The rules can be viewed as defining pipelines and/or workflows. An ingestion or access process can be encoded as a rule to provide a customized functionality. Rules can also be defined by users and executed interactively. Hence, changes to a particular process or policy can be easily constructed by the user and tested and deployed without the aid of system and application developers. The user can also define conditions when a rule gets triggered thus controlling application of different rules (or processing pipelines) based on current events and operating conditions. The programming of rules in irods can be viewed as lego-block type programming. The building blocks for the irods rules are micro-services - small, well-defined procedures/functions that perform a certain task. For example, one may encode a rule that when accessing a data object from a collection C, additional authorization checks need to be made. These authorization checks can be encoded as a set of micro-services with different triggers that can fire based on current operating conditions. In this way, one can control access to sensitive data based on rules and can escalate or reduce authorization levels dynamically as the situation warrants. The irods rule engine design builds upon the application of theories and concepts from a wide range of well-known paradigms from fields such as active databases, transactional systems, logic programming, business rule systems, constraint-management systems, workflows, service-oriented architecture and program verification. Apart from ires servers and an icat server, irods also has two other servers: isec for scheduling and executing queued rules, and ixms for

6 WAN, M., MOORE, R. AND RAJASEKAR, A providing a message-passing framework between micro-services. Figure 1 shows the various components of the irods system as well as some of its user interfaces. Figure 1 irods Architecture The irods system is in production use in multiple projects including the US National Archives Transcontinental Persistent Archive Prototype (TPAP) [21], the NSF Science of Learning Centers [22], the Australian Research Collaboration Services [23] and the SHAMAN project in UK [24]. 3. INTEGRATING irods AND AMAZON S S3 Amazon s S3 provides a powerful and easy to deploy internet-based file storage system. But it does not provide any other capabilities that will make it user-friendly and easy to integrate with existing storage systems. It lacks many of the features that will enable it to be used as a long-term, highly available and sharable resource. As it is, it is good for parking files for projects, using it as a backup web site with public access, and as a storage system that is integrated with Amazon s EC2 compute cloud. Some of the capabilities that can be value-added to S3 to make it more viable are: Full-fledged File System Interface: S3 does not expose any hierarchical directory/folder structure that we are familiar with. For each user it provides a limited number of buckets and files are placed in it with unique names. S3 also does not have full-fledged ACLs which can be used for controlling access to user groups and the public. Also, it is not user friendly as the user name space for user accounts are given system-defined strings. Also, the concept of public/anonymous user is not supported. At the API level, its protocol does not support the POSIX API which is widely used in block-level programmatic interfaces. S3 also does not support symbolic links where a file can be accessed from more than one file path definition. Data Grid Services: S3 is not suitable for integrating and federating with other resources. Many tools are available to use S3 as a backup resource, but they don t provide a means to use S3 in a federation of resources. Such a federation will allow multiple resources to

7 Integration of Cloud Storage with Data Grids be shared including other cloud storage services. S3 does not provide any tools for keeping track of replicated files and versioned files. Also, data grids need data to be transferred at high speed and in parallel. They also need to deal with data sizes larger than the 5GB (currently) limit within S3. Digital Library Services: S3 does not have metadata support. Descriptive and system metadata are needed for keeping additional information about an object (such as engineering, calibration and positional information of a sky image taken by an telescope). This is not only useful for processing the files but also for discovering them later based on a multi-dimensional search. Metadata schemas exists for multiple domains such as the FITS metadata for astronomical data, DICOM metadata for medical images, Dublin Core elements for electronic documents and Darwin Core for ecological data. Supporting such schemas for managing the contextual information for the data and enabling discovery is an important aspect of digital libraries. Persistent Archive Services: Even though S3 provides a robust storage platform, with long-term viability, it does not provide the necessary tools for maintaining a persistent archive. These include keeping track of integrity of the digital objects, including chain of custody; packaging information (data and metadata) into bundles and keeping them together for ease of access and archiving purposes; and consistency checking for validation of bit-wise integrity. Some of these capabilities are provided by third-party services, but many of these are still not sufficient to provide coherent and robust data sharing, or implement a digital library or persistent archiving environment on top of S3. By integrating irods on top of S3, these capabilities are realized, making S3 an attractive option for network-based storage. We have designed and implemented such an integrated system and have shown that it is viable and provides value-added services for S3. For our integration, we used the libs3 [25] library provided by a third party software developer. The software is available as source and as binaries, under a GPL license, and can be downloaded from the Amazon s S3 web site. We opted to use the libs3 package because it is in the language of preference (C language) and provided a simple means of integrating S3 with irods. The developer of libs3 had the design goal of implementing a C API for S3 access that provides a simple and straightforward API for accessing all of S3's functionality using sequentialized blocking requests, does not require the application developer to know anything about the internal S3 interface or about WSDL, HTTP, XML and SSL, can be used in a multi-threaded environment and can be used from applications that can connect multiple times simultaneously to S3. These design goals eminently suited our purpose as it was well-aligned with those of irods design goals multithreading, and sequential blocked access to files with get/put functionality. There was one major mismatch between the access functionality provided by S3 (and libs3) and irods. irods provides access similar to that of POSIX I/O including file open, seek, and close functionality not supported by S3. Also, irods allows users to access files in blocks; even though S3 has similar functionality using its GetObjectExtended API, using it to access small buffer sizes would make the system very slow. In order to manage this gap in functionality, in our integration we used the "Compound Resource Framework" - one of the integration methods in irods. The compound resource framework allows one to group multiple resources into a single resource pool. Each of the resources in the pool has a designated resource type

8 WAN, M., MOORE, R. AND RAJASEKAR, A such as Archive, Permanent, Cache and Volatile. An archive resource is used as a deep resource (such as tapes), possibly with high latency, and is used mainly for archiving files. One is disallowed from performing buffer level operations on such files and access is mainly through whole file retrieval. A cache resource on the other hand is considered to be a low-latency system (disk-based) with high bandwidth and support for parallel I/O. Objects in a cache resource can be purged by the irods administrator to recover space. A permanent resource is similar to a cache but is not amenable to purging, and a volatile resource is a temporary resource that can purge data without the knowledge of the irods system. Synchronization and back-up functionality between cache files and archived/permanent files is supported under the compound resource framework. One can have multiple resources of the same type in a compound resource group. In this framework, whenever a user ingests a file into the archive resource, it is not directly ingested into that resource. Instead, the file is automatically diverted into a cache resource. Periodically, or once a full file has been transferred, the file in the cache is synchronized into the archive by making a copy in the archive resource. On access of a file within the archive, it is first staged onto the cache resource, and then provided to the user. An advantage of this staging is that the bandwidth mismatch between the archive resource to the irods server and that of the irods server to the client host is automatically smoothed in such a way that the load and interactions with the archive resource is kept at a minimum. For example, if a user is accessing files in 10- kilobyte blocks, the irods system brings the whole file from the S3 and caches it in the associated disk resource (say the file is 1 GB in length). Independently of S3, irods performs the small access operations on the staged file. The S3 does not see the load of getting small buffers and the user sees a fast response to her requests. The integration of S3 and irods using the libs3 library allowed us to apply all the functionality of irods on files stored in S3. These include a rich access control paradigm, applying ingestion and access policies, replication, versioning, copying and moving and other data management features provided by irods. More importantly, users can associate metadata with files stored in S3 and use the query interface provided by irods to discover and access files using domain-centric metadata. The integration of the irods data grid on top of cloud storage makes it possible to build a shared collection that spans institutional repositories and cloud storage. A research project can establish a policy for when data will be migrated from the institutional repository into the cloud storage, implement criteria for minimizing data flow out of the cloud through management of the cache, and enable discovery of data stored in the cloud without having to access the cloud. This is accomplished through policy-based control of all operations that access the cloud storage. An irods policy is expressed as rule of the form: Event Condition Action-chain Recovery-chain Events are defined for each possible interaction with the cloud storage. Examples include: putting a file into the cloud storage getting a file from the cloud storage replicating a file into the cloud storage moving a file between storage systems copying a file that is in cloud storage aggregating files into a container before storage in the cloud

9 Integration of Cloud Storage with Data Grids generating an audit trail of all interactions with the could storage Given an event, a condition can then be specified that must be satisfied before the associated action-chain is executed. Examples of conditions include: test for whether the user's group has permission to use the cloud storage test for which cloud resource is being accessed test for whether the cache in front of the cloud resource is full test for whether the files in the cloud resource have reached a data retention time limit test for whether the data stored in the cloud has reached an allowable maximum test for whether an integrity check on files in the cloud was done within a desired time period Given satisfactory evaluation of the condition, an appropriate action-chain is then executed. Examples include: (put a file, valid user access, but quota exceeded) for the action chain, store the file in an alternate resource (put a file, but cache is full) for the action chain, identify the least recently used files and purge from cache (put a file, but size is below a minimum) aggregate the file into a container on a separate system before loading into cache The policies that control interaction with the cloud storage system can be quite sophisticated, and invoke hierarchical rules to pre-process the file before storage or on access. Through use of the cache in front of the cloud storage, pre- and post-processing of the file is straightforward. 4. STATUS AND CONCLUSION The integration of irods with cloud storage has been implemented and is part of the release of irods Version 2.2. Any user who has an account in an irods system that uses an S3 resource can use it for file storage (provided the user has appropriate resourcelevel access permission). We have a system running in our testbed at UCSD that uses S3 as a resource. Some users of irods (such as the Ocean Observatories Initiative [26]) have shown interest in using S3 through the irods framework. The system is undergoing testing and is seen to be robust. Because of the success with Amazon S3 integration, we are planning to interface other cloud storage systems such as those offered by Google and Microsoft. The main advantage we see with this integration is the ability to perform large-scale data operations (using micro-services) on files stored in S3. In the near future, we propose to integrate Amazon s Elastic Cloud Computing [7] with irods rule-based execution environment. With such an integrated system, one can perform operations on large file collections (functions such as format conversion, image processing, integrity validation, data mining, etc) by storing files in S3 and launching the jobs in EC2. ACKNOWLEDGMENT The research results in this paper were funded by the NARA supplement to NSF SCI , Cyberinfrastructure; From Vision to Reality - Transcontinental Persistent Archive Prototype (TPAP) ( ) and by the NSF Office of Cyberinfrastructure OCI grant, NARA Transcontinental Persistent Archive Prototype, ( ). The irods technology development has been funded by NSF ITR , Constraint-based Knowledge Systems for Grids, Digital Libraries, and Persistent

10 WAN, M., MOORE, R. AND RAJASEKAR, A Archives ( ) and NSF SDCI , "SDCI Data Improvement: Data Grids for Community Driven Applications ( ). REFERENCES 1. Rajasekar, M. Wan, M. Moore, and W. Schroeder, A Prototype Rule-based Distributed Data Management System, HPDC workshop on Next Generation Distributed Data Management, Paris, France, irods: integrated Rule Oriented Data System, 3. R.W. Moore and A. Rajasekar, Rule-Based Distributed Data Management, Grid 2007: IEEE/ACM International Conference on Grid Computing, Amazon Simple Storage Service (Amazon S3), 5. Weiss, Computing in the clouds, networker, v.11 n.4, p.16-25, December R. Buyya, Yeo, C., Venugopal, S., Broberg, J., Brandic, I., Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems, v.25 n.6, p , June, Amazon elastic compute cloud (EC2) Google app engine Sun network.com (Sun grid) IBM Cloud Computing Microsoft azure GoGrid Cloud Hosting J. Broberg, Buyya, R., and Tari, Z., Creating a Cloud Storage Mashup for High Performance, Low Cost Content Delivery. Proc. Service-Oriented Computing --- ICSOC 2008 Workshops, LNCS pp Nirvanix storage delivery network (SDN) GoogleDocs Rackspace Managed Hosting Foster, I., and Kesselman, C., (1999) The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann. 18. Rajasekar, A., Wan, M., Moore, R., Schroeder, W., Kremenek, G., Jagatheesan, A., Cowart, C., Zhu, B., Chen, S.-Y., and Olschanowsky, R Storage Resource Broker - Managing Distributed Data in a Grid, Computer Society of India Journal, special issue on SAN, Chervenak A., Foster, I., Kesselman, C., Salisbury, C., and Tuecke, S The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Data Sets, Journal of Network and Computer Applications: Special Issue on Network-Based Storage Services, vol. 23, no. 3, p , July A List of Amazon S3 Backup Tools NARA Transcontinental Persistent Archive Prototype, Australian Research Collaboration Service; Davis: A Generic Interface for SRB and irods, Science of Learning Centers, SHAMAN: Sustaining Heritage Access through Multivalent ArchiviNg libs3: A C Library API for Amazon S Ocean Observatories Initiative,

The International Journal of Digital Curation Issue 1, Volume

The International Journal of Digital Curation Issue 1, Volume Towards a Theory of Digital Preservation 63 Towards a Theory of Digital Preservation Reagan Moore, San Diego Supercomputer Center June 2008 Abstract A preservation environment manages communication from

More information

IRODS: the Integrated Rule- Oriented Data-Management System

IRODS: the Integrated Rule- Oriented Data-Management System IRODS: the Integrated Rule- Oriented Data-Management System Wayne Schroeder, Paul Tooby Data Intensive Cyber Environments Team (DICE) DICE Center, University of North Carolina at Chapel Hill; Institute

More information

A Simple Mass Storage System for the SRB Data Grid

A Simple Mass Storage System for the SRB Data Grid A Simple Mass Storage System for the SRB Data Grid Michael Wan, Arcot Rajasekar, Reagan Moore, Phil Andrews San Diego Supercomputer Center SDSC/UCSD/NPACI Outline Motivations for implementing a Mass Storage

More information

MetaData Management Control of Distributed Digital Objects using irods. Venkata Raviteja Vutukuri

MetaData Management Control of Distributed Digital Objects using irods. Venkata Raviteja Vutukuri Abstract: MetaData Management Control of Distributed Digital Objects using irods Venkata Raviteja Vutukuri irods is a middleware mechanism which accomplishes high level control on diverse distributed digital

More information

Implementing Trusted Digital Repositories

Implementing Trusted Digital Repositories Implementing Trusted Digital Repositories Reagan W. Moore, Arcot Rajasekar, Richard Marciano San Diego Supercomputer Center 9500 Gilman Drive, La Jolla, CA 92093-0505 {moore, sekar, marciano}@sdsc.edu

More information

Transcontinental Persistent Archive Prototype

Transcontinental Persistent Archive Prototype Transcontinental Persistent Archive Prototype Policy-Driven Data Preservation Reagan W. Moore University of North Carolina at Chapel Hill rwmoore@renci.org http://irods.diceresearch.org p// NSF OCI-0848296

More information

DSpace Fedora. Eprints Greenstone. Handle System

DSpace Fedora. Eprints Greenstone. Handle System Enabling Inter-repository repository Access Management between irods and Fedora Bing Zhu, Uni. of California: San Diego Richard Marciano Reagan Moore University of North Carolina at Chapel Hill May 18,

More information

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Joesph JaJa joseph@ Mike Smorul toaster@ Fritz McCall fmccall@ Yang Wang wpwy@ Institute

More information

Mitigating Risk of Data Loss in Preservation Environments

Mitigating Risk of Data Loss in Preservation Environments Storage Resource Broker Mitigating Risk of Data Loss in Preservation Environments Reagan W. Moore San Diego Supercomputer Center Joseph JaJa University of Maryland Robert Chadduck National Archives and

More information

Policy Based Distributed Data Management Systems

Policy Based Distributed Data Management Systems Policy Based Distributed Data Management Systems Reagan W. Moore Arcot Rajasekar Mike Wan {moore,sekar,mwan}@diceresearch.org http://irods.diceresearch.org Abstract Digital repositories can be defined

More information

DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS

DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS Reagan W. Moore San Diego Supercomputer Center San Diego, CA, USA Abstract Scientific applications now have data management requirements that extend

More information

Data Grid Services: The Storage Resource Broker. Andrew A. Chien CSE 225, Spring 2004 May 26, Administrivia

Data Grid Services: The Storage Resource Broker. Andrew A. Chien CSE 225, Spring 2004 May 26, Administrivia Data Grid Services: The Storage Resource Broker Andrew A. Chien CSE 225, Spring 2004 May 26, 2004 Administrivia This week:» 5/28 meet ½ hour early (430pm) Project Reports Due, 6/10, to Andrew s Office

More information

Knowledge-based Grids

Knowledge-based Grids Knowledge-based Grids Reagan Moore San Diego Supercomputer Center (http://www.npaci.edu/dice/) Data Intensive Computing Environment Chaitan Baru Walter Crescenzi Amarnath Gupta Bertram Ludaescher Richard

More information

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent

More information

White Paper: National Data Infrastructure for Earth System Science

White Paper: National Data Infrastructure for Earth System Science White Paper: National Data Infrastructure for Earth System Science Reagan W. Moore Arcot Rajasekar Mike Conway University of North Carolina at Chapel Hill Wayne Schroeder Mike Wan University of California,

More information

Virtualization of Workflows for Data Intensive Computation

Virtualization of Workflows for Data Intensive Computation Virtualization of Workflows for Data Intensive Computation Sreekanth Pothanis (1,2), Arcot Rajasekar (3,4), Reagan Moore (3,4). 1 Center for Computation and Technology, Louisiana State University, Baton

More information

Simplifying Collaboration in the Cloud

Simplifying Collaboration in the Cloud Simplifying Collaboration in the Cloud WOS and IRODS Data Grid Dave Fellinger dfellinger@ddn.com Innovating in Storage DDN Firsts: Streaming ingest from satellite with guaranteed bandwidth Continuous service

More information

WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution

WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution Tervela helps companies move large volumes of sensitive data safely and securely over network distances great and small. We have been

More information

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM Note: Before you use this

More information

GlobalSearch Security Definition Guide

GlobalSearch Security Definition Guide Prepared by: Marketing Square 9 Softworks 203-361-3471 127 Church Street, New Haven, CT 06510 O: (203) 789-0889 E: sales@square-9.com www.square-9.com Table of Contents GLOBALSEARCH SECURITY METHODS...

More information

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Paul Watry Univ. of Liverpool, NaCTeM pwatry@liverpool.ac.uk Ray Larson Univ. of California, Berkeley

More information

The Bits Are In The Clouds

The Bits Are In The Clouds The Bits Are In The Clouds Jeff Barr Senior Web Services Evangelist jbarr@amazon.com www.storage-developer.org Hello! I m Jeff Barr Amazon employee since 2002 Background: Microsoft, consulting, startups

More information

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM Note: Before you use this information

More information

Richard Marciano Alexandra Chassanoff David Pcolar Bing Zhu Chien-Yi Hu. March 24, 2010

Richard Marciano Alexandra Chassanoff David Pcolar Bing Zhu Chien-Yi Hu. March 24, 2010 Richard Marciano Alexandra Chassanoff David Pcolar Bing Zhu Chien-Yi Hu March 24, 2010 What is the feasibility of repository interoperability at the policy level? Can a preservation environment be assembled

More information

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam Agenda Introduction Challenges Data Transfer Solution irods use in Data Transfer Solution irods Proof-of-Concept Q&A Introduction

More information

Technical Overview. Access control lists define the users, groups, and roles that can access content as well as the operations that can be performed.

Technical Overview. Access control lists define the users, groups, and roles that can access content as well as the operations that can be performed. Technical Overview Technical Overview Standards based Architecture Scalable Secure Entirely Web Based Browser Independent Document Format independent LDAP integration Distributed Architecture Multiple

More information

Best Practices in Designing Cloud Storage based Archival solution Sreenidhi Iyangar & Jim Rice EMC Corporation

Best Practices in Designing Cloud Storage based Archival solution Sreenidhi Iyangar & Jim Rice EMC Corporation Best Practices in Designing Cloud Storage based Archival solution Sreenidhi Iyangar & Jim Rice EMC Corporation Abstract Cloud storage facilitates the use case of digital archiving for long periods of time

More information

SURVEY PAPER ON CLOUD COMPUTING

SURVEY PAPER ON CLOUD COMPUTING SURVEY PAPER ON CLOUD COMPUTING Kalpana Tiwari 1, Er. Sachin Chaudhary 2, Er. Kumar Shanu 3 1,2,3 Department of Computer Science and Engineering Bhagwant Institute of Technology, Muzaffarnagar, Uttar Pradesh

More information

Large Scale Computing Infrastructures

Large Scale Computing Infrastructures GC3: Grid Computing Competence Center Large Scale Computing Infrastructures Lecture 2: Cloud technologies Sergio Maffioletti GC3: Grid Computing Competence Center, University

More information

Microsoft SharePoint Server 2013 Plan, Configure & Manage

Microsoft SharePoint Server 2013 Plan, Configure & Manage Microsoft SharePoint Server 2013 Plan, Configure & Manage Course 20331-20332B 5 Days Instructor-led, Hands on Course Information This five day instructor-led course omits the overlap and redundancy that

More information

Grid Computing with Voyager

Grid Computing with Voyager Grid Computing with Voyager By Saikumar Dubugunta Recursion Software, Inc. September 28, 2005 TABLE OF CONTENTS Introduction... 1 Using Voyager for Grid Computing... 2 Voyager Core Components... 3 Code

More information

The International Journal of Digital Curation Issue 1, Volume

The International Journal of Digital Curation Issue 1, Volume 92 Digital Archive Policies Issue 1, Volume 2 2007 Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan W. Moore, San Diego Supercomputer Center June 2007 Abstract

More information

Enable IoT Solutions using Azure

Enable IoT Solutions using Azure Internet Of Things A WHITE PAPER SERIES Enable IoT Solutions using Azure 1 2 TABLE OF CONTENTS EXECUTIVE SUMMARY INTERNET OF THINGS GATEWAY EVENT INGESTION EVENT PERSISTENCE EVENT ACTIONS 3 SYNTEL S IoT

More information

Overview SENTINET 3.1

Overview SENTINET 3.1 Overview SENTINET 3.1 Overview 1 Contents Introduction... 2 Customer Benefits... 3 Development and Test... 3 Production and Operations... 4 Architecture... 5 Technology Stack... 7 Features Summary... 7

More information

SDS: A Scalable Data Services System in Data Grid

SDS: A Scalable Data Services System in Data Grid SDS: A Scalable Data s System in Data Grid Xiaoning Peng School of Information Science & Engineering, Central South University Changsha 410083, China Department of Computer Science and Technology, Huaihua

More information

Design of Distributed Data Mining Applications on the KNOWLEDGE GRID

Design of Distributed Data Mining Applications on the KNOWLEDGE GRID Design of Distributed Data Mining Applications on the KNOWLEDGE GRID Mario Cannataro ICAR-CNR cannataro@acm.org Domenico Talia DEIS University of Calabria talia@deis.unical.it Paolo Trunfio DEIS University

More information

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES May, 2017 Contents Introduction... 2 Overview... 2 Architecture... 2 SDFS File System Service... 3 Data Writes... 3 Data Reads... 3 De-duplication

More information

Developing Microsoft Azure Solutions (70-532) Syllabus

Developing Microsoft Azure Solutions (70-532) Syllabus Developing Microsoft Azure Solutions (70-532) Syllabus Cloud Computing Introduction What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages

More information

Introduction to Cloud Computing and Virtual Resource Management. Jian Tang Syracuse University

Introduction to Cloud Computing and Virtual Resource Management. Jian Tang Syracuse University Introduction to Cloud Computing and Virtual Resource Management Jian Tang Syracuse University 1 Outline Definition Components Why Cloud Computing Cloud Services IaaS Cloud Providers Overview of Virtual

More information

Cloud FastPath: Highly Secure Data Transfer

Cloud FastPath: Highly Secure Data Transfer Cloud FastPath: Highly Secure Data Transfer Tervela helps companies move large volumes of sensitive data safely and securely over network distances great and small. Tervela has been creating high performance

More information

Developing Microsoft Azure Solutions (70-532) Syllabus

Developing Microsoft Azure Solutions (70-532) Syllabus Developing Microsoft Azure Solutions (70-532) Syllabus Cloud Computing Introduction What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages

More information

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity The Social Grid Leveraging the Power of the Web and Focusing on Development Simplicity Tony Hey Corporate Vice President of Technical Computing at Microsoft TCP/IP versus ISO Protocols ISO Committees disconnected

More information

Solution Brief: Archiving with Harmonic Media Application Server and ProXplore

Solution Brief: Archiving with Harmonic Media Application Server and ProXplore Solution Brief: Archiving with Harmonic Media Application Server and ProXplore Summary Harmonic Media Application Server (MAS) provides management of content across the Harmonic server and storage infrastructure.

More information

Datacenter replication solution with quasardb

Datacenter replication solution with quasardb Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION

More information

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013 Data Replication: Automated move and copy of data PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013 Claudio Cacciari c.cacciari@cineca.it Outline The issue

More information

Distributed Systems Principles and Paradigms. Chapter 01: Introduction

Distributed Systems Principles and Paradigms. Chapter 01: Introduction Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 01: Introduction Version: October 25, 2009 2 / 26 Contents Chapter

More information

CA464 Distributed Programming

CA464 Distributed Programming 1 / 25 CA464 Distributed Programming Lecturer: Martin Crane Office: L2.51 Phone: 8974 Email: martin.crane@computing.dcu.ie WWW: http://www.computing.dcu.ie/ mcrane Course Page: "/CA464NewUpdate Textbook

More information

REFERENCE ARCHITECTURE. Rubrik and Nutanix

REFERENCE ARCHITECTURE. Rubrik and Nutanix REFERENCE ARCHITECTURE Rubrik and Nutanix TABLE OF CONTENTS INTRODUCTION - RUBRIK...3 INTRODUCTION - NUTANIX...3 AUDIENCE... 4 INTEGRATION OVERVIEW... 4 ARCHITECTURE OVERVIEW...5 Nutanix Snapshots...6

More information

Sentinet for BizTalk Server SENTINET

Sentinet for BizTalk Server SENTINET Sentinet for BizTalk Server SENTINET Sentinet for BizTalk Server 1 Contents Introduction... 2 Sentinet Benefits... 3 SOA and API Repository... 4 Security... 4 Mediation and Virtualization... 5 Authentication

More information

Data Sharing with Storage Resource Broker Enabling Collaboration in Complex Distributed Environments. White Paper

Data Sharing with Storage Resource Broker Enabling Collaboration in Complex Distributed Environments. White Paper Data Sharing with Storage Resource Broker Enabling Collaboration in Complex Distributed Environments White Paper 2 SRB: Enabling Collaboration in Complex Distributed Environments Table of Contents Introduction...3

More information

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems Distributed Systems Outline Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems What Is A Distributed System? A collection of independent computers that appears

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation

Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation 12 December 2007 Digital Curation Conference Washington D.C. Richard Moore Director of Production Systems San Diego

More information

Developing Microsoft Azure Solutions (70-532) Syllabus

Developing Microsoft Azure Solutions (70-532) Syllabus Developing Microsoft Azure Solutions (70-532) Syllabus Cloud Computing Introduction What is Cloud Computing Cloud Characteristics Cloud Computing Service Models Deployment Models in Cloud Computing Advantages

More information

Protect enterprise data, achieve long-term data retention

Protect enterprise data, achieve long-term data retention Technical white paper Protect enterprise data, achieve long-term data retention HP StoreOnce Catalyst and Symantec NetBackup OpenStorage Table of contents Introduction 2 Technology overview 3 HP StoreOnce

More information

Distributed Systems Principles and Paradigms. Chapter 01: Introduction. Contents. Distributed System: Definition.

Distributed Systems Principles and Paradigms. Chapter 01: Introduction. Contents. Distributed System: Definition. Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 01: Version: February 21, 2011 1 / 26 Contents Chapter 01: 02: Architectures

More information

UNICORE Globus: Interoperability of Grid Infrastructures

UNICORE Globus: Interoperability of Grid Infrastructures UNICORE : Interoperability of Grid Infrastructures Michael Rambadt Philipp Wieder Central Institute for Applied Mathematics (ZAM) Research Centre Juelich D 52425 Juelich, Germany Phone: +49 2461 612057

More information

Introduction to Distributed Systems. INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio)

Introduction to Distributed Systems. INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio) Introduction to Distributed Systems INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio) August 28, 2018 Outline Definition of a distributed system Goals of a distributed system Implications of distributed

More information

API TESTING TOOL IN CLOUD

API TESTING TOOL IN CLOUD API TESTING TOOL IN CLOUD Abhaysinh Sathe 1, Dr. Raj Kulkarni 2 1,2 Walchand Institute Of Technology, Solapur ABSTRACT Testing becomes an important process not only in term of exposure but also in terms

More information

Boundary control : Access Controls: An access control mechanism processes users request for resources in three steps: Identification:

Boundary control : Access Controls: An access control mechanism processes users request for resources in three steps: Identification: Application control : Boundary control : Access Controls: These controls restrict use of computer system resources to authorized users, limit the actions authorized users can taker with these resources,

More information

Raj Jain (Washington University in Saint Louis) Mohammed Samaka (Qatar University)

Raj Jain (Washington University in Saint Louis) Mohammed Samaka (Qatar University) APPLICATION DEPLOYMENT IN FUTURE GLOBAL MULTI-CLOUD ENVIRONMENT Raj Jain (Washington University in Saint Louis) Mohammed Samaka (Qatar University) GITMA 2015 Conference, St. Louis, June 23, 2015 These

More information

Distributed Data Management with Storage Resource Broker in the UK

Distributed Data Management with Storage Resource Broker in the UK Distributed Data Management with Storage Resource Broker in the UK Michael Doherty, Lisa Blanshard, Ananta Manandhar, Rik Tyer, Kerstin Kleese @ CCLRC, UK Abstract The Storage Resource Broker (SRB) is

More information

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT. Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies

More information

OnCommand Cloud Manager 3.2 Deploying and Managing ONTAP Cloud Systems

OnCommand Cloud Manager 3.2 Deploying and Managing ONTAP Cloud Systems OnCommand Cloud Manager 3.2 Deploying and Managing ONTAP Cloud Systems April 2017 215-12035_C0 doccomments@netapp.com Table of Contents 3 Contents Before you create ONTAP Cloud systems... 5 Logging in

More information

An Introduction to GPFS

An Introduction to GPFS IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4

More information

Cloud Computing 4/17/2016. Outline. Cloud Computing. Centralized versus Distributed Computing Some people argue that Cloud Computing. Cloud Computing.

Cloud Computing 4/17/2016. Outline. Cloud Computing. Centralized versus Distributed Computing Some people argue that Cloud Computing. Cloud Computing. Cloud Computing By: Muhammad Naseem Assistant Professor Department of Computer Engineering, Sir Syed University of Engineering & Technology, Web: http://sites.google.com/site/muhammadnaseem105 Email: mnaseem105@yahoo.com

More information

CA ARCserve Backup. Benefits. Overview. The CA Advantage

CA ARCserve Backup. Benefits. Overview. The CA Advantage PRODUCT BRIEF: CA ARCSERVE BACKUP R12.5 CA ARCserve Backup CA ARCSERVE BACKUP, A HIGH-PERFORMANCE, INDUSTRY-LEADING DATA PROTECTION PRODUCT, UNITES INNOVATIVE DATA DEDUPLICATION TECHNOLOGY, POWERFUL STORAGE

More information

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Programming Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Computing Only required amount of CPU and storage can be used anytime from anywhere via network Availability, throughput, reliability

More information

Don t just manage your documents. Mobilize them!

Don t just manage your documents. Mobilize them! Don t just manage your documents Mobilize them! Don t just manage your documents Mobilize them! A simple, secure way to transform how you control your documents across the Internet and in your office.

More information

XenData Product Brief: SX-550 Series Servers for LTO Archives

XenData Product Brief: SX-550 Series Servers for LTO Archives XenData Product Brief: SX-550 Series Servers for LTO Archives The SX-550 Series of Archive Servers creates highly scalable LTO Digital Video Archives that are optimized for broadcasters, video production

More information

SRB Logical Structure

SRB Logical Structure SDSC Storage Resource Broker () Introduction and Applications based on material by Arcot Rajasekar, Reagan Moore et al San Diego Supercomputer Center, UC San Diego A distributed file system (Data Grid),

More information

BlackPearl Customer Created Clients Using Free & Open Source Tools

BlackPearl Customer Created Clients Using Free & Open Source Tools BlackPearl Customer Created Clients Using Free & Open Source Tools December 2017 Contents A B S T R A C T... 3 I N T R O D U C T I O N... 3 B U L D I N G A C U S T O M E R C R E A T E D C L I E N T...

More information

Developing Enterprise Cloud Solutions with Azure

Developing Enterprise Cloud Solutions with Azure Developing Enterprise Cloud Solutions with Azure Java Focused 5 Day Course AUDIENCE FORMAT Developers and Software Architects Instructor-led with hands-on labs LEVEL 300 COURSE DESCRIPTION This course

More information

Advanced Solutions of Microsoft SharePoint Server 2013 Course Contact Hours

Advanced Solutions of Microsoft SharePoint Server 2013 Course Contact Hours Advanced Solutions of Microsoft SharePoint Server 2013 Course 20332 36 Contact Hours Course Overview This course examines how to plan, configure, and manage a Microsoft SharePoint Server 2013 environment.

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

Advanced Solutions of Microsoft SharePoint 2013

Advanced Solutions of Microsoft SharePoint 2013 Course 20332A :Advanced Solutions of Microsoft SharePoint 2013 Page 1 of 9 Advanced Solutions of Microsoft SharePoint 2013 Course 20332A: 4 days; Instructor-Led About the Course This four-day course examines

More information

Object Storage Service. Product Introduction. Issue 04 Date HUAWEI TECHNOLOGIES CO., LTD.

Object Storage Service. Product Introduction. Issue 04 Date HUAWEI TECHNOLOGIES CO., LTD. Issue 04 Date 2017-12-20 HUAWEI TECHNOLOGIES CO., LTD. 2017. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of

More information

The Materials Data Facility

The Materials Data Facility The Materials Data Facility Ben Blaiszik (blaiszik@uchicago.edu), Kyle Chard (chard@uchicago.edu) Ian Foster (foster@uchicago.edu) materialsdatafacility.org What is MDF? We aim to make it simple for materials

More information

Zumobi Brand Integration(Zbi) Platform Architecture Whitepaper Table of Contents

Zumobi Brand Integration(Zbi) Platform Architecture Whitepaper Table of Contents Zumobi Brand Integration(Zbi) Platform Architecture Whitepaper Table of Contents Introduction... 2 High-Level Platform Architecture Diagram... 3 Zbi Production Environment... 4 Zbi Publishing Engine...

More information

I data set della ricerca ed il progetto EUDAT

I data set della ricerca ed il progetto EUDAT I data set della ricerca ed il progetto EUDAT Casalecchio di Reno (BO) Via Magnanelli 6/3, 40033 Casalecchio di Reno 051 6171411 www.cineca.it 1 Digital as a Global Priority 2 Focus on research data Square

More information

Hedvig as backup target for Veeam

Hedvig as backup target for Veeam Hedvig as backup target for Veeam Solution Whitepaper Version 1.0 April 2018 Table of contents Executive overview... 3 Introduction... 3 Solution components... 4 Hedvig... 4 Hedvig Virtual Disk (vdisk)...

More information

CHEM-E Process Automation and Information Systems: Applications

CHEM-E Process Automation and Information Systems: Applications CHEM-E7205 - Process Automation and Information Systems: Applications Cloud computing Jukka Kortela Contents What is Cloud Computing? Overview of Cloud Computing Comparison of Cloud Deployment Models Comparison

More information

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality

More information

Sentinet for BizTalk Server VERSION 2.2

Sentinet for BizTalk Server VERSION 2.2 for BizTalk Server VERSION 2.2 for BizTalk Server 1 Contents Introduction... 2 SOA Repository... 2 Security... 3 Mediation and Virtualization... 3 Authentication and Authorization... 4 Monitoring, Recording

More information

Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack

Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack Robert Collazo Systems Engineer Rackspace Hosting The Rackspace Vision Agenda Truly a New Era of Computing 70 s 80 s Mainframe Era 90

More information

NC Education Cloud Feasibility Report

NC Education Cloud Feasibility Report 1 NC Education Cloud Feasibility Report 1. Problem Definition and rationale North Carolina districts are generally ill-equipped to manage production server infrastructure. Server infrastructure is most

More information

Requirements for data catalogues within facilities

Requirements for data catalogues within facilities Requirements for data catalogues within facilities Milan Prica 1, George Kourousias 1, Alistair Mills 2, Brian Matthews 2 1 Sincrotrone Trieste S.C.p.A, Trieste, Italy 2 Scientific Computing Department,

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:

More information

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama Basics of Cloud Computing Lecture 2 Cloud Providers Satish Srirama Outline Cloud computing services recap Amazon cloud services Elastic Compute Cloud (EC2) Storage services - Amazon S3 and EBS Cloud managers

More information

Genomics on Cisco Metacloud + SwiftStack

Genomics on Cisco Metacloud + SwiftStack Genomics on Cisco Metacloud + SwiftStack Technology is a large component of driving discovery in both research and providing timely answers for clinical treatments. Advances in genomic sequencing have

More information

Policy-Driven Repository Interoperability: Enabling Integration Patterns for irods and Fedora

Policy-Driven Repository Interoperability: Enabling Integration Patterns for irods and Fedora Policy-Driven Repository Interoperability: Enabling Integration Patterns for irods and Fedora David Pcolar Carolina Digital Repository (CDR) UNC Chapel Hill david_pcolar@unc.edu Alexandra Chassanoff School

More information

Introduction to Grid Technology

Introduction to Grid Technology Introduction to Grid Technology B.Ramamurthy 1 Arthur C Clarke s Laws (two of many) Any sufficiently advanced technology is indistinguishable from magic." "The only way of discovering the limits of the

More information

Advanced Solutions of Microsoft SharePoint Server 2013

Advanced Solutions of Microsoft SharePoint Server 2013 Course Duration: 4 Days + 1 day Self Study Course Pre-requisites: Before attending this course, students must have: Completed Course 20331: Core Solutions of Microsoft SharePoint Server 2013, successful

More information

Understanding StoRM: from introduction to internals

Understanding StoRM: from introduction to internals Understanding StoRM: from introduction to internals 13 November 2007 Outline Storage Resource Manager The StoRM service StoRM components and internals Deployment configuration Authorization and ACLs Conclusions.

More information

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid THE GLOBUS PROJECT White Paper GridFTP Universal Data Transfer for the Grid WHITE PAPER GridFTP Universal Data Transfer for the Grid September 5, 2000 Copyright 2000, The University of Chicago and The

More information

ONUG SDN Federation/Operability

ONUG SDN Federation/Operability ONUG SDN Federation/Operability Orchestration A white paper from the ONUG SDN Federation/Operability Working Group May, 2016 Definition of Open Networking Open networking is a suite of interoperable software

More information

Introduction to The Storage Resource Broker

Introduction to The Storage Resource Broker http://www.nesc.ac.uk/training http://www.ngs.ac.uk Introduction to The Storage Resource Broker http://www.pparc.ac.uk/ http://www.eu-egee.org/ Policy for re-use This presentation can be re-used for academic

More information

Science-as-a-Service

Science-as-a-Service Science-as-a-Service The iplant Foundation Rion Dooley Edwin Skidmore Dan Stanzione Steve Terry Matthew Vaughn Outline Why, why, why! When duct tape isn t enough Building an API for the web Core services

More information

Microsoft SQL Server

Microsoft SQL Server Microsoft SQL Server Abstract This white paper outlines the best practices for Microsoft SQL Server Failover Cluster Instance data protection with Cohesity DataPlatform. December 2017 Table of Contents

More information