Advanced Photon Source Data Management. S. Veseli, N. Schwarz, C. Schmitz (SDM/XSD) R. Sersted, D. Wallis (IT/AES)

Similar documents
A COMMON SOFTWARE FRAMEWORK FOR FEL DATA ACQUISITION AND EXPERIMENT MANAGEMENT AT FERMI

SERGEY STEPANOV. Argonne National Laboratory, Advanced Photon Source, Lemont, IL, USA. October 2017, ICALEPCS-2017, Barcelona, Spain

The Materials Data Facility

NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Habanero Operating Committee. January

Index Introduction Setting up an account Searching and accessing Download Advanced features

ICAT Job Portal. a generic job submission system built on a scientific data catalog. IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013

BMC Configuration Management (Marimba) Best Practices and Troubleshooting. Andy Santosa Senior Technical Support Analyst

Europeana Core Service Platform

Using the vrealize Orchestrator Operations Client. vrealize Orchestrator 7.5

Deployment Scenario: WebSphere Portal Mashup integration and page builder

Alteryx Technical Overview

PDF SHARE FORMS. Online, Offline, OnDemand. PDF forms and SharePoint are better together. PDF Share Forms Enterprise 3.0.

Five9 Plus Adapter for Agent Desktop Toolkit

VMs at a Tier-1 site. EGEE 09, Sander Klous, Nikhef

Top Trends in DBMS & DW

Online Data Analysis at European XFEL

Server Installation Guide

HSC Data Processing Environment ~Upgrading plan of open-use computing facility for HSC data

System Requirements. Hardware and Virtual Appliance Requirements

Migrating Performance Data to NetApp OnCommand Unified Manager 7.2

Introduction to Grid Computing

CloudShell 7.1 GA. Installation Guide. Release Date: September Document Version: 2.0

Galaxy. Data intensive biology for everyone. / #usegalaxy

Data Challenges in Photon Science. Manuela Kuhn GridKa School 2016 Karlsruhe, 29th August 2016

X-ray imaging software tools for HPC clusters and the Cloud

Polycom RealPresence Platform Director

Textual Description of webbioc

Using Resources of Multiple Grids with the Grid Service Provider. Micha?Kosiedowski

C exam IBM C IBM Digital Experience 8.5 Fundamentals

Breaking Barriers Exploding with Possibilities

Grid Services and the Globus Toolkit

What s New for Oracle Java Cloud Service. On Oracle Cloud Infrastructure and Oracle Cloud Infrastructure Classic. Topics: Oracle Cloud

PODShell: Simplifying HPC in the Cloud Workflow

Genomics on Cisco Metacloud + SwiftStack

OBTAINING AN ACCOUNT:

FuncX: A Function Serving Platform for HPC. Ryan Chard 28 Jan 2019

The Balabit s Privileged Session Management 5 F5 Azure Reference Guide

Using MATLAB on the TeraGrid. Nate Woody, CAC John Kotwicki, MathWorks Susan Mehringer, CAC

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

VMware AirWatch Content Gateway for Linux. VMware Workspace ONE UEM 1811 Unified Access Gateway

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Deliverable D5.3. World-wide E-infrastructure for structural biology. Grant agreement no.: Prototype of the new VRE portal functionality

Hardware Requirements

A VO-friendly, Community-based Authorization Framework

Getting Started with XSEDE. Dan Stanzione

Overview of the Self-Service Portal

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Infrastructure Underpinnings of the GFDL Workflow

FastEMC Communications to RelayHealth EMF

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES

AppSense DataNow. Release Notes (Version 4.1) Components in this Release. These release notes include:

Data Management at CHESS

Deliverable 7.3. Workflow Manager. Poznao Supercomputing and Networking Center

CASE STUDY Youngevity

The GAT Adapter to use GT4 RFT

System Design and Tuning

Distributed Monte Carlo Production for

SDK USE CASES Topic of the Month FusionBanking Loan IQ

StorMagic SvSAN: A virtual SAN made simple

The Application of a Distributed Computing Architecture to a Large Telemetry Ground Station

Overview. Steve Fisher Please do interrupt with any questions

Gridbus Portlets -- USER GUIDE -- GRIDBUS PORTLETS 1 1. GETTING STARTED 2 2. AUTHENTICATION 3 3. WORKING WITH PROJECTS 4

Prototype DIRAC portal for EISCAT data Short instruction

Managing, Monitoring, and Reporting Functions

ASG-Rochade Reconciliation Toolkit Release Notes

Version 1.24 Installation Guide for On-Premise Uila Deployment Hyper-V

A.J. Faulkner K. Zarb-Adami

IBM Security Access Manager Version December Release information

Design patterns for data-driven research acceleration

VMware vcenter Server Appliance Management Programming Guide. Modified on 28 MAY 2018 vcenter Server 6.7 VMware ESXi 6.7

Federated Services for Scientists Thursday, December 9, p.m. EST

FUJITSU Cloud Service K5 CF Service Functional Overview

Solution Brief: Archiving with Harmonic Media Application Server and ProXplore

LAMBDA The LSDF Execution Framework for Data Intensive Applications

Installation Guide. CloudShell Version: Release Date: June Document Version: 1.0

Introduction to HPC Using zcluster at GACRC

Bright Cluster Manager

Users and utilization of CERIT-SC infrastructure

Using the VMware vcenter Orchestrator Client. vrealize Orchestrator 5.5.1

AirLift Configuration. VMware Workspace ONE UEM 1902 VMware Workspace ONE AirLift 1.1

Migrating vrealize Automation 6.2 to 7.2

Jenkins: A complete solution. From Continuous Integration to Continuous Delivery For HSBC

MOC 20416B: Implementing Desktop Application Environments

Service Manager. Ops Console On-Premise User Guide

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

System Requirements for ConSol CM Version Architectural Overview

1. Introduction. Outline

Data Staging: Moving large amounts of data around, and moving it close to compute resources

VMware AirWatch Content Gateway Guide for Linux For Linux

Data publication and discovery with Globus

Setting Up the Server

new world ERP Server Migration Checklist New World ERP TMS

Highlight. Central AP Management with High Scalability

vrealize Operations Manager Management Pack for vrealize Hyperic Release Notes

StreamSets Control Hub Installation Guide

AWS Lambda. 1.1 What is AWS Lambda?

Guillimin HPC Users Meeting June 16, 2016

Transcription:

Advanced Photon Source Data Management S. Veseli, N. Schwarz, C. Schmitz (SDM/XSD) R. Sersted, D. Wallis (IT/AES) APS Data Management - Globus World 2018

Growing Beamline Data Needs X-ray detector capabilities are constantly improving: bigger frames, higher frame rates => more raw data APS Upgrade: Higher brightness => more x-rays can be focused onto a smaller area => more raw data in greater detail and less time APS 1ID: APS 8IDI: Today: 4x Hydra GE detector, 8MB frame, 8 fps => 256MB/s data rate ( 1TB/hr) Production Data Rates (including overhead, on a good week): 10TB/day, 60TB/week Near future (1-3 years): 2x 2923 Dexela (or similar), 23MB frame, 26 fps => 1.2GB/s data rate Near future: Pilatus 2M (or similar/larger), 9.5MB frame, 250 fps => 2.3GB/s data rate 2010-2016: ANL/LBL FCCD, 2MB frame, 100fps, compressed in real-time with 10% non-zero pixels on the average => about 200MB/s data rate 2016-Today: Lambda 750K, 1.5MB frame, about 10% non-zero pixels, 2000 fps => about 300 MB/s data rate Production Data Rates: 8-10 TB/week (max), 1-2TB/week (average) Future: research on VIPIC (>1MHz frame rate) and UFXC (50 khz frame rate) 2

APS Data Management Specific data management needs typically vary from beamline-to-beamline, mostly depending upon the types of detectors, X- ray techniques, and data processing tools in use However, most of the data management requirements are related to a set of tasks common to all beamlines: Storage area management (e.g. movement of acquired data from local storage to a more permanent location, data archival, etc.) Enabling users and applications to easily find and access data (metadata and replica catalogs, remote data access tools) Facilitating data processing and analysis with automated or userinitiated processing workflows APS Data Management (DM) project strives to help with these tasks 3

4

How Does Globus Fit In? Globus Connect Server: provides remote data access to APS on-site storage (via the aps#data endpoint) Globus MyProxy OAuth Server: handles aps#data endpoint authentication GridFTP servers and clients: used by DM software for internal data transfers between beamline (local) storage and APS on-site storage, for transfers between local storage and HPC resources, etc. Issue: Support for Open Source Globus Toolkit ended in early 2018 5

Services Site Services: DM Database (PostgreSQL) Storage Management Service Web Portal Automated user account synchronization utilities Station (Beamline Deployment) Services: Data Acquisition Service Metadata Catalog Processing Service All services are available via REST interfaces 6

Monitoring Every DM service has a set of monitoring interfaces that enable external applications to find out about its state These are used by the custom Nagios plugins that provide up-to-date information about the health of the DM station deployments 7

User Interfaces Web browser access to DM Web Portal, Nagios web pages, beamline Metadata Catalog, and Globus Online (for remote data access) Python REST services are accessible via DM Python and Java APIs Processing Service provides 0MQ interfaces Extensive set of command line tools Built on top of Python APIs Session based Fully scriptable Online usage documentation (accessible via the -h --help options) DM Station GUI (C. Schmitz) Implemented in PyQt Uses Python REST APIs Easiest way to start using the system 8

System Usage December 2016: 5 beamline deployments, 250TB of storage space used March 2018: 21 beamline deployments, close to 1PB of storage space used Three month average growth of storage space used: 75TB/month We are currently using more than 75% of available storage space Assuming 75TB/month increase, we have 3-4 months before we run out of space 9

System Usage: Processing @ 8-ID-I (S. Narayanan) 8IDI uses X-ray Photon Correlation Spectroscopy technique (XPCS) for the studies of equilibrium fluctuations and fluctuations about the evolution to equilibrium in condensed matter in the Small-Angle X-ray Scattering (SAXS) geometry SPEC software is used for instrument control and data acquisition For every raw data file SPEC scripts can start DM processing job based on one of the implemented workflows Batch processing workflow: 1) Run a custom shell script to prepare processing environment. 2) Copy raw data file to APS HPC cluster using GridFTP (the globus-url-copy command). 3) Append XPCS metadata to the data file by running a custom 8-ID-I utility. 4) Submit a processing job to the SGE batch scheduler via the qsub command. This job runs a custom 8-ID-I processing executable. 5) Monitor batch job by running a shell script that interacts with SGE via the qacct command. 6) Copy resulting output file into designated beamline storage area using GridFTP (the globus-url-copy command). Jobs are monitored via static web pages generated by a cron job running DM utilities 10

Future Plans Develop Data Acquisition Service plugins that handle integration with external cataloging and data publishing systems (DOE Data Explorer, Materials Data Facility) Storage hardware replacement (purchase approved recently) Enable beamline managers to organize their experiment data in storage in a manner that best fits their beamline Further develop functionality offered in the DM Station GUI: Improve file metadata and collection views Add workflow and processing job management capabilities Enhance DM system monitoring infrastructure: Develop service capabilities for self-diagnosing error or warning conditions and issuing alarms. Improve support for measuring performance (e.g., data transfer rates, file processing rates, etc.) Further develop beamline management functionality available in the DM Web Portal Develop standard set of workflow definitions that can be reused on different beamlines for automating processing pipelines (need use cases) Develop policy engine for automated management of experiment data in storage, archiving of old data, etc. Improve documentation, write user guide New beamline deployments? Archival system? 11

Conclusionse at APS has its own Data Management software installation The DM system visible has grown to all significantly beamline over machines. the last couple In addition of years, in to terms the of DM both software, its usage and the capabilities deployment area also contains support software The software packages, can be customized beamline and databases extended to serve and configuration individual beamline files, needs related to data management as well as various runtime and log files. In this way all beamlines In particular, it can be used to fully automate data acquisition and processing are fully independent of each other, which works well in the APS pipelines environment where beamline machines typically have different Acknowledgements maintenance cycles. The DM software has fully scripted installation, upgrade, and deployment testing processes, which APS IT group (R. Sersted and D. Wallis) for their work on building and maintaining reduces to a minimum maintenance overhead due to the APS On-site Storage and virtual machines used to host beamline DM services independent APS IS group (F. beamline Lacap and Y. software Huang) for installations. their help and cooperation with accessing The APS DM databases services typically run on a designated beamline server machine, APS SDM which group (B. can Frosik be and either A. Glowacki) virtual for or many physical. useful discussions All services are controlled APS 1-ID (J.-S. via Park a standard P. Kenesei) set and of control 8-ID-I (S. Narayanan) scripts suitable beamlines for their RHEL operating system used at APS. patience, support and help with system testing and troubleshooting since the early prototype versions. All DM users 12

Additional Slides 13

Data Management Challenges? How do we organize data? Where do we store data? How do we store data? Do we store all data? How do we find data? How do we access data? How do we manage stored data? How can we ensure data integrity? Can we automate data acquisition and processing pipelines? What can we do about user data analysis and processing? 14

A Bit Of History 2013: Tao Fusion project (LDRD) acquired XSTOR storage (250 TB) September 2014: APS Data Management project started Goal: provide APS users with means to easily access their data remotely using Globus Online October 2015: First successful software deployment at 6IDD (D. Robinson) January 2017: Transition to DDN storage (1.5PB) with high-performance GPFS file system, data redundancy, and 2x10Gbps network links March 2018: New VM cluster used exclusively for DM virtual machines Total of 512GB RAM, 144 CPU cores (72x2 due to hyper-threading), 2x10Gbps network links 15

Site Services DM Database (PostgreSQL) Maintains information about users, experiments, and beamline deployments Storage Management Service (Python, CherryPy, SQLAlchemy) Runs on the storage head node Provides experiment management services (via REST interfaces) Interacts with LDAP and APS Databases Controls storage file system permissions, which enables data access for remote users Web Portal (Java EE Application, Glassfish, JPA, JSF, Primefaces) Experiment management Support for beamline deployments Automated utilities for synchronizing DM user information with APS User Database 16

Station Services Each beamline deployment ( DM Station ) includes several Python services accessible via REST interfaces: DAQ Service, Metadata Catalog and Processing Service Data Acquisition Service Responsible for data uploads and for monitoring local file storage Customizable, plugin-based processing framework Plugins handle file transfers, metadata cataloging, interaction with other services Metadata Catalog (MongoDB) Metadata are arbitrary key/value pairs Each experiment has its own file metadata collection File metadata can be retrieved using command line or API tools, DM Station GUI, or via the Mongo Express application 17

Station Services Processing Service provides support for managing user-defined workflows, as well as for submitting and monitoring processing jobs based on those workflows DM workflow is a collection of processing steps executed in order: Workflow definitions are described as Python dictionaries Each processing step must be associated with an (arbitrary) executable Support for input/output variables Processing steps are automatically parallelized if possible Processing Service can be used either standalone, or together with other DM Station services in support of fully automated beamline data acquisition and processing pipelines 18

Software InstallationEach beamline at APS has its own Data Management Each beamline software at APS has installation its own Data visible Management to all software beamline installation machines. visible to all In beamline addition machines to the DM software, the deployment area also Deployment contains area support contains software DM software, packages, support beamline software databases packages, and beamline configuration databases, files, configuration as well as files, various runtime and log files. In this All way beamlines all beamlines are fully independent are fully independent of each other, of which each works other, well which in the works APS environment well in the APS (beamline environment machines typically where beamline have different machines typically maintenance have different cycles) maintenance cycles. The DM software has The fully DM scripted software installation, has fully scripted upgrade, installation, and upgrade, deployment and deployment testing testing processes, which reduces to a minimum maintenance overhead processes, which reduces to a minimum maintenance overhead due to independent beamline software installations due to independent beamline software installations. The DM services typically run on a designated beamline server machine, The which DM can services be either typically virtual or run physical on a designated (VMs are preferred) beamline server machine, Services which are controlled can be via either a standard virtual set or of control physical. scripts All suitable services for are the controlled RHEL operating via a standard system used set at of APScontrol scripts suitable for the RHEL operating system used at APS. Typically, beamline staff has no involvement with DM software installation and maintenance 19

SupportEach beamline at APS has its own Data Management software Official APS installation Data Distribution visible to System all beamline support policy: machines. In addition to the Although DM software, no guarantees the deployment are made area as also to the contains system s support software availability, packages, problems beamline are databases addressed and as soon configuration as possible files, on a as well best-effort as various basis. runtime and log files. In this way all beamlines are fully independent of each other, which works well in the APS Two email lists: environment where beamline machines typically have different DM Users mailing list (dm-users@aps.anl.gov) is used for announcing maintenance new software cycles. releases, The DM system software outages, has etc. fully scripted installation, DM Admin upgrade, mailing and list (dm-admin@aps.anl.gov) deployment testing processes, can be used which for reduces system to a inquiries minimum maintenance overhead due to independent beamline software installations. The DM services typically run on a designated beamline server machine, which can be either virtual or physical. All services are controlled via a standard set of control scripts suitable for the RHEL operating system used at APS. 20

System Usage: APSU MM Data (A. Jain) APSU uses Component Database (CDB) for identifying and tracking components needed to build the new facility APSU Magnetic Measurements (MM) software works with and creates many different types of files: various definition/configuration files, measurement and analysis files, scripts, raw measurement data, PV data, processed data, log files, Over 1K magnets, each will require multiple measurements => MM software will generate large amounts of data and numerous data files DM data upload tools will associate MM experiment with corresponding magnet QR ID in CDB During the DM data upload, each MM data file will be processed as follows: Metadata plugin stores file metadata in the APSU DM metadata database CDB plugin adds file item to CDB, and also adds file metadata as item s property After the DM data upload, MM experiment item views on the CDB web portal contain links to corresponding magnet, allow downloading experiment files, etc. (CDB work: D. Jarosz) 21