ATLAS & Google "Data Ocean" R&D Project
|
|
- Britney Heath
- 5 years ago
- Views:
Transcription
1 ATLAS & Google "Data Ocean" R&D Project Authors: Mario Lassnig (CERN), Karan Bhatia (Google), Andy Murphy (Google), Alexei Klimentov (BNL), Kaushik De (UTA), Martin Barisits (CERN), Fernando Barreiro (UTA), Thomas Beermann (CERN), Ruslan Mashinistov (UTA), Torre Wenaus (BNL), Sergey Panitkin (BNL) Project overview 2 Use cases 2 User analysis 2 Data placement, replication, and popularity 2 Data streaming 2 Work packages 3 WP1 - Data management 3 WP2 - Workflow management 3 WP3 - Google Cloud Storage Global Redirection 3 WP4 - Cost Model 3 Addendum 3 List of key personnel and PI s 3 Timeline 4 Resources from ATLAS and Google 4 ATL-SOFT-PUB December 2017 Objectives and key results 5 Namespace handling 5 Connecting ATLAS grid storage with Google storage for third-party-copy 5 Monitoring third-party-copy 6 Reading data from Google storage to Grid worker nodes File copy-to-scratch 7 Monitoring copy-to-scratch transfers 7 Reading data from Google storage to Grid worker nodes Streaming random-io 7 Monitoring random-io 7 Deletion of data on Google Storage 7 Reading data inside Google data centres Jobs running on Google compute 7 Network provisioning 8 Transparent global redirection between inter-regional zones on Google Cloud Storage 8 Development of an economic cost model 8 Appendix 1 Brainstorming document 8 Appendix 2 Group photo 10 Bibliography 10
2 Project overview ATLAS [1] is facing several challenges with respect to their computing requirements for LHC [2] Run-3 ( ) and HL-LHC runs ( ). The challenges are not specific for ATLAS or/and LHC, but common for HENP computing community. Most importantly, storage continues to be the driving cost factor and at the current growth rate cannot absorb the increased physics output of the experiment. Novel computing models with a more dynamic use of storage and computing resources need to be considered. This project aims to start an R&D project for evaluating and adopting novel IT technologies for HENP computing. ATLAS and Google plan to launch an R&D project to integrate Google cloud resources (Storage and Compute) to the ATLAS distributed computing environment. After a series of teleconferences, a face-to-face brainstorming meeting in Denver, CO at the Supercomputing 2017 conference resulted in this proposal for a first prototype of the "Data Ocean" project. The idea is threefold: (a) to allow ATLAS to explore the use of different computing models to prepare for High-Luminosity LHC, (b) to allow ATLAS user analysis to benefit from the Google infrastructure, and (c) to give Google real science use cases to improve their cloud platform. Use cases User analysis When analysts use the distributed analysis services to run on the grid, the outputs are deposited on the grid. Making 100% of those outputs available to the analyst quickly is a difficult problem and remains one of the weak points of distributed analysis. Through this R&D, analysis outputs generated in worker nodes around the world could be directed to Google Cloud Storage, where they become uniformly and reliably available to the analyst anywhere in the world. Analysis data products are small and the GCS-resident outputs could be regarded as a cache with a limited lifetime, and thus limited storage footprint, while the value of reliable accessibility of this hot data to analysts would be enormous. Data placement, replication, and popularity The final stages of data analysis by users require access to multi-petabyte of data storage. To ensure high level of access, ATLAS replicates multiple copies of this data to worldwide computing resources. The Google Cloud Storage service could be an alternative to these highly used data formats. We plan to store the final derivation of the full ATLAS MC or/and reprocessing data campaigns. This data will then be available to users worldwide through Google Compute and ATLAS Compute resources. Data streaming ATLAS Computing is investigating the use of sub-file data products in the analysis chain. A prototype of this "Event Streaming Service" is currently in development and could benefit from fine-grained cloud storage. This use case will evaluate the necessary compute to generate the sub-file data
3 products ("events") from their original files at the scale required by HL-LHC, and the performance gains of highly parallel small size data delivery to the analysis software. Work packages The proof-of-concept phase of the "Data Ocean" project will consist of four major parts (Work Packages - WP). We envision that these packages will have well defined common milestones and overlaps. Both ATLAS and Google will commit software engineering effort to this project, initially at the level of 3 FTE s total. The expected official project start is early Additional partners from US National Laboratories and Universities, and CERN/WLCG are likely to join this project. WP1 - Data management This work package connects Google Cloud Storage with the ATLAS Data Management system "Rucio" [3], which will allow writing a full multi-petabyte physics sample to Google Cloud Storage. By taking advantage of Google's and ESnet fast networks, the sample is then distributed by Google between their continental regions/zones and made available to ATLAS Compute across the globe. ATLAS and Google will work together to understand data popularity and cache the most popular physics data vs geographical access pattern. WP2 - Workflow management ATLAS user analysis jobs, brokered by the "PanDA" workflow management system [4], should be able to run using either file-copy or direct-io with Google Cloud Storage. A strategy of using container formats for user analysis jobs will be developed. In addition, this work package will involve running jobs on Google Compute Platform, accessing either data from ATLAS storage or Google Cloud Storage. WP3 - Google Cloud Storage Global Redirection The third work package will involve an improvement to Google Cloud Storage itself. Right now, the ATLAS jobs needs to retain knowledge which Google Cloud region is to be used. Google will implement a global redirection between their regions to expose Google Cloud Storage as a single global entity. WP4 - Cost Model The fourth work package will deal with the economic model necessary for sustainable commercial clouds resource usage. For example, using adaptive pricing for cloud resource costs (storage, compute, network). Addendum List of key personnel and PI s Google
4 ATLAS Karan Bhatia Andy Murphy BNL Alexei Klimentov Torre Wenaus Sergey Panitkin CERN Mario Lassnig Martin Barisits Thomas Beermann Tobias Wegner UTA Kaushik De Fernando Barreiro Ruslan Mashinistov Project Management The project will be managed jointly by Google and ATLAS PIs Progress will be reported and followed on weekly basis Two Technical Interchange Meetings will be organized during duration of the project : once by Google, once by ATLAS Timeline The expected official project start is early X+1 month: detailed objectives and key results description X+2 month: test ATLAS/Google data transfer X+3 month: test ATLAS/Google analysis jobs access X+4 month: Full ATLAS derived data replica stored by Google X+6 month: End-user analysis test X+8 month: commissioning and pre-production for ATLAS selected users Resources from ATLAS and Google Both ATLAS and Google will commit software engineering effort to this project, initially at the level of 3 FTEs total. It will be highly desirable to have a Google SW engineer at CERN to work together with Rucio and PanDA teams during PoC implementation and commissioning phase. Additional partners from US National Laboratories and Universities, and CERN/WLCG are likely to join this project. Google computing resources (storage, bandwidth and CPUs) estimation for PoC phase will be done in one month after project will be launched and WP approved by both parties.
5 Objectives and key results These OKRs are only loosely coupled and should be doable in parallel after the two initial steps ("Namespace handling" and "Connecting grid storage") are finished. Names and ETAs are tentative and subject to official project start. Namespace handling Google Storage would become a new endpoint for ATLAS Be able to address all derived MC and processing campaign data Set up Google storage authorisation, authentication Add Google storage hosts to the ATLAS topology system (AGIS) Synchronise topology with ATLAS data management system (Rucio) There should be two available buckets, one in the US (Available: 100G Chicago, 10G San Jose, 10G Ashburn; Coming: 100G NY, 100G Seattle) and one in the EU (no ESnet peering with Google). Rucio Data Identifiers (DIDs) are a globally unique tuple <Scope:Name> E.g., mc16_13tev:12345.hits.pool.root Have associated collection of metadata, e.g., project, datatype, #events Can be either file, dataset (collection of files), container (collection of datasets) Unique among all three categories, cannot be reused We put replication rules on DIDs (declarative data management, e.g., 3 copies of this DID, one must be on tape and all should be on different continents) Resolve DIDs to files to actual replicas (root://hostname/storage/file.123) RSE (Rucio Storage Element) Unique logical unit of data storage Has different attributes, e.g., is_tape, geoip, We have topological split between the endpoint name (e.g., CERN-PROD_SCRATCHDISK) and the associated hosts behind the name (which could be many, each with a different protocol). So, e.g., we could have GCS_EUROPE, GCS_USEAST, GCS_USWEST, (and once the GCS global redirector exists, just a single GCS). each one would then have an associated storage endpoint: gcs://bucket/.. (or more likely) s3://bucket/... Connecting ATLAS grid storage with Google storage for third-party-copy We can transfer to/from Google storage using our orchestrated mechanisms in Rucio
6 Verify Rucio transfertool implementation for S3 compatibility with Google RSE Implement changes to Rucio transfertool if necessary Set rules for DIDs on grid storage to create replicas on Google RSE Set rules for DIDs on Google RSE to create replicas on grid storage Proposed input volume is between 1-6 Petabytes. For two possible scenarios: 1PB of NTUP for end-user analysis only 4-5PB for a complete copy of derivation data for one campaign The current full analysis produces roughly ~500'000 files with ~40MB each per day, equalling a growth rate of ~20TB/day. For the proof-of-concept it should be sufficient if a small percentage of the jobs (<1%) can be rerouted to write their output to GCS (5000 files, <200GB per day growth rate). Google has network peering with ESnet, which has connections to several ATLAS Tier-1 and Tier-2 centres in the US and Europe. The connections to US ATLAS sites are very good, whereas the EU peerings are less reliable. BNL might serve as a bridge for EU transfers if necessary. Network monitoring should be considered, especially for the ESnet peering, e.g., using perfsonar. Rucio has a multi-queue transfer system (conveyor + transfertool) Conveyor decides which transfer requests to take off the queue and process Transfertool submits transfer requests to third-party-copy component FTS supports WebDAV to S3 push-third-party copy from DPM and dcache Receives acknowledgements and polls status of transfers Updates DIDs, replicas, rules, does the retries, etc. Monitoring third-party-copy We are able to understand the performance differences between our existing transfer infrastructure and Google Storage Ensure instrumentation events are properly forwarded to monitoring system Create dedicated dashboards We ship all our transfer events into HDFS and ElasticSearch Dashboards, compute durations, historical views, accounting, etc.. Also source for our analytics system, e.g. to estimate transfer-time-to-complete using machine-learning Most important metrics #files/second transferred and deleted mbps per file and per link space usage over time
7 Reading data from Google storage to Grid worker nodes File copy-to-scratch Jobs can download full input files for processing using rucio-clients Access protocols might differ from third-party-copy If new protocols are needed they can be implemented Monitoring copy-to-scratch transfers We can follow the job transfers with our existing monitoring Every job sends a trace for every files they access. A trace is a dictionary containing information like location of the file, timestamps (start of the copy, end of the copy)... These traces are used : To build the popularity of our data To monitor the volume processed by the jobs Reading data from Google storage to Grid worker nodes Streaming random-io It might be necessary to add the Google Cloud Network to LHCone Monitoring random-io Deletion of data on Google Storage Allow the deletion of data on Google Cloud Storage using Rucio Reading data inside Google data centres Jobs running on Google compute
8 Network provisioning Ensure that full-capacity network is used for data ingress from grid storage to Google Cloud Storage Ensure that jobs running in Google Cloud Compute do not overwhelm our research networks Transparent global redirection between inter-regional zones on Google Cloud Storage Retrieve a file from Google Cloud Storage using a unique identifier regardless which region/zone was used for initial data ingress Development of an economic cost model Control the cost of ATLAS data on Google Cloud Storage Control the cost of ATLAS jobs on Google Cloud Compute Appendix 1 Brainstorming document
9 Full resolution:
10 Appendix 2 Group photo Left to right: Karan Bhatia, Alexei Klimentov, Horst Severini, Kaushik De, Thomas Beermann, Mario Lassnig, Sergey Panitkin, Ruslan Mashinistov, Martin Barisits, Fernando Barreiro, Matteo Turilli Full resolution: Bibliography [1] ATLAS Collaboration, G Aad, et al. The ATLAS experiment at the CERN large hadron collider. J.Instrum, 3:S08003, [2] LHC The Large Hadron Collider. [3] Rucio [4] T.Maeno P.Nilsson K.De, A.Klimentov and T.Wenaus. PanDA Production and Analysis backend. Journal of Physics, vol. 219, 210, 2009.
Experiences with the new ATLAS Distributed Data Management System
Experiences with the new ATLAS Distributed Data Management System V. Garonne 1, M. Barisits 2, T. Beermann 2, M. Lassnig 2, C. Serfon 1, W. Guan 3 on behalf of the ATLAS Collaboration 1 University of Oslo,
More informationC3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management
1 2 3 4 5 6 7 C3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management T Beermann 1, M Lassnig 1, M Barisits 1, C Serfon 2, V Garonne 2 on behalf of the ATLAS Collaboration 1 CERN, Geneva,
More informationThe evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model
Journal of Physics: Conference Series The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model To cite this article: S González de la Hoz 2012 J. Phys.: Conf. Ser. 396 032050
More informationANSE: Advanced Network Services for [LHC] Experiments
ANSE: Advanced Network Services for [LHC] Experiments Artur Barczyk California Institute of Technology Joint Techs 2013 Honolulu, January 16, 2013 Introduction ANSE is a project funded by NSF s CC-NIE
More informationEvolution of the ATLAS PanDA Workload Management System for Exascale Computational Science
Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science T. Maeno, K. De, A. Klimentov, P. Nilsson, D. Oleynik, S. Panitkin, A. Petrosyan, J. Schovancova, A. Vaniachine,
More informationPopularity Prediction Tool for ATLAS Distributed Data Management
Popularity Prediction Tool for ATLAS Distributed Data Management T Beermann 1,2, P Maettig 1, G Stewart 2, 3, M Lassnig 2, V Garonne 2, M Barisits 2, R Vigne 2, C Serfon 2, L Goossens 2, A Nairz 2 and
More informationATLAS DQ2 to Rucio renaming infrastructure
ATLAS DQ2 to Rucio renaming infrastructure C. Serfon 1, M. Barisits 1,2, T. Beermann 1, V. Garonne 1, L. Goossens 1, M. Lassnig 1, A. Molfetas 1,3, A. Nairz 1, G. Stewart 1, R. Vigne 1 on behalf of the
More informationConference The Data Challenges of the LHC. Reda Tafirout, TRIUMF
Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment
More informationAGIS: The ATLAS Grid Information System
AGIS: The ATLAS Grid Information System Alexey Anisenkov 1, Sergey Belov 2, Alessandro Di Girolamo 3, Stavro Gayazov 1, Alexei Klimentov 4, Danila Oleynik 2, Alexander Senchenko 1 on behalf of the ATLAS
More informationChallenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk
Challenges and Evolution of the LHC Production Grid April 13, 2011 Ian Fisk 1 Evolution Uni x ALICE Remote Access PD2P/ Popularity Tier-2 Tier-2 Uni u Open Lab m Tier-2 Science Uni x Grid Uni z USA Tier-2
More informationATLAS Distributed Computing Experience and Performance During the LHC Run-2
ATLAS Distributed Computing Experience and Performance During the LHC Run-2 A Filipčič 1 for the ATLAS Collaboration 1 Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia E-mail: andrej.filipcic@ijs.si
More informationFrom raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider
From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider Andrew Washbrook School of Physics and Astronomy University of Edinburgh Dealing with Data Conference
More informationTowards Network Awareness in LHC Computing
Towards Network Awareness in LHC Computing CMS ALICE CERN Atlas LHCb LHC Run1: Discovery of a New Boson LHC Run2: Beyond the Standard Model Gateway to a New Era Artur Barczyk / Caltech Internet2 Technology
More informationFederated data storage system prototype for LHC experiments and data intensive science
Federated data storage system prototype for LHC experiments and data intensive science A. Kiryanov 1,2,a, A. Klimentov 1,3,b, D. Krasnopevtsev 1,4,c, E. Ryabinkin 1,d, A. Zarochentsev 1,5,e 1 National
More informationTests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation
Journal of Physics: Conference Series PAPER OPEN ACCESS Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation To cite this article: R. Di Nardo et al 2015 J. Phys.: Conf.
More informationScientific data processing at global scale The LHC Computing Grid. fabio hernandez
Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since
More informationThe ATLAS PanDA Pilot in Operation
The ATLAS PanDA Pilot in Operation P. Nilsson 1, J. Caballero 2, K. De 1, T. Maeno 2, A. Stradling 1, T. Wenaus 2 for the ATLAS Collaboration 1 University of Texas at Arlington, Science Hall, P O Box 19059,
More informationOverview of ATLAS PanDA Workload Management
Overview of ATLAS PanDA Workload Management T. Maeno 1, K. De 2, T. Wenaus 1, P. Nilsson 2, G. A. Stewart 3, R. Walker 4, A. Stradling 2, J. Caballero 1, M. Potekhin 1, D. Smith 5, for The ATLAS Collaboration
More informationTHE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES
1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB
More informationData Transfers Between LHC Grid Sites Dorian Kcira
Data Transfers Between LHC Grid Sites Dorian Kcira dkcira@caltech.edu Caltech High Energy Physics Group hep.caltech.edu/cms CERN Site: LHC and the Experiments Large Hadron Collider 27 km circumference
More informationPanDA: Exascale Federation of Resources for the ATLAS Experiment
EPJ Web of Conferences will be set by the publisher DOI: will be set by the publisher c Owned by the authors, published by EDP Sciences, 2015 PanDA: Exascale Federation of Resources for the ATLAS Experiment
More informationATLAS Data Management Accounting with Hadoop Pig and HBase
Journal of Physics: Conference Series ATLAS Data Management Accounting with Hadoop Pig and HBase To cite this article: Mario Lassnig et al 2012 J. Phys.: Conf. Ser. 396 052044 View the article online for
More informationStorage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan
Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality
More informationHEP replica management
Primary actor Goal in context Scope Level Stakeholders and interests Precondition Minimal guarantees Success guarantees Trigger Technology and data variations Priority Releases Response time Frequency
More informationDistributed Data Management on the Grid. Mario Lassnig
Distributed Data Management on the Grid Mario Lassnig Who am I? Mario Lassnig Computer scientist main field of study was theoretical (algorithm design) working on/with distributed and embedded systems
More informationThe DMLite Rucio Plugin: ATLAS data in a filesystem
Journal of Physics: Conference Series OPEN ACCESS The DMLite Rucio Plugin: ATLAS data in a filesystem To cite this article: M Lassnig et al 2014 J. Phys.: Conf. Ser. 513 042030 View the article online
More informationUW-ATLAS Experiences with Condor
UW-ATLAS Experiences with Condor M.Chen, A. Leung, B.Mellado Sau Lan Wu and N.Xu Paradyn / Condor Week, Madison, 05/01/08 Outline Our first success story with Condor - ATLAS production in 2004~2005. CRONUS
More informationProgrammable Information Highway (with no Traffic Jams)
Programmable Information Highway (with no Traffic Jams) Inder Monga Energy Sciences Network Scientific Networking Division Lawrence Berkeley National Lab Exponential Growth ESnet Accepted Traffic: Jan
More informationNew data access with HTTP/WebDAV in the ATLAS experiment
New data access with HTTP/WebDAV in the ATLAS experiment Johannes Elmsheuser on behalf of the ATLAS collaboration Ludwig-Maximilians-Universität München 13 April 2015 21st International Conference on Computing
More informationRucio quota management
Rucio quota management Martin Barisits, Thomas Beermann for the Rucio Team ATLAS Sites Jamboree, 28.01.2016 Quota in Rucio, a quick reminder Quota is accounted by rules set on datasets or containers for
More informationReliability Engineering Analysis of ATLAS Data Reprocessing Campaigns
Journal of Physics: Conference Series OPEN ACCESS Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns To cite this article: A Vaniachine et al 2014 J. Phys.: Conf. Ser. 513 032101 View
More informationAnalytics Platform for ATLAS Computing Services
Analytics Platform for ATLAS Computing Services Ilija Vukotic for the ATLAS collaboration ICHEP 2016, Chicago, USA Getting the most from distributed resources What we want To understand the system To understand
More informationExperience with ATLAS MySQL PanDA database service
Journal of Physics: Conference Series Experience with ATLAS MySQL PanDA database service To cite this article: Y Smirnov et al 2010 J. Phys.: Conf. Ser. 219 042059 View the article online for updates and
More informationA Popularity-Based Prediction and Data Redistribution Tool for ATLAS Distributed Data Management
A Popularity-Based Prediction and Data Redistribution Tool for ATLAS Distributed Data Management CERN E-mail: thomas.beermann@cern.ch Graeme A. Stewart University of Glasgow E-mail: graeme.a.stewart@gmail.com
More informationKubernetes Integration with Virtuozzo Storage
Kubernetes Integration with Virtuozzo Storage A Technical OCTOBER, 2017 2017 Virtuozzo. All rights reserved. 1 Application Container Storage Application containers appear to be the perfect tool for supporting
More informationScheduling Computational and Storage Resources on the NRP
Scheduling Computational and Storage Resources on the NRP Rob Gardner Dima Mishin University of Chicago UCSD Second NRP Workshop Montana State University August 6-7, 2018 slides: http://bit.ly/nrp-scheduling
More informationPoS(EGICF12-EMITC2)106
DDM Site Services: A solution for global replication of HEP data Fernando Harald Barreiro Megino 1 E-mail: fernando.harald.barreiro.megino@cern.ch Simone Campana E-mail: simone.campana@cern.ch Vincent
More informationHigh Performance Computing Course Notes Grid Computing I
High Performance Computing Course Notes 2008-2009 2009 Grid Computing I Resource Demands Even as computer power, data storage, and communication continue to improve exponentially, resource capacities are
More informationHigh-Energy Physics Data-Storage Challenges
High-Energy Physics Data-Storage Challenges Richard P. Mount SLAC SC2003 Experimental HENP Understanding the quantum world requires: Repeated measurement billions of collisions Large (500 2000 physicist)
More informationFederated Data Storage System Prototype based on dcache
Federated Data Storage System Prototype based on dcache Andrey Kiryanov, Alexei Klimentov, Artem Petrosyan, Andrey Zarochentsev on behalf of BigData lab @ NRC KI and Russian Federated Data Storage Project
More informationAndrea Sciabà CERN, Switzerland
Frascati Physics Series Vol. VVVVVV (xxxx), pp. 000-000 XX Conference Location, Date-start - Date-end, Year THE LHC COMPUTING GRID Andrea Sciabà CERN, Switzerland Abstract The LHC experiments will start
More informationThe Fermilab HEPCloud Facility: Adding 60,000 Cores for Science! Burt Holzman, for the Fermilab HEPCloud Team HTCondor Week 2016 May 19, 2016
The Fermilab HEPCloud Facility: Adding 60,000 Cores for Science! Burt Holzman, for the Fermilab HEPCloud Team HTCondor Week 2016 May 19, 2016 My last Condor Week talk 2 05/19/16 Burt Holzman Fermilab HEPCloud
More informationATLAS Computing: the Run-2 experience
ATLAS Computing: the Run-2 experience Fernando Barreiro Megino on behalf of ATLAS Distributed Computing KEK, 4 April 2017 About me SW Engineer (2004) and Telecommunications Engineer (2007), Universidad
More informationHigh Performance Computing on MapReduce Programming Framework
International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming
More informationInfluence of Distributing a Tier-2 Data Storage on Physics Analysis
ACAT Conference 2013 Influence of Distributing a Tier-2 Data Storage on Physics Analysis Jiří Horký 1,2 (horky@fzu.cz) Miloš Lokajíček 1, Jakub Peisar 2 1 Institute of Physics ASCR, 2 CESNET 17th of May,
More information<Insert Picture Here> Enterprise Data Management using Grid Technology
Enterprise Data using Grid Technology Kriangsak Tiawsirisup Sales Consulting Manager Oracle Corporation (Thailand) 3 Related Data Centre Trends. Service Oriented Architecture Flexibility
More informationATLAS distributed computing: experience and evolution
Journal of Physics: Conference Series OPEN ACCESS ATLAS distributed computing: experience and evolution To cite this article: A Nairz and the Atlas Collaboration 2014 J. Phys.: Conf. Ser. 523 012020 View
More informationConstant monitoring of multi-site network connectivity at the Tokyo Tier2 center
Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center, T. Mashimo, N. Matsui, H. Matsunaga, H. Sakamoto, I. Ueda International Center for Elementary Particle Physics, The University
More informationUK Tier-2 site evolution for ATLAS. Alastair Dewhurst
UK Tier-2 site evolution for ATLAS Alastair Dewhurst Introduction My understanding is that GridPP funding is only part of the story when it comes to paying for a Tier 2 site. Each site is unique. Aim to
More informationATLAS Experiment and GCE
ATLAS Experiment and GCE Google IO Conference San Francisco, CA Sergey Panitkin (BNL) and Andrew Hanushevsky (SLAC), for the ATLAS Collaboration ATLAS Experiment The ATLAS is one of the six particle detectors
More informationINDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM
INDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM METADATA M.A. Grigoryeva 1,2,а, M.V. Golosova 1, A.A. Klimentov 3, M.S. Borodin 4, A.A. Alekseev 2, I.A. Tkachenko 1 1 National Research Center "Kurchatov
More informationThe LHC Computing Grid
The LHC Computing Grid Visit of Finnish IT Centre for Science CSC Board Members Finland Tuesday 19 th May 2009 Frédéric Hemmer IT Department Head The LHC and Detectors Outline Computing Challenges Current
More informationPROOF-Condor integration for ATLAS
PROOF-Condor integration for ATLAS G. Ganis,, J. Iwaszkiewicz, F. Rademakers CERN / PH-SFT M. Livny, B. Mellado, Neng Xu,, Sau Lan Wu University Of Wisconsin Condor Week, Madison, 29 Apr 2 May 2008 Outline
More informationData services for LHC computing
Data services for LHC computing SLAC 1 Xavier Espinal on behalf of IT/ST DAQ to CC 8GB/s+4xReco Hot files Reliable Fast Processing DAQ Feedback loop WAN aware Tier-1/2 replica, multi-site High throughout
More informationATLAS Analysis Workshop Summary
ATLAS Analysis Workshop Summary Matthew Feickert 1 1 Southern Methodist University March 29th, 2016 Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 1 Outline 1 ATLAS Analysis with
More information150 million sensors deliver data. 40 million times per second
CERN June 2007 View of the ATLAS detector (under construction) 150 million sensors deliver data 40 million times per second ATLAS distributed data management software, Don Quijote 2 (DQ2) ATLAS full trigger
More informationConnectivity Services, Autobahn and New Services
Connectivity Services, Autobahn and New Services Domenico Vicinanza, DANTE EGEE 09, Barcelona, 21 st -25 th September 2009 Agenda Background GÉANT Connectivity services: GÉANT IP GÉANT Plus GÉANT Lambda
More informationHigh Throughput WAN Data Transfer with Hadoop-based Storage
High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San
More informationAgenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache
Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,
More informationLHCb Computing Status. Andrei Tsaregorodtsev CPPM
LHCb Computing Status Andrei Tsaregorodtsev CPPM Plan Run II Computing Model Results of the 2015 data processing 2016-2017 outlook Preparing for Run III Conclusions 2 HLT Output Stream Splitting 12.5 khz
More informationBigData and Computing Challenges in High Energy and Nuclear Physics
BigData and Computing Challenges in High Energy and Nuclear Physics Alexei Klimentov CREMLIN WP2 Workshop on BigData Management Moscow, Feb 15-16, 2017 02.03.2017 1 Outline High Energy Physics and Nuclear
More informationEvolution of Cloud Computing in ATLAS
The Evolution of Cloud Computing in ATLAS Ryan Taylor on behalf of the ATLAS collaboration 1 Outline Cloud Usage and IaaS Resource Management Software Services to facilitate cloud use Sim@P1 Performance
More informationLessons Learned in the NorduGrid Federation
Lessons Learned in the NorduGrid Federation David Cameron University of Oslo With input from Gerd Behrmann, Oxana Smirnova and Mattias Wadenstein Creating Federated Data Stores For The LHC 14.9.12, Lyon,
More informationAMAZON S3 FOR SCIENCE GRIDS: A VIABLE SOLUTION?
AMAZON S3 FOR SCIENCE GRIDS: A VIABLE SOLUTION? Mayur Palankar and Adriana Iamnitchi University of South Florida Matei Ripeanu University of British Columbia Simson Garfinkel Harvard University Amazon
More informationWLCG Network Throughput WG
WLCG Network Throughput WG Shawn McKee, Marian Babik for the Working Group HEPiX Tsukuba 16-20 October 2017 Working Group WLCG Network Throughput WG formed in the fall of 2014 within the scope of WLCG
More informationBootstrapping a (New?) LHC Data Transfer Ecosystem
Bootstrapping a (New?) LHC Data Transfer Ecosystem Brian Paul Bockelman, Andy Hanushevsky, Oliver Keeble, Mario Lassnig, Paul Millar, Derek Weitzel, Wei Yang Why am I here? The announcement in mid-2017
More informationOnline data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland
Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland Abstract. The Data and Storage Services group at CERN is conducting
More informationIllustraCve Example of Distributed Analysis in ATLAS Spanish Tier2 and Tier3
IllustraCve Example of Distributed Analysis in ATLAS Spanish Tier2 and Tier3 S. González, E. Oliver, M. Villaplana, A. Fernández, M. Kaci, A. Lamas, J. Salt, J. Sánchez PCI2010 Workshop Rabat, 5 th 7 th
More informationThe ATLAS EventIndex: Full chain deployment and first operation
The ATLAS EventIndex: Full chain deployment and first operation Álvaro Fernández Casaní Instituto de Física Corpuscular () Universitat de València CSIC On behalf of the ATLAS Collaboration 1 Outline ATLAS
More informationMonitoring of large-scale federated data storage: XRootD and beyond.
Monitoring of large-scale federated data storage: XRootD and beyond. J Andreeva 1, A Beche 1, S Belov 2, D Diguez Arias 1, D Giordano 1, D Oleynik 2, A Petrosyan 2, P Saiz 1, M Tadel 3, D Tuckett 1 and
More informationData Management for the World s Largest Machine
Data Management for the World s Largest Machine Sigve Haug 1, Farid Ould-Saada 2, Katarina Pajchel 2, and Alexander L. Read 2 1 Laboratory for High Energy Physics, University of Bern, Sidlerstrasse 5,
More informationEarly experience with the Run 2 ATLAS analysis model
Early experience with the Run 2 ATLAS analysis model Argonne National Laboratory E-mail: cranshaw@anl.gov During the long shutdown of the LHC, the ATLAS collaboration redesigned its analysis model based
More informationComputing / The DESY Grid Center
Computing / The DESY Grid Center Developing software for HEP - dcache - ILC software development The DESY Grid Center - NAF, DESY-HH and DESY-ZN Grid overview - Usage and outcome Yves Kemp for DESY IT
More informationStreamlining CASTOR to manage the LHC data torrent
Streamlining CASTOR to manage the LHC data torrent G. Lo Presti, X. Espinal Curull, E. Cano, B. Fiorini, A. Ieri, S. Murray, S. Ponce and E. Sindrilaru CERN, 1211 Geneva 23, Switzerland E-mail: giuseppe.lopresti@cern.ch
More informationVolley: Automated Data Placement for Geo-Distributed Cloud Services
Volley: Automated Data Placement for Geo-Distributed Cloud Services Authors: Sharad Agarwal, John Dunagen, Navendu Jain, Stefan Saroiu, Alec Wolman, Harbinder Bogan 7th USENIX Symposium on Networked Systems
More informationZero to Microservices in 5 minutes using Docker Containers. Mathew Lodge Weaveworks
Zero to Microservices in 5 minutes using Docker Containers Mathew Lodge (@mathewlodge) Weaveworks (@weaveworks) https://www.weave.works/ 2 Going faster with software delivery is now a business issue Software
More informationWorldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010
Worldwide Production Distributed Data Management at the LHC Brian Bockelman MSST 2010, 4 May 2010 At the LHC http://op-webtools.web.cern.ch/opwebtools/vistar/vistars.php?usr=lhc1 Gratuitous detector pictures:
More informationBig Data Analytics and the LHC
Big Data Analytics and the LHC Maria Girone CERN openlab CTO Computing Frontiers 2016, Como, May 2016 DOI: 10.5281/zenodo.45449, CC-BY-SA, images courtesy of CERN 2 3 xx 4 Big bang in the laboratory We
More informationStorage and I/O requirements of the LHC experiments
Storage and I/O requirements of the LHC experiments Sverre Jarp CERN openlab, IT Dept where the Web was born 22 June 2006 OpenFabrics Workshop, Paris 1 Briefly about CERN 22 June 2006 OpenFabrics Workshop,
More informationUnified System for Processing Real and Simulated Data in the ATLAS Experiment
Unified System for Processing Real and Simulated Data in the ATLAS Experiment Mikhail Borodin Big Data Laboratory, National Research Centre "Kurchatov Institute", Moscow, Russia National Research Nuclear
More informationMonitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team
Monitoring for IT Services and WLCG Alberto AIMAR CERN-IT for the MONIT Team 2 Outline Scope and Mandate Architecture and Data Flow Technologies and Usage WLCG Monitoring IT DC and Services Monitoring
More informationHammerCloud: A Stress Testing System for Distributed Analysis
HammerCloud: A Stress Testing System for Distributed Analysis Daniel C. van der Ster 1, Johannes Elmsheuser 2, Mario Úbeda García 1, Massimo Paladin 1 1: CERN, Geneva, Switzerland 2: Ludwig-Maximilians-Universität
More informationSoftware and computing evolution: the HL-LHC challenge. Simone Campana, CERN
Software and computing evolution: the HL-LHC challenge Simone Campana, CERN Higgs discovery in Run-1 The Large Hadron Collider at CERN We are here: Run-2 (Fernando s talk) High Luminosity: the HL-LHC challenge
More informationUsing Puppet to contextualize computing resources for ATLAS analysis on Google Compute Engine
Journal of Physics: Conference Series OPEN ACCESS Using Puppet to contextualize computing resources for ATLAS analysis on Google Compute Engine To cite this article: Henrik Öhman et al 2014 J. Phys.: Conf.
More informationSystem upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.
System upgrade and future perspective for the operation of Tokyo Tier2 center, T. Mashimo, N. Matsui, H. Sakamoto and I. Ueda International Center for Elementary Particle Physics, The University of Tokyo
More informationInsight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products
What is big data? Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.
More informationPrompt data reconstruction at the ATLAS experiment
Prompt data reconstruction at the ATLAS experiment Graeme Andrew Stewart 1, Jamie Boyd 1, João Firmino da Costa 2, Joseph Tuggle 3 and Guillaume Unal 1, on behalf of the ATLAS Collaboration 1 European
More informationarxiv: v1 [cs.dc] 12 May 2017
GRID Storage Optimization in Transparent and User-Friendly Way for LHCb Datasets arxiv:1705.04513v1 [cs.dc] 12 May 2017 M Hushchyn 1,2, A Ustyuzhanin 1,3, P Charpentier 4 and C Haen 4 1 Yandex School of
More informationIEPSAS-Kosice: experiences in running LCG site
IEPSAS-Kosice: experiences in running LCG site Marian Babik 1, Dusan Bruncko 2, Tomas Daranyi 1, Ladislav Hluchy 1 and Pavol Strizenec 2 1 Department of Parallel and Distributed Computing, Institute of
More informationStatus of KISTI Tier2 Center for ALICE
APCTP 2009 LHC Physics Workshop at Korea Status of KISTI Tier2 Center for ALICE August 27, 2009 Soonwook Hwang KISTI e-science Division 1 Outline ALICE Computing Model KISTI ALICE Tier2 Center Future Plan
More informationSecurely Access Services Over AWS PrivateLink. January 2019
Securely Access Services Over AWS PrivateLink January 2019 Notices This document is provided for informational purposes only. It represents AWS s current product offerings and practices as of the date
More informationEuropeana Core Service Platform
Europeana Core Service Platform DELIVERABLE D7.1: Strategic Development Plan, Architectural Planning Revision Final Date of submission 30 October 2015 Author(s) Marcin Werla, PSNC Pavel Kats, Europeana
More informationVendor: Microsoft. Exam Code: Exam Name: Developing Microsoft Azure Solutions. Version: Demo
Vendor: Microsoft Exam Code: 70-532 Exam Name: Developing Microsoft Azure Solutions Version: Demo Testlet 1 Topic 1, Web-based Solution Background You are developing a web-based solution that students
More informationImproved ATLAS HammerCloud Monitoring for Local Site Administration
Improved ATLAS HammerCloud Monitoring for Local Site Administration M Böhler 1, J Elmsheuser 2, F Hönig 2, F Legger 2, V Mancinelli 3, and G Sciacca 4 on behalf of the ATLAS collaboration 1 Albert-Ludwigs
More informationMapReduce. U of Toronto, 2014
MapReduce U of Toronto, 2014 http://www.google.org/flutrends/ca/ (2012) Average Searches Per Day: 5,134,000,000 2 Motivation Process lots of data Google processed about 24 petabytes of data per day in
More informationSummary of the LHC Computing Review
Summary of the LHC Computing Review http://lhc-computing-review-public.web.cern.ch John Harvey CERN/EP May 10 th, 2001 LHCb Collaboration Meeting The Scale Data taking rate : 50,100, 200 Hz (ALICE, ATLAS-CMS,
More informationDistributed Data Management with Storage Resource Broker in the UK
Distributed Data Management with Storage Resource Broker in the UK Michael Doherty, Lisa Blanshard, Ananta Manandhar, Rik Tyer, Kerstin Kleese @ CCLRC, UK Abstract The Storage Resource Broker (SRB) is
More informationPhilippe Laurens, Michigan State University, for USATLAS. Atlas Great Lakes Tier 2 collocated at MSU and the University of Michigan
Philippe Laurens, Michigan State University, for USATLAS Atlas Great Lakes Tier 2 collocated at MSU and the University of Michigan ESCC/Internet2 Joint Techs -- 12 July 2011 Content Introduction LHC, ATLAS,
More informationOutline. Infrastructure and operations architecture. Operations. Services Monitoring and management tools
EGI-InSPIRE EGI Operations Tiziana Ferrari/EGI.eu EGI Chief Operations Officer 1 Outline Infrastructure and operations architecture Services Monitoring and management tools Operations 2 Installed Capacity
More informationData Storage. Paul Millar dcache
Data Storage Paul Millar dcache Overview Introducing storage How storage is used Challenges and future directions 2 (Magnetic) Hard Disks 3 Tape systems 4 Disk enclosures 5 RAID systems 6 Types of RAID
More information