The Wuppertal Tier-2 Center and recent software developments on Job Monitoring for ATLAS

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "The Wuppertal Tier-2 Center and recent software developments on Job Monitoring for ATLAS"

Transcription

1 The Wuppertal Tier-2 Center and recent software developments on Job Monitoring for ATLAS DESY Computing Seminar Frank Volkmer, M. Sc. Bergische Universität Wuppertal

2 Introduction Hardware Pleiades Cluster Software Development dcache JEM ProdSys Dashboard

3 Hardware

4 Pleiades Cluster Tier 2 Computing Centre in LCG DE Cloud installed in 2007 ATLAS ICECUBE AUGER successor of ALICEnext 512 AMD Dual Cores

5 Computing 64 BL460c (2x Inter Xeon 2.33GHz) 8 CPU Cores each 26 BL2x220c (2x 2x Intel Xeon 2.83 GHz) 16 CPU Cores each / 416 total 10 GBit/s Ethernet Interconnect (experimental) MCS (single and double) water cooled racks

6 Worker Nodes

7 Storage ~ 0.5 PB hard drive space 8 new SNs with 48 x 1TB 6 old SNs with 48 x 750GB 44 in RAID 6, 4 hot spare DL380 G5 4x MSA 60 using dcache

8 Storage Nodes

9 Network each row with central switch, 40GBit/s uplink SN: 10GBit/s each up to 140 GBit/s peek old WN: internaly with 10 GBit/s each allows MPI calculations 20 GBit/s uplink new WN: 16 with 10GBit/s uplink, 1 GBit/s internally

10 Cluster File System Scalable File Share (SFS) / Lustre Factory Build (installed by HP) Interconnect via Ethernet 64 TB SFS 3.1

11 UPS

12 Main Cooling Pipes

13 Controlling PVSS all 36 relevant temperatures instrumented information is archived and analyzed warning via and SMS MCS water in and out Air in and out

14 PVSS

15 Foreseeable Future 2010: TB more dcache space new cabinet: ~512 Cores

16 Software Development

17 dcache dcache Workshop performance evaluation tool

18 dcache Workshop Cost calculation and hot replication Info service, GLUE and Checksum Module dcache Mon (introduction of a monitoring tool) Mover queues and and transfer parameters

19 dcache Performance synchronization server to test measure data throughput between SE and WN

20 dcache Performance

21 JEM User centric Job Execution Monitor wrapping layer around the executed job line by line execution monitoring of C code, bash and Python scripts message passing via stomp

22 JEM History v1: Diploma / Master in 2004 R-GMA v2.2: bash-monitor / named pipes Ganga-Integration v3: 2009-present complete rewrite shmem / stomp /trigger

23 Ganga Integration GangaJEM changes job object, injects wrapper live evaluation of what your job is doing peek into stdout / stderr peek into files live info about cpu / mem / etc..

24 script Monitors own pre compiled version of bash internal python tracing engine line by line execution shared memory access

25 ctracer line by line execution can instrument any C / C++ code by statically linking against tracer lib

26 stomp Messaging messaging service provided by CERN always reachable from anywhere inside LCG allows calling home of the job

27 shmem several processes involved on WN system monitor script monitors watchdog all writing into a shared memory red black tree id hash (manual offset calculation) one consumer, publishing the information

28 ringbuffer advantages N-to-1 semantics N-to-M seantics possible lock free reading fast concurrent writes peek ahead and deferred deletion C and python API

29 Trigger Architecture all messages are consumed by a single evaluation engine trigger register for certain types of chunks mark chunks as approved or discarded right now or later on build up private data structures to evaluate

30 Exception Tracker registers to exception and code line chunks tracks stack traces backwards find out where or if they are caught possible refinement: discard all exceptions that were caught in my code

31 Progression Tracker most data analysis is running inside a specific loop over all data samples register to a certain line of code in main loop evaluate progression cpu time per data sample etc...

32 Outlook JEM v3 is almost done and waits for deployment UI part needs to be rewritten needs lot of end user feedback maybe some p2p architecture inside one site to analyze whole task meta data

33 ProdSys Dashboard monitor all ATLAS tasks submitted into LCG monitor site/cloud efficiency

34 Statistics ~600 unique visitors/d, ~1000 page loads/d Connected to AGIS, ProdDB, Panda CIC, SAM ~33 GB data ~80M records ~31M Job Definitions ~39M Job Executions

35 Overview Text

36 Functional Tests

37 Data Collection several services querying different databases to gather all necessary information redundant data copies missing service monitoring

38 ATLAS Dashboards DDM AGIS Panda Ganga HTTP (Python) HTTP (Python) HTTP (Python) DB DB MONALISA DDM Dashboard ProdSys Dashboard Panda Monitor Analysis Dashboard Service SLS HTTP (XML) HTTP (XML) HTTP (XML) HTTP (XML) HTTP (XML) SRM CIC GOCDB SAM Fabric

39 Current Problems Main developer still getting up to speed unmaintained for almost a year some not so well written sql queries needs special database indexing

40 New Slim Architecture standardized monitoring messages direct from source as often as possible publish via HTTP / Django with JSON Smart Client (GWT / Java) use pre compiled AJAX consume appropriate messages no expensive database queries

41 { } "time": " T18:20:55", "activetasks": [ { "type": "reco", "cloud": "FR", "taskid": 95133, "jobstates": { "RUNNING": 191, "DONE": 8, "PREPARED": 1 } }, { "type": "simul", "cloud": "IT", "taskid": 94772, "jobstates": { "RUNNING": 1, "DONE": 99 } },... ] JSON simple format less overhead than XML simple parsing in python import simplejson taskstate = json.loads(msg) print taskstate[ time ]...

42 Future Projects squid proxy caching establish a continuous development cycle with regular updates review documentation (2007) establish meta monitoring integrate into new architecture integrate new plotting architecture make it smarter :)

43 Thank you... Any questions?

The ATLAS Production System

The ATLAS Production System The ATLAS MC and Data Rodney Walker Ludwig Maximilians Universität Munich 2nd Feb, 2009 / DESY Computing Seminar Outline 1 Monte Carlo Production Data 2 3 MC Production Data MC Production Data Group and

More information

Monitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team

Monitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team Monitoring for IT Services and WLCG Alberto AIMAR CERN-IT for the MONIT Team 2 Outline Scope and Mandate Architecture and Data Flow Technologies and Usage WLCG Monitoring IT DC and Services Monitoring

More information

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland The LCG 3D Project Maria Girone, CERN The rd Open Grid Forum - OGF 4th June 2008, Barcelona Outline Introduction The Distributed Database (3D) Project Streams Replication Technology and Performance Availability

More information

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing Wolf Behrenhoff, Christoph Wissing DESY Computing Seminar May 17th, 2010 Page 1 Installation of

More information

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2 1 Senior Application Engineer The MathWorks Korea 2017 The MathWorks, Inc. 2 Data Analytics Workflow Business Systems Smart Connected Systems Data Acquisition Engineering, Scientific, and Field Business

More information

Monitoring the ALICE Grid with MonALISA

Monitoring the ALICE Grid with MonALISA Monitoring the ALICE Grid with MonALISA 2008-08-20 Costin Grigoras ALICE Workshop @ Sibiu Monitoring the ALICE Grid with MonALISA MonALISA Framework library Data collection and storage in ALICE Visualization

More information

AGIS: The ATLAS Grid Information System

AGIS: The ATLAS Grid Information System AGIS: The ATLAS Grid Information System Alexey Anisenkov 1, Sergey Belov 2, Alessandro Di Girolamo 3, Stavro Gayazov 1, Alexei Klimentov 4, Danila Oleynik 2, Alexander Senchenko 1 on behalf of the ATLAS

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI outlook Alice Examples Atlas Examples CMS Examples Alice Examples ALICE Tier-2s at the moment do not support interactive analysis not

More information

HPE Scalable Storage with Intel Enterprise Edition for Lustre*

HPE Scalable Storage with Intel Enterprise Edition for Lustre* HPE Scalable Storage with Intel Enterprise Edition for Lustre* HPE Scalable Storage with Intel Enterprise Edition For Lustre* High Performance Storage Solution Meets Demanding I/O requirements Performance

More information

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

SNAP Performance Benchmark and Profiling. April 2014

SNAP Performance Benchmark and Profiling. April 2014 SNAP Performance Benchmark and Profiling April 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox For more information on the supporting

More information

QuickSpecs HP Cluster Platform 3000 and HP Cluster Platform 4000

QuickSpecs HP Cluster Platform 3000 and HP Cluster Platform 4000 Overview An HP Cluster Platform 3000 or 4000 with 128 compute nodes (HP ProLiant DL160 G6 or HP ProLiant DL165 G5 Servers) and an InfiniBand high-speed interconnect. The configuration consists of 3 compute

More information

ProLiant DL F100 Integrated Cluster Solutions and Non-Integrated Cluster Bundle Configurations. Configurations

ProLiant DL F100 Integrated Cluster Solutions and Non-Integrated Cluster Bundle Configurations. Configurations Overview ProLiant DL F100 Integrated Cluster Solutions and Non-Integrated Cluster Bundle Configurations 1. MSA1000 6. Fibre Channel Interconnect #1 and #2 2. Smart Array Controller 7. Ethernet "HeartBeat"

More information

HCI: Hyper-Converged Infrastructure

HCI: Hyper-Converged Infrastructure Key Benefits: Innovative IT solution for high performance, simplicity and low cost Complete solution for IT workloads: compute, storage and networking in a single appliance High performance enabled by

More information

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( ) Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL

More information

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst UK Tier-2 site evolution for ATLAS Alastair Dewhurst Introduction My understanding is that GridPP funding is only part of the story when it comes to paying for a Tier 2 site. Each site is unique. Aim to

More information

Lessons learned from Lustre file system operation

Lessons learned from Lustre file system operation Lessons learned from Lustre file system operation Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association

More information

Developing a Powerful yet Inexpensive Computational Infrastructure for the UT Dept. of Nuclear Engineering. David D. Dixon April 8, 2009

Developing a Powerful yet Inexpensive Computational Infrastructure for the UT Dept. of Nuclear Engineering. David D. Dixon April 8, 2009 Developing a Powerful yet Inexpensive Computational Infrastructure for the UT Dept. of Nuclear Engineering David D. Dixon April 8, 2009 Overview Status of Existing Computational Infrastructure General

More information

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY Journal of Physics: Conference Series OPEN ACCESS Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY To cite this article: Elena Bystritskaya et al 2014 J. Phys.: Conf.

More information

Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO

Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO Ulrike Schnoor (CERN) Anton Gamel, Felix Bührer, Benjamin Rottler, Markus Schumacher (University of Freiburg) February 02, 2018

More information

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova Workload Management Stefano Lacaprara Department of Physics INFN and University of Padova CMS Physics Week, FNAL, 12/16 April 2005 Outline 1 Workload Management: the CMS way General Architecture Present

More information

Decentralized Distributed Storage System for Big Data

Decentralized Distributed Storage System for Big Data Decentralized Distributed Storage System for Big Presenter: Wei Xie -Intensive Scalable Computing Laboratory(DISCL) Computer Science Department Texas Tech University Outline Trends in Big and Cloud Storage

More information

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

where the Web was born Experience of Adding New Architectures to the LCG Production Environment where the Web was born Experience of Adding New Architectures to the LCG Production Environment Andreas Unterkircher, openlab fellow Sverre Jarp, CTO CERN openlab Industrializing the Grid openlab Workshop

More information

ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements

ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements Database TEG workshop, Nov 2011 ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements Gancho Dimitrov 1 Outline Some facts about the ATLAS databases at CERN Plan for upgrade

More information

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved. See what s new: Data Domain Global Deduplication Array, DD Boost and more 2010 1 EMC Backup Recovery Systems (BRS) Division EMC Competitor Competitor Competitor Competitor Competitor Competitor Competitor

More information

MONITORING SERVERLESS ARCHITECTURES

MONITORING SERVERLESS ARCHITECTURES MONITORING SERVERLESS ARCHITECTURES CAN YOU HELP WITH SOME PRODUCTION PROBLEMS? Your Manager (CC) Rachel Gardner Rafal Gancarz Lead Consultant @ OpenCredo WHAT IS SERVERLESS? (CC) theaucitron Cloud-native

More information

VOLTDB + HP VERTICA. page

VOLTDB + HP VERTICA. page VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics

More information

HPE Direct-Connect External SAS Storage for HPE BladeSystem Solutions Deployment Guide

HPE Direct-Connect External SAS Storage for HPE BladeSystem Solutions Deployment Guide HPE Direct-Connect External SAS Storage for HPE BladeSystem Solutions Deployment Guide This document provides device overview information, installation best practices and procedural overview, and illustrated

More information

Cisco UCS S3260 System Storage Management

Cisco UCS S3260 System Storage Management Storage Server Features and Components Overview, page 1 Cisco UCS S3260 Storage Management Operations, page 9 Disk Sharing for High Availability, page 10 Storage Enclosure Operations, page 15 Storage Server

More information

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February LHC Cloud Computing with CernVM Ben Segal 1 CERN 1211 Geneva 23, Switzerland E mail: b.segal@cern.ch Predrag Buncic CERN E mail: predrag.buncic@cern.ch 13th International Workshop on Advanced Computing

More information

VMware, Cisco and EMC The VCE Alliance

VMware, Cisco and EMC The VCE Alliance ware, Cisco and EMC The VCE Alliance Juan Carlos Bonilla ware Luis Pérez Cisco Aarón Sánchez EMC October, 2009 1 The VCE Positioning - Where is the Problem? Source: IDC 2008 2 Where is the Problem? The

More information

Monitoring tools in EGEE

Monitoring tools in EGEE Monitoring tools in EGEE Piotr Nyczyk CERN IT/GD Joint OSG and EGEE Operations Workshop - 3 Abingdon, 27-29 September 2005 www.eu-egee.org Kaleidoscope of monitoring tools Monitoring for operations Covered

More information

Minimum Hardware and OS Specifications

Minimum Hardware and OS Specifications Hardware and OS Specifications File Stream Document Management Software System Requirements for v4.5 NB: please read through carefully, as it contains 4 separate specifications for a Workstation PC, a

More information

Hyper-converged storage for Oracle RAC based on NVMe SSDs and standard x86 servers

Hyper-converged storage for Oracle RAC based on NVMe SSDs and standard x86 servers Hyper-converged storage for Oracle RAC based on NVMe SSDs and standard x86 servers White Paper rev. 2016-05-18 2015-2016 FlashGrid Inc. 1 www.flashgrid.io Abstract Oracle Real Application Clusters (RAC)

More information

Starting the Avalanche:

Starting the Avalanche: Starting the Avalanche: Application DoS In Microservice Architectures Scott Behrens Jeremy Heffner Introductions Scott Behrens Netflix senior application security engineer Breaking and building for 8+

More information

DELL EMC DATA DOMAIN OPERATING SYSTEM

DELL EMC DATA DOMAIN OPERATING SYSTEM DATA SHEET DD OS Essentials High-speed, scalable deduplication Up to 68 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability Data invulnerability architecture

More information

Grid Computing Activities at KIT

Grid Computing Activities at KIT Grid Computing Activities at KIT Meeting between NCP and KIT, 21.09.2015 Manuel Giffels Karlsruhe Institute of Technology Institute of Experimental Nuclear Physics & Steinbuch Center for Computing Courtesy

More information

Next Generation Computing Architectures for Cloud Scale Applications

Next Generation Computing Architectures for Cloud Scale Applications Next Generation Computing Architectures for Cloud Scale Applications Steve McQuerry, CCIE #6108, Manager Technical Marketing #clmel Agenda Introduction Cloud Scale Architectures System Link Technology

More information

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy Why the Grid? Science is becoming increasingly digital and needs to deal with increasing amounts of

More information

White paper 200 Camera Surveillance Video Vault Solution Powered by Fujitsu

White paper 200 Camera Surveillance Video Vault Solution Powered by Fujitsu White paper 200 Camera Surveillance Video Vault Solution Powered by Fujitsu SoleraTec Surveillance Video Vault provides exceptional video surveillance data capture, management, and storage in a complete

More information

Using Cloud Services behind SGI DMF

Using Cloud Services behind SGI DMF Using Cloud Services behind SGI DMF Greg Banks Principal Engineer, Storage SW 2013 SGI Overview Cloud Storage SGI Objectstore Design Features & Non-Features Future Directions Cloud Storage

More information

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo

More information

Monitoring of large-scale federated data storage: XRootD and beyond.

Monitoring of large-scale federated data storage: XRootD and beyond. Monitoring of large-scale federated data storage: XRootD and beyond. J Andreeva 1, A Beche 1, S Belov 2, D Diguez Arias 1, D Giordano 1, D Oleynik 2, A Petrosyan 2, P Saiz 1, M Tadel 3, D Tuckett 1 and

More information

Oracle Database Mobile Server, Version 12.2

Oracle Database Mobile Server, Version 12.2 O R A C L E D A T A S H E E T Oracle Database Mobile Server, Version 12.2 Oracle Database Mobile Server 12c (ODMS) is a highly optimized, robust and secure way to connect mobile and embedded Internet of

More information

A memcached implementation in Java. Bela Ban JBoss 2340

A memcached implementation in Java. Bela Ban JBoss 2340 A memcached implementation in Java Bela Ban JBoss 2340 AGENDA 2 > Introduction > memcached > memcached in Java > Improving memcached > Infinispan > Demo Introduction 3 > We want to store all of our data

More information

IBM Db2 Analytics Accelerator Version 7.1

IBM Db2 Analytics Accelerator Version 7.1 IBM Db2 Analytics Accelerator Version 7.1 Delivering new flexible, integrated deployment options Overview Ute Baumbach (bmb@de.ibm.com) 1 IBM Z Analytics Keep your data in place a different approach to

More information

Assessing performance in HP LeftHand SANs

Assessing performance in HP LeftHand SANs Assessing performance in HP LeftHand SANs HP LeftHand Starter, Virtualization, and Multi-Site SANs deliver reliable, scalable, and predictable performance White paper Introduction... 2 The advantages of

More information

for Multi-Services Gateways

for Multi-Services Gateways KURA an OSGi-basedApplication Framework for Multi-Services Gateways Introduction & Technical Overview Pierre Pitiot Grenoble 19 février 2014 Multi-Service Gateway Approach ESF / Increasing Value / Minimizing

More information

How то Use HPC Resources Efficiently by a Message Oriented Framework.

How то Use HPC Resources Efficiently by a Message Oriented Framework. How то Use HPC Resources Efficiently by a Message Oriented Framework www.hp-see.eu E. Atanassov, T. Gurov, A. Karaivanova Institute of Information and Communication Technologies Bulgarian Academy of Science

More information

Application Management Webinar. Daniela Field

Application Management Webinar. Daniela Field Application Management Webinar Daniela Field Agenda } Agile Deployment } Project vs Node Security } Deployment } Cloud Administration } Monitoring } Logging } Alerting Cloud Overview Cloud Overview Project

More information

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini White Paper Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini June 2016 2016 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 9 Contents

More information

Efficiency at Scale. Sanjeev Kumar Director of Engineering, Facebook

Efficiency at Scale. Sanjeev Kumar Director of Engineering, Facebook Efficiency at Scale Sanjeev Kumar Director of Engineering, Facebook International Workshop on Rack-scale Computing, April 2014 Agenda 1 Overview 2 Datacenter Architecture 3 Case Study: Optimizing BLOB

More information

Virginia Tech Research Center Arlington, Virginia, USA

Virginia Tech Research Center Arlington, Virginia, USA SMART BUILDINGS AS BUILDING BLOCKS OF A SMART CITY Professor Saifur Rahman Virginia Tech Advanced Research Institute Electrical & Computer Engg Department University of Sarajevo Bosnia, 06 October, 2016

More information

FREE SCIENTIFIC COMPUTING

FREE SCIENTIFIC COMPUTING Institute of Physics, Belgrade Scientific Computing Laboratory FREE SCIENTIFIC COMPUTING GRID COMPUTING Branimir Acković March 4, 2007 Petnica Science Center Overview 1/2 escience Brief History of UNIX

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Cisco UCS S3260 System Storage Management

Cisco UCS S3260 System Storage Management Storage Server Features and Components Overview, page 1 Cisco UCS S3260 Storage Management Operations, page 9 Disk Sharing for High Availability, page 10 Storage Enclosure Operations, page 15 Storage Server

More information

vstart 50 VMware vsphere Solution Specification

vstart 50 VMware vsphere Solution Specification vstart 50 VMware vsphere Solution Specification Release 1.3 for 12 th Generation Servers Dell Virtualization Solutions Engineering Revision: A00 March 2012 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES

More information

Overview of ATLAS PanDA Workload Management

Overview of ATLAS PanDA Workload Management Overview of ATLAS PanDA Workload Management T. Maeno 1, K. De 2, T. Wenaus 1, P. Nilsson 2, G. A. Stewart 3, R. Walker 4, A. Stradling 2, J. Caballero 1, M. Potekhin 1, D. Smith 5, for The ATLAS Collaboration

More information

Grid Computing. Olivier Dadoun LAL, Orsay. Introduction & Parachute method. Socle 2006 Clermont-Ferrand Orsay)

Grid Computing. Olivier Dadoun LAL, Orsay. Introduction & Parachute method. Socle 2006 Clermont-Ferrand Orsay) virtual organization Grid Computing Introduction & Parachute method Socle 2006 Clermont-Ferrand (@lal Orsay) Olivier Dadoun LAL, Orsay dadoun@lal.in2p3.fr www.dadoun.net October 2006 1 Contents Preamble

More information

Data services for LHC computing

Data services for LHC computing Data services for LHC computing SLAC 1 Xavier Espinal on behalf of IT/ST DAQ to CC 8GB/s+4xReco Hot files Reliable Fast Processing DAQ Feedback loop WAN aware Tier-1/2 replica, multi-site High throughout

More information

Cisco MCS 7835-H2 Unified Communications Manager Appliance

Cisco MCS 7835-H2 Unified Communications Manager Appliance Cisco MCS 7835-H2 Unified Communications Manager Appliance Cisco Unified Communications Solutions unify voice, video, data, and mobile applications on fixed and mobile networks enabling easy collaboration

More information

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads Gen9 server blades give more performance per dollar for your investment. Executive Summary Information Technology (IT)

More information

Using MVCC for Clustered Databases

Using MVCC for Clustered Databases Using MVCC for Clustered Databases structure introduction, scope and terms life-cycle of a transaction in Postgres-R write scalability tests results and their analysis 2 focus: cluster high availability,

More information

Cisco MCS 7845-H1 Unified CallManager Appliance

Cisco MCS 7845-H1 Unified CallManager Appliance Data Sheet Cisco MCS 7845-H1 Unified CallManager Appliance THIS PRODUCT IS NO LONGER BEING SOLD AND MIGHT NOT BE SUPPORTED. READ THE END-OF-LIFE NOTICE TO LEARN ABOUT POTENTIAL REPLACEMENT PRODUCTS AND

More information

Database monitoring and service validation. Dirk Duellmann CERN IT/PSS and 3D

Database monitoring and service validation. Dirk Duellmann CERN IT/PSS and 3D Database monitoring and service validation Dirk Duellmann CERN IT/PSS and 3D http://lcg3d.cern.ch LCG Database Deployment Plan After October 05 workshop a database deployment plan has been presented to

More information

Oracle at CERN CERN openlab summer students programme 2011

Oracle at CERN CERN openlab summer students programme 2011 Oracle at CERN CERN openlab summer students programme 2011 Eric Grancher eric.grancher@cern.ch CERN IT department Image courtesy of Forschungszentrum Jülich / Seitenplan, with material from NASA, ESA and

More information

Lessons Learned in the NorduGrid Federation

Lessons Learned in the NorduGrid Federation Lessons Learned in the NorduGrid Federation David Cameron University of Oslo With input from Gerd Behrmann, Oxana Smirnova and Mattias Wadenstein Creating Federated Data Stores For The LHC 14.9.12, Lyon,

More information

Enterprise Ceph: Everyway, your way! Amit Dell Kyle Red Hat Red Hat Summit June 2016

Enterprise Ceph: Everyway, your way! Amit Dell Kyle Red Hat Red Hat Summit June 2016 Enterprise Ceph: Everyway, your way! Amit Bhutani @ Dell Kyle Bader @ Red Hat Red Hat Summit June 2016 Agenda Overview of Ceph Components and Architecture Evolution of Ceph in Dell-Red Hat Joint OpenStack

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

Introduction to Cluster Computing

Introduction to Cluster Computing Introduction to Cluster Computing Prabhaker Mateti Wright State University Dayton, Ohio, USA Overview High performance computing High throughput computing NOW, HPC, and HTC Parallel algorithms Software

More information

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays Dell EMC Engineering December 2016 A Dell Best Practices Guide Revisions Date March 2011 Description Initial

More information

Head to Head with Dell & IBM: How ProLiant Wins

Head to Head with Dell & IBM: How ProLiant Wins Head to Head with Dell & IBM: How ProLiant Wins Erik Salwen Group Manager HP ProLiant Platform Division Session #2214 Adaptive Enterprise: a strategy positioned to win Supply Business Strategy Business

More information

Introduction to High Performance Parallel I/O

Introduction to High Performance Parallel I/O Introduction to High Performance Parallel I/O Richard Gerber Deputy Group Lead NERSC User Services August 30, 2013-1- Some slides from Katie Antypas I/O Needs Getting Bigger All the Time I/O needs growing

More information

Genesis HyperMDC 200D

Genesis HyperMDC 200D The Genesis HyperMDC 200D is a metadata cluster designed for ease-of-use and quick deployment. IPMI Control Dual Power Supplies Enhanced Metadata Uptime Storage Up to 1.3M IOPS and 5,500 MBps throughput

More information

Oracle Platform Performance Baseline Oracle 12c on Hitachi VSP G1000. Benchmark Report December 2014

Oracle Platform Performance Baseline Oracle 12c on Hitachi VSP G1000. Benchmark Report December 2014 Oracle Platform Performance Baseline Oracle 12c on Hitachi VSP G1000 Benchmark Report December 2014 Contents 1 System Configuration 2 Introduction into Oracle Platform Performance Tests 3 Storage Benchmark

More information

The Oracle Database Appliance I/O and Performance Architecture

The Oracle Database Appliance I/O and Performance Architecture Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

More information

Assistance in Lustre administration

Assistance in Lustre administration Assistance in Lustre administration Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN Application of Virtualization Technologies & CernVM Benedikt Hegner CERN Virtualization Use Cases Worker Node Virtualization Software Testing Training Platform Software Deployment }Covered today Server

More information

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY Table of Contents Introduction 3 Performance on Hosted Server 3 Figure 1: Real World Performance 3 Benchmarks 3 System configuration used for benchmarks 3

More information

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND ISCSI INFRASTRUCTURE

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND ISCSI INFRASTRUCTURE DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND ISCSI INFRASTRUCTURE Design Guide APRIL 2017 1 The information in this publication is provided as is. Dell Inc. makes no representations or warranties

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

ABySS Performance Benchmark and Profiling. May 2010

ABySS Performance Benchmark and Profiling. May 2010 ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC

More information

AMGA metadata catalogue system

AMGA metadata catalogue system AMGA metadata catalogue system Hurng-Chun Lee ACGrid School, Hanoi, Vietnam www.eu-egee.org EGEE and glite are registered trademarks Outline AMGA overview AMGA Background and Motivation for AMGA Interface,

More information

Challenges in HPC I/O

Challenges in HPC I/O Challenges in HPC I/O Universität Basel Julian M. Kunkel German Climate Computing Center / Universität Hamburg 10. October 2014 Outline 1 High-Performance Computing 2 Parallel File Systems and Challenges

More information

Parallel I/O on JUQUEEN

Parallel I/O on JUQUEEN Parallel I/O on JUQUEEN 4. Februar 2014, JUQUEEN Porting and Tuning Workshop Mitglied der Helmholtz-Gemeinschaft Wolfgang Frings w.frings@fz-juelich.de Jülich Supercomputing Centre Overview Parallel I/O

More information

Compiling applications for the Cray XC

Compiling applications for the Cray XC Compiling applications for the Cray XC Compiler Driver Wrappers (1) All applications that will run in parallel on the Cray XC should be compiled with the standard language wrappers. The compiler drivers

More information

EMC SYMMETRIX VMAX 40K SYSTEM

EMC SYMMETRIX VMAX 40K SYSTEM EMC SYMMETRIX VMAX 40K SYSTEM The EMC Symmetrix VMAX 40K storage system delivers unmatched scalability and high availability for the enterprise while providing market-leading functionality to accelerate

More information

EMC SYMMETRIX VMAX 40K STORAGE SYSTEM

EMC SYMMETRIX VMAX 40K STORAGE SYSTEM EMC SYMMETRIX VMAX 40K STORAGE SYSTEM The EMC Symmetrix VMAX 40K storage system delivers unmatched scalability and high availability for the enterprise while providing market-leading functionality to accelerate

More information

System Description. System Architecture. System Architecture, page 1 Deployment Environment, page 4

System Description. System Architecture. System Architecture, page 1 Deployment Environment, page 4 System Architecture, page 1 Deployment Environment, page 4 System Architecture The diagram below illustrates the high-level architecture of a typical Prime Home deployment. Figure 1: High Level Architecture

More information

The CORAL Project. Dirk Düllmann for the CORAL team Open Grid Forum, Database Workshop Barcelona, 4 June 2008

The CORAL Project. Dirk Düllmann for the CORAL team Open Grid Forum, Database Workshop Barcelona, 4 June 2008 The CORAL Project Dirk Düllmann for the CORAL team Open Grid Forum, Database Workshop Barcelona, 4 June 2008 Outline CORAL - a foundation for Physics Database Applications in the LHC Computing Grid (LCG)

More information

Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters

Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters Hari Subramoni, Ping Lai, Sayantan Sur and Dhabhaleswar. K. Panda Department of

More information

Management of batch at CERN

Management of batch at CERN Management of batch at CERN What is this talk about? LSF as a product basic commands user perspective basic commands admin perspective CERN installation Unix users/groups and LSF groups share management

More information

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval New Oracle NoSQL Database APIs that Speed Insertion and Retrieval O R A C L E W H I T E P A P E R F E B R U A R Y 2 0 1 6 1 NEW ORACLE NoSQL DATABASE APIs that SPEED INSERTION AND RETRIEVAL Introduction

More information

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity Fujitsu High Performance Computing Ecosystem Human Centric Innovation in Action Dr. Pierre Lagier Chief Technology Officer Fujitsu Systems Europe Innovation Flexibility Simplicity INTERNAL USE ONLY 0 Copyright

More information

EMC Symmetrix DMX Series The High End Platform. Tom Gorodecki EMC

EMC Symmetrix DMX Series The High End Platform. Tom Gorodecki EMC 1 EMC Symmetrix Series The High End Platform Tom Gorodecki EMC 2 EMC Symmetrix -3 Series World s Most Trusted Storage Platform Symmetrix -3: World s Largest High-end Storage Array -3 950: New High-end

More information

Streamlining CASTOR to manage the LHC data torrent

Streamlining CASTOR to manage the LHC data torrent Streamlining CASTOR to manage the LHC data torrent G. Lo Presti, X. Espinal Curull, E. Cano, B. Fiorini, A. Ieri, S. Murray, S. Ponce and E. Sindrilaru CERN, 1211 Geneva 23, Switzerland E-mail: giuseppe.lopresti@cern.ch

More information

Evaluation Guide for ASP.NET Web CMS and Experience Platforms

Evaluation Guide for ASP.NET Web CMS and Experience Platforms Evaluation Guide for ASP.NET Web CMS and Experience Platforms CONTENTS Introduction....................... 1 4 Key Differences...2 Architecture:...2 Development Model...3 Content:...4 Database:...4 Bonus:

More information

Geographical failover for the EGEE-WLCG Grid collaboration tools. CHEP 2007 Victoria, Canada, 2-7 September. Enabling Grids for E-sciencE

Geographical failover for the EGEE-WLCG Grid collaboration tools. CHEP 2007 Victoria, Canada, 2-7 September. Enabling Grids for E-sciencE Geographical failover for the EGEE-WLCG Grid collaboration tools CHEP 2007 Victoria, Canada, 2-7 September Alessandro Cavalli, Alfredo Pagano (INFN/CNAF, Bologna, Italy) Cyril L'Orphelin, Gilles Mathieu,

More information