Phronesis, a diagnosis and recovery tool for system administrators

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Phronesis, a diagnosis and recovery tool for system administrators"

Transcription

1 Journal of Physics: Conference Series OPEN ACCESS Phronesis, a diagnosis and recovery tool for system administrators To cite this article: C Haen et al 2014 J. Phys.: Conf. Ser View the article online for updates and enhancements. Related content - Artificial intelligence in the service of system administrators C Haen, V Barra, E Bonaccorsi et al. - A New Nightly Build System for LHCb M Clemencic and B Couturier - Systematic profiling to monitor and specify the software refactoring process of the LHCb experiment Ben Couturier, E Kiagias and Stefan B Lohn This content was downloaded from IP address on 11/01/2018 at 12:52

2 Phronesis, a diagnosis and recovery tool for system administrators C HAEN 1, V BARRA 2, E BONACCORSI 3 and N NEUFELD 3 1 Univ. Blaise Pascal, Clermont-ferrand cedex, France 2 LIMOS, UMR 6158 CNRS, Univ. Blaise Pascal, Clermont-ferrand cedex, France 3 European Organization for Nuclear Research, CERN CH-1211, Genève 23, Switzerland Abstract. The LHCb experiment relies on the Online system, which includes a very large and heterogeneous computing cluster. Ensuring the proper behavior of the different tasks running on the more than 2000 servers represents a huge workload for the small operator team and is a 24/7 task. At CHEP 2012, we presented a prototype of a framework that we designed in order to support the experts. The main objective is to provide them with steadily improving diagnosis and recovery solutions in case of misbehavior of a service, without having to modify the original applications. Our framework is based on adapted principles of the Autonomic Computing model, on Reinforcement Learning algorithms, as well as innovative concepts such as Shared Experience. While the submission at CHEP 2012 showed the validity of our prototype on simulations, we here present an implementation with improved algorithms and manipulation tools, and report on the experience gained with running it in the LHCb Online system. 1. Introduction LHCb [1] is one of the four large experiments at the Large Hadron Collider at CERN. This experiment relies on a large computing infrastructure [2] to (i) control the data acquisition system and the detector, and (ii) manage the data it produces. The team in charge of the installation and the administration of this system comprises less than 10 people, with three full time workers. To help the system administrators to reach their goal of high availability, we have attempted to provide them with a software which would propose a diagnosis and recovery solution in case of problems, improve with experience and act as a knowledge and problem history database. The paper we published at CHEP 2012 [3] introduced the concepts we used in our software. The validity of these concepts was proven on several simulations. Since then, the algorithms were improved, the software code consolidated and manipulation tools were developed. Further simulations were run to test deeper the ability of the software, and it has now been deployed on a much larger scale in the LHCb Online environment. 2. LISA: LearnIng approach for System Administration In [3], we presented methods that address problems similar to ours. These methods were expert systems [4] and autonomic computing principles like MAPE-K loop [5]. Based on these historical approaches and adding innovative concepts such as the Shared Experience principle, we now define the methodology of our framework as follows: Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Published under licence by IOP Publishing Ltd 1

3 Linux systems represent the greatest share of the Online environment. We thus decided to focus only on them. Network or Windows-based machine diagnoses are not addressed. Because of the great variety of software running on the LHCb Online HLT farm, our solution needs to be as generic as possible. As files and processes are the components of any application, we decided to use them as basic blocks for our diagnoses. To each type of problem that can be encountered with such entities like wrong file permission, wrong process user, etc is associated a default recovery solution. Note that this method is eqaully valid on Windows servers as it is generic enough. Perform no monitoring, but rather wait to be informed of problems by external sources Existing implementations associate one MAPE-K loop instance to one system and rely on multi-agent theory for synchronization and cooperation. Our approach is to have a single loop for all the systems. This allows the software to spot the dependencies between the various systems. By using Reinforcement Learning algorithms, we improve the diagnostic speed and scalability by reducing the amount of components that are checked before finding the faulty one. The Shared Experience principle consists of sharing the experience between similar systems (like two websites). It reduces both the learning phase of the learning algorithms and the description workload of the users. Using Convention over Configuration [6] contributes in reducing the configuration work of the software. Our software offers a default recovery solution with the full procedure for the fix to be taken into account, as well as information regarding previously encountered situations on the same problematic entity. However, the user has to perform the correction himself. 3. Phronesis Our implementation of the above methodology is called Phronesis. modules described in this section. It is divided in several 3.1. Compiler We defined a new configuration grammar that allows us to describe services as a composition of files, processes and other services. This grammar is actually inspired by the object model, where objects would be mainly files, processes or services and the inheritance concept is used to describe the Shared Experience principle. The user can also define two types of rules: Dependency rule: this rule states that one service needs another one to be fully functional. Recovery rule or Trigger: this rule lists what a given recovery action involves. For example, if the recovery action consists of changing the content of a file, a recovery rule could state that it is required to stop a process before changing the file and another one to start it after the modification. The compiler was developed in Python using the pyparsing library [7]. The choice of Python was made because of the dynamic characteristics of Python, such as the introspection mechanism and weak typing. The compiler reads the configuration files and produces an SQL script output. One critical aspect of the compilation is to not lose the experience that was previously gained by the reinforcement algorithm. This is achieved using custom graph-matching algorithms between the configuration files and the current content of the database. 2

4 3.2. Remote Agent The remote agent is a software program that runs on all the machines the user wants to supervise. Its only purpose is to answer queries from the Core (see 3.3). The complexities of it are at the technical level, and are just implementation details. The query concerns all the attributes of files, processes or the general environment. The agent is developed in C++, using several Boost libraries [8] Core The Core module of the software is the central part which contains all the algorithms used to actually diagnose problems and offer recovery solutions. The main algorithms are listed here: Sorting algorithm: when several problems are reported at the same time, this algorithm has to decide in which order they are analyzed. The order is very important for performance reasons, but also because there might be situations in which one problem cannot be solved before the others are. This algorithm uses Dependency rules to establish the order. Recovery algorithm: once the root cause of a problem is found, it can usually be fixed quite easily (e.g. fix a corrupted file, restart a process). For the changes to be taken into account, extra actions might be required. These actions are defined by Recovery rules. The complication comes from the fact that actions can be required before or after the fix is applied. Computing the full chain of events is a non-trivial task. Reinforcement Learning algorithm: the reinforcement learning algorithm is used to optimize the exploration path from a reported problem to its faulty component. The chosen method consists of keeping track of the paths that were successful in previous cases. Each path has an associated counter which is incremented when the path is faulty. When a new problem is reported, one can rely on these counters to choose the more appropriate path. There are two strategies: either sorting the counters in decreasing orders, either making a weighted random choice. Simulations (see 4.1) show that in average, both strategies are equivalent. Although simple, this method based on counters has great advantages. If a path is reinforced whereas it should not, the user can very easily correct it. The user can also give a priori knowledge. Finally, from a technical point of view, the application of the Shared Experience principle to this method is straightforward. Dependency algorithm: one of the most interesting features of our software is its ability to find dependencies between services based on previous experience. This capacity allows our software to infer new Dependency rules, and thus provides better diagnoses. The implementation is done in C++ and uses Boost libraries. It can be run as a daemon, as an interactive program, or to make a full check of all the services known to it Tools There are two kinds of interactions between the software and the user. Output communication so that the user knows what the software is doing. Input communication for the user to report problems or give feedback. This bidirectional communication is made possible using an Application Programming Interface (API). The output communication is based on an Observer pattern [9], while the input messages are similar to Remote Procedure Calls. Based on the API, several ready-to-use user interfaces were developed: phrutils: a command line tool phrgui: currently being prototyped. A GUI based on the Qt framework [10]. phrxml: only for output communication. This stores all the output into an XML file based ring buffer. 3

5 phrsimu: an interface used by our simulation software to test the algorithms. phricinga: an interface that gathers data from Icinga [11], the monitoring software used at LHCb. phrweb: a web interface based on phrxml and the Django framework [12]. 4. Results 4.1. Simulations It was important in order to test our algorithms to be able to simulate realistic situations. To achieve this, we developed a complete set of tools to produce Monte-Carlo simulations. Phronesis needs to be compiled in a particular way. The reason is that the simulation tool tests the algorithms of the Core module, and not the code quality of the Agents: when under normal usage, remote servers are queried to get information before processing it; in simulation mode, the query is intercepted and a local Agent is instructed what to return. This allows us to test Phronesis on a single local machine. Another software program is used to randomly generate problems based on user input, inject signals to the Core to mock the agents analysis, interact with it to confirm or deny its diagnoses, and produce statistics about the behavior of Phronesis. This tool reproduces almost any kind of environment. Various situations were simulated, which validated the importance of Dependency rules as well as the Shared Experience principle. It also showed that the two exploring strategies of a faulty service mentioned earlier are equivalent in average Real case application Phronesis is now being deployed on the entire LHCb Online cluster. It is to be noted that it is not a replacement to any solution already in place, but is expected to be in addition to it. At the time of writing, a fair fraction of the LHCb Online system is already covered and the diagnoses we had the opportunity to trigger showed useful. Systems under Phronesis supervision include the log aggregation cluster, the event filter software, the web services and the monitoring infrastructure. Despite the fact that there only a small number of unexpected and unprovoked situations, Phronesis could make several correct diagnoses, and offered appropriate solutions. Among these, several diagnoses were a direct consequence of the Convention Over Configuration approach, because the root cause was pointing at elements which the user did not define manually. Examples of diagnoses are: Full inodes for log servers: the log servers store a large number of tiny files (around files with a median size of 100 Kb) on a clustered file system. As a consequence, the pool of inodes was exhausted well before the actual storage space. The solution, correctly suggested by Phronesis, was to remove files. In fact, this problem was spotted before it actually happened because of the default threshold set to 99% of used inodes: it is a great chance, because otherwise all the new logs that would have required a new file would have been silently lost. Incorrect mount options on a web service: one of the web services required a particular folder to be mounted with the write option, which was not the case. Phronesis suggested to remount it with the appropriate option. Although correct, this would not have worked immediately, because an NFS server on which Phronesis had no control was not configured to accept it. Incorrect DIM [13] name server address: the file containing the information was corrupted Various problems on MySQL servers: running out of disk space and errord in the configuration files were among the problems diagnosed by Phronesis on the MySQL database 4

6 Various problems on the monitoring infrastructure: the mail alerts not being sent tracked down to a process not running, the out-of-date results tracked down to a full disk space and checks not executed because of some servers not running are a few issues that Phronesis correctly diagnosed. In some cases, Phronesis completely missed the root cause of the problems. We have observed two types of failures: Errors due to a situation not foreseen in the design. Examples are disk errors or cluster setups. When it did not imply heavy modifications, the code was improved. Other cases were left for future developments. Errors due to incomplete configuration, like missing information or unsupervised service. The configuration was always updated to cover future occurrences of similar cases. 5. Outlook There is still large room for improvement, both in terms of the technical implementation and of functionality. This includes (i) an extension of the configuration grammar, which is unfortunately more verbose than what we hoped at the beginning, (ii) better native support for cluster systems, and (iii) dynamic constraints on the properties of files and processes. The plan is to add more systems under the supervision of Phronesis and add coverage for the corner cases. We hope to be able to release it as an open source solution that the community would pick up, and further develop. References [1] Augusto A A et al. (LHCb) 2008 JINST 3 S08005 [2] Neufeld N (LHCb) 2003 Nucl. Phys. Proc. Suppl [3] Haen C, Barra V, Bonaccorsi E and Neufeld N 2012 Journal of Physics: Conference Series URL [4] Ginsberg M 1993 Essentials of artificial intelligence (Morgan Kaufmann) ISBN [5] IBM 2001 an architectural blueprint for autonomic computing URL " computing the ibm blueprint" [6] Miller J 2009 Microsoft msdn magazine: Design for convention over configuration URL [7] McGuire P Pyparsing website URL [8] Boost-team 2013 Boost libraries URL [9] Gamma E, Helm R, Johnson R and Vlissides J 1995 Design patterns: elements of reusable object-oriented software (Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc.) ISBN [10] Qt-project 2013 Qt project URL [11] Haen C, Bonaccorsi E and Neufeld N 2011 Distributed monitoring system based on icinga Proceedings of ICALEPCS2011 pp URL [12] Foundation D S 2013 Django website URL [13] Gaspar C 1993 Dim website URL 5

ComPWA: A common amplitude analysis framework for PANDA

ComPWA: A common amplitude analysis framework for PANDA Journal of Physics: Conference Series OPEN ACCESS ComPWA: A common amplitude analysis framework for PANDA To cite this article: M Michel et al 2014 J. Phys.: Conf. Ser. 513 022025 Related content - Partial

More information

CMS - HLT Configuration Management System

CMS - HLT Configuration Management System Journal of Physics: Conference Series PAPER OPEN ACCESS CMS - HLT Configuration Management System To cite this article: Vincenzo Daponte and Andrea Bocci 2015 J. Phys.: Conf. Ser. 664 082008 View the article

More information

SNiPER: an offline software framework for non-collider physics experiments

SNiPER: an offline software framework for non-collider physics experiments SNiPER: an offline software framework for non-collider physics experiments J. H. Zou 1, X. T. Huang 2, W. D. Li 1, T. Lin 1, T. Li 2, K. Zhang 1, Z. Y. Deng 1, G. F. Cao 1 1 Institute of High Energy Physics,

More information

CMS High Level Trigger Timing Measurements

CMS High Level Trigger Timing Measurements Journal of Physics: Conference Series PAPER OPEN ACCESS High Level Trigger Timing Measurements To cite this article: Clint Richardson 2015 J. Phys.: Conf. Ser. 664 082045 Related content - Recent Standard

More information

Evolution of Database Replication Technologies for WLCG

Evolution of Database Replication Technologies for WLCG Journal of Physics: Conference Series PAPER OPEN ACCESS Evolution of Database Replication Technologies for WLCG To cite this article: Zbigniew Baranowski et al 2015 J. Phys.: Conf. Ser. 664 042032 View

More information

Streamlining CASTOR to manage the LHC data torrent

Streamlining CASTOR to manage the LHC data torrent Streamlining CASTOR to manage the LHC data torrent G. Lo Presti, X. Espinal Curull, E. Cano, B. Fiorini, A. Ieri, S. Murray, S. Ponce and E. Sindrilaru CERN, 1211 Geneva 23, Switzerland E-mail: giuseppe.lopresti@cern.ch

More information

The DMLite Rucio Plugin: ATLAS data in a filesystem

The DMLite Rucio Plugin: ATLAS data in a filesystem Journal of Physics: Conference Series OPEN ACCESS The DMLite Rucio Plugin: ATLAS data in a filesystem To cite this article: M Lassnig et al 2014 J. Phys.: Conf. Ser. 513 042030 View the article online

More information

File Access Optimization with the Lustre Filesystem at Florida CMS T2

File Access Optimization with the Lustre Filesystem at Florida CMS T2 Journal of Physics: Conference Series PAPER OPEN ACCESS File Access Optimization with the Lustre Filesystem at Florida CMS T2 To cite this article: P. Avery et al 215 J. Phys.: Conf. Ser. 664 4228 View

More information

Geant4 Computing Performance Benchmarking and Monitoring

Geant4 Computing Performance Benchmarking and Monitoring Journal of Physics: Conference Series PAPER OPEN ACCESS Geant4 Computing Performance Benchmarking and Monitoring To cite this article: Andrea Dotti et al 2015 J. Phys.: Conf. Ser. 664 062021 View the article

More information

An SQL-based approach to physics analysis

An SQL-based approach to physics analysis Journal of Physics: Conference Series OPEN ACCESS An SQL-based approach to physics analysis To cite this article: Dr Maaike Limper 2014 J. Phys.: Conf. Ser. 513 022022 View the article online for updates

More information

Performance of popular open source databases for HEP related computing problems

Performance of popular open source databases for HEP related computing problems Journal of Physics: Conference Series OPEN ACCESS Performance of popular open source databases for HEP related computing problems To cite this article: D Kovalskyi et al 2014 J. Phys.: Conf. Ser. 513 042027

More information

Interoperating AliEn and ARC for a distributed Tier1 in the Nordic countries.

Interoperating AliEn and ARC for a distributed Tier1 in the Nordic countries. for a distributed Tier1 in the Nordic countries. Philippe Gros Lund University, Div. of Experimental High Energy Physics, Box 118, 22100 Lund, Sweden philippe.gros@hep.lu.se Anders Rhod Gregersen NDGF

More information

Monte Carlo Production Management at CMS

Monte Carlo Production Management at CMS Monte Carlo Production Management at CMS G Boudoul 1, G Franzoni 2, A Norkus 2,3, A Pol 2, P Srimanobhas 4 and J-R Vlimant 5 - for the Compact Muon Solenoid collaboration 1 U. C. Bernard-Lyon I, 43 boulevard

More information

System level traffic shaping in disk servers with heterogeneous protocols

System level traffic shaping in disk servers with heterogeneous protocols Journal of Physics: Conference Series OPEN ACCESS System level traffic shaping in disk servers with heterogeneous protocols To cite this article: Eric Cano and Daniele Francesco Kruse 14 J. Phys.: Conf.

More information

Data preservation for the HERA experiments at DESY using dcache technology

Data preservation for the HERA experiments at DESY using dcache technology Journal of Physics: Conference Series PAPER OPEN ACCESS Data preservation for the HERA experiments at DESY using dcache technology To cite this article: Dirk Krücker et al 2015 J. Phys.: Conf. Ser. 66

More information

A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications.

A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications. 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP21) IOP Publishing Journal of Physics: Conference Series 664 (21) 23 doi:1.188/1742-696/664//23 A first look at 1 Gbps

More information

Database on Demand: insight how to build your own DBaaS

Database on Demand: insight how to build your own DBaaS Journal of Physics: Conference Series PAPER OPEN ACCESS Database on Demand: insight how to build your own DBaaS Related content - DataBase on Demand R Gaspar Aparicio, D Gomez, I Coterillo Coz et al. To

More information

Overview of ATLAS PanDA Workload Management

Overview of ATLAS PanDA Workload Management Overview of ATLAS PanDA Workload Management T. Maeno 1, K. De 2, T. Wenaus 1, P. Nilsson 2, G. A. Stewart 3, R. Walker 4, A. Stradling 2, J. Caballero 1, M. Potekhin 1, D. Smith 5, for The ATLAS Collaboration

More information

Early experience with the Run 2 ATLAS analysis model

Early experience with the Run 2 ATLAS analysis model Early experience with the Run 2 ATLAS analysis model Argonne National Laboratory E-mail: cranshaw@anl.gov During the long shutdown of the LHC, the ATLAS collaboration redesigned its analysis model based

More information

AGIS: The ATLAS Grid Information System

AGIS: The ATLAS Grid Information System AGIS: The ATLAS Grid Information System Alexey Anisenkov 1, Sergey Belov 2, Alessandro Di Girolamo 3, Stavro Gayazov 1, Alexei Klimentov 4, Danila Oleynik 2, Alexander Senchenko 1 on behalf of the ATLAS

More information

Monitoring of large-scale federated data storage: XRootD and beyond.

Monitoring of large-scale federated data storage: XRootD and beyond. Monitoring of large-scale federated data storage: XRootD and beyond. J Andreeva 1, A Beche 1, S Belov 2, D Diguez Arias 1, D Giordano 1, D Oleynik 2, A Petrosyan 2, P Saiz 1, M Tadel 3, D Tuckett 1 and

More information

AN OVERVIEW OF THE LHC EXPERIMENTS' CONTROL SYSTEMS

AN OVERVIEW OF THE LHC EXPERIMENTS' CONTROL SYSTEMS AN OVERVIEW OF THE LHC EXPERIMENTS' CONTROL SYSTEMS C. Gaspar, CERN, Geneva, Switzerland Abstract The four LHC experiments (ALICE, ATLAS, CMS and LHCb), either by need or by choice have defined different

More information

Deploying enterprise applications on Dell Hybrid Cloud System for Microsoft Cloud Platform System Standard

Deploying enterprise applications on Dell Hybrid Cloud System for Microsoft Cloud Platform System Standard Deploying enterprise applications on Dell Hybrid Cloud System for Microsoft Cloud Platform System Standard Date 7-18-2016 Copyright This document is provided as-is. Information and views expressed in this

More information

A self-configuring control system for storage and computing departments at INFN-CNAF Tierl

A self-configuring control system for storage and computing departments at INFN-CNAF Tierl Journal of Physics: Conference Series PAPER OPEN ACCESS A self-configuring control system for storage and computing departments at INFN-CNAF Tierl To cite this article: Daniele Gregori et al 2015 J. Phys.:

More information

b-jet identification at High Level Trigger in CMS

b-jet identification at High Level Trigger in CMS Journal of Physics: Conference Series PAPER OPEN ACCESS b-jet identification at High Level Trigger in CMS To cite this article: Eric Chabert 2015 J. Phys.: Conf. Ser. 608 012041 View the article online

More information

Testing SLURM open source batch system for a Tierl/Tier2 HEP computing facility

Testing SLURM open source batch system for a Tierl/Tier2 HEP computing facility Journal of Physics: Conference Series OPEN ACCESS Testing SLURM open source batch system for a Tierl/Tier2 HEP computing facility Recent citations - A new Self-Adaptive dispatching System for local clusters

More information

Michael Böge, Jan Chrin

Michael Böge, Jan Chrin PAUL SCHERRER INSTITUT SLS-TME-TA-1999-0015 September, 1999 A CORBA Based Client- Model for Beam Dynamics Applications at the SLS Michael Böge, Jan Chrin Paul Scherrer Institut CH-5232 Villigen PSI Switzerland

More information

Pattern-Oriented Development with Rational Rose

Pattern-Oriented Development with Rational Rose Pattern-Oriented Development with Rational Rose Professor Peter Forbrig, Department of Computer Science, University of Rostock, Germany; Dr. Ralf Laemmel, Department of Information Management and Software

More information

The NOvA software testing framework

The NOvA software testing framework Journal of Physics: Conference Series PAPER OPEN ACCESS The NOvA software testing framework Related content - Corrosion process monitoring by AFM higher harmonic imaging S Babicz, A Zieliski, J Smulko

More information

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY Journal of Physics: Conference Series OPEN ACCESS Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY To cite this article: Elena Bystritskaya et al 2014 J. Phys.: Conf.

More information

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland Abstract. The Data and Storage Services group at CERN is conducting

More information

Performance quality monitoring system for the Daya Bay reactor neutrino experiment

Performance quality monitoring system for the Daya Bay reactor neutrino experiment Journal of Physics: Conference Series OPEN ACCESS Performance quality monitoring system for the Daya Bay reactor neutrino experiment To cite this article: Y B Liu and the Daya Bay collaboration 2014 J.

More information

A Prototype for Guideline Checking and Model Transformation in Matlab/Simulink

A Prototype for Guideline Checking and Model Transformation in Matlab/Simulink A Prototype for Guideline Checking and Model Transformation in Matlab/Simulink Holger Giese, Matthias Meyer, Robert Wagner Software Engineering Group Department of Computer Science University of Paderborn

More information

Simulation of digital pixel readout chip architectures with the RD53 SystemVerilog-UVM verification environment using Monte Carlo physics data

Simulation of digital pixel readout chip architectures with the RD53 SystemVerilog-UVM verification environment using Monte Carlo physics data Journal of Instrumentation OPEN ACCESS Simulation of digital pixel readout chip architectures with the RD53 SystemVerilog-UVM verification environment using Monte Carlo physics data To cite this article:

More information

Implementation of a PC-based Level 0 Trigger Processor for the NA62 Experiment

Implementation of a PC-based Level 0 Trigger Processor for the NA62 Experiment Implementation of a PC-based Level 0 Trigger Processor for the NA62 Experiment M Pivanti 1, S F Schifano 2, P Dalpiaz 1, E Gamberini 1, A Gianoli 1, M Sozzi 3 1 Physics Dept and INFN, Ferrara University,

More information

Tier 3 batch system data locality via managed caches

Tier 3 batch system data locality via managed caches Journal of Physics: Conference Series PAPER OPEN ACCESS Tier 3 batch system data locality via managed caches To cite this article: Max Fischer et al 2015 J. Phys.: Conf. Ser. 608 012018 Recent citations

More information

A new petabyte-scale data derivation framework for ATLAS

A new petabyte-scale data derivation framework for ATLAS Journal of Physics: Conference Series PAPER OPEN ACCESS A new petabyte-scale data derivation framework for ATLAS To cite this article: James Catmore et al 2015 J. Phys.: Conf. Ser. 664 072007 View the

More information

IBM Monitoring Agent for Citrix Virtual Desktop Infrastructure 7.2 FP3. User's Guide IBM SC

IBM Monitoring Agent for Citrix Virtual Desktop Infrastructure 7.2 FP3. User's Guide IBM SC IBM Monitoring Agent for Citrix Virtual Desktop Infrastructure 7.2 FP3 User's Guide IBM SC14-7487-02 IBM Monitoring Agent for Citrix Virtual Desktop Infrastructure 7.2 FP3 User's Guide IBM SC14-7487-02

More information

UDP-Lite Enhancement Through Checksum Protection

UDP-Lite Enhancement Through Checksum Protection IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS UDP-Lite Enhancement Through Checksum Protection To cite this article: Suherman et al 2017 IOP Conf. Ser.: Mater. Sci. Eng. 180

More information

BOSS and LHC computing using CernVM and BOINC

BOSS and LHC computing using CernVM and BOINC BOSS and LHC computing using CernVM and BOINC otn-2010-0x openlab Summer Student Report BOSS and LHC computing using CernVM and BOINC Jie Wu (Supervisor: Ben Segal / IT) 1 December 2010 Version 1 Distribution::

More information

Part I: Future Internet Foundations: Architectural Issues

Part I: Future Internet Foundations: Architectural Issues Part I: Future Internet Foundations: Architectural Issues Part I: Future Internet Foundations: Architectural Issues 3 Introduction The Internet has evolved from a slow, person-to-machine, communication

More information

OnCommand Unified Manager

OnCommand Unified Manager OnCommand Unified Manager Operations Manager Administration Guide For Use with Core Package 5.2.1 NetApp, Inc. 495 East Java Drive Sunnyvale, CA 94089 U.S. Telephone: +1 (408) 822-6000 Fax: +1 (408) 822-4501

More information

Real-time dataflow and workflow with the CMS tracker data

Real-time dataflow and workflow with the CMS tracker data Journal of Physics: Conference Series Real-time dataflow and workflow with the CMS tracker data To cite this article: N D Filippis et al 2008 J. Phys.: Conf. Ser. 119 072015 View the article online for

More information

Model-View-Controller

Model-View-Controller CNM STEMulus Center Web Development with PHP November 11, 2015 1/8 Outline 1 2 2/8 Definition A design pattern is a reusable and accepted solution to a particular software engineering problem. Design patterns

More information

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION The process of planning and executing SQL Server migrations can be complex and risk-prone. This is a case where the right approach and

More information

INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM

INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM Charles S. Saxon, Eastern Michigan University, charles.saxon@emich.edu ABSTRACT Incorporating advanced programming

More information

A Generic Distributed Architecture for Business Computations. Application to Financial Risk Analysis.

A Generic Distributed Architecture for Business Computations. Application to Financial Risk Analysis. A Generic Distributed Architecture for Business Computations. Application to Financial Risk Analysis. Arnaud Defrance, Stéphane Vialle, Morgann Wauquier Firstname.Lastname@supelec.fr Supelec, 2 rue Edouard

More information

Network Programmability with Cisco Application Centric Infrastructure

Network Programmability with Cisco Application Centric Infrastructure White Paper Network Programmability with Cisco Application Centric Infrastructure What You Will Learn This document examines the programmability support on Cisco Application Centric Infrastructure (ACI).

More information

Cooperation among ALICE Storage Elements: current status and directions (The ALICE Global Redirector: a step towards real storage robustness).

Cooperation among ALICE Storage Elements: current status and directions (The ALICE Global Redirector: a step towards real storage robustness). Cooperation among ALICE Storage Elements: current status and directions (The ALICE Global Redirector: a step towards real storage robustness). 1 CERN Geneve 23, CH-1211, Switzerland E-mail: fabrizio.furano@cern.ch

More information

Recent developments in user-job management with Ganga

Recent developments in user-job management with Ganga Recent developments in user-job management with Ganga Currie R 1, Elmsheuser J 2, Fay R 3, Owen P H 1, Richards A 1, Slater M 4, Sutcliffe W 1, Williams M 4 1 Blackett Laboratory, Imperial College London,

More information

VISUAL CORRELATION IN THE CONTEXT OF POST-MORTEM ANALYSIS

VISUAL CORRELATION IN THE CONTEXT OF POST-MORTEM ANALYSIS VISUAL CORRELATION IN THE CONTEXT OF POST-MORTEM ANALYSIS Michael Hayoz and Ulrich Ultes-Nitsche Research group on telecommunications, networks & security Department of Informatics, University of Fribourg,

More information

GPU Linear algebra extensions for GNU/Octave

GPU Linear algebra extensions for GNU/Octave Journal of Physics: Conference Series GPU Linear algebra extensions for GNU/Octave To cite this article: L B Bosi et al 2012 J. Phys.: Conf. Ser. 368 012062 View the article online for updates and enhancements.

More information

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

Data Mining Technology Based on Bayesian Network Structure Applied in Learning , pp.67-71 http://dx.doi.org/10.14257/astl.2016.137.12 Data Mining Technology Based on Bayesian Network Structure Applied in Learning Chunhua Wang, Dong Han College of Information Engineering, Huanghuai

More information

Getting Started With System Center 2012 R2 Orchestrator

Getting Started With System Center 2012 R2 Orchestrator Getting Started With System Center 2012 R2 Orchestrator Microsoft Corporation Published: November 1, 2013 Applies To System Center 2012 Service Pack 1 (SP1) System Center 2012 R2 Orchestrator Feedback

More information

Volunteer Computing at CERN

Volunteer Computing at CERN Volunteer Computing at CERN BOINC workshop Sep 2014, Budapest Tomi Asp & Pete Jones, on behalf the LHC@Home team Agenda Overview Status of the LHC@Home projects Additional BOINC projects Service consolidation

More information

Understanding the T2 traffic in CMS during Run-1

Understanding the T2 traffic in CMS during Run-1 Journal of Physics: Conference Series PAPER OPEN ACCESS Understanding the T2 traffic in CMS during Run-1 To cite this article: Wildish T and 2015 J. Phys.: Conf. Ser. 664 032034 View the article online

More information

Conflict based Backjumping for Constraints Optimization Problems

Conflict based Backjumping for Constraints Optimization Problems Conflict based Backjumping for Constraints Optimization Problems Roie Zivan and Amnon Meisels {zivanr,am}@cs.bgu.ac.il Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84-105,

More information

Design Patterns. An introduction

Design Patterns. An introduction Design Patterns An introduction Introduction Designing object-oriented software is hard, and designing reusable object-oriented software is even harder. Your design should be specific to the problem at

More information

Python in the Cling World

Python in the Cling World Journal of Physics: Conference Series PAPER OPEN ACCESS Python in the Cling World To cite this article: W Lavrijsen 2015 J. Phys.: Conf. Ser. 664 062029 Recent citations - Giving pandas ROOT to chew on:

More information

UCT Application Development Lifecycle. UCT Business Applications

UCT Application Development Lifecycle. UCT Business Applications UCT Business Applications Page i Table of Contents Planning Phase... 1 Analysis Phase... 2 Design Phase... 3 Implementation Phase... 4 Software Development... 4 Product Testing... 5 Product Implementation...

More information

A Grid-Enabled Component Container for CORBA Lightweight Components

A Grid-Enabled Component Container for CORBA Lightweight Components A Grid-Enabled Component Container for CORBA Lightweight Components Diego Sevilla 1, José M. García 1, Antonio F. Gómez 2 1 Department of Computer Engineering 2 Department of Information and Communications

More information

Evaluation of Apache Hadoop for parallel data analysis with ROOT

Evaluation of Apache Hadoop for parallel data analysis with ROOT Evaluation of Apache Hadoop for parallel data analysis with ROOT S Lehrack, G Duckeck, J Ebke Ludwigs-Maximilians-University Munich, Chair of elementary particle physics, Am Coulombwall 1, D-85748 Garching,

More information

Sending Commands and Managing Processes across the BABAR OPR Unix Farm through C++ and CORBA

Sending Commands and Managing Processes across the BABAR OPR Unix Farm through C++ and CORBA Sending Commands and Managing Processes across the BABAR OPR Unix Farm through C++ and CORBA G. Grosdidier 1, S. Dasu 2, T. Glanzman 3, T. Pavel 4 (for the BABAR Prompt Reconstruction and Computing Groups)

More information

Architecture Design and Experimental Platform Demonstration of Optical Network based on OpenFlow Protocol

Architecture Design and Experimental Platform Demonstration of Optical Network based on OpenFlow Protocol Journal of Physics: Conference Series PAPER OPEN ACCESS Architecture Design and Experimental Platform Demonstration of Optical Network based on OpenFlow Protocol To cite this article: Fangyuan Xing et

More information

4) An organization needs a data store to handle the following data types and access patterns:

4) An organization needs a data store to handle the following data types and access patterns: 1) A company needs to deploy a data lake solution for their data scientists in which all company data is accessible and stored in a central S3 bucket. The company segregates the data by business unit,

More information

A Formal Verification Methodology for Checking Data Integrity

A Formal Verification Methodology for Checking Data Integrity A Formal Verification Methodology for ing Data Integrity Yasushi Umezawa, Takeshi Shimizu Fujitsu Laboratories of America, Inc., Sunnyvale, CA, USA yasushi.umezawa@us.fujitsu.com, takeshi.shimizu@us.fujitsu.com

More information

Exploring Dynamic Compilation Facility in Java

Exploring Dynamic Compilation Facility in Java Exploring Dynamic Compilation Facility in Java Dingwei He and Kasi Periyasamy Computer Science Department University of Wisconsin-La Crosse La Crosse, WI 54601 kasi@cs.uwlax.edu Abstract Traditional programming

More information

CernVM-FS beyond LHC computing

CernVM-FS beyond LHC computing CernVM-FS beyond LHC computing C Condurache, I Collier STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot, OX11 0QX, UK E-mail: catalin.condurache@stfc.ac.uk Abstract. In the last three years

More information

Patterns for Data Migration Projects

Patterns for Data Migration Projects Martin Wagner martin.wagner@tngtech.com http://www.tngtech.com Tim Wellhausen kontakt@tim-wellhausen.de http://www.tim-wellhausen.de July 2, 2010 Introduction Data migration is one of the most common operations

More information

NUCLEAR EXPERT WEB MINING SYSTEM: MONITORING AND ANALYSIS OF NUCLEAR ACCEPTANCE BY INFORMATION RETRIEVAL AND OPINION EXTRACTION ON THE INTERNET

NUCLEAR EXPERT WEB MINING SYSTEM: MONITORING AND ANALYSIS OF NUCLEAR ACCEPTANCE BY INFORMATION RETRIEVAL AND OPINION EXTRACTION ON THE INTERNET 2011 International Nuclear Atlantic Conference - INAC 2011 Belo Horizonte, MG, Brazil, October 24-28, 2011 ASSOCIAÇÃO BRASILEIRA DE ENERGIA NUCLEAR - ABEN ISBN: 978-85-99141-04-5 NUCLEAR EXPERT WEB MINING

More information

Dynamics 365. for Finance and Operations, Enterprise edition (onpremises) system requirements

Dynamics 365. for Finance and Operations, Enterprise edition (onpremises) system requirements Dynamics 365 ignite for Finance and Operations, Enterprise edition (onpremises) system requirements This document describes the various system requirements for Microsoft Dynamics 365 for Finance and Operations,

More information

DAL ALGORITHMS AND PYTHON

DAL ALGORITHMS AND PYTHON DAL ALGORITHMS AND PYTHON CERN Summer Student Report Bahar Aydemir Supervisors: Igor Soloviev Giuseppe Avolio September 15, 2017 1 Contents 1 Introduction... 3 2 Work Done... 3 2.1 Implementation Details...

More information

Database Performance Analysis Techniques Using Metric Extensions and SPA

Database Performance Analysis Techniques Using Metric Extensions and SPA Database Performance Analysis Techniques Using Metric Extensions and SPA Kurt Engeleiter Oracle Corporation Redwood Shores, CA, USA Keywords: ADDM, SQL Tuning Advisor, SQL Performance Analyzer, Metric

More information

PyCMSXiO: an external interface to script treatment plans for the Elekta CMS XiO treatment planning system

PyCMSXiO: an external interface to script treatment plans for the Elekta CMS XiO treatment planning system Journal of Physics: Conference Series OPEN ACCESS PyCMSXiO: an external interface to script treatment plans for the Elekta CMS XiO treatment planning system To cite this article: Aitang Xing et al 2014

More information

MVC. Model-View-Controller. Design Patterns. Certain programs reuse the same basic structure or set of ideas

MVC. Model-View-Controller. Design Patterns. Certain programs reuse the same basic structure or set of ideas MVC -- Design Patterns Certain programs reuse the same basic structure or set of ideas These regularly occurring structures have been called Design Patterns Design Patterns Design Patterns: Elements of

More information

Building Knowledge Models Using KSM

Building Knowledge Models Using KSM Building Knowledge Models Using KSM Jose Cuena, Martin Molina Department of Artificial Intelligence, Technical University of Madrid, Campus de Montegancedo S/N, Boadilla del Monte 28660, Madrid, SPAIN

More information

Survey on MapReduce Scheduling Algorithms

Survey on MapReduce Scheduling Algorithms Survey on MapReduce Scheduling Algorithms Liya Thomas, Mtech Student, Department of CSE, SCTCE,TVM Syama R, Assistant Professor Department of CSE, SCTCE,TVM ABSTRACT MapReduce is a programming model used

More information

ISTITUTO NAZIONALE DI FISICA NUCLEARE

ISTITUTO NAZIONALE DI FISICA NUCLEARE ISTITUTO NAZIONALE DI FISICA NUCLEARE Sezione di Perugia INFN/TC-05/10 July 4, 2005 DESIGN, IMPLEMENTATION AND CONFIGURATION OF A GRID SITE WITH A PRIVATE NETWORK ARCHITECTURE Leonello Servoli 1,2!, Mirko

More information

Rule Engine for Validating Complex Business Objects

Rule Engine for Validating Complex Business Objects Rule Engine for Validating Complex Business Objects DIBYENDUSEKHAR GOSWAMI, goswami.dib@gmail.com Several systems use complex business objects which need to be validated against multiple requirements before

More information

Talend Open Studio for Data Quality. User Guide 5.5.2

Talend Open Studio for Data Quality. User Guide 5.5.2 Talend Open Studio for Data Quality User Guide 5.5.2 Talend Open Studio for Data Quality Adapted for v5.5. Supersedes previous releases. Publication date: January 29, 2015 Copyleft This documentation is

More information

Plex Media Server Driver. Installation and Usage Guide. Revision: 3.0 Date: Monday, July 10, 2017 Authors: Alan Chow

Plex Media Server Driver. Installation and Usage Guide. Revision: 3.0 Date: Monday, July 10, 2017 Authors: Alan Chow Plex Media Server Driver Installation and Usage Guide Revision: 3.0 Date: Monday, July 10, 2017 Authors: Alan Chow Contents Overview... 3 Features... 3 Why use Plex Media Server?... 4 Process in a nutshell...

More information

Improving data integrity on cloud storage services

Improving data integrity on cloud storage services International Journal of Engineering Science Invention Volume 2 Issue 2 ǁ February. 2013 Improving data integrity on cloud storage services Miss. M.Sowparnika 1, Prof. R. Dheenadayalu 2 1 (Department of

More information

CSMA based Medium Access Control for Wireless Sensor Network

CSMA based Medium Access Control for Wireless Sensor Network CSMA based Medium Access Control for Wireless Sensor Network H. Hoang, Halmstad University Abstract Wireless sensor networks bring many challenges on implementation of Medium Access Control protocols because

More information

Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework

Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Junguk Cho, Hyunseok Chang, Sarit Mukherjee, T.V. Lakshman, and Jacobus Van der Merwe 1 Big Data Era Big data analysis is increasingly common

More information

Programme Outcome COURSE OUTCOMES MCA

Programme Outcome COURSE OUTCOMES MCA Programme Outcome PO1: To provide trained human resource for the IT industry. COURSE OUTCOMES MCA MCA 101 : Object Oriented Programming CO1 The students develops a sound approach to problem solving using

More information

Recent Developments in the CernVM-File System Server Backend

Recent Developments in the CernVM-File System Server Backend Journal of Physics: Conference Series PAPER OPEN ACCESS Recent Developments in the CernVM-File System Server Backend To cite this article: R Meusel et al 2015 J. Phys.: Conf. Ser. 608 012031 Recent citations

More information

Biter: A Platform for the Teaching and Research of Multiagent Systems Design using RoboCup

Biter: A Platform for the Teaching and Research of Multiagent Systems Design using RoboCup Proceedings of the International Robocup Symposium, 2001 Biter: A Platform for the Teaching and Research of Multiagent Systems Design using RoboCup Paul Buhler 1 and José M. Vidal 2 College of Charleston,

More information

DEVELOPMENT OF 3D BIT-MAP-BASED CAD AND ITS APPLICATION TO HYDRAULIC PUMP MODEL FABRICATION

DEVELOPMENT OF 3D BIT-MAP-BASED CAD AND ITS APPLICATION TO HYDRAULIC PUMP MODEL FABRICATION DEVELOPMENT OF 3D BIT-MAP-BASED CAD AND ITS APPLICATION TO HYDRAULIC PUMP MODEL FABRICATION Tarou Takagi *, Tatsuro Yashiki *, Yasushi Nagumo *, Shouhei Numata * and Noriyuki Sadaoka * * Power and Industrial

More information

The CMS High Level Trigger System: Experience and Future Development

The CMS High Level Trigger System: Experience and Future Development Journal of Physics: Conference Series The CMS High Level Trigger System: Experience and Future Development To cite this article: G Bauer et al 2012 J. Phys.: Conf. Ser. 396 012008 View the article online

More information

Microsoft Operations Manager 2005

Microsoft Operations Manager 2005 Managing Microsoft SQL Server 2005 with Microsoft Operations Manager 2005 in a Dell Scalable Enterprise Architecture The Microsoft Operations Manager (MOM) 2005 tool enables enterprise IT organizations

More information

Parallels Virtuozzo Containers 4.6 for Windows

Parallels Virtuozzo Containers 4.6 for Windows Parallels Parallels Virtuozzo Containers 4.6 for Windows Templates Management Guide Copyright 1999-2010 Parallels Holdings, Ltd. and its affiliates. All rights reserved. Parallels Holdings, Ltd. c/o Parallels

More information

Dell EqualLogic Storage Management Pack Suite Version 5.0 For Microsoft System Center Operations Manager And System Center Essentials User s Guide

Dell EqualLogic Storage Management Pack Suite Version 5.0 For Microsoft System Center Operations Manager And System Center Essentials User s Guide Dell EqualLogic Storage Management Pack Suite Version 5.0 For Microsoft System Center Operations Manager And System Center Essentials User s Guide Notes, Cautions, and Warnings NOTE: A NOTE indicates important

More information

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February LHC Cloud Computing with CernVM Ben Segal 1 CERN 1211 Geneva 23, Switzerland E mail: b.segal@cern.ch Predrag Buncic CERN E mail: predrag.buncic@cern.ch 13th International Workshop on Advanced Computing

More information

SEGUS Inc DB2 z/os tools

SEGUS Inc DB2 z/os tools SEGUS Inc DB2 z/os tools superior products, superior services 2015 SOFTWARE ENGINEERING GMBH and SEGUS Inc. 1 SOFTWARE ENGINEERING headquarter German Software Company In private hands for more than 35

More information

Performance Monitoring

Performance Monitoring Performance Monitoring Performance Monitoring Goals Monitoring should check that the performanceinfluencing database parameters are correctly set and if they are not, it should point to where the problems

More information

Microsoft SharePoint Server 2013 Plan, Configure & Manage

Microsoft SharePoint Server 2013 Plan, Configure & Manage Microsoft SharePoint Server 2013 Plan, Configure & Manage Course 20331-20332B 5 Days Instructor-led, Hands on Course Information This five day instructor-led course omits the overlap and redundancy that

More information

Hue Application for Big Data Ingestion

Hue Application for Big Data Ingestion Hue Application for Big Data Ingestion August 2016 Author: Medina Bandić Supervisor(s): Antonio Romero Marin Manuel Martin Marquez CERN openlab Summer Student Report 2016 1 Abstract The purpose of project

More information

ATLAS Distributed Computing Experience and Performance During the LHC Run-2

ATLAS Distributed Computing Experience and Performance During the LHC Run-2 ATLAS Distributed Computing Experience and Performance During the LHC Run-2 A Filipčič 1 for the ATLAS Collaboration 1 Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia E-mail: andrej.filipcic@ijs.si

More information

Veritas NetBackup for Microsoft SharePoint Server Administrator s Guide

Veritas NetBackup for Microsoft SharePoint Server Administrator s Guide Veritas NetBackup for Microsoft SharePoint Server Administrator s Guide for Windows Release 8.0 Veritas NetBackup for Microsoft SharePoint Server Administrator s Guide Last updated: 2016-11- 10 Legal Notice

More information

Bringing ATLAS production to HPC resources - A use case with the Hydra supercomputer of the Max Planck Society

Bringing ATLAS production to HPC resources - A use case with the Hydra supercomputer of the Max Planck Society Journal of Physics: Conference Series PAPER OPEN ACCESS Bringing ATLAS production to HPC resources - A use case with the Hydra supercomputer of the Max Planck Society To cite this article: J A Kennedy

More information