CERN s Business Computing

Similar documents
Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

CSCS CERN videoconference CFD applications

Tackling tomorrow s computing challenges today at CERN. Maria Girone CERN openlab CTO

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Overview. About CERN 2 / 11

Distributed e-infrastructures for data intensive science

Grid Computing a new tool for science

Summary of the LHC Computing Review

Big Data Analytics and the LHC

Storage and I/O requirements of the LHC experiments

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

The LHC Computing Grid

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

We invented the Web. 20 years later we got Drupal.

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

CouchDB-based system for data management in a Grid environment Implementation and Experience

The LHC computing model and its evolution. Dr Bob Jones CERN

The LHC Computing Grid

Travelling securely on the Grid to the origin of the Universe

Virtualizing a Batch. University Grid Center

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

IEPSAS-Kosice: experiences in running LCG site

Developments in Manufacturing Technologies Research Co-operation between Riga Technical University and CERN

Computing at the Large Hadron Collider. Frank Würthwein. Professor of Physics University of California San Diego November 15th, 2013

Grid Computing Activities at KIT

Batch Services at CERN: Status and Future Evolution

Case Study: Tata Communications Delivering a Truly Interactive Business Intelligence Experience on a Large Multi-Tenant Hadoop Cluster

The CMS Computing Model

First Experience with LCG. Board of Sponsors 3 rd April 2009

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Accelerating Throughput from the LHC to the World

Fault Detection using Advanced Analytics at CERN's Large Hadron Collider

1. Introduction. Outline

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

Grid Computing: dealing with GB/s dataflows

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

The Grid: Processing the Data from the World s Largest Scientific Machine

Big Data Big Mess? Ein Versuch einer Positionierung

Using the In-Memory Columnar Store to Perform Real-Time Analysis of CERN Data. Maaike Limper Emil Pilecki Manuel Martín Márquez

Stephen J. Gowdy (CERN) 12 th September 2012 XLDB Conference FINDING THE HIGGS IN THE HAYSTACK(S)

Storage Resource Sharing with CASTOR.

Evaluation of the computing resources required for a Nordic research exploitation of the LHC

Andrea Sciabà CERN, Switzerland

Experience of the WLCG data management system from the first two years of the LHC data taking

e BOOK Do you feel trapped by your database vendor? What you can do to take back control of your database (and its associated costs!

Evolution of Database Replication Technologies for WLCG

ISTITUTO NAZIONALE DI FISICA NUCLEARE

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

High-Energy Physics Data-Storage Challenges

CERN and Scientific Computing

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

HammerCloud: A Stress Testing System for Distributed Analysis

ATLAS Experiment and GCE

The ATLAS EventIndex: Full chain deployment and first operation

CERN Open Data and Data Analysis Knowledge Preservation

OPTIMIZING YOUR ORACLE DATABASE ENVIRONMENTS

Open data and scientific reproducibility

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

arxiv: v1 [cs.dc] 12 May 2017

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Data Management for the World s Largest Machine

RUSSIAN DATA INTENSIVE GRID (RDIG): CURRENT STATUS AND PERSPECTIVES TOWARD NATIONAL GRID INITIATIVE

Scientific Instrumentation using NI Technology

Gigabyte Bandwidth Enables Global Co-Laboratories

Volunteer Computing at CERN

RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP

LCG Conditions Database Project

How physicists analyze massive data: LHC + brain + ROOT = Higgs. Axel Naumann, CERN - 33C3, 2016 (but almost 2017)

Spark and HPC for High Energy Physics Data Analyses

irods usage at CC-IN2P3: a long history

Distributed Data Management with Storage Resource Broker in the UK

Data Transfers Between LHC Grid Sites Dorian Kcira

Monitoring of Computing Resource Use of Active Software Releases at ATLAS

Investigation on Oracle GoldenGate Veridata for Data Consistency in WLCG Distributed Database Environment

Grid Computing: dealing with GB/s dataflows

Europe and its Open Science Cloud: the Italian perspective. Luciano Gaido Plan-E meeting, Poznan, April

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Tier-2 structure in Poland. R. Gokieli Institute for Nuclear Studies, Warsaw M. Witek Institute of Nuclear Physics, Cracow

In-Memory Data Management Jens Krueger

Big Data - Some Words BIG DATA 8/31/2017. Introduction

Evolution of Database Replication Technologies for WLCG

Technical Case Study CERN the European Organization for Nuclear Research

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Azon Master Class. By Ryan Stevenson Guidebook #9 Amazon Advertising

a new Remote Operations Center at Fermilab

SALTO E&T website User manual

150 million sensors deliver data. 40 million times per second

Improving Packet Processing Performance of a Memory- Bounded Application

The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

CC-IN2P3: A High Performance Data Center for Research

Grid Computing at the IIHE

The JINR Tier1 Site Simulation for Research and Development Purposes

Grid Data Management

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Transcription:

CERN s Business Computing Where Accelerated the infinitely by Large Pentaho Meets the Infinitely small Jan Janke Deputy Group Leader CERN Administrative Information Systems Group

CERN World s Leading Particle Physics Research Laboratory Founded in 1954 ~2300 Staff ~1600 other personnel 12000 visiting scientists Budget of ~1 billion US$

A Worldwide Community Member States 7115 researchers Collaborators 992 researchers Associate 774 researchers Observers 2558 researchers Others 817 researchers

Birthplace of the Web Frequently Asked Questions on WWW FREQUENTLY ASKED QUESTIONS ON W3 An FAQ list is really a cop-out from managed information. You should be able to find everything you want to know by browsing from the WWW project page, as everything should be arranged in a logical way. Here though are things which maybe didn't fit into the structure, with pointers to the answers which maybe did. Its an experiment, started May 92. The questioners are anonymous. I am just starting: how do I find out more?[1] How does www keep track of the available servers?[2] How does W3 compare with WAIS and Gopher[3]? How do I create my own server[4]? 1-10, Up, <RETURN> for more, Quit, or Help:

CERN s Mission Advance the frontiers of knowledge E.g. the secrets of the Big Bang what was the matter like within the first moments of the Universe s existence? Develop new technologies for accelerators and detectors Information technology - the Web and the GRID Medicine - diagnosis and therapy Train scientists and engineers of tomorrow Uniting People Unite people from different Research countries and cultures

Next Scientific Challenge: Understand the very first moments of our Universe after the Big Bang Big Bang 10-32 cm 13.8 Billion Years 10 28 cm Today

World-class technology The Coldest The Temperatures highest vacuum

The LHC

Million of collisions

Particle Detectors ATLAS ALICE CMS LHCb

Computer Centre (Tier 0) Raw recording rate ~10 ½ GByte/sec 91,000 processing cores 30 PetaBytes of data storage on disk 70 PetaBytes of data storage on tape ~25 PetaBytes/year

Grid computing The largest computing grid

Our Business Challenges

Our Business Challenges Large Numbers of Highly Demanding Users

Our Business Challenges A Unique Organization

What do we do with Pentaho? How do we tackle our business challenges?

Reporting

Performance ERP PDI Memory BI Server Mondrian Cache ORDERS 1-15 minutes ORDERS Analyzer Row Format Column Format Oracle 11 g Oracle 12 c In-Memory ~1 TB

How do we manage Pentaho? Content management, multi tenancy, scheduling

Find different content types Find content in all BA Servers

How this schema was found Decentralised administration of schemas! Select who can manage which schemas no need to be too generous w. full admin rights.

Multi-Tenancy: Decentralise Administration Accessible to domain administrators Upload to any available BA server New creations require approval by global Pentaho admin team.

Uncover Dependencies

Bulk Update Features Update schema names and all dependent reports Exchange a data source for another (new) one

Schema Development Lifecycle Create new schema Create new data source By convention, name starts w. UD (under development) Publish to BA server Create and test reports Reports updated automatically Promote to production (rename) Prepare new data source Approval for release By convention, name must now start w. GA (generally available)

Scheduler Enhancements Schedule several reports together Suitable for simpler most frequent needs

Scheduler Enhancements Check previous executions. Download results from past runs. Base report parameters on SQL.

Version Control Integration Enable change management Keep a clean approved production environment

Thank You! We are eager to share our experiences What are you doing with and around Pentaho? Visit CERN Virtually at https://home.cern In Person, book a (free) visit: https://visit.cern/