Clarens remote analysis enabling environment

Similar documents
Clarens Web Services Architecture

Heterogeneous Relational Databases for a Grid-enabled Analysis Environment

Java Development and Grid Computing with the Globus Toolkit Version 3

Copyright

Overview. Steve Fisher Please do interrupt with any questions

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Gatlet - a Grid Portal Framework

StratusLab Cloud Distribution Installation. Charles Loomis (CNRS/LAL) 3 July 2014

CMS HLT production using Grid tools

User Manual. Admin Report Kit for IIS 7 (ARKIIS)

Vision of J2EE. Why J2EE? Need for. J2EE Suite. J2EE Based Distributed Application Architecture Overview. Umair Javed 1

Distributed Architectures & Microservices. CS 475, Spring 2018 Concurrent & Distributed Systems

How to build Scientific Gateways with Vine Toolkit and Liferay/GridSphere framework

Distributed Multitiered Application

Verteilte Systeme (Distributed Systems)

Basic Properties of Styles

Distributed Environments. CORBA, JavaRMI and DCOM

McAfee epolicy Orchestrator Release Notes

Software MEIC. (Lesson 20)

Protocol Buffers, grpc

Uniform Resource Locators (URL)

CIS-CAT Pro Dashboard Documentation

Oracle Streams Advanced Queuing User's Guide And Reference 10g Release 2

Notes. Submit homework on Blackboard The first homework deadline is the end of Sunday, Feb 11 th. Final slides have 'Spring 2018' in chapter title

Middleware. Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004

Using MATLAB on the TeraGrid. Nate Woody, CAC John Kotwicki, MathWorks Susan Mehringer, CAC

The PRISM infrastructure System Architecture and User Interface

Client/Server-Architecture

GT-OGSA Grid Service Infrastructure

Grid Computing Systems: A Survey and Taxonomy

Java EE 7: Back-End Server Application Development

Policy Settings for Windows Server 2003 (including SP1) and Windows XP (including SP2)

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Let's Play... Try to name the databases described on the following slides...

Managing State. Chapter 13

WebSphere Application Server - Overview

A Grid-enabled Interface to Condor for Interactive Analysis on Handheld and. Resource-limited Devices

(9A05803) WEB SERVICES (ELECTIVE - III)

Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing

MONitoring Agents using a Large Integrated Services Architecture. Iosif Legrand California Institute of Technology

Advanced Topics in Operating Systems

3C05 - Advanced Software Engineering Thursday, April 29, 2004

Data Access and Analysis with Distributed, Federated Data Servers in climateprediction.net

A Simple Mass Storage System for the SRB Data Grid

Virtual Credit Card Processing System

CEPTEST Server Version

UGP and the UC Grid Portals

JAVA COURSES. Empowering Innovation. DN InfoTech Pvt. Ltd. H-151, Sector 63, Noida, UP

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

Chapter 4 Communication

DataMan. version 6.5.4

04 Webservices. Web APIs REST Coulouris. Roy Fielding, Aphrodite, chp.9. Chp 5/6

The Clarens Web Service Framework for Distributed Scientific Analysis in Grid Projects

Chapter 5. The MapReduce Programming Model and Implementation

Introduction to Cisco TV CDS Software APIs

Exercises. Cacti Installation and Configuration

Exercises. Cacti Installation and Configuration

Java EE Application Assembly & Deployment Packaging Applications, Java EE modules. Model View Controller (MVC)2 Architecture & Packaging EJB Module

DIAL: Distributed Interactive Analysis of Large Datasets

70-532: Developing Microsoft Azure Solutions

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

Introduction and Overview Socket Programming Lower-level stuff Higher-level interfaces Security. Network Programming. Samuli Sorvakko/Nixu Oy

Software MEIC. (Lesson 20)

Data Communication & Computer Networks MCQ S

70-532: Developing Microsoft Azure Solutions

70-487: Developing Windows Azure and Web Services

Excerpts of Web Application Security focusing on Data Validation. adapted for F.I.S.T. 2004, Frankfurt

ISA 767, Secure Electronic Commerce Xinwen Zhang, George Mason University

Python Twisted. Mahendra

REST Easy with Infrared360

Oracle 10g and IPv6 IPv6 Summit 11 December 2003

Identity Synchronizer User administration in Domino controlled by Active Directory

Deccansoft Software Services. J2EE Syllabus

Remote Procedure Call. Tom Anderson

Improvement to the Smart Data Server with SOAP *

Introduction and Overview Socket Programming Higher-level interfaces Final thoughts. Network Programming. Samuli Sorvakko/Nixu Oy

Web as a Distributed System

Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions

SUN Sun Certified Enterprise Architect for J2EE 5. Download Full Version :

J2EE Interview Questions

P.Buncic, A.Peters, P.Saiz for the ALICE Collaboration

Mirth Connect and NwHIN Gateway Integration Training


Gridbus Portlets -- USER GUIDE -- GRIDBUS PORTLETS 1 1. GETTING STARTED 2 2. AUTHENTICATION 3 3. WORKING WITH PROJECTS 4

Architecture Proposal

Developing Microsoft Azure Solutions (70-532) Syllabus

Reliability, load-balancing, monitoring and all that: deployment aspects of UNICORE. Bernd Schuller UNICORE Summit 2016 June 23, 2016

Running Galaxy in an HPC environment requirements, challenges and some solutions : the LIFEPORTAL

CNIT 129S: Securing Web Applications. Ch 3: Web Application Technologies

CO Java EE 7: Back-End Server Application Development

The LGI Pilot job portal. EGI Technical Forum 20 September 2011 Jan Just Keijser Willem van Engen Mark Somers

CS603: Distributed Systems

Cloud Computing & Visualization

Chapter 3. Technology Adopted. 3.1 Introduction


Oracle Data Integrator 12c: Integration and Administration

Googles Approach for Distributed Systems. Slides partially based upon Majd F. Sakr, Vinay Kolar, Mohammad Hammoud and Google PrototBuf Tutorial

Operating Systems. Week 13 Recitation: Exam 3 Preview Review of Exam 3, Spring Paul Krzyzanowski. Rutgers University.

Selenium Testing Course Content

Web Programming Paper Solution (Chapter wise)

Transcription:

Clarens remote analysis enabling environment May 22, 2002 E. Aslakson, J. Bunn, K. Holtman, S. Iqbal, I. Legrand, V. Litvin, H. Newman, E. Soedarmadji, S. Singh, C. D. Steenberg,, R. Wilkinson

Varied components and data flows Many promising alternative service providers Production system and data repositories TAG and AOD extraction/conversion/transport services Tier 0/1/2 ORCA analysis farm(s) (or distributed `farm using grid queues) PIAF/Proof/.. type analysis farm(s) RDBMS based data warehouse(s) Tier 1/2 Production data flow TAGs/AODs data flow Physics Query flow Tool plugin module Local analysis tool: PAW/ROOT/ Data extraction Web service(s) Local disk User Query Web service(s) Web browser Tier 3/4/5

Demonstration for ROOT Examples of distributed analysis services PIAF/PROOF RDBMS-based data warehouse (mysql, pgsql done) Remotely stored ROOT files (rootd, webfile) Remote TAGs/AODs Grid services (files movement, job submission) Need at least protocols for: RDBMS access (ODBC/JDBC/Native DB connect) Remote file access (maybe through Globus) Some connection to access remote TAGS Now repeat the process for a web front-end, Lizard, JAS, C++ code etc. Remember that a web front-end is just another client of the service.

Clarens Architecture I Common protocol spoken by all types of clients to all types of services Implement service once for all clients Implement client access to service once for each client type using common protocol already implemented for all languages (C++, Java, Fortran, etc. :-): Common protocol is XML-RPC with SOAP close to working, CORBA doable, but would require different server above Clarens (uses IIOP, not HTTP) Handles authentication using Grid certificates, connection management, data serialization,, optionally encryption Immplementation uses stable,, well-known server infrastructure (Apache( Apache) ) that is debugged/audited over a long period by many Clarens layer itself implemented in Python, but can be reimplemented in C++ should performance be inadequate

Diagram: Clarens Architecture II Service Clarens Web server http/https RPC Client

Clarens Architecture III Server data flow Client data flow Authentication Authentication Session initialization Session initialization Request deserializing Request serializing Request mashalling Session Worker code invocation Request transmission Worker code invocation Result deserializing Result serializing Session termination Session termination

Server notes Server consists of a pool of processes Any process can respond to a request Every RPC call is a request Server processes cannot make the main server crash - bad compiled modules only affect that service Session data stored in an embedded Berkeley DB Other objects may be re-used by subsequent connections to the same process - e.g. Connections to a RDBMS Compiled modules linked against shared libraries need to have library search path set in the.so file (RPATH for elf files) Compiled modules using shared libraries that apache or python was linked against, must use the same version - e.g. can't use Berkeley DB4 if apache/python was linked against DB3

Server installation Directory structure: <top>/ clarens_server.py /echo init.py method echo /objy init.py method objy clarens_int.so (compiled module - ObjyDB) /file init.py method file ~clarens/ myservice/ init.py method mymethod Dependencies: apache, python, py-xmlrpc, m2crypto, Berkeley-DB3/4, py- bsddb3

Server example Access OBJY Tags/Histograms method_list={"getdbs": getdbs, "gethistos": gethistos,... "gettagcolumninfo":gettagcolumninfo } def getdbs(req,method_name method_name,args): # Handler method try: dblist <Get list of databases> except: return apache.http_error req.write(encode(dblist)) return apache.ok

Server example II File access: def read(req,name name,offset,len): <write len bytes from file name from offset> return apache.ok method_list={"read": read}

Client example Access OBJY Tags/Histograms Python: dbsvr=clarens.clarens_client('http://host/xmlrpc/clarens_server.py') py') dbsvr.objy objy.getdbs('/home/mydir/myfederation.boot') Root: dbsvr=clarens('http://host/xmlrpc/clarens_server.py') dbsvr.call('objy objy.getdbs','/home/mydir/myfederation.boot') Gives a list of databases in 'myfederation' myfederation' Call hierarchy (objy.getdbs): objy=dir to place file in GetDBs=method name To o implement new service: Create the above Create methods in 'method' list'

Client example II File access Python: dbsvr=clarens.clarens_client('http://host/xmlrpc/clarens_server.py') py') dbsvr.file file.read('/home/myfile.root') dbsvr.logout() Root: dbsvr=clarens('http://host/xmlrpc/clarens_server.py') TCWebfile F(dbsvr,'/home/myfile.root') TCWebfile G(dbsvr,'~user/myfile.root') TBrowser T; dbsvr.logout() Browse file with root tree browser Read file like any other Seek method is implemented on client side File offset is persistent data stored by the client

Conclusions Clarens is a simple way to implement web services on the server Provides some basic connectivity functionality common to all services Uses commodity protocols No Globus needed on client side, only certificate Simple to implement clients in scripts and compiled code Needed: more services, E.g.: data warehouse service (Saima) NTUPLE streaming from reconstructed data (Edwin, Julian) Reconstruction on demand (?) Job submission Also TODO: implement discovery mechanisms