Internet2 Meeting September 2005

Similar documents
Grid monitoring with MonAlisa

An Agent Based, Dynamic Service System to Monitor, Control and Optimize Distributed Systems. June 2007

Iosif Legrand. California Institute of Technology

MONitoring Agents using a Large Integrated Services Architecture. Iosif Legrand California Institute of Technology

Monitoring the ALICE Grid with MonALISA

Monitoring and Control of Large Scale Distributed Systems

MONALISA: A MONITORING FRAMEWORK FOR LARGE SCALE COMPUTING SYSTEMS

Table of Contents. LISA User Guide

DIPLOMA PROJECT. Distributed Delay Constrained Multicast Algorithms for the MonALISA Framework

Cisco Configuration Engine 2.0

Oracle Enterprise Manager. 1 Before You Install. System Monitoring Plug-in for Oracle Unified Directory User's Guide Release 1.0

Understanding Feature and Network Services in Cisco Unified Serviceability

ELFms industrialisation plans

Port Usage Information for the IM and Presence Service

Chapter 18 Distributed Systems and Web Services

Port Usage Information for the IM and Presence Service

5 days lecture course and hands-on lab $3,295 USD 33 Digital Version

A User-level Secure Grid File System

Adaptive Cluster Computing using JavaSpaces

IBM WebSphere Application Server 8. Clustering Flexible Management

POLITEHNICA UNIVERSITY OF BUCHAREST THE FACULTY OF COMPUTER SCIENCE AND AUTOMATIC CONTROL DIPLOMA PROJECT DISTRIBUTED ALGORITHMS

<Insert Picture Here> Oracle Application Cache Solution: Coherence

Global Platform for Rich Media Conferencing and Collaboration

The EU DataGrid Fabric Management

System Specification

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Programmable BitPipe. Andreas Gladisch VP Convergent Networks and Infrastructure, Telekom Innovation Labs

AMGA metadata catalogue system

Cisco Nexus Data Broker for Network Traffic Monitoring and Visibility

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

BIG-IP Acceleration: Network Configuration. Version

Cisco Unified Serviceability

Cisco Plug and Play Feature Guide Cisco Services. Cisco Plug and Play Feature Guide Cisco and/or its affiliates.

System Description. System Architecture. System Architecture, page 1 Deployment Environment, page 4

Grid Tutorial Networking

A Simple Mass Storage System for the SRB Data Grid

Presented By: Ian Kelley

Read the following information carefully, before you begin an upgrade.

Multinode Scalability and WAN Deployments

Optical Networking Activities in NetherLight

vrealize Operations Management Pack for NSX for vsphere Release Notes

Services. Service descriptions. Cisco HCS services

Veeam Cloud Connect. Version 8.0. Administrator Guide

Polycom RealPresence Access Director System

F5 BIG-IQ Centralized Management: Local Traffic & Network Implementations. Version 5.4

ELIMINATE SECURITY BLIND SPOTS WITH THE VENAFI AGENT

Cisco ISE Ports Reference

Cisco Extensible Network Controller

HA solution with PXC-5.7 with ProxySQL. Ramesh Sivaraman Krunal Bauskar

Configure System Parameters

StreamSets Control Hub Installation Guide

Cisco Unified Communications Manager TCP and UDP Port

NetScaler for Apps and Desktops CNS-222; 5 Days; Instructor-led

The Collaboration Cornerstone

Cisco Unified Communications Manager TCP and UDP Port

Symbols INDEX > 12-14

Distribution system how to remotely configure Zabbix infrastructure

Barracuda Link Balancer

vrealize Operations Management Pack for NSX for vsphere 3.5.0

StratusLab Cloud Distribution Installation. Charles Loomis (CNRS/LAL) 3 July 2014

Cisco ISE Ports Reference

SysAid Technical Presentation. Phone (Toll-Free US): Phone: +972 (3)

Service-Oriented Architecture for Command and Control Systems with Dynamic Reconfiguration

Cisco Wide Area Application Services: Secure, Scalable, and Simple Central Management

Relay Proxy User Guide

WebNMS White Paper Motorola (NSN) Element Manager HRPDA (EMH)

Percona XtraDB Cluster ProxySQL. For your high availability and clustering needs

Configuring Trace. Configuring Trace Parameters CHAPTER

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

BlackBerry Enterprise Server for IBM Lotus Domino Version: 5.0. Administration Guide

Distributed Systems Principles and Paradigms

Introduction Mobility Support Handover Management Conclutions. Mobility in IPv6. Thomas Liske. Dresden University of Technology

Installing SmartSense on HDP

MySQL High Availability. Michael Messina Senior Managing Consultant, Rolta-AdvizeX /

Citrix NetScaler Essentials and Unified Gateway

VMware vsphere 6.5: Install, Configure, Manage (5 Days)

Cisco ISE Ports Reference

Cisco Prime Collaboration Deployment Configuration and Administration

Cisco ISE Ports Reference

Installing and Configuring VMware vcenter Orchestrator

Step by Step ASR, Azure Site Recovery

NGFW Security Management Center

vrealize Operations Management Pack for NSX for vsphere 3.0

Communication System Design Projects

Configuring MWTM to Run with Various Networking Options

The CORAL Project. Dirk Düllmann for the CORAL team Open Grid Forum, Database Workshop Barcelona, 4 June 2008

Deployment Scenarios for Standalone Content Engines

[VMICMV6.5]: VMware vsphere: Install, Configure, Manage [V6.5]

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Release Notes for Cisco Application Policy Infrastructure Controller Enterprise Module, Release x

WhatsConfigured v3.1 User Guide

Configuration Guide. BlackBerry UEM Cloud

Inca as Monitoring. Kavin Kumar Palanisamy Indiana University Bloomington

Oracle 10g and IPv6 IPv6 Summit 11 December 2003

CONTENTS. Cisco Internet Streamer CDS 3.0 Software Configuration Guide iii OL CHAPTER 1 Product Overview 1-1

Platform Services Controller Administration. Modified on 27 JUN 2018 VMware vsphere 6.7 VMware ESXi 6.7 vcenter Server 6.7

CNS-222EA - EARLY ACCESS: NETSCALER FOR APPS AND DESKTOPS

WLS Neue Optionen braucht das Land

Distributed Systems Principles and Paradigms. Chapter 12: Distributed Web-Based Systems

ehealth SPECTRUM Integration

Transcription:

End User Agents: extending the "intelligence" to the edge in Distributed Systems Internet2 Meeting California Institute of Technology 1

OUTLINE (Monitoring Agents using a Large, Integrated s Architecture) An Agent Based, Dynamic System able to Monitor, Control and Optimize Distributed Systems LISA (Localhost( Information Agent) End User Agent, capable to effectively integrate user applications with Oriented Architectures. Examples : EVO system: a distributed videoconferencing system Data transfers: creating on demand an optical path 2

is A Dynamic, Distributed Architecture Real-time monitoring is an essential part of managing distributed systems. The monitoring information gathered is necessary for developing higher level services, and components that provide automated decisions, to help operate and optimize the workflow in complex systems. The system is designed as an ensemble of autonomous multi-threaded, self-describing agent-based subsystems which are registered as dynamic services, and are able to collaborate and cooperate in performing a wide range of monitoring tasks. These agents can analyze and process the information, in a distributed way, to provide optimization decisions in large scale distributed applications. An agent-based architecture provides the ability to invest the system with increasing degrees of intelligence; to reduce complexity and make global systems manageable in real time 3

The Architecture Provides: Distributed Registration and Discovery for s and Applications. Monitoring all aspects of complex systems : System information for computer nodes and clusters Network information : WAN and LAN Monitoring the performance of Applications, Jobs or services The End User Systems, its performance Can interact with any other services to provide in near real-time customized information based on monitoring data Secure, remote administration for services and applications Agents to supervise applications,, to restart or reconfigure them, and to notify other services when certain conditions are detected. The framework can be used to develop higher level decision services,, implemented as a distributed network of communicating agents, to perform global optimization tasks. Graphical User Interfaces to visualize complex information 4

The Discovery System & s Fully Distributed System with no Single Point of Failure Clients, HL services repositories AGENTS Proxies services Global s or Clients Dynamic load balancing Scalability & Replication Security AAA for Clients Distributed System for gathering and Analyzing Information. 5 Network of JINI-LUSs Secure & Public Distributed Dynamic Discovery- based on a lease Mechanism and REN

service & Data Handling Client (other service) Web client WEB WSDL SOAP MonALSIA Applications Data Stores Data Cache & DB Postgres MySQL Registration Lookup Communications via the ML Proxy data Predicates & Agents Configuration Control (SSL) Lookup Discovery Client (other service) Java User defined loadable Modules to write /sent data 6

7 Registration / Discovery Admin Access and AAA for Clients Application Trust keystore Applications Admin SSL connection Trust keystore Registration (signed certificate) Lookup Lookup Discovery s Proxy Multiplexer s Proxy Multiplexer AAA services Client (other service) Data Filters & Agents Client authentication Client (other service)

OSG Grid3 CMS ALICE VRVS System STAR D0 ABILENE GLORIAD 8 Communities using More than 200 Sites running and it monitors more than 12 000 nodes, more than 60 WAN links and Collects ~ 200 000 parameters /min VRVS It has been used for ABILENE Demonstrations at: SC2003 - - CMS-DC04 Telecom 2003 GRID3 WSIS 2003 SC 2004 I2 2005 http://monalisa.caltech.edu ALICE

Monitoring OSG, GRID3, Running Jobs, I2 Network Traffic, and Topology JOBS TOPOLOGY JOB Evolution ACCOUNTING 9

Monitoring I2 Network Traffic, Grid03 Farms and Jobs 10

Monitoring Network Topology Latency, Routers NETWORKS ROUTERS AS 11

ApMon Application Monitoring CPU Usag e ( % ) Library of APIs (C, C++, Java, Perl. Python) that can be used to send any information to services Flexibility, dynamic configuration, high communication performance Automated system monitoring Accounting information 70 60 50 40 30 20 10 0 No Lost Packages 0 1000 2000 3000 4000 5000 6000 Messages per second 12 APPLICATION App. Monitoring Time;IP;procID parameter1: value parameter2: value... ApMon App. Monitoring Mbps_out: 0.52 Status: reading MB_inout: 562.4 System Monitoring load1: 0.24 processes: 97 pages_in: 83 hosts APPLICATION Config Servlet ApMon ApMon Config UDP/XDR Monitoring Data UDP/XDR Monitoring Data UDP/XDR Monitoring Data dynamic reloading ApMon configuration generated automatically by a servlet / CGI script

Monitoring the Execution of Jobs and the Time Evolution SPLIT JOBS LIFELINES for JOBS Summit a Job Job1 Job Job Job2 13 DAG Job3 Job 31 Job 32

Monitoring ABILENE backbone Network Test for a Land Speed Record ~ 7 Gb/s in a single TCP stream from Geneva to Caltech 14

Monitoring VRVS Reflectors and Communication Topology 15

Communication in the Distributed Collaborative System cornell caltech vrvs 5 starlight vrvs us funet vrvs eu pub Reflectors are hosts that interconnect users by permanent IP tunnels. The active IP tunnels must be selected so that there is no cycle formed. usf inet 2 triumf 16 usp sinica kek Tree The selection is made according to the real-time measurements of the network performance. w( T ) = ( v, u) T w(( v, u)) minimum-spanning tree (MST)

Creating a Dynamic, Global, Minimum Spanning Tree to optimize the connectivity A weighted connected graph G = (V,E) with n vertices and m edges. The quality of connectivity between any two reflectors is measured every 2s. Building in near real time a minimumspanning tree T w ( T ) = ( v, u ) T w (( v, u )) 17

LISA- Localhost Information Agent End To End Monitoring Tool A lightweight Java Web Start application that provides complete monitoring of the end user systems, the network connectivity and can use the framework to optimize client applications It is very easy to deploy and install by simply using any browser. It detects the system architecture, the operating system and selects dynamically the binary parts necessary on each system. It can be easily deployed on any system. It is now used on all versions of Windows, Linux, Mac. It provides complete system monitoring of the host computer: CPU, memory, IO, disk, Hardware detection Main components, Audio, Video equipment, Drivers installed in the system Provides embedded clients for IPERF (or other network monitoring tools, like Web 100 ) A user friendly GUI to present all the monitoring information. 18

19 LISA- Provides an Efficient Integration for Distributed Systems and Applications It is using external services to identify the real IP of the end system, its network ID and AS Discovers services and can select, based on service attributes, different applications and their parameters (location, AS, functionality, load ) Based on information such as AS number or location, it determines a list with the best possible services. Registers as a listener for other service attributes (eg. number of connected clients). Continuously monitors the network connection with several selected services and provides the best one to be used from the client s perspective. Measures network quality, detects faults and informs upper layer services to take appropriate decisions Application Application Registration Application Best Lookup Application Lookup LISA Discovery

EVO: LISA Detects the Best Reflector for each Client and Agents keep the reflectors connected in a MST Dynamic Discovery of Reflectors Creates and maintains, in real-time, the optimal connectivity between reflectors (MST) based on periodic network measurements. Detects and monitor the User configuration, its hardware, the connectivity and its performance. Dynamically connects the client to the best reflector Provides secure administration. It is using alarm triggers to notify unexpected events 20

agents to create on demand on an optical path or tree 2 ML Demon Discovery & Secure Connection ML Agent 3 1 Optical Switch Control and Monitor the switch ML Agent Optical Switch Optical Switch Time to create a path on demand <1s independent of the location and the number of connections Runs a ML Demon >ml_path IP1 IP4 copy file IP4 ML Agent 4 21 ML proxy services used in Agent Communication

Test Setup for Controlling Optical Switches CALIENT (LA) 3 partitions on each switch They are controlled by a service 10G links 1G links Glimmerglass (GE) 3 Simulated Links as L2 VLAN Monitor and control switches using TL1 Interoperability between the two systems End User access to service 22

Monitoring Optical Switches Agents to Create on Demand an Optical Path 23

is a framework capable to correlate information from different layers Networking Farms & Data Serv. Job Job1 Job2 Job3 Job 31 Job 32 Applications Job HELP to create User Vertical Integration NEAR REAL TIME FEEDBACK BETWEEN MAJOR LAYERS IS CRUCIAL FOR DYNAMIC LOAD BALANCING, ADAPATABILITY AND SELF-ORGANIZATION 24