NM-WG Specification Adoption in perfsonar Aaron Brown, Internet2, University of Delaware Martin Swany University of Delaware, Internet2
What is perfsonar A collaboration Production network operators focused on designing and building tools that they will deploy and use on their networks to provide monitoring and diagnostic capabilites to themselves and their user comunities. An architecture & a set of protocols Web Services Architecture Protocols based on the Open Grid Forum Network Measurement Working Group (NM-WG) Schemas Emerging standards in the Network Markup Language WG (NML-WG) Several interoperable software implementations Java & Perl A Deployed Measurement infrastructure
perfsonar Goals Increase network awareness Set user expectations accurately Reduce diagnostic costs Performance problems noticed early Performance problems addressed efficiently Network engineers can see & act outside their turf Transform application design Incorporate network intuition into application behavior
Vision: Network Performance Information is Available People can find it (Discovery) Community of trust allows access across administrative domain boundaries Ubiquitous Widely deployed (Paths of interest covered) Reliable (Consistently configured correctly) Valuable Actionable (Analysis suggests course of action) Automatable (Applications act on data)
perfsonar Collaborators RNP ARNES BELNET CARNET CESNET CYNET DANTE DFN ESnet FCCN FERMI GARR GEANT GRNET HEAnet Internet2 ISTF POZNAN UNINETT University of Delaware Renater RedIRIS SLAC SWITCH SURFnet And anybody else we missed
perfsonar Architecture Interoperable network measurement middleware: Modular Web services-based Decentralized Locally controlled Integrates: Network measurement tools Network measurement archives Discovery Authentication and authorization Data manipulation Resource protection Topology Based on: Open Grid Forum Network Measurement Working Group schema.
perfsonar: System Description Domains represented by a set of services Each domain can deploy services important to the domain Analysis clients interact with service across multiple domains
perfsonar: Services (1) Lookup Service Allows the client to discover the existing services and other LS services. Dynamic: services registration themselves to the LS and mention their capabilities, they can also leave or be removed if a service gets down. AuthN/Z Service Internet2 Middleware Group, GN2-JRA5 (edugain) Authorization functionality for the framework Users can have several roles, the authorisation is done based on the user role. Trust relationships defined between users affiliated with different administrative domains.
perfsonar Services (2) Transformation Service Transform the data (aggregation, concatenation, correlation, translation, etc). Topology Service Make the network topology information available to the framework. Find the closest MP, provide topology information for visualisation tools Resource protector Arbitrate the consumption of limited resources between multiple services.
Inter-domain perfsonar example interaction Useful graph Client Token MA Here is who Here is who I am, I d like to access MA B a,b,c I : am, Network I d like A, to MA access A, AA MA A A Token MB Where Link utilisation along - Path a,b,c? AA A Get Link utilisation a,b,c Get link Here utilisation you go c,d,e,f Where Here Link you a,b,c: utilisation go Network along A - LS Path A, a,b,c,d,e,f? AA B c,d,e,f : Network B, MA B, AA B LS A MA A LS B MA B a b c d e f Network A Network B
Schema Key Goals: Extensibility, Normalization, Readability Break representation of performance measurements down into basic elements Data and Metadata Measurement Data A set of of measurement events that have some value or values at a particular time Measurement Metadata The details about the set of measurement data
Schema Normalization Can simply the database representation for many types of measurement data While optimizations are certainly possible, many measurement types can be viewed as one value over time Assists Combination/Concatenation of metrics Creating derived metrics Normalization helps with inferring relationships between types of metrics
Schema Basic Elements - Metadata Subject The measured/tested entity EventType (Verb) What type of measurement, value, or event occurred Characteristic, tool output, or generic event Parameters (Adjectives and Adverbs) How, or under what conditions, did this event occur?
Schema Basic Elements - Data Some sort of value - Datum Existence of an event might point to the case where there no additional value As in Link up/down or threshold events Time Must be extensible since even agreement about the right structure is not easy E.g. UNIX timestamp vs NTP time
A Message Message Metadata Data
An Object Store Store Metadata Data
A Data is Linked to a Metadata Metadata <id>someid</id> Data <metadataidref> someid </metadataidref>
A Metadata may be linked to another Metadata <id>someid</id> Metadata <id>someotherid</id> <metadataidref> someid </metadataidref>
Schema Namespaces All measurements have some sort of Data and Time All measurements can be described by the Metadata identifying who, what and how The specific structures of the Data and Metadata elements depend on the measurement Approach: Consistently use Data and Metadata elements and vary the namespaces of the specific elements
Schema Namespaces - 2 We encode the measurement/event type in the namespace And as a standalone element Some components of the system can pass Data and Metadata elements through without understanding their specific structure Allows and implementation to decide whether it supports a particular type of data or not Allows validation based on extended (namespace-specific) schemata
Schema Namespaces and Extensibility One key to extensibility is the use of hierarchy with delegation Similar to OIDs in the IETF management world The NM-WG has a hierarchy of network characteristics Good starting point However, not all tools are cleanly mapped onto the Characteristic space Often a matter of some debate
Schema Namespaces and Extensibility - 2 Organization-rooted tools namespace addresses this Some top-level tools ping, traceroute Easy to add new tools in organizationspecific namespaces Performance Event Repository Add a schema and get a URI Add Java classes
perfsonar-ps Motivation Create separate implementation of perfsonar standard Use same protocol/standards Proof of interoperability (strengthens the standard) Targeted for NOC deployments Lightweight Easy to deploy/manage (We were unable to convince our primary users to deploy Java services due to the complexity of dependencies)
perfsonar-ps Beta Release (0.06) (1/21/08) Focus on development of major perfsonar components LS - perfsonar_ps::services::ls::ls SNMP MA - perfsonar_ps::services::ma::snmp Status MA - perfsonar_ps::services::ma::status CircuitStatus MA - perfsonar_ps::services::ma::circuitstatus Topology MA - perfsonar_ps::services::ma::topology PingER (SLAC) * Not yet released OWAMP/BWCTL archive (perfsonarbuoy) Not released via CPAN
SNMP Measurement Archive Provide access to network performance data Utilization Errors Discards Numerous tools exist to collect passive measurements (via SNMP): MRTG Cacti Cricket Expose archives from RRD files
SNMP Measurement Archive Current Deployment: Internet2 Network ESnet Georgia Tech/SOX Fermilab
Pinger Based MP/MA Joint effort between Fermi Lab and SLAC Present views of historic Pinger data Expose interface to schedule live tests Built with perfsonar-ps infrastructure
Link Status Measurement Archive Provide access to up/down status information about layer2 links Data stored in a SQL database Database schema allows for storing time ranges during which a link had a certain status Minimizes storage costs for rarely changing links Communication/Configuration via XML Target audience is network operators and users interested in obtaining the status of the links over which their data flows
Link Status Measurement Archive Collector Allows for the periodic collection of the status of one or more links Can use SNMP, Scripts or simply Constants Can store results directly into a database or into a remote Measurement Archive Future Plans: TL1 Collection
Link Status Measurement Archive Visualization A perfsonar-ui Plugin is available that can display a network and the status of its links Current Deployment Internet2 Network HOPI (in2p3 circuit) Planned Deployment SLAC
Circuit Status Measurement Archive An e2emon-compatible service Integrates with the Link Status MA to provide the information stored in MAs Can work with local MAs directly or with remote MAs Can use the Topology MA to obtain necessary information about nodes Can use a Lookup Service to lookup the MA containing information on each link Target audience is administrators who want to publish circuit status information to e2emon clients
Circuit Status Measurement Archive Visualization Any tool that is compatible with e2emon will work with this service Current Deployment Internet2 Network HOPI (in2p3 circuit) Planned Deployment SLAC
Topology Service Provides a queryable repository for obtaining topology information about a domain Can obtain the entire network Xquery interface allows the construction of complex queries about the network Topology is specified according to the schema in development in the OGF
Topology Service Current Deployments Internet2 SLAC (PingER Topology Information) Planned Deployments DICE Dynamic Circuit Service Sites ESnet
perfsonar Lookup Service Directory service of perfsonar deployments Accept service registrations Handles queries for service location and capabilities and location of available data Manage the lifetimes of data and services to keep information up to date Web Service interface to XML Database Sleepycat XML Database Service Info/Data kept in native formats Draw away the complex query tasks from otherwise 'busy' services
Lookup Service Also XML based configuration/protocol Native storage/query mechanisms [Xpath/XQuery] Message format to exchange the data Targeted at single domain deployment Single instance to manage multiple services Client components and applications use the LS to find services perfsonar-ui perfadmin
Lookup Service Current Deployment: ( Arbor Internet2 (Ann University of Delaware Planned Deployment: IU for Internet2 network and regionals DICE Dynamic Circuit Network sites International Partners
Distributed Lookup Service Federation of individual LS instances into a global system Meta -lookup phase allows a query to find the specific LS that has relevant information Or perhaps the relevant LSes that have said info The specific query is sent directly to the LS in question Recent active design and development
Distributed Lookup Service Service and measurement metadata is summarized for propagation to distant domains IP addresses in service and measurement metadata are compressed into network/netmask pairs in the same way that routes are advertised (CIDR-style) These summarized metadata elements are advertised to external scopes A scope is a set of LSes that are related by e.g. being in the same administrative domain (although multiple scopes within a single domain are possible)
Weather Maps - Internet2
Gmaps from SLAC
CNM from DFN
CNM from DFN
perfsonarui from acad.bg
PerfsonarUI 1
PerfsonarUI 2
PerfsonarUI 3
Oscars Circuit plugin - Internet2
Oscars circuit plugin
E2Emon - Monitoring Circuits
E2Emon: Status of E2E link CERN-LHCOPN-FNAL-001 E2Emon generated view of the data for one OPN link [E2EMON]
Traceroute Visualizer Forward direction bandwidth utilization on application path from LBNL to INFN-Frascati (Italy) traffic shown as bars on those network device interfaces that have an associated MP services (the first 4 graphs are normalized to 2000 Mb/s, the last to 500 Mb/ s) 1 ir1000gw (131.243.2.1) 2 er1kgw 3 lbl2-ge-lbnl.es.net link capacity is also provided 10 esnet.rt1.nyc.us.geant2.net (NO DATA) 11 so-7-0-0.rt1.ams.nl.geant2.net (NO DATA) 12 so-6-2-0.rt1.fra.de.geant2.net (NO DATA) 13 so-6-2-0.rt1.gen.ch.geant2.net (NO DATA) 14 so-2-0-0.rt1.mil.it.geant2.net (NO DATA) 15 garr-gw.rt1.mil.it.geant2.net (NO DATA) 16 rt1-mi1-rt-mi2.mi2.garr.net 4 slacmr1-sdn-lblmr1.es.net (GRAPH OMITTED) 5 snv2mr1-slacmr1.es.net (GRAPH OMITTED) 6 snv2sdn1-snv2mr1.es.net 17 rt-mi2-rt-rm2.rm2.garr.net (GRAPH OMITTED) 18 rt-rm2-rc-fra.fra.garr.net (GRAPH OMITTED) 19 rc-fra-ru-lnf.fra.garr.net (GRAPH OMITTED) 7 chislsdn1-oc192-snv2sdn1.es.net (GRAPH OMITTED) 8 chiccr1-chislsdn1.es.net 20 21 www6.lnf.infn.it (193.206.84.223) 189.908 ms 189.596 ms 189.684 ms 9 aofacr1-chicsdn1.es.net (GRAPH OMITTED)