D6.06 - Demonstration of integrated BoD testing and validation Deliverable Number 06 WP # 6 Date 2012-12-30 Authors Jimmy Cullen, The University of Manchester Fedde Bloemhof, JIVE Paul Boven, JIVE Ralph Spencer, The University of Manchester Neal Jackson, The University of Manchester Document Log Version Date Summary 0.1 2012-11-06 Document created 0.2 2012-12-13 Introduction and background written 0.3 2012-12-20 Introduction revised and test results inserted 1.0 2012-12-30 Final revisions 1.1 2013-01-10 Updated the Torun test data
Signal to Noise Ratio Introduction This report describes the work performed to integrate testing and validation of Bandwidth on Demand (BoD) links with the BoD reservation tool, such that once a reservation is made, automated verification of the link is performed. If link validation fails, the BoD team is informed by email and investigations into the cause of the failure can be made. Very Long Baseline Interferometry (VLBI) is a technique used in astronomy to observe astronomical objects at high resolution in radio frequencies using multiple telescopes separated by large distances. Traditional methods used to record data use computer hard disk drives, which after the observation is complete must be physically shipped to the correlator for processing. Electronic VLBI (evlbi) makes use of high speed computer networks to transfer the data in real time to the correlator, allowing immediate processing and notification of any problems at telescopes. In evlbi, real time correlation of data involves processing of synchronised data from all telescopes. Telescope data is generated as a constant bit rate data stream and transmitted to the correlator via UDP as it permits high data rates over long links, however it does not guarantee against packet loss or reordering. Reliable evlbi operation requires stable bandwidth with no or low-level packet loss and reordering, therefore characterisation of the data network over which the data are to be transmitted is essential prior to any evlbi session. For evlbi the main network characteristics of interest are bandwidth, packet loss and packet reordering. Bandwidth is important as this must at least be equal to the rate at which data is generated for real time correlation. If the available bandwidth is less than data generation rates, then data may still be transmitted to the correlator, but processing will occur sometime after the observation has completed. For VLBI the signal to noise ratio is proportional to the square root of one minus the fractional packet loss, which for low level packet loss is acceptable, as shown in Figure 1. With current software correlators, packet reordering is now less of a problem than previously since buffering and reordering of packets on the fly is possible. 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fractional data loss Figure 1. A plot showing how the signal to noise ratio changes with fractional data loss. For low level data loss, the SNR remains high, for example at 19% data loss the SNR remains at 90%. Currently the European VLBI Network (EVN) schedules observations approximately once a month. The EVN correlator is based at the Joint Institute for VLBI in Europe (JIVE) in The Netherlands, and many observatories have static light paths to JIVE which they use in evlbi observations. Given the frequency of EVN observations, the light paths are used inefficiently. An alternative to using dedicated light paths is to dynamically allocate fixed amounts of bandwidth to users for a limited amount of time as and when required, a technique known as Bandwidth on Demand. BoD matches very closely the requirements of evlbi, as well as other escience disciplines.
Bandwidth on Demand Reservation Service Several National Research and Education Networks (NRENs) throughout Europe and DANTE offer a BoD service, and at JIVE a BoD reservation tool has been developed to interface with these. Figure 2 is a screen shot of the main interface of the reservation tool. Authenticated users can review prior reservations, reserve new links and manage current reservations. Figure 2. The main interface to the BoD reservation tool. From here authenticated users can view current and previous BoD links, request new BoD links or terminate currently provisioned links. As described in Deliverable 6.1, the reservation tool is written in PHP and uses the Simple Object Access Protocol to interact with the BoD provider. Link Testing and Validation As described previously, it is vital to evlbi that network characterisation is performed prior to observations so that data is successfully correlated. UDPmon [1] is a network and end host characterisation application for PCs which was written with evlbi in mind. Using the client - server model, the PCs share information about their clock timings before pseudo random data is transmitted from client to server using UDP. Included in each datagram is an application header that allows many facets of the data transport to be recorded, including bandwidth, packet loss and reordering through the use of packet sequence numbers, and 1 way delay and jitter.
Automated testing The bandwidth reservation system allows authenticated users to request bandwidth between two end points. Incorporated into the reservation system is a service which permits the requested BoD link to be tested in advance of data transfers. The requestor selects a delay length until the tests are run, the length of each test in seconds and the size of the packets. The system then performs UDPmon tests between preconfigured end hosts using the BoD reservation bandwidth. Tests are run in each direction at full bandwidth, followed by tests at half bandwidth, and the results are recorded in a database. If percentage package loss from any of the four tests is greater than 0.1% the test is considered a failure and an email is sent to the BoD requestor informing them, who can then contact the appropriate authority to resolve the problem. Figure 3 shows the Entity-Relationship Diagram containing the tables in the integrated reservation and testing system. The connection test data are stored in the t_bodconnectiontests table, including test results. Figure 3. Entity-Relationship Diagram displaying the main tables involved in the integrated BoD reservation and automated testing system. All previous link reservations and automated test results are stored in the database and made available to users through the web interface. Making historic test results available can have many uses, for example allowing users to view how the stability of a particular link evolves over time, or how a particular link s performance compares with other similar links. End host configuration Each end host must be pre-configured to be used in the automated testing and validation of links. A limited access account is configured on each end host and a copy of the UDPmon software installed. On the BoD reservation server public-private cryptography keys were generated, and the public key distributed to all end hosts. This configuration allows the BoD reservation server to securely connect to the end hosts and run the testing and validation software. Demonstration of working system In deliverable D6.08 International Bandwidth-on-Demand at 10Gb/s, BoD links were made between JIVE Onsala and JIVE Torun using the integrated reservation and validation system.
Onsala - JIVE The integrated reservation and validation system was used to create a BoD reservation between Onsala and JIVE at 10 Gbps. The automated tests were run for 10 minutes with 8972 byte payload packets. Figure 4 shows a screenshot of the reservation window Figure 4. Screenshot showing a 10 Gbps reservation between Onsala and JIVE. The automated tests ran for 10 minutes. The BoD link test was marked as a failure by the system because of packet loss on two of the tests, both at 10 Gbps. Figure 5 is a screenshot of the notification email created by the system to inform the user of the automated test failure, which is useful as an additional method of informing the link requestor of the link s test failure. Figure 5. Screen shot of an automated BoD testing and validation failure notice. Further detailed testing may help identify potential sources of problems in the system. The results show that both tests at 50% of requested bandwidth show minimal packet loss, which would suggest that the packet loss is a factor of the requested bandwidth. Subsequent tests were conducted at 9 Gbps which showed packet loss of 3.5%, and at 8 Gbps which had minimal and acceptable packet loss.
Based on the test information, further, more detailed, investigations into the link and end hosts are warranted. This demonstrates the value of automated testing, allowing the concerned parties to investigate the packet loss and rectify before real evlbi observation data is transmitted at 10 Gbps. Torun - JIVE A 10 Gbps reservation was made between JIVE and Torun using the BoD system with an automated test which ran for one hour, as shown in figure 6. The results of the test were very impressive, with less than 0.1% packet loss and data rates greater than 9.7 Gbps. Figure 6. Connection details of the BoD reservation and hour long automated test between Torun and JIVE. The test was deemed a success and an email was sent informing the requestor of the success. Using the automated tests, an interesting feature of this link was discovered. The packet size value for the automated tests is the payload value of the UDP datagram. The Maximum Transmission Unit (MTU) is the largest packet size permissible at the Ethernet layer, and for most high data rate networks this is 9000 bytes, which after the UDP and IP headers leaves a maximum of 8972 bytes available for the UDP payload. When tests were run with 8972 bytes packet size, tests failed, with no 100% packet loss. This problem was then further investigated on the PCs and it was quickly found that the machines could ping one another, proving that the link was up and working. UDPmon was then run from the command line and it was discovered that the MTU of the link was 8996 bytes. The PCs were configured for 9000 byte MTU, therefore the 8996 byte MTU must be applied by some network equipment on the link. Conclusions The integration of optional automated testing and validation on BoD links before operational use of the network for evlbi observations has been demonstrated and shown to be a useful tool. In both demonstrations the tests highlighted link characteristics which are of importance to their use in evlbi
observations. The tests have been designed to simulate evlbi transfers, thus allowing the requestor to have confidence that the link will perform in production as demonstrated in the test. Acknowledgements The authors would like to thank all those who helped make this demonstration possible, including colleagues at SURFnet, NORDUnet, SUNET, GEANT, PSNC, Torun and JIVE. References [1] http://www.hep.man.ac.uk/u/rich/net/ accessed on 2012-12-21