Work Project Report: Benchmark for 100 Gbps Ethernet network analysis
|
|
- Hugh Campbell
- 5 years ago
- Views:
Transcription
1 Work Project Report: Benchmark for 100 Gbps Ethernet network analysis CERN Summer Student Programme 2016 Student: Iraklis Moutidis Main supervisor: Balazs Voneki Second supervisor: Dr. Niko Neufeld Division: EP-LBC Project ID: Note reference: LHCb-PUB Project name: Automatized benchmark for automatic launching, scheduling and analysing various aspects of the network. Introduction LHCb-PUB /09/2016 During the Long Shutdown 2 ( ) the LHCb experiment will be upgraded in order to reach extremely high precision on the main observable of the b and c-quarks sectors. In its current state the LHCb experiment has a readout rate of 1.1 MHz within a fixed latency, which causes the collision rate to be limited. The removal of this bottleneck is one of the main objectives of the LHCb upgrade. This will be achieved by implementing a trigger-less readout system where the functionality of the trigger will be executed by software. The trigger-less readout system must be able to handle large bandwidth such as 4 TBytes/s. For that reason the design of the system will include readout boards that can transmit information at the rate of 100 Gbits/s and a high throughput local area network [1]. The main goal of my project is to create an automatized benchmark for automatic launching, scheduling and analysing the performance of the readout boards and the network, in order to decide which manufacturer offers the best solution for our implementation, what configuration of the network offers the best performance and what configuration of the readout system is the most cost efficient. Project Implementation The automatized benchmark was implemented using bash [2] and its goal was to coordinate the different connected nodes to transfer information with each other in order to measure the achieved bandwidth depending on the different configurations applied to the nodes. An example of the node network is given in Figure 1: Node network example. Each node sends data simultaneously to every node of the network. For transmitting data between the nodes we used the iperf tool, version (6 Mar 2015). iperf is an open source tool for active measurements of the maximum achievable bandwidth on IP networks. It supports tuning on various parameters related to timing, buffers and protocols (TCP, UDP, SCTP with IPv4 and IPv6). For each test it reports the bandwidth, loss and other parameters. iperf was originally developed by NLANR/DAST [3]. To accurately test the performance of the nodes we had to
2 make them transmit to each other simultaneously. To achieve that we used the iperf tool for transmitting data between the nodes and the MPICH interface to synchronize them. MPICH is a high-performance and widely portable implementation of the Message Passing Interface (MPI) standard (MPI-1, MPI-2 and MPI-3). The goals of MPICH are: (1) to provide an MPI implementation that efficiently supports different computation and communication platforms including commodity clusters (desktop systems, shared-memory systems, multicore architectures), high-speed networks (10 Gigabit Ethernet, InfiniBand, Myrinet, Quadrics) and proprietary high-end computing systems (Blue Gene, Cray) and (2) to enable cutting-edge research in MPI through an easy-to-extend modular framework for other derived implementations [4]. The configurations that we applied to the tested networks were 4: Number of processes (processors) per node. In our experiments we tested 2 to 32 processes with no CPU pinning. Transmitting window size. The window size in our experiments varied from 8 Kbyte to 416 Kbyte and was increased by 8 Kbyte per test. Transmission time. We used 10, 20, 30 and 40 seconds transmission time. Number of nodes used on the network. We used 3 to 7 nodes in our experiments. Lastly in some cases, in order to generate the appropriate script for the different tests I used the Python programming language. I did so, because it was easier to dynamically form each mpi/iperf command with different parameters depending the desired configurations we wanted to apply for each test. Figure 1: Node network example. Each node sends data simultaneously to every node of the network.
3 Results We run a number of experiments using the installed 100 Gbit/sec network cards in a number of configurations (as mentioned on Project Implementation). By examining the results of the test we made some observations: First some nodes are running faster than others, we have to investigate further the reason that causes this behaviour. To verify that this behaviour is always happening we run a number of tests with different data transmission time duration (10, 20, 30, 40 seconds) and also run a test that the nodes are transiting data sequentially. In all of our example some nodes (lab26) were performing better than others. All the nodes of the experiments had identical hardware: 2 sockets, CPU: Intel(R) Xeon(R) CPU E GHz with 8 cores and 16 threads, RAM: 64 GB (8 x 8 GB 2133 MHz, but configured clock speed = 1866 MHz) and BIOS with the default settings. Second the network gets often overloaded when we run the experiments on more than 4 nodes and with window size approximately bigger than 100 Kbytes. In many cases the tests could not be completed because of that reason. Error message were: write failed connection reset by peer and connect failed: No route to host. 3 Nodes performance results The highest bandwidth that we achieved in our test using three nodes was Gbits/sec. The processes number was 28 and the window size was 176 Kbytes. On Table 1: Configurations with the best performance for 3 nodes are the top 5 configurations of the test Table 1: Configurations with the best performance for 3 nodes In Figure 2: Overall results for 3 node tests., we present the overall performance for the 3 node tested network. 4 Nodes performance results For the 4 node configuration the highest achieved bandwidth was Gbits/sec with 24 processes and window size 44 Kbytes. On Table 2: Configurations with the best performance for 4 nodes are the top 5 configurations of the test. Figure 3: Overall results for 4 node tests presents the performance of the 4 node network Table 2: Configurations with the best performance for 4 nodes
4 Figure 2: Overall results for 3 node tests. Figure 3: Overall results for 4 node tests
5 5 Nodes performance results Running the test on 5 nodes was very problematic. Most of the times the network was overloaded and the test had to be terminated. We tried all the TCP buffersize ranges, but it did not help. The root cause is not clear. We could measure the bandwidth only for 15 and 16 processes. The highest bandwidth that we achieved was Gbits/sec. The processes number was 15 and the window size was 148 Kbytes. On Table 3: Configurations with the best performance for 5 nodes are the top 5 configurations of the test Table 3: Configurations with the best performance for 5 nodes In Figure 4: Overall results for 5 node tests we present the overall performance for the 5 node tested network. 7 Nodes performance results For the 7 node configuration we could run the tests for 2 to 15 nodes. Using more processes than 15 made the network to be overloaded. The highest achieved bandwidth was Gbits/sec with 13 processes and window size 196 Kbytes. On Table 4: Configurations with the best performance for 7 nodes are the top 5 configurations of the test. Figure 5: Overall results for 7 node tests presents the performance of the 7 node network Table 4: Configurations with the best performance for 7 nodes
6 Figure 4: Overall results for 5 node tests Figure 5: Overall results for 7 node tests
7 Conclusion During my stay at CERN I implemented an automatized benchmark for automatic launching, scheduling and analysing the performance of the readout boards and the network. During the work on this project, I acquired a lot of knowledge about bash scripting and learned how to use tools for network benchmarking (iperf) and string manipulation (awk). The implemented benchmarking tool can test a given network for various configurations and help the user to identify the best set up to accomplish maximum performance. References [1] LHCb Trigger and Online Upgrade Technical Design Report. European Organization for Nuclear Research (CERN), CERN/LHCC , [2] "Wikipedia," [Online]. Available: [3] "iperf The network bandwidth measurement tool.," [Online]. Available: [4] "MPICH," [Online]. Available:
A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications.
21st International Conference on Computing in High Energy and Nuclear Physics (CHEP21) IOP Publishing Journal of Physics: Conference Series 664 (21) 23 doi:1.188/1742-696/664//23 A first look at 1 Gbps
More informationIBM Network Processor, Development Environment and LHCb Software
IBM Network Processor, Development Environment and LHCb Software LHCb Readout Unit Internal Review July 24 th 2001 Niko Neufeld, CERN 1 Outline IBM NP4GS3 Architecture A Readout Unit based on the NP4GS3
More informationImproving Packet Processing Performance of a Memory- Bounded Application
Improving Packet Processing Performance of a Memory- Bounded Application Jörn Schumacher CERN / University of Paderborn, Germany jorn.schumacher@cern.ch On behalf of the ATLAS FELIX Developer Team LHCb
More informationestadium Project Lab 2: Iperf Command
estadium Project Lab 2: Iperf Command Objectives Being familiar with the command iperf. In this Lab, we will set up two computers (PC1 and PC2) as an ad-hoc network and use the command iperf to measure
More informationOutline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work
Using Non-blocking Operations in HPC to Reduce Execution Times David Buettner, Julian Kunkel, Thomas Ludwig Euro PVM/MPI September 8th, 2009 Outline 1 Motivation 2 Theory of a non-blocking benchmark 3
More informationHigh bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK
High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459
More informationApproaches to Parallel Computing
Approaches to Parallel Computing K. Cooper 1 1 Department of Mathematics Washington State University 2019 Paradigms Concept Many hands make light work... Set several processors to work on separate aspects
More informationReduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection
Switching Operational modes: Store-and-forward: Each switch receives an entire packet before it forwards it onto the next switch - useful in a general purpose network (I.e. a LAN). usually, there is a
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationOpen Source Traffic Analyzer
Open Source Traffic Analyzer Daniel Turull June 2010 Outline 1 Introduction 2 Background study 3 Design 4 Implementation 5 Evaluation 6 Conclusions 7 Demo Outline 1 Introduction 2 Background study 3 Design
More informationThe Optimal CPU and Interconnect for an HPC Cluster
5. LS-DYNA Anwenderforum, Ulm 2006 Cluster / High Performance Computing I The Optimal CPU and Interconnect for an HPC Cluster Andreas Koch Transtec AG, Tübingen, Deutschland F - I - 15 Cluster / High Performance
More informationHigh Speed DAQ with DPDK
High Speed DAQ with DPDK June - August 2016 Author: Saiyida Noor Fatima Supervisors: Niko Neufeld Sebastian Valat CERN Openlab Summer Student Report 2016 Acknowledgment My sincere thanks to my supervisor
More informationarxiv: v1 [physics.ins-det] 16 Oct 2017
arxiv:1710.05607v1 [physics.ins-det] 16 Oct 2017 The ALICE O 2 common driver for the C-RORC and CRU read-out cards Boeschoten P and Costa F for the ALICE collaboration E-mail: pascal.boeschoten@cern.ch,
More informationBenchmarking message queue libraries and network technologies to transport large data volume in
Benchmarking message queue libraries and network technologies to transport large data volume in the ALICE O 2 system V. Chibante Barroso, U. Fuchs, A. Wegrzynek for the ALICE Collaboration Abstract ALICE
More informationCluster Computing. Interconnect Technologies for Clusters
Interconnect Technologies for Clusters Interconnect approaches WAN infinite distance LAN Few kilometers SAN Few meters Backplane Not scalable Physical Cluster Interconnects FastEther Gigabit EtherNet 10
More informationIntroduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses
Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses 1 Most of the integrated I/O subsystems are connected to the
More informationAnalytics of Wide-Area Lustre Throughput Using LNet Routers
Analytics of Wide-Area Throughput Using LNet Routers Nagi Rao, Neena Imam, Jesse Hanley, Sarp Oral Oak Ridge National Laboratory User Group Conference LUG 2018 April 24-26, 2018 Argonne National Laboratory
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationPerformance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA
Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to
More informationUpdate on PRad GEMs, Readout Electronics & DAQ
Update on PRad GEMs, Readout Electronics & DAQ Kondo Gnanvo University of Virginia, Charlottesville, VA Outline PRad GEMs update Upgrade of SRS electronics Integration into JLab DAQ system Cosmic tests
More informationCluster Computing. Cluster Architectures
Cluster Architectures Overview The Problem The Solution The Anatomy of a Cluster The New Problem A big cluster example The Problem Applications Many fields have come to depend on processing power for progress:
More informationL1 and Subsequent Triggers
April 8, 2003 L1 and Subsequent Triggers Abstract During the last year the scope of the L1 trigger has changed rather drastically compared to the TP. This note aims at summarising the changes, both in
More informationBlueGene/L. Computer Science, University of Warwick. Source: IBM
BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours
More information7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT
7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT Draft Printed for SECO Murex S.A.S 2012 all rights reserved Murex Analytics Only global vendor of trading, risk management and processing systems focusing also
More informationOPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA
OPEN MPI WITH RDMA SUPPORT AND CUDA Rolf vandevaart, NVIDIA OVERVIEW What is CUDA-aware History of CUDA-aware support in Open MPI GPU Direct RDMA support Tuning parameters Application example Future work
More information10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G
10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G Mohammad J. Rashti and Ahmad Afsahi Queen s University Kingston, ON, Canada 2007 Workshop on Communication Architectures
More informationLS-DYNA Productivity and Power-aware Simulations in Cluster Environments
LS-DYNA Productivity and Power-aware Simulations in Cluster Environments Gilad Shainer 1, Tong Liu 1, Jacob Liberman 2, Jeff Layton 2 Onur Celebioglu 2, Scot A. Schultz 3, Joshua Mora 3, David Cownie 3,
More informationLinux Network Tuning Guide for AMD EPYC Processor Based Servers
Linux Network Tuning Guide for AMD EPYC Processor Application Note Publication # 56224 Revision: 1.00 Issue Date: November 2017 Advanced Micro Devices 2017 Advanced Micro Devices, Inc. All rights reserved.
More informationCluster Computing. Cluster Architectures
Cluster Architectures Overview The Problem The Solution The Anatomy of a Cluster The New Problem A big cluster example The Problem Applications Many fields have come to depend on processing power for progress:
More informationDisclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme
NET1343BU NSX Performance Samuel Kommu #VMworld #NET1343BU Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no
More informationBenefits of full TCP/IP offload in the NFS
Benefits of full TCP/IP offload in the NFS Services. Hari Ghadia Technology Strategist Adaptec Inc. hari_ghadia@adaptec.com Page Agenda Industry trend and role of NFS TCP/IP offload Adapters NACs Performance
More informationNFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications
NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan
More informationReal Parallel Computers
Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short
More informationA Study of the Merits of Precision Time Protocol (IEEE-1588) Across High-Speed Data Networks
A Study of the Merits of Precision Time Protocol (IEEE-1588) Across High-Speed Data Networks LHCb-PUB-2015-022 18/09/2015 Public Note Issue: 1 Revision: 0 Reference: LHCb-PUB-2015-022 Created: August 17,
More informationRecovering Disk Storage Metrics from low level Trace events
Recovering Disk Storage Metrics from low level Trace events Progress Report Meeting May 05, 2016 Houssem Daoud Michel Dagenais École Polytechnique de Montréal Laboratoire DORSAL Agenda Introduction and
More informationThe creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM
The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM Lukas Nellen ICN-UNAM lukas@nucleares.unam.mx 3rd BigData BigNetworks Conference Puerto Vallarta April 23, 2015 Who Am I? ALICE
More informationReducing Network Contention with Mixed Workloads on Modern Multicore Clusters
Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters Matthew Koop 1 Miao Luo D. K. Panda matthew.koop@nasa.gov {luom, panda}@cse.ohio-state.edu 1 NASA Center for Computational
More informationDELL EMC ISILON F800 AND H600 I/O PERFORMANCE
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
More informationEVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster
EVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster Magdalena Slawinska, Greg Eisenhauer, Thomas M. Benson, Alan Nussbaum College of Computing, Georgia
More informationLinux Network Tuning Guide for AMD EPYC Processor Based Servers
Linux Network Tuning Guide for AMD EPYC Processor Application Note Publication # 56224 Revision: 1.10 Issue Date: May 2018 Advanced Micro Devices 2018 Advanced Micro Devices, Inc. All rights reserved.
More informationCOSC 6385 Computer Architecture - Multi Processor Systems
COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:
More informationCLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER
CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER V.V. Korenkov 1, N.A. Kutovskiy 1, N.A. Balashov 1, V.T. Dimitrov 2,a, R.D. Hristova 2, K.T. Kouzmov 2, S.T. Hristov 3 1 Laboratory of Information
More informationLab 2: Threads and Processes
CS333: Operating Systems Lab Lab 2: Threads and Processes Goal The goal of this lab is to get you comfortable with writing basic multi-process / multi-threaded applications, and understanding their performance.
More informationMiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces
MiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces Hye-Churn Jang Hyun-Wook (Jin) Jin Department of Computer Science and Engineering Konkuk University Seoul, Korea {comfact,
More informationAccelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet
WHITE PAPER Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet Contents Background... 2 The MapR Distribution... 2 Mellanox Ethernet Solution... 3 Test
More informationMeasuring TCP bandwidth on top of a Gigabit and Myrinet network
Measuring TCP bandwidth on top of a Gigabit and Myrinet network Juan J. Costa, Javier Bueno Hedo, Xavier Martorell and Toni Cortes {jcosta,jbueno,xavim,toni}@ac.upc.edu December 7, 9 Abstract In this article
More informationAMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV)
White Paper December, 2018 AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV) Executive Summary Data centers and cloud service providers are creating a technology shift
More informationScreencast: What is [Open] MPI?
Screencast: What is [Open] MPI? Jeff Squyres May 2008 May 2008 Screencast: What is [Open] MPI? 1 What is MPI? Message Passing Interface De facto standard Not an official standard (IEEE, IETF, ) Written
More informationScreencast: What is [Open] MPI? Jeff Squyres May May 2008 Screencast: What is [Open] MPI? 1. What is MPI? Message Passing Interface
Screencast: What is [Open] MPI? Jeff Squyres May 2008 May 2008 Screencast: What is [Open] MPI? 1 What is MPI? Message Passing Interface De facto standard Not an official standard (IEEE, IETF, ) Written
More informationMellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007
Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise
More informationQuickSpecs. HP Z 10GbE Dual Port Module. Models
Overview Models Part Number: 1Ql49AA Introduction The is a 10GBASE-T adapter utilizing the Intel X722 MAC and X557-AT2 PHY pairing to deliver full line-rate performance, utilizing CAT 6A UTP cabling (or
More informationAssessment of LS-DYNA Scalability Performance on Cray XD1
5 th European LS-DYNA Users Conference Computing Technology (2) Assessment of LS-DYNA Scalability Performance on Cray Author: Ting-Ting Zhu, Cray Inc. Correspondence: Telephone: 651-65-987 Fax: 651-65-9123
More informationHigh Performance Computing: Concepts, Methods & Means Enabling Technologies 2 : Cluster Networks
High Performance Computing: Concepts, Methods & Means Enabling Technologies 2 : Cluster Networks Prof. Amy Apon Department of Computer Science and Computer Engineering University of Arkansas March 15 th,
More informationCentre de Physique des Particules de Marseille. The PCIe-based readout system for the LHCb experiment
The PCIe-based readout system for the LHCb experiment K.Arnaud, J.P. Duval, J.P. Cachemiche, Cachemiche,P.-Y. F. Réthoré F. Hachon, M. Jevaud, R. Le Gac, Rethore Centre de Physique des Particules def.marseille
More informationThe NE010 iwarp Adapter
The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter
More informationEvaluation of inter-cluster data transfer on grid environment
Evaluation of inter-cluster data transfer on grid environment Shoji Ogura *1, Satoshi Matsuoka *1,*2, Hidemoto Nakada *3,*1 *1 :Tokyo Institute of Technology *2 :National Institute of Informatics *3 :National
More informationHow to Choose the Right Bus for Your Measurement System
1 How to Choose the Right Bus for Your Measurement System Overview When you have hundreds of different data acquisition (DAQ) devices to choose from on a wide variety of buses, it can be difficult to select
More informationOptimizing LS-DYNA Productivity in Cluster Environments
10 th International LS-DYNA Users Conference Computing Technology Optimizing LS-DYNA Productivity in Cluster Environments Gilad Shainer and Swati Kher Mellanox Technologies Abstract Increasing demand for
More informationScalable Ethernet Clos-Switches. Norbert Eicker John von Neumann-Institute for Computing Ferdinand Geier ParTec Cluster Competence Center GmbH
Scalable Ethernet Clos-Switches Norbert Eicker John von Neumann-Institute for Computing Ferdinand Geier ParTec Cluster Competence Center GmbH Outline Motivation Clos-Switches Ethernet Crossbar Switches
More informationrepresent parallel computers, so distributed systems such as Does not consider storage or I/O issues
Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines
More informationNetworking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ
Networking for Data Acquisition Systems Fabrice Le Goff - 14/02/2018 - ISOTDAQ Outline Generalities The OSI Model Ethernet and Local Area Networks IP and Routing TCP, UDP and Transport Efficiency Networking
More informationBenchmark of a Cubieboard cluster
Benchmark of a Cubieboard cluster M J Schnepf, D Gudu, B Rische, M Fischer, C Jung and M Hardt Steinbuch Centre for Computing, Karlsruhe Institute of Technology, Karlsruhe, Germany E-mail: matthias.schnepf@student.kit.edu,
More informationChallenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang
Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures. Allison H. Baker, Todd Gamblin, Martin Schulz, and Ulrike Meier Yang Multigrid Solvers Method of solving linear equation
More informationIntroduction to Ethernet Latency
Introduction to Ethernet Latency An Explanation of Latency and Latency Measurement The primary difference in the various methods of latency measurement is the point in the software stack at which the latency
More informationModeling and Validating Time, Buffering, and Utilization of a Large-Scale, Real-Time Data Acquisition System
Modeling and Validating Time, Buffering, and Utilization of a Large-Scale, Real-Time Data Acquisition System Alejandro Santos, Pedro Javier García, Wainer Vandelli, Holger Fröning The 2017 International
More informationNetwork Design Considerations for Grid Computing
Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom
More informationComparing Ethernet & Soft RoCE over 1 Gigabit Ethernet
Comparing Ethernet & Soft RoCE over 1 Gigabit Ethernet Gurkirat Kaur, Manoj Kumar 1, Manju Bala 2 1 Department of Computer Science & Engineering, CTIEMT Jalandhar, Punjab, India 2 Department of Electronics
More informationParallel Programming with MPI
Parallel Programming with MPI Science and Technology Support Ohio Supercomputer Center 1224 Kinnear Road. Columbus, OH 43212 (614) 292-1800 oschelp@osc.edu http://www.osc.edu/supercomputing/ Functions
More informationFinding the Needle in the Haystack
Finding the Needle in the Haystack Jonzy Data Security Analysis, Sr. Finding the Needle in the Haystack With all the information available via NetFlows, finding the "Needle in the Haystack" (the bad actor
More informationEMX-2401 DATA SHEET FEATURES 3U EMBEDDED CONTROLLER FOR PXI EXPRESS SYSTEMS. Powerful computing power with Intel Core i5-520e 2.
DATA SHEET EMX-2401 3U EMBEDDED CONTROLLER FOR PXI EXPRESS SYSTEMS FEATURES Powerful computing power with Intel Core i5-520e 2.4 GHz processor Dual Channel DDR3 SODIMM up to 8 GB 1066 MHz Maximum System
More informationPortable 2-Port Gigabit Wirespeed Streams Generator & Network TAP
Portable 2-Port Gigabit Wirespeed Streams Generator & Network TAP NuDOG-301C OVERVIEW NuDOG-301C is a handheld device with two Gigabit ports for Ethernet testing. The main functions of NuDOG-301C include
More informationHigh Performance Computing (HPC) Prepared By: Abdussamad Muntahi Muhammad Rahman
High Performance Computing (HPC) Prepared By: Abdussamad Muntahi Muhammad Rahman 1 2 Introduction to High Performance Computing (HPC) Introduction High-speed computing. Originally pertaining only to supercomputers
More informationFeedback on BeeGFS. A Parallel File System for High Performance Computing
Feedback on BeeGFS A Parallel File System for High Performance Computing Philippe Dos Santos et Georges Raseev FR 2764 Fédération de Recherche LUmière MATière December 13 2016 LOGO CNRS LOGO IO December
More informationReal Parallel Computers
Real Parallel Computers Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005 Short history
More informationEthernet transport protocols for FPGA
Ethernet transport protocols for FPGA Wojciech M. Zabołotny Institute of Electronic Systems, Warsaw University of Technology Previous version available at: https://indico.gsi.de/conferencedisplay.py?confid=3080
More informationParallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein
Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster
More informationOn BigFix Performance: Disk is King. How to get your infrastructure right the first time! Case Study: IBM Cloud Development - WW IT Services
On BigFix Performance: Disk is King How to get your infrastructure right the first time! Case Study: IBM Cloud Development - WW IT Services Authors: Shaun T. Kelley, Mark Leitch Abstract: Rolling out large
More informationCERN openlab Summer 2006: Networking Overview
CERN openlab Summer 2006: Networking Overview Martin Swany, Ph.D. Assistant Professor, Computer and Information Sciences, U. Delaware, USA Visiting Helsinki Institute of Physics (HIP) at CERN swany@cis.udel.edu,
More informationParallel Computing Ideas
Parallel Computing Ideas K. 1 1 Department of Mathematics 2018 Why When to go for speed Historically: Production code Code takes a long time to run Code runs many times Code is not end in itself 2010:
More information46PaQ. Dimitris Miras, Saleem Bhatti, Peter Kirstein Networks Research Group Computer Science UCL. 46PaQ AHM 2005 UKLIGHT Workshop, 19 Sep
46PaQ Dimitris Miras, Saleem Bhatti, Peter Kirstein Networks Research Group Computer Science UCL 46PaQ AHM 2005 UKLIGHT Workshop, 19 Sep 2005 1 Today s talk Overview Current Status and Results Future Work
More informationExploring Parallelism in. Joseph Pantoga Jon Simington
Exploring Parallelism in Joseph Pantoga Jon Simington Why bring parallelism to Python? - We love Python (and you should, too!) - Interacts very well with C / C++ via python.h and CPython - Rapid development
More informationUsing Time Division Multiplexing to support Real-time Networking on Ethernet
Using Time Division Multiplexing to support Real-time Networking on Ethernet Hariprasad Sampathkumar 25 th January 2005 Master s Thesis Defense Committee Dr. Douglas Niehaus, Chair Dr. Jeremiah James,
More informationCluster Network Products
Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster
More informationHigh Performance MPI on IBM 12x InfiniBand Architecture
High Performance MPI on IBM 12x InfiniBand Architecture Abhinav Vishnu, Brad Benton 1 and Dhabaleswar K. Panda {vishnu, panda} @ cse.ohio-state.edu {brad.benton}@us.ibm.com 1 1 Presentation Road-Map Introduction
More information60 GHz range improvements and multipoint capabilities
60 GHz range improvements and multipoint capabilities Antons Beļajevs MikroTik, Latvia MUM EU April 2018 Wireless band comparison 2.4 GHz 802.11b/g/n Crowded spectrum Low channel count 5 GHz 802.11a/n/ac
More informationA Case for High Performance Computing with Virtual Machines
A Case for High Performance Computing with Virtual Machines Wei Huang*, Jiuxing Liu +, Bulent Abali +, and Dhabaleswar K. Panda* *The Ohio State University +IBM T. J. Waston Research Center Presentation
More information2008 JINST 3 S Online System. Chapter System decomposition and architecture. 8.2 Data Acquisition System
Chapter 8 Online System The task of the Online system is to ensure the transfer of data from the front-end electronics to permanent storage under known and controlled conditions. This includes not only
More informationBenchmarking third-party-transfer protocols with the FTS
Benchmarking third-party-transfer protocols with the FTS Rizart Dona CERN Summer Student Programme 2018 Supervised by Dr. Simone Campana & Dr. Oliver Keeble 1.Introduction 1 Worldwide LHC Computing Grid
More informationTo Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC
To Infiniband or Not Infiniband, One Site s s Perspective Steve Woods MCNC 1 Agenda Infiniband background Current configuration Base Performance Application performance experience Future Conclusions 2
More informationBaadal: the IITD computing cloud (Beta release)
Baadal: the IITD computing cloud (Beta release) The CSC has commissioned a new cloud computing environment for high performance computing based on 1. 32 blade servers each with 2x6 core Intel(R) Xeon(R)
More informationEXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT. Konstantinos Alexopoulos ECE NTUA CSLab
EXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT Konstantinos Alexopoulos ECE NTUA CSLab MOTIVATION HPC, Multi-node & Heterogeneous Systems Communication with low latency
More informationClearStream. Prototyping 40 Gbps Transparent End-to-End Connectivity. Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)!
ClearStream Prototyping 40 Gbps Transparent End-to-End Connectivity Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)! University of Amsterdam! more data! Speed! Volume! Internet!
More informationBest Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays
Dell EqualLogic Best Practices Series Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays A Dell Technical Whitepaper Jerry Daugherty Storage Infrastructure
More informationOVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI
CMPE 655- MULTIPLE PROCESSOR SYSTEMS OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI What is MULTI PROCESSING?? Multiprocessing is the coordinated processing
More informationInfiniband and RDMA Technology. Doug Ledford
Infiniband and RDMA Technology Doug Ledford Top 500 Supercomputers Nov 2005 #5 Sandia National Labs, 4500 machines, 9000 CPUs, 38TFlops, 1 big headache Performance great...but... Adding new machines problematic
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationxsim The Extreme-Scale Simulator
www.bsc.es xsim The Extreme-Scale Simulator Janko Strassburg Severo Ochoa Seminar @ BSC, 28 Feb 2014 Motivation Future exascale systems are predicted to have hundreds of thousands of nodes, thousands of
More informationGROMACS Performance Benchmark and Profiling. September 2012
GROMACS Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource
More informationBenchmarking the ATLAS software through the Kit Validation engine
Benchmarking the ATLAS software through the Kit Validation engine Alessandro De Salvo (1), Franco Brasolin (2) (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma, (2) Istituto Nazionale di Fisica
More informationBarcelona: a Fibre Channel Switch SoC for Enterprise SANs Nital P. Patwa Hardware Engineering Manager/Technical Leader
Barcelona: a Fibre Channel Switch SoC for Enterprise SANs Nital P. Patwa Hardware Engineering Manager/Technical Leader 1 Agenda Introduction to Fibre Channel Switching in Enterprise SANs Barcelona Switch-On-a-Chip
More information