Congestion Management for Ethernet-based Lossless DataCenter Networks
|
|
- Neil Gardner
- 5 years ago
- Views:
Transcription
1 Congestion Management for Ethernet-based Lossless DataCenter Networks Pedro Javier Garcia 1, Jesus Escudero-Sahuquillo 1, Francisco J. Quiles 1 and Jose Duato 2 1: University of Castilla-La Mancha (UCLM) 2: Technical University València (UPV) NENDICA DCN: ICne
2 Abstract This paper describes congestion phenomena in lossless data center networks and its nega- tive consequences. It explores proposed solutions, analyzing their pros and cons to determine which are suited to the requirements of modern data centers. Conclusions identify important issues that should be addressed in the future.
3 Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions
4 Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions
5 Introduction On-Line Data Intensive (OLDI) Services [Congdon18] Require immediate answers to requests that are coming in at a high rate. End-user experience is highly dependent upon the system responsiveness. The network becomes a significant component of overall DC latency when congestion occurs in the network. Deadline = 250 ms Request Aggregator Deadline = 50 ms Aggregator Aggregator... Aggregator Deadline = 10 ms Worker Worker... Worker Worker Worker... Worker
6 Introduction Data-Center Networks (DCNs) Todays DCNs require a flexible fabric for carrying in a convergent way traffic from different types of applications, storage of control. Latency is a concern: Fabric design for DCNs must minimize or eliminate packet loss, provide high throughput and maintain low latency. These goals are crucial for applications of OLDI, Deep Learning, NVMe over Fabrics and the Cloudified Central Offices. However, congestion threatens these applications.
7 Introduction Why congestion isolation is needed? HoL-blocking dramatically degrades the network performance (e.g. PFC has not enough granularity and there is no congested flow identification) [Garcia05]. Classical e2e congestion control for lossless networks is difficult to tune, reacts slowly, and may introduce oscillations and instability [Escudero11]. Network Throughput (normalized) HS starts HS = traffic injected to Hot Spot destination HS ends 1Q ITh VOQnet 0 1e+06 2e+06 3e+06 4e+06 5e+06 Time (nanoseconds) 64-node CLOS network, 4 hot-spots
8 Introduction Why congestion isolation is needed? Src. A 33% Sw. 1 33% Sw. 5 Congested flows (Dst. X) Non-congested flows (Dst. Y) 33% Non-congested flows (Dst. Z) Src. B 33% Sw. 2 66% Sw. 6 33% Sw % Dst. X 33% 33% Src. C Src. D 33% Sw. 3 33% Sw. 7 33% 66% Sw. 9 Dst. Y 33% Dst. Z Src. E 33% Sw. 4 33% High-Order HoL-blocking Low-Order HoL-blocking 33 % Sending 33 % Stopped 33 % Sending 33 % Stopped 33 % Sending 33 % Sending
9 Introduction Why congestion isolation is needed? We need a congestion isolation (CI) mechanism that reacts quickly when transient congestion situations appear, preventing network performance degradation caused by the HoL blocking. We want a CI mechanism that complements other technologies available in the DCNs, so that CI improves their performance, while the others reduce the CI complexity.
10 Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions
11 Congestion Dynamics in DCNs Appearance of Congestion Congestion Congestion Injection rate at 100% of the link bandwidth (full rate) Injection rate at 100% of the link bandwidth (full rate) Speedup = 1 Speedup = 2 Congestion (t0+t) Congestion (t0) Congestion (t0) Congestion (t0+t) Injection rate at 100% of the link bandwidth (full rate) Injection rate at 100% of the link bandwidth (full rate) Speedup = 2 Speedup = 1.5
12 Congestion Dynamics in DCNs Growth of Congestion Trees (from root to leaves) Switch 1 Switch 3 Switch speedup = 1.5 Packet flows Congestion point Switch 5 Switch 2 Switch 4
13 Congestion Dynamics in DCNs Growth of Congestion Trees (from leaves to root) Switch speedup = 1.5 Packet flows Congestion point Switch 1 Switch 5 Switch 2 Switch 7 Switch 3 Switch 6 Switch 4
14 Congestion Dynamics in DCNs Growth of Congestion Trees (Roots movement) Switch speedup = 1.5 Packet flows (start) Packet flows (after) Congestion point Switch 1 Switch 1 Switch 3 Switch 3 Switch 2 Switch 2
15 Congestion Dynamics in DCNs Growth of Congestion Trees (in-network roots) Switch 1 Switch 5 Switch 2 Switch 7 Switch 8 X Y Switch 3 Switch 6 Switch speedup = 1.5 Packet flows addressed to X Packet flows addressed to Y Congestion point Switch 4
16 Congestion Dynamics in DCNs Growth of Congestion Trees (Overlapping) X Switch 1 Switch 4 Switch 8 Switch 2 Switch 5 Switch 7 Y Switch 3 Switch 6 Switch speedup = 1.5 Packet flows addressed to X Packet flows addressed to Y Congestion point Switch 9
17 Congestion Dynamics in DCNs Growth of Congestion Trees (Vanishing) Switch speedup = 1.5 Permanent packet flows Packet flows disappearing first Congestion point first appeared in the switch Switch 1 Switch 1 Switch 3 Switch 3 Switch 2 Switch 2
18 Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions
19 Reducing Congestion Incast congestion reduction - ECMP
20 Reducing Congestion In-network congestion reduction - ECN X Switch 1 Switch 4 Switch 8 Switch 2 Switch 5 Switch 7 Y Switch 3 Switch 6 Switch speedup = 1.5 Packet flows addressed to X Packet flows addressed to Y Victim flow Congestion point Switch 9
21 Reducing Congestion Limitations of current technologies [Escudero19] These technologies may work together to eliminate loss in the cloud data center network. Load-balancing and destination scheduling are end-toend solutions incurring in the RTT delays when congestion appear. However, there is no time for loss in the network due to congestion and congestion trees grow very quickly. Transient congestion may still produce HoL blocking that leads to increase latency, lower throughput and buffers overflow, significantly degrading performance. Even using these mechanisms, we still need something to deal with HOL Blocking locally and fast.
22 Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions
23 Combining Congestion Management Mechanisms CI is needed to react locally and very fast to immediately eliminate HoL blocking. Previous technologies reduce the use of PFC and ECN, but their closed- and open-loop approach cause delays still happening. Congestion trees appear suddenly, are difficult to predict (even worse when load balancing is applied) and grow quickly. New techniques can be applied in combination to the previous technologies, improving their behavior.
24 Combining Congestion Management Mechanisms Dynamic Virtual Lanes (DVL) Switch A Switch B P1 CFQ ncfq CFQ ncfq P3 CIP P1 CFQ ncfq CFQ ncfq P3 Congestion Root P2 CFQ P4 P2 CFQ P4 ncfq Legend Output port requested by the packet on top. Congestion root. Congestion Isolation Packets (CIP). Packets from congested flows. Packets from non-congested flows. ncfq
25 Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions
26 References [Duato03] J. Duato, S. Yalamanchili, and L. M. Ni, Interconnection Networks: An Engineering Approach. San Francisco, CA, USA: Morgan Kaufmann Publishers, [Garcia05] P. J. Garcia, J. Flich, J. Duato, I. Johnson, F. J. Quiles, and F. Naven, Dynamic Evolution of Congestion Trees: Analysis and Impact on Switch Architecture, in High Performance Embedded Architectures and Compilers, ser. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, Nov. 2005, pp [Congdon18] Paul Congdon, IEEE 802 Nendica Report: The Lossless Network for Data Centers, IEEE-SA Industry Connections White Paper, August [Leiserson85] C. E. Leiserson, Fat-trees: Universal networks for hardware-efficient supercomputing, IEEE Transactions on Computers, vol. C-34, pp , Oct [Escudero11] Jesús Escudero-Sahuquillo, Ernst Gunnar Gran, Pedro Javier García, Jose Flich, Tor Skeie, Olav Lysne, Francisco J. Quiles, José Duato: Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks. ICPP 2011: [Escudero19] Jesús Escudero-Sahuquillo, Pedro Javier García, Francisco J. Quiles, José Duato: P802.1Qcz interworking with otherdata center technologies. IEEE Plenary Meeting, San Diego, CA, USA July 8, 2018 (cz-escudero-sahuquillo-ci-internetworking-0718-v1.pdf)
Congestion Management in HPC
Congestion Management in HPC Interconnection Networks Pedro J. García Universidad de Castilla-La Mancha (SPAIN) Conference title 1 Outline Why may congestion become a problem? Should we care about congestion
More information36 IEEE POTENTIALS /07/$ IEEE
INTERCONNECTION NETWORKS ARE A KEY ELEMENT IN a wide variety of systems: massive parallel processors, local and system area networks, clusters of PCs and workstations, and Internet Protocol routers. They
More informationCongestion Management in Lossless Interconnects: Challenges and Benefits
Congestion Management in Lossless Interconnects: Challenges and Benefits José Duato Technical University of Valencia (SPAIN) Conference title 1 Outline Why is congestion management required? Benefits Congestion
More informationAn Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing
An Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing Pedro Yébenes 1, Jesús Escudero-Sahuquillo 1, Pedro J. García 1, Francisco
More informationIEEE P802.1Qcz Proposed Project for Congestion Isolation
IEEE P82.1Qcz Proposed Project for Congestion Isolation IETF 11 London ICCRG Paul Congdon paul.congdon@tallac.com Project Background P82.1Qcz Project Initiation November 217 - Agreed to develop a Project
More informationRequirement Discussion of Flow-Based Flow Control(FFC)
Requirement Discussion of Flow-Based Flow Control(FFC) Nongda Hu Yolanda Yu hunongda@huawei.com yolanda.yu@huawei.com IEEE 802.1 DCB, Stuttgart, May 2017 www.huawei.com new-dcb-yolanda-ffc-proposal-0517-v01
More informationP802.1Qcz Congestion Isolation
P802.1Qcz Congestion Isolation IEEE 802 / IETF Workshop on Data Center Networking Bangkok November 2018 Paul Congdon (Huawei/Tallac) The Case for Low-latency, Lossless, Large-Scale DCNs More and more latency-sensitive
More informationFast-Response Multipath Routing Policy for High-Speed Interconnection Networks
HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope
More informationJesus Escudero-Sahuquillo Universidad de Castilla-La Mancha (UCLM) SPAIN
Pedro Javier Garcia Jesus Escudero-Sahuquillo Universidad de Castilla-La Mancha (UCLM) SPAIN Universidad de Castilla-La Mancha (UCLM) SPAIN Style Powered tby: Conference itle 1 March 12, Barcelona, Spain
More informationInfiniBand Congestion Control
InfiniBand Congestion Control Modelling and validation ABSTRACT Ernst Gunnar Gran Simula Research Laboratory Martin Linges vei 17 1325 Lysaker, Norway ernstgr@simula.no In a lossless interconnection network
More informationIEEE-SA Industry Connections White Paper IEEE 802 Nendica Report: The Lossless Network for Data Centers
IEEE-SA Industry Connections White Paper IEEE 802 Nendica Report: The Lossless Network for Data Centers IEEE 3 Park Avenue New York, NY 10016-5997 USA IEEE 802 Nendica Report: The Lossless Network for
More informationAttaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley
Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter Glenn Judd Morgan Stanley 1 Introduction Datacenter computing pervasive Beyond the Internet services domain BigData, Grid Computing,
More informationIEEE-SA Industry Connections Report. The Lossless Network. For Data Centers
IEEE-SA Industry Connections Report The Lossless Network For Data Centers IEEE 3 Park Avenue New York, NY 10016-5997 USA The Lossless Network for Data Centers i Trademarks and Disclaimers IEEE believes
More informationAdvanced Computer Networks. Flow Control
Advanced Computer Networks 263 3501 00 Flow Control Patrick Stuedi Spring Semester 2017 1 Oriana Riva, Department of Computer Science ETH Zürich Last week TCP in Datacenters Avoid incast problem - Reduce
More informationDIBS: Just-in-time congestion mitigation for Data Centers
DIBS: Just-in-time congestion mitigation for Data Centers Kyriakos Zarifis, Rui Miao, Matt Calder, Ethan Katz-Bassett, Minlan Yu, Jitendra Padhye University of Southern California Microsoft Research Summary
More informationDynamic Network Reconfiguration for Switch-based Networks
Dynamic Network Reconfiguration for Switch-based Networks Ms. Deepti Metri 1, Prof. A. V. Mophare 2 1Student, Computer Science and Engineering, N. B. N. Sinhgad College of Engineering, Maharashtra, India
More informationTHE LOSSLESS NETWORK. For Data Centers
THE LOSSLESS NETWORK For Data Centers IEEE 802 Network Enhancements for the Next Decade IEEE-SA Industry Connections Revision 1.0 February 1, 2018 Contents Abstract... 2 Contributors... 2 Our Digital Lives
More informationBaidu s Best Practice with Low Latency Networks
Baidu s Best Practice with Low Latency Networks Feng Gao IEEE 802 IC NEND Orlando, FL November 2017 Presented by Huawei Low Latency Network Solutions 01 1. Background Introduction 2. Network Latency Analysis
More information15-744: Computer Networking. Data Center Networking II
15-744: Computer Networking Data Center Networking II Overview Data Center Topology Scheduling Data Center Packet Scheduling 2 Current solutions for increasing data center network bandwidth FatTree BCube
More informationHigh Node Count - Scalability Challenges for Interconnection Networks
High Node Count - Scalability Challenges for Interconnection Networks Professor Olav Lysne Simula Research Laboratory Overview Congestion control Fault Tolerance Scalable Modular Routing State Of The Art:
More informationAdvanced Computer Networks. Datacenter TCP
Advanced Computer Networks 263 3501 00 Datacenter TCP Spring Semester 2017 1 Oriana Riva, Department of Computer Science ETH Zürich Today Problems with TCP in the Data Center TCP Incast TPC timeouts Improvements
More informationRDMA over Commodity Ethernet at Scale
RDMA over Commodity Ethernet at Scale Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitendra Padhye, Marina Lipshteyn ACM SIGCOMM 2016 August 24 2016 Outline RDMA/RoCEv2 background DSCP-based
More informationA Multiple LID Routing Scheme for Fat-Tree-Based InfiniBand Networks
A Multiple LID Routing Scheme for Fat-Tree-Based InfiniBand Networks Xuan-Yi Lin, Yeh-Ching Chung, and Tai-Yi Huang Department of Computer Science National Tsing-Hua University, Hsinchu, Taiwan 00, ROC
More informationGot Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China
Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat ACM SIGCOMM 2013, 12-16 August, Hong Kong, China Virtualized Server 1 Application Performance in Virtualized
More informationChelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING
Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity
More informationUNIVERSITY OF CASTILLA-LA MANCHA. Computing Systems Department
UNIVERSITY OF CASTILLA-LA MANCHA Computing Systems Department A case study on implementing virtual 5D torus networks using network components of lower dimensionality HiPINEB 2017 Francisco José Andújar
More informationSwitching Architectures for Cloud Network Designs
Switching Architectures for Cloud Network Designs Networks today require predictable performance and are much more aware of application flows than traditional networks with static addressing of devices.
More informationAdvanced Computer Networks. Flow Control
Advanced Computer Networks 263 3501 00 Flow Control Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 Oriana Riva, Department of Computer Science ETH Zürich 1 Today Flow Control Store-and-forward,
More informationCongestion in InfiniBand Networks
Congestion in InfiniBand Networks Philip Williams Stanford University EE382C Abstract The InfiniBand Architecture (IBA) is a relatively new industry-standard networking technology suited for inter-processor
More informationNetworking Recap Storage Intro. CSE-291 (Cloud Computing), Fall 2016 Gregory Kesden
Networking Recap Storage Intro CSE-291 (Cloud Computing), Fall 2016 Gregory Kesden Networking Recap Storage Intro Long Haul/Global Networking Speed of light is limiting; Latency has a lower bound (.) Throughput
More informationFuture Routing Schemes in Petascale clusters
Future Routing Schemes in Petascale clusters Gilad Shainer, Mellanox, USA Ola Torudbakken, Sun Microsystems, Norway Richard Graham, Oak Ridge National Laboratory, USA Birds of a Feather Presentation Abstract
More informationLecture 16: Data Center Network Architectures
MIT 6.829: Computer Networks Fall 2017 Lecture 16: Data Center Network Architectures Scribe: Alex Lombardi, Danielle Olson, Nicholas Selby 1 Background on Data Centers Computing, storage, and networking
More informationRouter s Queue Management
Router s Queue Management Manages sharing of (i) buffer space (ii) bandwidth Q1: Which packet to drop when queue is full? Q2: Which packet to send next? FIFO + Drop Tail Keep a single queue Answer to Q1:
More informationXCo: Explicit Coordination to Prevent Network Fabric Congestion in Cloud Computing Cluster Platforms. Presented by Wei Dai
XCo: Explicit Coordination to Prevent Network Fabric Congestion in Cloud Computing Cluster Platforms Presented by Wei Dai Reasons for Congestion in Cloud Cloud operators use virtualization to consolidate
More informationMellanox Virtual Modular Switch
WHITE PAPER July 2015 Mellanox Virtual Modular Switch Introduction...1 Considerations for Data Center Aggregation Switching...1 Virtual Modular Switch Architecture - Dual-Tier 40/56/100GbE Aggregation...2
More informationDeadlock-free Routing in InfiniBand TM through Destination Renaming Λ
Deadlock-free Routing in InfiniBand TM through Destination Renaming Λ P. López, J. Flich and J. Duato Dept. of Computing Engineering (DISCA) Universidad Politécnica de Valencia, Valencia, Spain plopez@gap.upv.es
More informationCombining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing
Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Jose Flich 1,PedroLópez 1, Manuel. P. Malumbres 1, José Duato 1,andTomRokicki 2 1 Dpto.
More informationKnowledge-Defined Network Orchestration in a Hybrid Optical/Electrical Datacenter Network
Knowledge-Defined Network Orchestration in a Hybrid Optical/Electrical Datacenter Network Wei Lu (Postdoctoral Researcher) On behalf of Prof. Zuqing Zhu University of Science and Technology of China, Hefei,
More informationCisco Data Center Ethernet
Cisco Data Center Ethernet Q. What is Data Center Ethernet? Is it a product, a protocol, or a solution? A. The Cisco Data Center Ethernet architecture is a collection of Ethernet extensions providing enhancements
More informationSwitching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.
Switching/Flow Control Overview Interconnection Networks: Flow Control and Microarchitecture Topology: determines connectivity of network Routing: determines paths through network Flow Control: determine
More informationPaving the Road to Exascale Computing. Yossi Avni
Paving the Road to Exascale Computing Yossi Avni HPC@mellanox.com Connectivity Solutions for Efficient Computing Enterprise HPC High-end HPC HPC Clouds ICs Mellanox Interconnect Networking Solutions Adapter
More informationRouting protocols behaviour under bandwidth limitation
2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Routing protocols behaviour under bandwidth limitation Cosmin Adomnicăi
More informationUniversity of Castilla-La Mancha
University of Castilla-La Mancha A publication of the Computing Systems Department Implementing the Advanced Switching Fabric Discovery Process by Antonio Robles-Gomez, Aurelio Bermúdez, Rafael Casado,
More informationDATA CENTER FABRIC COOKBOOK
Do It Yourself! DATA CENTER FABRIC COOKBOOK How to prepare something new from well known ingredients Emil Gągała WHAT DOES AN IDEAL FABRIC LOOK LIKE? 2 Copyright 2011 Juniper Networks, Inc. www.juniper.net
More informationScaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc
Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC
More informationMaelstrom: An Enterprise Continuity Protocol for Financial Datacenters
Maelstrom: An Enterprise Continuity Protocol for Financial Datacenters Mahesh Balakrishnan, Tudor Marian, Hakim Weatherspoon Cornell University, Ithaca, NY Datacenters Internet Services (90s) Websites,
More informationPerformance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability
Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox InfiniBand Host Channel Adapters (HCA) enable the highest data center
More informationData Center TCP (DCTCP)
Data Center Packet Transport Data Center TCP (DCTCP) Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, Murari Sridharan Cloud computing
More informationCross-Layer Flow and Congestion Control for Datacenter Networks
Cross-Layer Flow and Congestion Control for Datacenter Networks Andreea Simona Anghel, Robert Birke, Daniel Crisan and Mitch Gusat IBM Research GmbH, Zürich Research Laboratory Outline Motivation CEE impact
More informationA Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ
A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ E. Baydal, P. López and J. Duato Depto. Informática de Sistemas y Computadores Universidad Politécnica de Valencia, Camino
More informationLecture 15: Datacenter TCP"
Lecture 15: Datacenter TCP" CSE 222A: Computer Communication Networks Alex C. Snoeren Thanks: Mohammad Alizadeh Lecture 15 Overview" Datacenter workload discussion DC-TCP Overview 2 Datacenter Review"
More informationLecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren
Lecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren Lecture 21 Overview" How fast should a sending host transmit data? Not to fast, not to slow, just right Should not be faster than
More informationAdaptive Routing Strategies for Modern High Performance Networks
Adaptive Routing Strategies for Modern High Performance Networks Patrick Geoffray Myricom patrick@myri.com Torsten Hoefler Indiana University htor@cs.indiana.edu 28 August 2008 Hot Interconnect Stanford,
More informationExtending commodity OpenFlow switches for large-scale HPC deployments
Extending commodity OpenFlow switches for large-scale HPC deployments Mariano Benito Enrique Vallejo Ramón Beivide Cruz Izu University of Cantabria The University of Adelaide Overview 1.Introduction 1.
More informationDUE to the increasing computing power of microprocessors
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 13, NO. 7, JULY 2002 693 Boosting the Performance of Myrinet Networks José Flich, Member, IEEE, Pedro López, M.P. Malumbres, Member, IEEE, and
More informationDiscussion of Congestion Isolation Changes to 802.1Q
Discussion of Congestion Isolation Changes to 802.1Q Paul Congdon (Huawei), IEEE 802.1 DCB Geneva, Switzerland January 2018 High Level Questions Do we support and define end-station behavior? Should we
More informationDesign of a Tile-based High-Radix Switch with High Throughput
2011 2nd International Conference on Networking and Information Technology IPCSIT vol.17 (2011) (2011) IACSIT Press, Singapore Design of a Tile-based High-Radix Switch with High Throughput Wang Kefei 1,
More informationExpeditus: Congestion-Aware Load Balancing in Clos Data Center Networks
Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks Peng Wang, Hong Xu, Zhixiong Niu, Dongsu Han, Yongqiang Xiong ACM SoCC 2016, Oct 5-7, Santa Clara Motivation Datacenter networks
More informationCSE 123A Computer Networks
CSE 123A Computer Networks Winter 2005 Lecture 8: IP Router Design Many portions courtesy Nick McKeown Overview Router basics Interconnection architecture Input Queuing Output Queuing Virtual output Queuing
More informationETHERNET ENHANCEMENTS FOR STORAGE. Sunil Ahluwalia, Intel Corporation
ETHERNET ENHANCEMENTS FOR STORAGE Sunil Ahluwalia, Intel Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may use
More informationFM4000. A Scalable, Low-latency 10 GigE Switch for High-performance Data Centers
A Scalable, Low-latency 10 GigE Switch for High-performance Data Centers Uri Cummings Rebecca Collins Virat Agarwal Dan Daly Fabrizio Petrini Michael Perrone Davide Pasetto Hot Interconnects 17 (Aug 2009)
More informationANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS
ANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS Mariano Benito Pablo Fuentes Enrique Vallejo Ramón Beivide With support from: 4th IEEE International Workshop of High-Perfomance Interconnection
More informationNVMe Over Fabrics (NVMe-oF)
NVMe Over Fabrics (NVMe-oF) High Performance Flash Moves to Ethernet Rob Davis Vice President Storage Technology, Mellanox Santa Clara, CA 1 Access Time Access in Time Micro (micro-sec) Seconds Why NVMe
More informationAdvanced Computer Networks. Datacenter TCP
Advanced Computer Networks 263 3501 00 Datacenter TCP Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 1 Oriana Riva, Department of Computer Science ETH Zürich Last week Datacenter Fabric Portland
More informationCongestion Control in Datacenters. Ahmed Saeed
Congestion Control in Datacenters Ahmed Saeed What is a Datacenter? Tens of thousands of machines in the same building (or adjacent buildings) Hundreds of switches connecting all machines What is a Datacenter?
More informationData Center Network Topologies II
Data Center Network Topologies II Hakim Weatherspoon Associate Professor, Dept of Computer cience C 5413: High Performance ystems and Networking April 10, 2017 March 31, 2017 Agenda for semester Project
More informationLecture 16: Router Design
Lecture 16: Router Design CSE 123: Computer Networks Alex C. Snoeren Eample courtesy Mike Freedman Lecture 16 Overview End-to-end lookup and forwarding example Router internals Buffering Scheduling 2 Example:
More informationPacket Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)
Packet Scheduling in Data Centers Lecture 17, Computer Networks (198:552) Datacenter transport Goal: Complete flows quickly / meet deadlines Short flows (e.g., query, coordination) Large flows (e.g., data
More informationUSING HIGH PERFORMANCE NETWORK INTERCONNECTS IN DYNAMIC ENVIRONMENTS
12 th ANNUAL WORKSHOP 2016 USING HIGH PERFORMANCE NETWORK INTERCONNECTS IN DYNAMIC ENVIRONMENTS Vangelis Tasoulas Simula Research Laboratory [ April 7 th, 2016 ] ACKNOWLEDGEMENTS Feroz Zahid, Ernst Gunnar
More informationLecture 14: Congestion Control"
Lecture 14: Congestion Control" CSE 222A: Computer Communication Networks George Porter Thanks: Amin Vahdat, Dina Katabi and Alex C. Snoeren Lecture 14 Overview" TCP congestion control review Dukkipati
More informationIndustry Standards for the Exponential Growth of Data Center Bandwidth and Management. Craig W. Carlson
Industry Standards for the Exponential Growth of Data Center Bandwidth and Management Craig W. Carlson 2 Or Finding the Fat Pipe through standards Creative Commons, Flikr User davepaker Overview Part of
More informationThe Best Ethernet Storage Fabric
The Best Ethernet Storage Fabric John F. Kim & Amit Katz Santa Clara, CA August 2017 1 Storage Networking Background: From Fibre Channel to Ethernet 1997 2017 Feature Fibre Channel Ethernet Bandwidth 1
More informationIntroduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era
Massimiliano Sbaraglia Network Engineer Introduction In the cloud computing era, distributed architecture is used to handle operations of mass data, such as the storage, mining, querying, and searching
More informationCombining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing?
Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? J. Flich 1,P.López 1, M. P. Malumbres 1, J. Duato 1, and T. Rokicki 2 1 Dpto. Informática
More informationMessaging Overview. Introduction. Gen-Z Messaging
Page 1 of 6 Messaging Overview Introduction Gen-Z is a new data access technology that not only enhances memory and data storage solutions, but also provides a framework for both optimized and traditional
More informationAPPLICATION NOTE. XCellAir s Wi-Fi Radio Resource Optimization Solution. Features, Test Results & Methodology
APPLICATION NOTE XCellAir s Wi-Fi Radio Resource Optimization Solution Features, Test Results & Methodology Introduction Multi Service Operators (MSOs) and Internet service providers have been aggressively
More informationBoosting the Performance of Myrinet Networks
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. XX, NO. Y, MONTH 22 1 Boosting the Performance of Myrinet Networks J. Flich, P. López, M. P. Malumbres, and J. Duato Abstract Networks of workstations
More informationEfficient Switches with QoS Support for Clusters
Efficient Switches with QoS Support for Clusters Alejandro Martínez, Francisco J. Alfaro,JoséL.Sánchez,José Duato 2 DSI - Univ. of Castilla-La Mancha 2 DISCA - Tech. Univ. of Valencia 27 - Albacete, Spain
More informationSUPERNA RPO REPORTING AND BROCADE IP EXTENSION WITH ISILON SYNCIQ
SUPERNA RPO REPORTING AND BROCADE IP EXTENSION WITH ISILON SYNCIQ Reduce risk and data loss exposure with the Eyeglass RPO Reporting and Brocade 7840 IP Extension solution for Isilon SyncIQ SOLUTION ESSENTIALS
More informationThe benefits Arista s LANZ functionality will provide to network administrators: Real time visibility of congestion hotspots at the microbursts level
Arista LANZ Overview Overview Arista Networks Latency Analyzer (LANZ) represents the next step in the revolution in delivering real-time network performance and congestion monitoring. For the first time,
More information170 Index. Delta networks, DENS methodology
Index A ACK messages, 99 adaptive timeout algorithm, 109 format and semantics, 107 pending packets, 105 piggybacking, 107 schematic represenation, 105 source adapter, 108 ACK overhead, 107 109, 112 Active
More informationBasic Low Level Concepts
Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock
More informationMicro load balancing in data centers with DRILL
Micro load balancing in data centers with DRILL Soudeh Ghorbani (UIUC) Brighten Godfrey (UIUC) Yashar Ganjali (University of Toronto) Amin Firoozshahian (Intel) Where should the load balancing functionality
More informationData center Networking: New advances and Challenges (Ethernet) Anupam Jagdish Chomal Principal Software Engineer DellEMC Isilon
Data center Networking: New advances and Challenges (Ethernet) Anupam Jagdish Chomal Principal Software Engineer DellEMC Isilon Bitcoin mining Contd Main reason for bitcoin mines at Iceland is the natural
More informationVPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability
VPI / InfiniBand Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox enables the highest data center performance with its
More informationTransport Protocols for Data Center Communication. Evisa Tsolakou Supervisor: Prof. Jörg Ott Advisor: Lect. Pasi Sarolahti
Transport Protocols for Data Center Communication Evisa Tsolakou Supervisor: Prof. Jörg Ott Advisor: Lect. Pasi Sarolahti Contents Motivation and Objectives Methodology Data Centers and Data Center Networks
More informationArista 7020R Series: Q&A
7020R Series: Q&A Document Arista 7020R Series: Q&A Product Overview What is the 7020R Series? The Arista 7020R Series, including the 7020SR, 7020TR and 7020TRA, offers a purpose built high performance
More informationHuawei CloudFabric Solution Optimized for High-Availability/Hyperscale/HPC Environments
Huawei CloudFabric Solution Optimized for High-Availability/Hyperscale/HPC Environments CloudFabric Solution Optimized for High-Availability/Hyperscale/HPC Environments Internet Finance HPC VPC Industry
More informationPerformance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing
Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing J. Flich, M. P. Malumbres, P. López and J. Duato Dpto. Informática de Sistemas y Computadores Universidad Politécnica
More informationRoGUE: RDMA over Generic Unconverged Ethernet
RoGUE: RDMA over Generic Unconverged Ethernet Yanfang Le with Brent Stephens, Arjun Singhvi, Aditya Akella, Mike Swift RDMA Overview RDMA USER KERNEL Zero Copy Application Application Buffer Buffer HARWARE
More informationRevisiting Network Support for RDMA
Revisiting Network Support for RDMA Radhika Mittal 1, Alex Shpiner 3, Aurojit Panda 1, Eitan Zahavi 3, Arvind Krishnamurthy 2, Sylvia Ratnasamy 1, Scott Shenker 1 (1: UC Berkeley, 2: Univ. of Washington,
More informationEnabling High Performance Data Centre Solutions and Cloud Services Through Novel Optical DC Architectures. Dimitra Simeonidou
Enabling High Performance Data Centre Solutions and Cloud Services Through Novel Optical DC Architectures Dimitra Simeonidou Challenges and Drivers for DC Evolution Data centres are growing in size and
More informationDeploying Data Center Switching Solutions
Deploying Data Center Switching Solutions Choose the Best Fit for Your Use Case 1 Table of Contents Executive Summary... 3 Introduction... 3 Multivector Scaling... 3 Low On-Chip Memory ASIC Platforms...4
More informationAccelerating Development and Troubleshooting of Data Center Bridging (DCB) Protocols Using Xgig
Accelerating Development and Troubleshooting of Data Center Bridging (DCB) Protocols Using Xgig The new Data Center Bridging (DCB) protocols provide important mechanisms for enabling priority and managing
More informationAlizadeh, M. et al., " CONGA: distributed congestion-aware load balancing for datacenters," Proc. of ACM SIGCOMM '14, 44(4): , Oct
CONGA Paper Review By Buting Ma and Taeju Park Paper Reference Alizadeh, M. et al., " CONGA: distributed congestion-aware load balancing for datacenters," Proc. of ACM SIGCOMM '14, 44(4):503-514, Oct.
More informationEvaluate Data Center Network Performance
Downloaded from orbit.dtu.dk on: Sep 02, 2018 Evaluate Data Center Network Performance Pilimon, Artur Publication date: 2018 Document Version Publisher's PDF, also known as Version of record Link back
More informationCONGA: Distributed Congestion-Aware Load Balancing for Datacenters
CONGA: Distributed Congestion-Aware Load Balancing for Datacenters By Alizadeh,M et al. Motivation Distributed datacenter applications require large bisection bandwidth Spine Presented by Andrew and Jack
More informationSPARTA: Scalable Per-Address RouTing Architecture
SPARTA: Scalable Per-Address RouTing Architecture John Carter Data Center Networking IBM Research - Austin IBM Research Science & Technology IBM Research activities related to SDN / OpenFlow IBM Research
More informationSCALABLE STRATEGIES FOR ALLEVIATING THE HOL BLOCKING PRODUCED BY CONGESTION TREES IN LOSSLESS INTERCONNECTION NETWORKS
SCALABLE STRATEGIES FOR ALLEVIATING THE HOL BLOCKING PRODUCED BY CONGESTION TREES IN LOSSLESS INTERCONNECTION NETWORKS P. Nicolas Kokkalis, Njuguna Njoroge, Ernesto Staroswiecki EE382C Interconnection
More informationVM Aware Fibre Channel
White Paper White Paper VM-ID VM Aware Fibre Channel Virtual Machine Traffic Visibility for SANs StorFusion VM-ID feature on QLogic Gen6 (32G) and Enhanced Gen5 (16G) Fibre Channel KEY BENEFITS Increases
More information