Congestion Management in HPC

Size: px
Start display at page:

Download "Congestion Management in HPC"

Transcription

1 Congestion Management in HPC Interconnection Networks Pedro J. García Universidad de Castilla-La Mancha (SPAIN) Conference title 1

2 Outline Why may congestion become a problem? Should we care about congestion in current HPC systems? How can congestion be managed? Challenges 2

3 Why may congestion become a problem? For three decades the goal of computer architects has been to keep the processors busy top performance Interconnects were usually cheap, and never a bottleneck Now, global system performance in large systems is limited by the interconnection network Network saturation leads to congestion situations that may drastically degrade network performance 3

4 Contention Several packets from different flows request the same output port in a switch One packet makes progress, the others wait Network contention 4

5 Congestion Persistent contention, mainly in network saturation state Buffers containing packets belonging gto flows involved in contention become full Persistent network contention 5

6 Congestion propagation In saturated lossless networks, congestion is quickly propagated by flow control, forming congestion trees Flow control Persistent network contention 6

7 Congestion propagation In saturated lossless networks, congestion is quickly propagated by flow control, forming congestion trees Congestion tree structure: Congestion propagation may reach the sources Congestion tree leaf Congestion tree branch Congestion tree root Congestion tree leaf Congestion tree branch 7

8 Congestion trees and Head of Line blocking Congestion trees may cause Head of Line (HoL) blocking Non-congested packets advance at the same speed as congested ones Congestion affects sources that do not cause congestion 8

9 Network performance at saturation HS = traffic injected to Hot Spot destination HS starts HS ends At saturation, network performance drops dramatically due to congestion situations 9

10 Should we currently care about congestion? Conflicting interests: cost vs. performance Saturation was traditionally avoided by overdimensioning the interconnection network 10

11 Network overdimensioning Many more components than really necessary Offered network bandwidth is much higher h than the bandwidth requested by end nodes 11

12 Network overdimensioning Advantage: low link utilization congestion is unlikely Saturation Working zone zone ency Late Injected traffic Disadvantages: Expensive (processors cheaper relative to interconnects) Power consumption increases (growing link speed) 12

13 Should we currently care about congestion? Conflicting interests: cost vs. performance Saturation was traditionally avoided by overdimensioning the interconnection network currently not suitable No network overdimensioning? 13

14 Network not overdimensioned Only the components strictly necessary to interconnect all the processing nodes Offered network bandwidth decreases 14

15 Network not overdimensioned Advantages: cheaper, less power consumption Saturation Working zone zone tency Lat Injected traffic Disadvantage: high link utilization congestion is likely 15

16 Should we currently care about congestion? Conflicting interests: cost vs. performance Saturation was traditionally avoided by overdimensioning the interconnection network Currently not suitable No overdimensioning Danger when working with high traffic loads (close to the saturation point) Network performance (throughput, latency) should be good under very different traffic patterns & load scenarios Traffic load may significantly vary over time, reaching saturation Some strategy to deal with congestion is required 16

17 The big picture: Power Growing processor Growing link consumption speed speed increases Processor prices drop (demand) Relative interconnect cost increases Power management Smaller networks Congestion probability grows Performance Congestion Management degradation Strategies Saturation point reached with lower traffic load Bandwidth decreases 17

18 Benefits of congestion management Stable performance when the network reaches saturation No performance drop Delivers maximum achievable throughput Reacts quickly when power management turned some components off and demand suddenly increases Prevents performance degradation due to power management Enables more aggressive power saving strategies without risk Helps to keep performance when faults occur and fault tolerance techniques enable alternative paths Alternative paths may become congested (fewer resources are available) 18

19 How can congestion be managed? Different approaches to congestion management: Packet dropping Proactive techniques Reactive techniques HoL blocking elimination techniques Hybrid techniques Related techniques 19

20 Packet dropping Packets in congested buffers are discarded Suitable for computer networks (like the Internet) but not suitable for most current HPC parallel applications Both congested and non congested packets may be discarded Discarded packets must be retransmitted, thus increasing final packet latency 20

21 Proactive congestion management A.K.A. congestion prevention Path setup before data transmission [1] Used in ATM, computer networks (QoS) Optimal performance requires to know in advance: Resource requirements of each transmission Network status Knowledge about network status is not always available High overhead, high setup latencies, poor link utilization (not suitable for HPC) [1] P. Yew, N. Tzeng, D.H. Lawrie, Distributing Hot Spot Addressing in Large Scale Multiprocessors, IEEE Transactions on Computers, 36(4): ,

22 Reactive congestion management A.K.A. congestion recovery Injection limitation techniques (injection throttling) using closed loop feedback Does not scale well with network size and link bandwidth Notification delay (proportional to distance / number of hops) Link and buffer capacity (proportional to clock frequency) May produce traffic oscillations (closed loop system with pure delay) 22

23 Reactive congestion management Example: Infiniband FECN/BECN mechanism [2]: Two bits in the packet header are reserved for congestion notification If a switch port is considered as congested, the Forward Explicit Congestion Notification (FECN) bit in the header of packets crossing that port is set Upon reception of such a FECN marked packet, a destination will return a packet (Congestion Notification Packet, CNP) whose header will have the Backward Explicit Congestion Notification (BECN) bit set back to the source Any source receiving a BECN marked packet will then reduce its packet injection rate for this traffic flow [2] E.G. Gran, M. Eimot, S.A. Reinemo, T. Skeie, O. Lysne, L. Huse, G. Shainer, First experiences with congestion control in InfiniBand hardware, in Proceedings of IPDPS 2010, pp

24 HoL blocking elimination techniques Key idea: The real problem is not the congestion itself, but its negative effect (HoL blocking) By eliminating HoL blocking, congestion becomes harmless 24

25 Example of HoL blocking due to congestion Should congested flows be throttled? Src % Sw % Sw. 5 Congested flows Non-congested flows Src % Sw % 33 % Sw. 6 Sw % 100 % Dst. 1 Src. 2 Sw. 3 Sw % 33 % 66 % 33 % Dst. 2 Src % Sw % 33 % Sending 33 % Stopped 33 % Sending 25

26 Example of real life HoL blocking The A 31 highway metaphor Bottleneck A-31 A-43 The flow is affected by the bottleneck of the A 31 highway Map Source: Google Maps 26

27 HoL blocking elimination techniques In general, these techniques rely on having different queues at each port to separate different packet kt flows They differ mainly in the criteria to map packets to queues and in the number of required queues per port 27

28 HoL blocking elimination techniques VOQnet (Virtual Output Queuing at network level) [3] A separate queue at each input port for every destination Packets with the same destination are stored in the same queue Selected_Queue = Packet_Destination Completely eliminates HoL blocking Number of required buffer resources increases at least quadratically with network size!!! [3] W. Dally, P. Carvey, L. Dennison, Architecture of the Avici terabit switch/router, in Proceedings of 6th Hot Interconnects, 1998, pp

29 HoL blocking elimination techniques VOQsw (Virtual Output Queuing at switch level) [4] & DAMQs (Dynamically Allocated Multi Queues) [5] A separate queue at every input port for every output port Packets requesting the same output t are stored in the same queue Selected_Queue = Requested_Output_Port Better than nothing but they do not completely eliminate HoL blocking Effectiveness depends on topology and traffic pattern [4] T. Anderson, S. Owicki, J. Saxe, C. Thacker, High speed switch scheduling for local area networks, ACM Transactions on Computer Systems, vol. 11 (4), pp , November [5] Y. Tamir, G. Frazier, Dynamically allocated multi queue buffers for VLSI communication switches, IEEE Transactions on Computers,vol. 41 (6), June

30 HoL blocking elimination techniques DBBM (Destination Based Buffer Management) )[6] Several groups of destinations are defined A separate queue for each group at every port (q queues per port) Packets with destinations in the same group are stored at the same queue Selected_Queue = Packet_Destination MOD q Does not completely eliminate HoL blocking Effectiveness depends on the number of queues, topology and traffic pattern [6] T. Nachiondo, J. Flich, J. Duato, Buffer management strategies to reduce HoL blocking, IEEE Transactions on Parallel and Distributed Systems, vol. 21 (6), pp ,

31 HoL blocking elimination techniques OBQA (Output Based Queue Assignment) [7] Suitable for fat trees with DESTRO routing Queue assignment linked with topology & routing algorithm Reduces HoL blocking with the minimum number of queues per port (q) Sl Selected_Queue tdq = Requested_Output_Port t t tmod q q smaller than half the switch radix Does not completely eliminate HoL blocking Effectiveness depends on the number of queues [7] J. Escudero Sahuquillo, P. J. García, F. J. Quiles, J. Duato, An efficient strategy for reducing head of line blocking in fat trees, in LNCS vol. 6272, pp Proceedings of 16 th International Euro Par Conference (II), () Ischia, Italy, Sept

32 Performance comparison Uniform traffic simulation results Network Latency y( (cycles) vs Normalized Generated Traffic 4 ary 4 tree 8x8 switches 16 ary 2 tree 32x32 switches 32

33 HoL blocking elimination techniques RECN (Regional Explicit Congestion Notification) [8] & FBICM (Flow Based Implicit Congestion Management) [9] RECN has been proposed for source based routing networks while FBICM for distributed table based routing networks The key difference with respect to previous techniques is that they completely and dynamically isolate congested flows Basics: Explicit identification of congested flows Storage of congestion information Dynamic queue allocation to isolate congested flows [8] P. J. García, J. Flich, J. Duato, I. Johnson, F. J. Quiles, F. Naven, Efficient, scalable congestion management for interconnection networks, IEEE Micro, vol. 26 (5), pp , September [9] J. Escudero Sahuquillo, P. J. García, F. J. Quiles, J. Flich, J. Duato, Cost effective congestion management for interconnection networks using distributed deterministic routing, in Proceedings of ICPADS 2010, Shanghai, China, December

34 RECN/FBICM basic procedure Congested points are detected at any port of the network by measuring queue occupancy The location of any detected d congested point is stored in a control memory (a CAM line) at any port forwarding packets towards the congested point: RECN: an explicit route is stored FBICM: a list of destinations is stored to implicitly locate the point A special queue associated to the CAM line is also allocated to exclusively store packets addressed to that congested point Congestion information is progressively notified to any port in other switches crossed by congested flows, where new CAM lines and special il queues are allocated A packet arriving at a port is stored in the standard queue only if its routing information does not match any CAM line 34

35 RECN/FBICM queue requirements Non congested packets can share queues without suffering significant HoL blocking only one standard queue per port Special queues are allocated/deallocated when required, thus congested packets can be separately buffered by using a small number of special queues per port HoL blocking produced d by congested packets is eliminated in a scalable way 35

36 RECN/FBICM drawbacks In scenarios with a lot of different congested points, it is possible to run out of special queues at some ports The need for CAMs at switch ports increases implementation cost and required silicon area per port 36

37 Hybrid congestion management strategies Example: Combining Injection Throttling and FBICM [10]: Use FBICM to quickly and locally eliminate HoL blocking blocking, propagating congestion information and allocating queues as necessary Use reactive congestion management to slowly eliminate congestion, deallocating FBICM queues whenever possible Use of FBICM provides immediate response and allows reactive congestion management to be tuned for slow reaction, thus avoiding oscillations Reactive congestion management drastically reduces FBICM buffer requirements (just one or two queues per port) [10] J. Escudero Sahuquillo, E. G. Gran, P.J. García, J. Flich, T. Skeie, O. Lysne, F.J. Quiles, J. Duato, Combining Congested Flow Isolation and Injection Throttling in HPC Interconnection Networks, to appear in Proceedings of ICPP

38 Performance comparison Hot spot scenario simulation results Network Normalized Throughput vs Time 4 ary 3 tree 1 hot spot 4 ary 3 tree 4 hot spots 38

39 Related techniques Adaptive Routing/Traffic balancing May help to delay the occurrence of congestion Useless when heavy congestion arises Problems regarding in order packet delivery Existing congestion management techniques do not work correctly with adaptive routing (congested points may vary) Adaptive routing may spread congestion over more links Virtual Channels Performance depends d on channel (queue) assignment 39

40 Challenges To develop congestion management techniques that react locally and immediately when congestion arises To make congestion management techniques truly scalable To achieve coordination among end nodes without explicit communication among them To eliminate instabilities and oscillatory responses To minimize the number of extra resources needed to handle congestion To make congestion management compatible with ih adaptive routing 40

41 Acknowledgements Jose Duato (Universitat Politecnica de Valencia), who generously gave us the main ideas behind our congestion management proposals Jose Flich (Universitat i Politecnica i de Vl Valencia) i) and Jesus Escudero Sahuquillo (Universidad de Castilla La Mancha), who have developed alongside me all our congestion management proposals The technique combining reactive congestion management and FBICM has been developed in collaboration with Simula Research Laboratory (Oslo) 41

42 Thanks!! Any question? Conference title 42

Congestion Management in Lossless Interconnects: Challenges and Benefits

Congestion Management in Lossless Interconnects: Challenges and Benefits Congestion Management in Lossless Interconnects: Challenges and Benefits José Duato Technical University of Valencia (SPAIN) Conference title 1 Outline Why is congestion management required? Benefits Congestion

More information

36 IEEE POTENTIALS /07/$ IEEE

36 IEEE POTENTIALS /07/$ IEEE INTERCONNECTION NETWORKS ARE A KEY ELEMENT IN a wide variety of systems: massive parallel processors, local and system area networks, clusters of PCs and workstations, and Internet Protocol routers. They

More information

Congestion Management for Ethernet-based Lossless DataCenter Networks

Congestion Management for Ethernet-based Lossless DataCenter Networks Congestion Management for Ethernet-based Lossless DataCenter Networks Pedro Javier Garcia 1, Jesus Escudero-Sahuquillo 1, Francisco J. Quiles 1 and Jose Duato 2 1: University of Castilla-La Mancha (UCLM)

More information

An Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing

An Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing An Effective Queuing Scheme to Provide Slim Fly topologies with HoL Blocking Reduction and Deadlock Freedom for Minimal-Path Routing Pedro Yébenes 1, Jesús Escudero-Sahuquillo 1, Pedro J. García 1, Francisco

More information

InfiniBand Congestion Control

InfiniBand Congestion Control InfiniBand Congestion Control Modelling and validation ABSTRACT Ernst Gunnar Gran Simula Research Laboratory Martin Linges vei 17 1325 Lysaker, Norway ernstgr@simula.no In a lossless interconnection network

More information

Dynamic Network Reconfiguration for Switch-based Networks

Dynamic Network Reconfiguration for Switch-based Networks Dynamic Network Reconfiguration for Switch-based Networks Ms. Deepti Metri 1, Prof. A. V. Mophare 2 1Student, Computer Science and Engineering, N. B. N. Sinhgad College of Engineering, Maharashtra, India

More information

Congestion in InfiniBand Networks

Congestion in InfiniBand Networks Congestion in InfiniBand Networks Philip Williams Stanford University EE382C Abstract The InfiniBand Architecture (IBA) is a relatively new industry-standard networking technology suited for inter-processor

More information

SCALABLE STRATEGIES FOR ALLEVIATING THE HOL BLOCKING PRODUCED BY CONGESTION TREES IN LOSSLESS INTERCONNECTION NETWORKS

SCALABLE STRATEGIES FOR ALLEVIATING THE HOL BLOCKING PRODUCED BY CONGESTION TREES IN LOSSLESS INTERCONNECTION NETWORKS SCALABLE STRATEGIES FOR ALLEVIATING THE HOL BLOCKING PRODUCED BY CONGESTION TREES IN LOSSLESS INTERCONNECTION NETWORKS P. Nicolas Kokkalis, Njuguna Njoroge, Ernesto Staroswiecki EE382C Interconnection

More information

UNIVERSITY OF CASTILLA-LA MANCHA. Computing Systems Department

UNIVERSITY OF CASTILLA-LA MANCHA. Computing Systems Department UNIVERSITY OF CASTILLA-LA MANCHA Computing Systems Department A case study on implementing virtual 5D torus networks using network components of lower dimensionality HiPINEB 2017 Francisco José Andújar

More information

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope

More information

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ E. Baydal, P. López and J. Duato Depto. Informática de Sistemas y Computadores Universidad Politécnica de Valencia, Camino

More information

Design of a Tile-based High-Radix Switch with High Throughput

Design of a Tile-based High-Radix Switch with High Throughput 2011 2nd International Conference on Networking and Information Technology IPCSIT vol.17 (2011) (2011) IACSIT Press, Singapore Design of a Tile-based High-Radix Switch with High Throughput Wang Kefei 1,

More information

Removing the Latency Overhead of the ITB Mechanism in COWs with Source Routing Λ

Removing the Latency Overhead of the ITB Mechanism in COWs with Source Routing Λ Removing the Latency Overhead of the ITB Mechanism in COWs with Source Routing Λ J. Flich, M. P. Malumbres, P. López and J. Duato Dpto. of Computer Engineering (DISCA) Universidad Politécnica de Valencia

More information

Routing Algorithms. Review

Routing Algorithms. Review Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent

More information

High Node Count - Scalability Challenges for Interconnection Networks

High Node Count - Scalability Challenges for Interconnection Networks High Node Count - Scalability Challenges for Interconnection Networks Professor Olav Lysne Simula Research Laboratory Overview Congestion control Fault Tolerance Scalable Modular Routing State Of The Art:

More information

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing?

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? J. Flich 1,P.López 1, M. P. Malumbres 1, J. Duato 1, and T. Rokicki 2 1 Dpto. Informática

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes EE482, Spring 1999 Research Paper Report Deadlock Recovery Schemes Jinyung Namkoong Mohammed Haque Nuwan Jayasena Manman Ren May 18, 1999 Introduction The selected papers address the problems of deadlock,

More information

What Is Congestion? Effects of Congestion. Interaction of Queues. Chapter 12 Congestion in Data Networks. Effect of Congestion Control

What Is Congestion? Effects of Congestion. Interaction of Queues. Chapter 12 Congestion in Data Networks. Effect of Congestion Control Chapter 12 Congestion in Data Networks Effect of Congestion Control Ideal Performance Practical Performance Congestion Control Mechanisms Backpressure Choke Packet Implicit Congestion Signaling Explicit

More information

Extending commodity OpenFlow switches for large-scale HPC deployments

Extending commodity OpenFlow switches for large-scale HPC deployments Extending commodity OpenFlow switches for large-scale HPC deployments Mariano Benito Enrique Vallejo Ramón Beivide Cruz Izu University of Cantabria The University of Adelaide Overview 1.Introduction 1.

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

The Impact of Optics on HPC System Interconnects

The Impact of Optics on HPC System Interconnects The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott Hot Interconnects 2009 Manhattan, NYC Will cost-effective optics fundamentally change the landscape of networking? Yes. Changes

More information

Quality of Service. Traffic Descriptor Traffic Profiles. Figure 24.1 Traffic descriptors. Figure Three traffic profiles

Quality of Service. Traffic Descriptor Traffic Profiles. Figure 24.1 Traffic descriptors. Figure Three traffic profiles 24-1 DATA TRAFFIC Chapter 24 Congestion Control and Quality of Service The main focus of control and quality of service is data traffic. In control we try to avoid traffic. In quality of service, we try

More information

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed

More information

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup

More information

ANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS

ANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS ANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS Mariano Benito Pablo Fuentes Enrique Vallejo Ramón Beivide With support from: 4th IEEE International Workshop of High-Perfomance Interconnection

More information

Congestion Control in Communication Networks

Congestion Control in Communication Networks Congestion Control in Communication Networks Introduction Congestion occurs when number of packets transmitted approaches network capacity Objective of congestion control: keep number of packets below

More information

Chapter 24 Congestion Control and Quality of Service 24.1

Chapter 24 Congestion Control and Quality of Service 24.1 Chapter 24 Congestion Control and Quality of Service 24.1 Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 24-1 DATA TRAFFIC The main focus of congestion control

More information

ETSF05/ETSF10 Internet Protocols. Performance & QoS Congestion Control

ETSF05/ETSF10 Internet Protocols. Performance & QoS Congestion Control ETSF05/ETSF10 Internet Protocols Performance & QoS Congestion Control Quality of Service (QoS) Maintaining a functioning network Meeting applications demands User s demands = QoE (Quality of Experience)

More information

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* Young-Joo Suh, Binh Vien Dao, Jose Duato, and Sudhakar Yalamanchili Computer Systems Research Laboratory Facultad de Informatica School

More information

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Jose Flich 1,PedroLópez 1, Manuel. P. Malumbres 1, José Duato 1,andTomRokicki 2 1 Dpto.

More information

Congestion in Data Networks. Congestion in Data Networks

Congestion in Data Networks. Congestion in Data Networks Congestion in Data Networks CS420/520 Axel Krings 1 Congestion in Data Networks What is Congestion? Congestion occurs when the number of packets being transmitted through the network approaches the packet

More information

ETSF05/ETSF10 Internet Protocols. Performance & QoS Congestion Control

ETSF05/ETSF10 Internet Protocols. Performance & QoS Congestion Control ETSF05/ETSF10 Internet Protocols Performance & QoS Congestion Control Quality of Service (QoS) Maintaining a functioning network Meeting applications demands User s demands = QoE (Quality of Experience)

More information

Lecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren

Lecture 21: Congestion Control CSE 123: Computer Networks Alex C. Snoeren Lecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren Lecture 21 Overview" How fast should a sending host transmit data? Not to fast, not to slow, just right Should not be faster than

More information

Fairness Example: high priority for nearby stations Optimality Efficiency overhead

Fairness Example: high priority for nearby stations Optimality Efficiency overhead Routing Requirements: Correctness Simplicity Robustness Under localized failures and overloads Stability React too slow or too fast Fairness Example: high priority for nearby stations Optimality Efficiency

More information

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly

More information

Resource allocation in networks. Resource Allocation in Networks. Resource allocation

Resource allocation in networks. Resource Allocation in Networks. Resource allocation Resource allocation in networks Resource Allocation in Networks Very much like a resource allocation problem in operating systems How is it different? Resources and jobs are different Resources are buffers

More information

Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing

Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing J. Flich, M. P. Malumbres, P. López and J. Duato Dpto. Informática de Sistemas y Computadores Universidad Politécnica

More information

Exploring InfiniBand Congestion Control

Exploring InfiniBand Congestion Control Exploring InfiniBand Congestion Control Ahmed Yusuf Mahamud Master s Thesis Spring 215 Exploring InfiniBand Congestion Control Ahmed Yusuf Mahamud May 18, 215 ii Abstract Congestion Control (CC) is used

More information

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

The final publication is available at

The final publication is available at Document downloaded from: http://hdl.handle.net/10251/82062 This paper must be cited as: Peñaranda Cebrián, R.; Gómez Requena, C.; Gómez Requena, ME.; López Rodríguez, PJ.; Duato Marín, JF. (2016). The

More information

A First Implementation of In-Transit Buffers on Myrinet GM Software Λ

A First Implementation of In-Transit Buffers on Myrinet GM Software Λ A First Implementation of In-Transit Buffers on Myrinet GM Software Λ S. Coll, J. Flich, M. P. Malumbres, P. López, J. Duato and F.J. Mora Universidad Politécnica de Valencia Camino de Vera, 14, 46071

More information

What Is Congestion? Computer Networks. Ideal Network Utilization. Interaction of Queues

What Is Congestion? Computer Networks. Ideal Network Utilization. Interaction of Queues 168 430 Computer Networks Chapter 13 Congestion in Data Networks What Is Congestion? Congestion occurs when the number of packets being transmitted through the network approaches the packet handling capacity

More information

"Filling up an old bath with holes in it, indeed. Who would be such a fool?" "A sum it is, girl," my father said. "A sum. A problem for the mind.

Filling up an old bath with holes in it, indeed. Who would be such a fool? A sum it is, girl, my father said. A sum. A problem for the mind. We were doing very well, up to the kind of sum when a bath is filling at the rate of so many gallons and two holes are letting the water out, and please to say how long it will take to fill the bath, when

More information

University of Castilla-La Mancha

University of Castilla-La Mancha University of Castilla-La Mancha A publication of the Computing Systems Department Implementing the Advanced Switching Fabric Discovery Process by Antonio Robles-Gomez, Aurelio Bermúdez, Rafael Casado,

More information

Requirement Discussion of Flow-Based Flow Control(FFC)

Requirement Discussion of Flow-Based Flow Control(FFC) Requirement Discussion of Flow-Based Flow Control(FFC) Nongda Hu Yolanda Yu hunongda@huawei.com yolanda.yu@huawei.com IEEE 802.1 DCB, Stuttgart, May 2017 www.huawei.com new-dcb-yolanda-ffc-proposal-0517-v01

More information

Efficient Switches with QoS Support for Clusters

Efficient Switches with QoS Support for Clusters Efficient Switches with QoS Support for Clusters Alejandro Martínez, Francisco J. Alfaro,JoséL.Sánchez,José Duato 2 DSI - Univ. of Castilla-La Mancha 2 DISCA - Tech. Univ. of Valencia 27 - Albacete, Spain

More information

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services Overview 15-441 15-441 Computer Networking 15-641 Lecture 19 Queue Management and Quality of Service Peter Steenkiste Fall 2016 www.cs.cmu.edu/~prs/15-441-f16 What is QoS? Queuing discipline and scheduling

More information

Boosting the Performance of Myrinet Networks

Boosting the Performance of Myrinet Networks IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. XX, NO. Y, MONTH 22 1 Boosting the Performance of Myrinet Networks J. Flich, P. López, M. P. Malumbres, and J. Duato Abstract Networks of workstations

More information

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

Interconnection Networks: Routing. Prof. Natalie Enright Jerger Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly

More information

Random Early Detection (RED) gateways. Sally Floyd CS 268: Computer Networks

Random Early Detection (RED) gateways. Sally Floyd CS 268: Computer Networks Random Early Detection (RED) gateways Sally Floyd CS 268: Computer Networks floyd@eelblgov March 20, 1995 1 The Environment Feedback-based transport protocols (eg, TCP) Problems with current Drop-Tail

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

CSMA based Medium Access Control for Wireless Sensor Network

CSMA based Medium Access Control for Wireless Sensor Network CSMA based Medium Access Control for Wireless Sensor Network H. Hoang, Halmstad University Abstract Wireless sensor networks bring many challenges on implementation of Medium Access Control protocols because

More information

CS519: Computer Networks

CS519: Computer Networks Lets start at the beginning : Computer Networks Lecture 1: Jan 26, 2004 Intro to Computer Networking What is a for? To allow two or more endpoints to communicate What is a? Nodes connected by links Lets

More information

A Cost and Scalability Comparison of the Dragonfly versus the Fat Tree. Frank Olaf Sem-Jacobsen Simula Research Laboratory

A Cost and Scalability Comparison of the Dragonfly versus the Fat Tree. Frank Olaf Sem-Jacobsen Simula Research Laboratory A Cost and Scalability Comparison of the Dragonfly versus the Fat Tree Frank Olaf Sem-Jacobsen frankose@simula.no Simula Research Laboratory HPC Advisory Council Workshop Barcelona, Spain, September 12,

More information

Overview. TCP & router queuing Computer Networking. TCP details. Workloads. TCP Performance. TCP Performance. Lecture 10 TCP & Routers

Overview. TCP & router queuing Computer Networking. TCP details. Workloads. TCP Performance. TCP Performance. Lecture 10 TCP & Routers Overview 15-441 Computer Networking TCP & router queuing Lecture 10 TCP & Routers TCP details Workloads Lecture 10: 09-30-2002 2 TCP Performance TCP Performance Can TCP saturate a link? Congestion control

More information

Future Routing Schemes in Petascale clusters

Future Routing Schemes in Petascale clusters Future Routing Schemes in Petascale clusters Gilad Shainer, Mellanox, USA Ola Torudbakken, Sun Microsystems, Norway Richard Graham, Oak Ridge National Laboratory, USA Birds of a Feather Presentation Abstract

More information

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Dr. Vinod Vokkarane Assistant Professor, Computer and Information Science Co-Director, Advanced Computer Networks Lab University

More information

This Lecture. BUS Computer Facilities Network Management. Switching Network. Simple Switching Network

This Lecture. BUS Computer Facilities Network Management. Switching Network. Simple Switching Network This Lecture BUS0 - Computer Facilities Network Management Switching networks Circuit switching Packet switching gram approach Virtual circuit approach Routing in switching networks Faculty of Information

More information

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including Router Architectures By the end of this lecture, you should be able to. Explain the different generations of router architectures Describe the route lookup process Explain the operation of PATRICIA algorithm

More information

Revisiting Network Support for RDMA

Revisiting Network Support for RDMA Revisiting Network Support for RDMA Radhika Mittal 1, Alex Shpiner 3, Aurojit Panda 1, Eitan Zahavi 3, Arvind Krishnamurthy 2, Sylvia Ratnasamy 1, Scott Shenker 1 (1: UC Berkeley, 2: Univ. of Washington,

More information

Mobile Transport Layer

Mobile Transport Layer Mobile Transport Layer 1 Transport Layer HTTP (used by web services) typically uses TCP Reliable transport between TCP client and server required - Stream oriented, not transaction oriented - Network friendly:

More information

Computer Networking. Queue Management and Quality of Service (QOS)

Computer Networking. Queue Management and Quality of Service (QOS) Computer Networking Queue Management and Quality of Service (QOS) Outline Previously:TCP flow control Congestion sources and collapse Congestion control basics - Routers 2 Internet Pipes? How should you

More information

Congestion Control for High Bandwidth-delay Product Networks. Dina Katabi, Mark Handley, Charlie Rohrs

Congestion Control for High Bandwidth-delay Product Networks. Dina Katabi, Mark Handley, Charlie Rohrs Congestion Control for High Bandwidth-delay Product Networks Dina Katabi, Mark Handley, Charlie Rohrs Outline Introduction What s wrong with TCP? Idea of Efficiency vs. Fairness XCP, what is it? Is it

More information

Congestion Control. Daniel Zappala. CS 460 Computer Networking Brigham Young University

Congestion Control. Daniel Zappala. CS 460 Computer Networking Brigham Young University Congestion Control Daniel Zappala CS 460 Computer Networking Brigham Young University 2/25 Congestion Control how do you send as fast as possible, without overwhelming the network? challenges the fastest

More information

Introduction. Router Architectures. Introduction. Introduction. Recent advances in routing architecture including

Introduction. Router Architectures. Introduction. Introduction. Recent advances in routing architecture including Introduction Router Architectures Recent advances in routing architecture including specialized hardware switching fabrics efficient and faster lookup algorithms have created routers that are capable of

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

ADVANCED COMPUTER NETWORKS

ADVANCED COMPUTER NETWORKS ADVANCED COMPUTER NETWORKS Congestion Control and Avoidance 1 Lecture-6 Instructor : Mazhar Hussain CONGESTION CONTROL When one part of the subnet (e.g. one or more routers in an area) becomes overloaded,

More information

Chapter 6: Congestion Control and Resource Allocation

Chapter 6: Congestion Control and Resource Allocation Chapter 6: Congestion Control and Resource Allocation CS/ECPE 5516: Comm. Network Prof. Abrams Spring 2000 1 Section 6.1: Resource Allocation Issues 2 How to prevent traffic jams Traffic lights on freeway

More information

Address InterLeaving for Low- Cost NoCs

Address InterLeaving for Low- Cost NoCs Address InterLeaving for Low- Cost NoCs Miltos D. Grammatikakis, Kyprianos Papadimitriou, Polydoros Petrakis, Marcello Coppola, and Michael Soulie Technological Educational Institute of Crete, GR STMicroelectronics,

More information

CHAPTER 3 EFFECTIVE ADMISSION CONTROL MECHANISM IN WIRELESS MESH NETWORKS

CHAPTER 3 EFFECTIVE ADMISSION CONTROL MECHANISM IN WIRELESS MESH NETWORKS 28 CHAPTER 3 EFFECTIVE ADMISSION CONTROL MECHANISM IN WIRELESS MESH NETWORKS Introduction Measurement-based scheme, that constantly monitors the network, will incorporate the current network state in the

More information

Dynamic Scheduling Algorithm for input-queued crossbar switches

Dynamic Scheduling Algorithm for input-queued crossbar switches Dynamic Scheduling Algorithm for input-queued crossbar switches Mihir V. Shah, Mehul C. Patel, Dinesh J. Sharma, Ajay I. Trivedi Abstract Crossbars are main components of communication switches used to

More information

In-Order Packet Delivery in Interconnection Networks using Adaptive Routing

In-Order Packet Delivery in Interconnection Networks using Adaptive Routing In-Order Packet Delivery in Interconnection Networks using Adaptive Routing J.C. Martínez, J. Flich, A. Robles, P. López, and J. Duato Dept. of Computer Engineering Universidad Politécnica de Valencia

More information

DUE to the increasing computing power of microprocessors

DUE to the increasing computing power of microprocessors IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 13, NO. 7, JULY 2002 693 Boosting the Performance of Myrinet Networks José Flich, Member, IEEE, Pedro López, M.P. Malumbres, Member, IEEE, and

More information

Basic Switch Organization

Basic Switch Organization NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization

More information

Congestion. Can t sustain input rate > output rate Issues: - Avoid congestion - Control congestion - Prioritize who gets limited resources

Congestion. Can t sustain input rate > output rate Issues: - Avoid congestion - Control congestion - Prioritize who gets limited resources Congestion Source 1 Source 2 10-Mbps Ethernet 100-Mbps FDDI Router 1.5-Mbps T1 link Destination Can t sustain input rate > output rate Issues: - Avoid congestion - Control congestion - Prioritize who gets

More information

Deadlock-free Routing in InfiniBand TM through Destination Renaming Λ

Deadlock-free Routing in InfiniBand TM through Destination Renaming Λ Deadlock-free Routing in InfiniBand TM through Destination Renaming Λ P. López, J. Flich and J. Duato Dept. of Computing Engineering (DISCA) Universidad Politécnica de Valencia, Valencia, Spain plopez@gap.upv.es

More information

Congestion Control and Resource Allocation

Congestion Control and Resource Allocation Congestion Control and Resource Allocation Lecture material taken from Computer Networks A Systems Approach, Third Edition,Peterson and Davie, Morgan Kaufmann, 2007. Advanced Computer Networks Congestion

More information

Unit 2 Packet Switching Networks - II

Unit 2 Packet Switching Networks - II Unit 2 Packet Switching Networks - II Dijkstra Algorithm: Finding shortest path Algorithm for finding shortest paths N: set of nodes for which shortest path already found Initialization: (Start with source

More information

Chapter II. Protocols for High Speed Networks. 2.1 Need for alternative Protocols

Chapter II. Protocols for High Speed Networks. 2.1 Need for alternative Protocols Chapter II Protocols for High Speed Networks 2.1 Need for alternative Protocols As the conventional TCP suffers from poor performance on high bandwidth delay product links [47] meant for supporting transmission

More information

Routing and Fault-Tolerance Capabilities of the Fabriscale FM compared to OpenSM

Routing and Fault-Tolerance Capabilities of the Fabriscale FM compared to OpenSM Routing and Fault-Tolerance Capabilities of the Fabriscale FM compared to OpenSM Jesus Camacho Villanueva, Tor Skeie, and Sven-Arne Reinemo Fabriscale Technologies E-mail: {jesus.camacho,tor.skeie,sven-arne.reinemo}@fabriscale.com

More information

IEEE TRANSACTIONS ON COMPUTERS, VOL. 52, NO. 7, JULY Applying In-Transit Buffers to Boost the Performance of Networks with Source Routing

IEEE TRANSACTIONS ON COMPUTERS, VOL. 52, NO. 7, JULY Applying In-Transit Buffers to Boost the Performance of Networks with Source Routing IEEE TRANSACTIONS ON COMPUTERS, VOL. 52, NO. 7, JULY 2003 1 Applying In-Transit Buffers to Boost the Performance of Networks with Source Routing José Flich, Member, IEEE, Pedro López, Member, IEEE Computer

More information

Unicast Routing in Mobile Ad Hoc Networks. Dr. Ashikur Rahman CSE 6811: Wireless Ad hoc Networks

Unicast Routing in Mobile Ad Hoc Networks. Dr. Ashikur Rahman CSE 6811: Wireless Ad hoc Networks Unicast Routing in Mobile Ad Hoc Networks 1 Routing problem 2 Responsibility of a routing protocol Determining an optimal way to find optimal routes Determining a feasible path to a destination based on

More information

Bandwidth Allocation & TCP

Bandwidth Allocation & TCP Bandwidth Allocation & TCP The Transport Layer Focus Application Presentation How do we share bandwidth? Session Topics Transport Network Congestion control & fairness Data Link TCP Additive Increase/Multiplicative

More information

Application of SDN: Load Balancing & Traffic Engineering

Application of SDN: Load Balancing & Traffic Engineering Application of SDN: Load Balancing & Traffic Engineering Outline 1 OpenFlow-Based Server Load Balancing Gone Wild Introduction OpenFlow Solution Partitioning the Client Traffic Transitioning With Connection

More information

Frame Relay. Frame Relay: characteristics

Frame Relay. Frame Relay: characteristics Frame Relay Andrea Bianco Telecommunication Network Group firstname.lastname@polito.it http://www.telematica.polito.it/ Network management and QoS provisioning - 1 Frame Relay: characteristics Packet switching

More information

TCP and BBR. Geoff Huston APNIC

TCP and BBR. Geoff Huston APNIC TCP and BBR Geoff Huston APNIC Computer Networking is all about moving data The way in which data movement is controlled is a key characteristic of the network architecture The Internet protocol passed

More information

Network Control and Signalling

Network Control and Signalling Network Control and Signalling 1. Introduction 2. Fundamentals and design principles 3. Network architecture and topology 4. Network control and signalling 5. Network components 5.1 links 5.2 switches

More information

Ethernet Hub. Campus Network Design. Hubs. Sending and receiving Ethernet frames via a hub

Ethernet Hub. Campus Network Design. Hubs. Sending and receiving Ethernet frames via a hub Campus Network Design Thana Hongsuwan Ethernet Hub 2003, Cisco Systems, Inc. All rights reserved. 1-1 2003, Cisco Systems, Inc. All rights reserved. BCMSN v2.0 1-2 Sending and receiving Ethernet frames

More information

A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes

A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes N.A. Nordbotten 1, M.E. Gómez 2, J. Flich 2, P.López 2, A. Robles 2, T. Skeie 1, O. Lysne 1, and J. Duato 2 1 Simula Research

More information

Computer Networks. Sándor Laki ELTE-Ericsson Communication Networks Laboratory

Computer Networks. Sándor Laki ELTE-Ericsson Communication Networks Laboratory Computer Networks Sándor Laki ELTE-Ericsson Communication Networks Laboratory ELTE FI Department Of Information Systems lakis@elte.hu http://lakis.web.elte.hu Based on the slides of Laurent Vanbever. Further

More information

Chapter 7 CONCLUSION

Chapter 7 CONCLUSION 97 Chapter 7 CONCLUSION 7.1. Introduction A Mobile Ad-hoc Network (MANET) could be considered as network of mobile nodes which communicate with each other without any fixed infrastructure. The nodes in

More information

P802.1Qcz Congestion Isolation

P802.1Qcz Congestion Isolation P802.1Qcz Congestion Isolation IEEE 802 / IETF Workshop on Data Center Networking Bangkok November 2018 Paul Congdon (Huawei/Tallac) The Case for Low-latency, Lossless, Large-Scale DCNs More and more latency-sensitive

More information

PLEASE READ CAREFULLY BEFORE YOU START

PLEASE READ CAREFULLY BEFORE YOU START MIDTERM EXAMINATION #2 NETWORKING CONCEPTS 03-60-367-01 U N I V E R S I T Y O F W I N D S O R - S c h o o l o f C o m p u t e r S c i e n c e Fall 2011 Question Paper NOTE: Students may take this question

More information

Lecture 21. Reminders: Homework 6 due today, Programming Project 4 due on Thursday Questions? Current event: BGP router glitch on Nov.

Lecture 21. Reminders: Homework 6 due today, Programming Project 4 due on Thursday Questions? Current event: BGP router glitch on Nov. Lecture 21 Reminders: Homework 6 due today, Programming Project 4 due on Thursday Questions? Current event: BGP router glitch on Nov. 7 http://money.cnn.com/2011/11/07/technology/juniper_internet_outage/

More information

End-to-End Adaptive Packet Aggregation for High-Throughput I/O Bus Network Using Ethernet

End-to-End Adaptive Packet Aggregation for High-Throughput I/O Bus Network Using Ethernet Hot Interconnects 2014 End-to-End Adaptive Packet Aggregation for High-Throughput I/O Bus Network Using Ethernet Green Platform Research Laboratories, NEC, Japan J. Suzuki, Y. Hayashi, M. Kan, S. Miyakawa,

More information

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching. Switching/Flow Control Overview Interconnection Networks: Flow Control and Microarchitecture Topology: determines connectivity of network Routing: determines paths through network Flow Control: determine

More information

Transmission Control Protocol. ITS 413 Internet Technologies and Applications

Transmission Control Protocol. ITS 413 Internet Technologies and Applications Transmission Control Protocol ITS 413 Internet Technologies and Applications Contents Overview of TCP (Review) TCP and Congestion Control The Causes of Congestion Approaches to Congestion Control TCP Congestion

More information

CONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS

CONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS Proceedings of the International Conference on Parallel and Distributed Computing and Systems, Las Vegas, Nevada, pp. 379-384, October 1998. CONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS

More information