Tagger: Practical PFC Deadlock Prevention in Data Center Networks
|
|
- Felicity Powell
- 6 years ago
- Views:
Transcription
1 Tagger: Practical PFC Deadlock Prevention in Data Center Networks Shuihai Hu*(HKUST), Yibo Zhu, Peng Cheng, Chuanxiong Guo* (Toutiao), Kun Tan*(Huawei), Jitendra Padhye, Kai Chen (HKUST) Microsoft CoNEXT 2017, Incheon, South Korea * Work done while at Microsoft 1
2 RDMA is Being Widely Deployed RDMA: Remote Direct Memory Access v High throughput, low latency with low CPU overhead v Microsoft, Google, etc. are deploying RDMA RDMA Application RDMA Application Kernel kernel bypass Lossless Network Kernel kernel bypass RDMA NIC (With PFC) RDMA NIC 2
3 Priority Flow Control (PFC) PAUSE Congestion PFC threshold: 3pkts PAUSE upstream switch when PFC threshold reached v Avoid packet drop due to buffer overflow 3
4 A Simple Illustration of PFC Deadlock Switch A PFC threshold PAUSE PAUSE Switch C PAUSE Switch B Due to Cyclic Buffer Dependency (CBD) A->B->C->A Not just a theoretical problem, we have seen it in our datacenters too! 4
5 CBD in the Clos Network 5
6 CBD in the Clos Network flow 1 flow 2 consider two flows initially follow shortest UP-DOWN paths 6
7 CBD in the Clos Network flow 1 flow 2 due to link failures, both flows are locally rerouted to non-shortest paths 7
8 CBD in the Clos Network RX RX RX RX RX RX flow 1 flow 2 these two DOWN-UP bounced flows create CBD buffer dependency graph CBD: ->->->-> 8
9 Real in Production Data Centers? Packet reroute measurements in more than 20 data centers: ~100,000 DOWN-UP reroutes! 9
10 Handling Deadlock is Important #1: transient problem à PERMANENT deadlock v Transient loops due to link failures v Packet flooding v #2: small deadlock can cause large deadlock PAUSE PAUSE PAUSE PAUSE deadlock PAUSE PAUSE PAUSE 10
11 Three Key Challenges What are the challenges in designing a practical deadlock prevention solution? Ø No change to existing routing protocols or hardware Ø Link failures & routing errors are unavoidable at scale Ø Switches support at most 8 limited lossless priorities (and typically only two can be used) 11
12 The Existing Deadlock Prevention Solutions #1: deadlock-free routing protocols v not supported by commodity switches (fail challenge #1) v not work with link failures or routing errors (fail challenge #2) #2: buffer management schemes v require a lot of lossless priorities (fail challenge #3) Our answer: Tagger 12
13 TAGGER DESIGN 13
14 Important Observation Fat-tree [Sigcomm 08] V [Sigcomm 09] BCube [Sigcomm 09] HyperX [SC 09] desired path set: all shortest paths desired path set: dimension-order paths Takeaway: In a data center, we can ask operator to supply a set of expected lossless paths (ELP)! 14
15 Basic Idea of Tagger 1. Ask operators to provide: v topology & expected lossless paths (ELP) 2. Packets carrying tags when in the network 3. Pre-install match-action rules at switches for tag manipulation and packet queueing v v packets travel over ELP: lossless queues & CBD never forms packets deviate ELP: lossy queue, thus PFC not triggered 15
16 Illustrating Tagger for Clos Topology Root cause of CBD: packets deviate UP-DOWN routing! flow 1 flow 2 ELP = all shortest paths (CBD-free) 16
17 Illustrating Tagger for Clos Topology match action Tag InPort OutPort NewTag NoBounce Bounced match-action rules installed at switches flow 1 tag = NoBounce Under Tagger, packets carry tags when travelling in the network Initially, tag value = NoBounce At switches, Tagger pre-install match-action rules for tag manipulation 17
18 Illustrating Tagger for Clos Topology tag = NoBounce match action Tag InPort OutPort NewTag NoBounce Bounced match-action rules installed at switches flow 1 Packet received by switch 18
19 Illustrating Tagger for Clos Topology tag = NoBounce Bounced match action Tag InPort OutPort NewTag NoBounce Bounced down-up bounce observed! flow 1 rewrite tag once DOWN-UP bounce detected 19
20 Illustrating Tagger for Clos Topology tag = Bounced flow 1 knows it is a bounced packet that deviates ELP à placed in the lossy queue No PFC PAUSE sent from to à buffer dependency from to removed 20
21 Illustrating Tagger for Clos Topology RX RX RX RX RX RX flow 2 buffer dependency graph CBD: ->->->-> Tagger will do the same for packets of flow 2 2 buffer dependency edges are removed à CBD is eliminated 21
22 What If ELP Has CBD? ELP = shortest paths + 1-bounce paths (ELP has CBD now!) 22
23 Segmenting ELP into CBD-free Subsets two bounced paths are in ELP now flow 1 flow 2 flow 1 flow 2 flow 1 flow 2 path segments before bounce (only have UP-DOWN paths, no CBD) path segments after bounce (only have UP-DOWN paths, no CBD) 23
24 Isolating Path Segments with Tags flow 1 flow 2 flow 1 flow 2 tag 1 à path segments before bounce tag 2 à path segments after bounce 24
25 Isolating Path Segments with Tags tag = 1 tag = 2 flow 1 Adding a rule at switch : (Tag = 1, Inport=, OutPort = ) -> NewTag = 2 25
26 No CBD after Segmentation flow 1 flow 2 tag 1 flow 1 flow 2 tag buffer dependency graph packets with tag i à i-th lossless queue CBD: ->->->-> 26
27 What If k-bounce Paths all in ELP? solution: just segmenting ELP into k CBD-free subsets based on number of bounced times! ELP = shortest up-down paths + 1-bounce paths k-bounce paths 27
28 Summary: Tagger Design for Clos Topology 1. Initially, packets carry with tag = 1 2. pre-install match-action rules at switches: DOWN-UP bounce: increase tag by 1 Enqueue packets with tag i to i-th lossless queue (i <= k+1) Enqueue packets with tag i to lossy queue(i > k+1) For Clos topology, Tagger is optimal in terms of # of lossless priorities. 28
29 How to Implement Tagger? DSCP field in the IP header as the tag carried in the packets build 3-step match-action pipeline with basic ACL rules available in commodity switches 29
30 Tagger Meets All the Three Challenges 1. Work with existing routing protocols & hardware 2. Work with link failures & routing errors 3. Work with limited number of lossless queues 30
31 More Details in the Paper Proof of Deadlock freedom Analysis & Discussions Algorithm complexity Optimality Compression of match-action rules 31
32 Evaluation-1: Tagger prevents Deadlock deadlock! flow 1 flow 2 Scenario: two flows forms CBD Tagger avoids CBD caused by bounced flows, and prevents deadlock! 32
33 Evaluation-2: Scalability of Tagger * last entry includes additional 20,000 random paths. Match-action rules and priorities required for Jellyfish topology Tagger is scalable in terms of number of lossless priorities and ACL rules. 33
34 Evaluation-3: Overhead of Tagger Tagger rules have no impact on throughput and latency 34
35 Conclusion Tagger: a tagging system guarantees deadlock-freedom Practical: Ørequire no change to existing routing protocols Øimplementable with existing commodity switching ASICs Øwork with limited number of lossless priorities General: Øwork with any topologies Øwork with any ELPs 35
36 Thanks! 36
Tagger: Practical PFC Deadlock Prevention in Data Center Networks
Tagger: Practical PFC Deadlock Prevention in Data Center Networks Shuihai Hu,, Yibo Zhu, Peng Cheng, Chuanxiong Guo Kun Tan, Jitendra Padhye, Kai Chen Microsoft Hong Kong University of Science and Technology
More informationRDMA over Commodity Ethernet at Scale
RDMA over Commodity Ethernet at Scale Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitendra Padhye, Marina Lipshteyn ACM SIGCOMM 2016 August 24 2016 Outline RDMA/RoCEv2 background DSCP-based
More informationRDMA in Data Centers: Looking Back and Looking Forward
RDMA in Data Centers: Looking Back and Looking Forward Chuanxiong Guo Microsoft Research ACM SIGCOMM APNet 2017 August 3 2017 The Rising of Cloud Computing 40 AZURE REGIONS Data Centers Data Centers Data
More informationP802.1Qcz Congestion Isolation
P802.1Qcz Congestion Isolation IEEE 802 / IETF Workshop on Data Center Networking Bangkok November 2018 Paul Congdon (Huawei/Tallac) The Case for Low-latency, Lossless, Large-Scale DCNs More and more latency-sensitive
More informationRevisiting Network Support for RDMA
Revisiting Network Support for RDMA Radhika Mittal 1, Alex Shpiner 3, Aurojit Panda 1, Eitan Zahavi 3, Arvind Krishnamurthy 2, Sylvia Ratnasamy 1, Scott Shenker 1 (1: UC Berkeley, 2: Univ. of Washington,
More informationRDMA and Hardware Support
RDMA and Hardware Support SIGCOMM Topic Preview 2018 Yibo Zhu Microsoft Research 1 The (Traditional) Journey of Data How app developers see the network Under the hood This architecture had been working
More informationBaidu s Best Practice with Low Latency Networks
Baidu s Best Practice with Low Latency Networks Feng Gao IEEE 802 IC NEND Orlando, FL November 2017 Presented by Huawei Low Latency Network Solutions 01 1. Background Introduction 2. Network Latency Analysis
More informationIEEE P802.1Qcz Proposed Project for Congestion Isolation
IEEE P82.1Qcz Proposed Project for Congestion Isolation IETF 11 London ICCRG Paul Congdon paul.congdon@tallac.com Project Background P82.1Qcz Project Initiation November 217 - Agreed to develop a Project
More informationRoGUE: RDMA over Generic Unconverged Ethernet
RoGUE: RDMA over Generic Unconverged Ethernet Yanfang Le with Brent Stephens, Arjun Singhvi, Aditya Akella, Mike Swift RDMA Overview RDMA USER KERNEL Zero Copy Application Application Buffer Buffer HARWARE
More informationGot Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China
Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat ACM SIGCOMM 2013, 12-16 August, Hong Kong, China Virtualized Server 1 Application Performance in Virtualized
More informationSoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet
SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet Mao Miao, Fengyuan Ren, Xiaohui Luo, Jing Xie, Qingkai Meng, Wenxue Cheng Dept. of Computer Science and Technology, Tsinghua
More informationRequirement Discussion of Flow-Based Flow Control(FFC)
Requirement Discussion of Flow-Based Flow Control(FFC) Nongda Hu Yolanda Yu hunongda@huawei.com yolanda.yu@huawei.com IEEE 802.1 DCB, Stuttgart, May 2017 www.huawei.com new-dcb-yolanda-ffc-proposal-0517-v01
More informationFrom Routing to Traffic Engineering
1 From Routing to Traffic Engineering Robert Soulé Advanced Networking Fall 2016 2 In the beginning B Goal: pair-wise connectivity (get packets from A to B) Approach: configure static rules in routers
More informationInformation-Agnostic Flow Scheduling for Commodity Data Centers. Kai Chen SING Group, CSE Department, HKUST May 16, Stanford University
Information-Agnostic Flow Scheduling for Commodity Data Centers Kai Chen SING Group, CSE Department, HKUST May 16, 2016 @ Stanford University 1 SING Testbed Cluster Electrical Packet Switch, 1G (x10) Electrical
More informationFaRM: Fast Remote Memory
FaRM: Fast Remote Memory Problem Context DRAM prices have decreased significantly Cost effective to build commodity servers w/hundreds of GBs E.g. - cluster with 100 machines can hold tens of TBs of main
More informationEXPERIENCES EVALUATING DCTCP. Lawrence Brakmo, Boris Burkov, Greg Leclercq and Murat Mugan Facebook
EXPERIENCES EVALUATING DCTCP Lawrence Brakmo, Boris Burkov, Greg Leclercq and Murat Mugan Facebook INTRODUCTION Standard TCP congestion control, which only reacts to packet losses has many problems Can
More informationProgrammable NICs. Lecture 14, Computer Networks (198:552)
Programmable NICs Lecture 14, Computer Networks (198:552) Network Interface Cards (NICs) The physical interface between a machine and the wire Life of a transmitted packet Userspace application NIC Transport
More informationAdvanced Computer Networks. Flow Control
Advanced Computer Networks 263 3501 00 Flow Control Patrick Stuedi Spring Semester 2017 1 Oriana Riva, Department of Computer Science ETH Zürich Last week TCP in Datacenters Avoid incast problem - Reduce
More informationDIBS: Just-in-time congestion mitigation for Data Centers
DIBS: Just-in-time congestion mitigation for Data Centers Kyriakos Zarifis, Rui Miao, Matt Calder, Ethan Katz-Bassett, Minlan Yu, Jitendra Padhye University of Southern California Microsoft Research Summary
More informationConfiguring Priority Flow Control
This chapter contains the following sections: Information About Priority Flow Control, page 1 Guidelines and Limitations, page 2 Default Settings for Priority Flow Control, page 3 Enabling Priority Flow
More informationBest Practices for Deployments using DCB and RoCE
Best Practices for Deployments using DCB and RoCE Contents Introduction... Converged Networks... RoCE... RoCE and iwarp Comparison... RoCE Benefits for the Data Center... RoCE Evaluation Design... RoCE
More informationDemocratically Finding The Cause of Packet Drops
Democratically Finding The Cause of Packet Drops Behnaz Arzani, Selim Ciraci, Luiz Chamon, Yibo Zhu, Hongqiang (Harry) Liu, Jitu Padhye, Geoff Outhred, Boon Thau Loo 1 Marple- SigComm 2017 Sherlock- SigComm
More informationInformation-Agnostic Flow Scheduling for Commodity Data Centers
Information-Agnostic Flow Scheduling for Commodity Data Centers Wei Bai, Li Chen, Kai Chen, Dongsu Han (KAIST), Chen Tian (NJU), Hao Wang Sing Group @ Hong Kong University of Science and Technology USENIX
More informationRouting in packet-switching networks
Routing in packet-switching networks Circuit switching vs. Packet switching Most of WANs based on circuit or packet switching Circuit switching designed for voice Resources dedicated to a particular call
More informationSynchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom
ISCA 2018 Session 8B: Interconnection Networks Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom Aniruddh Ramrakhyani Georgia Tech (aniruddh@gatech.edu) Tushar
More informationPacket Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)
Packet Scheduling in Data Centers Lecture 17, Computer Networks (198:552) Datacenter transport Goal: Complete flows quickly / meet deadlines Short flows (e.g., query, coordination) Large flows (e.g., data
More informationInterconnection Networks: Routing. Prof. Natalie Enright Jerger
Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly
More informationConfiguring Priority Flow Control
This chapter contains the following sections: Information About Priority Flow Control, page 1 Guidelines and Limitations, page 2 Default Settings for Priority Flow Control, page 3 Enabling Priority Flow
More informationCongestion Control for Large-Scale RDMA Deployments
Congestion Control for Large-Scale RDMA Deployments Yibo Zhu 1,3 Haggai Eran 2 Daniel Firestone 1 Chuanxiong Guo 1 Marina Lipshteyn 1 Yehonatan Liron 2 Jitendra Padhye 1 Shachar Raindel 2 Mohamad Haj Yahia
More informationRouter s Queue Management
Router s Queue Management Manages sharing of (i) buffer space (ii) bandwidth Q1: Which packet to drop when queue is full? Q2: Which packet to send next? FIFO + Drop Tail Keep a single queue Answer to Q1:
More informationBCube: A High Performance, Servercentric. Architecture for Modular Data Centers
BCube: A High Performance, Servercentric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang1;2, Yunfeng Shi1;3, Chen Tian1;4, Yongguang Zhang1, Songwu
More informationAdvanced Computer Networks. RDMA, Network Virtualization
Advanced Computer Networks 263 3501 00 RDMA, Network Virtualization Patrick Stuedi Spring Semester 2013 Oriana Riva, Department of Computer Science ETH Zürich Last Week Scaling Layer 2 Portland VL2 TCP
More informationRouting Strategies. Fixed Routing. Fixed Flooding Random Adaptive
Routing Strategies Fixed Flooding Random Adaptive Fixed Routing Single permanent route for each source to destination pair Determine routes using a least cost algorithm Route fixed, at least until a change
More informationCongestion Control in Datacenters. Ahmed Saeed
Congestion Control in Datacenters Ahmed Saeed What is a Datacenter? Tens of thousands of machines in the same building (or adjacent buildings) Hundreds of switches connecting all machines What is a Datacenter?
More informationHIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS
HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS CS6410 Moontae Lee (Nov 20, 2014) Part 1 Overview 00 Background User-level Networking (U-Net) Remote Direct Memory Access
More informationLITE Kernel RDMA. Support for Datacenter Applications. Shin-Yeh Tsai, Yiying Zhang
LITE Kernel RDMA Support for Datacenter Applications Shin-Yeh Tsai, Yiying Zhang Time 2 Berkeley Socket Userspace Kernel Hardware Time 1983 2 Berkeley Socket TCP Offload engine Arrakis & mtcp IX Userspace
More informationExtending commodity OpenFlow switches for large-scale HPC deployments
Extending commodity OpenFlow switches for large-scale HPC deployments Mariano Benito Enrique Vallejo Ramón Beivide Cruz Izu University of Cantabria The University of Adelaide Overview 1.Introduction 1.
More informationCutting the Cord: A Robust Wireless Facilities Network for Data Centers
Cutting the Cord: A Robust Wireless Facilities Network for Data Centers Yibo Zhu, Xia Zhou, Zengbin Zhang, Lin Zhou, Amin Vahdat, Ben Y. Zhao and Haitao Zheng U.C. Santa Barbara, Dartmouth College, U.C.
More informationTDT Appendix E Interconnection Networks
TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages
More informationConfiguring Priority Flow Control
About Priority Flow Control, page 1 Licensing Requirements for Priority Flow Control, page 2 Prerequisites for Priority Flow Control, page 2 Guidelines and Limitations for Priority Flow Control, page 2
More informationHigh Performance Packet Processing with FlexNIC
High Performance Packet Processing with FlexNIC Antoine Kaufmann, Naveen Kr. Sharma Thomas Anderson, Arvind Krishnamurthy University of Washington Simon Peter The University of Texas at Austin Ethernet
More informationDeTail Reducing the Tail of Flow Completion Times in Datacenter Networks. David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz
DeTail Reducing the Tail of Flow Completion Times in Datacenter Networks David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz 1 A Typical Facebook Page Modern pages have many components
More informationAudience This paper is targeted for IT managers and architects. It showcases how to utilize your network efficiently and gain higher performance using
White paper Benefits of Remote Direct Memory Access Over Routed Fabrics Introduction An enormous impact on data center design and operations is happening because of the rapid evolution of enterprise IT.
More informationReducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet
Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems
More informationDos-A Scalable Optical Switch for Datacenters
Dos-A Scalable Optical Switch for Datacenters Speaker: Lin Wang Research Advisor: Biswanath Mukherjee Ye, X. et al., DOS: A scalable optical switch for datacenters, Proceedings of the 6th ACM/IEEE Symposium
More informationMaximum Performance. How to get it and how to avoid pitfalls. Christoph Lameter, PhD
Maximum Performance How to get it and how to avoid pitfalls Christoph Lameter, PhD cl@linux.com Performance Just push a button? Systems are optimized by default for good general performance in all areas.
More informationRoCE vs. iwarp Competitive Analysis
WHITE PAPER February 217 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...5 Summary...6
More informationConfiguring Priority Flow Control
About Priority Flow Control, on page 1 Licensing Requirements for Priority Flow Control, on page 2 Prerequisites for Priority Flow Control, on page 2 Guidelines and Limitations for Priority Flow Control,
More informationWilliam Stallings Data and Computer Communications. Chapter 10 Packet Switching
William Stallings Data and Computer Communications Chapter 10 Packet Switching Principles Circuit switching designed for voice Resources dedicated to a particular call Much of the time a data connection
More informationarxiv: v2 [cs.ni] 12 Jun 2012
Finishing Flows Quickly with Preemptive Scheduling Chi-Yao Hong UIUC cyhong@illinois.edu Matthew Caesar UIUC caesar@illinois.edu P. Brighten Godfrey UIUC pbg@illinois.edu arxiv:6.7v [cs.ni] Jun ABSTRACT
More informationFinishing Flows Quickly with Preemptive Scheduling
Finishing Flows Quickly with Preemptive Scheduling Chi-Yao Hong UIUC cyhong@illinois.edu Matthew Caesar UIUC caesar@illinois.edu P. Brighten Godfrey UIUC pbg@illinois.edu ABSTRACT Today s data centers
More informationNFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications
NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan
More informationCutting the Cord: A Robust Wireless Facilities Network for Data Centers
Cutting the Cord: A Robust Wireless Facilities Network for Data Centers Yibo Zhu, Xia Zhou, Zengbin Zhang, Lin Zhou, Amin Vahdat, Ben Y. Zhao and Haitao Zheng U.C. Santa Barbara, Dartmouth College, U.C.
More informationMaelstrom: An Enterprise Continuity Protocol for Financial Datacenters
Maelstrom: An Enterprise Continuity Protocol for Financial Datacenters Mahesh Balakrishnan, Tudor Marian, Hakim Weatherspoon Cornell University, Ithaca, NY Datacenters Internet Services (90s) Websites,
More informationSPARTA: Scalable Per-Address RouTing Architecture
SPARTA: Scalable Per-Address RouTing Architecture John Carter Data Center Networking IBM Research - Austin IBM Research Science & Technology IBM Research activities related to SDN / OpenFlow IBM Research
More informationConfiguring QoS. Finding Feature Information. Prerequisites for QoS
Finding Feature Information, page 1 Prerequisites for QoS, page 1 Restrictions for QoS, page 3 Information About QoS, page 4 How to Configure QoS, page 28 Monitoring Standard QoS, page 80 Configuration
More informationSOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik
SOFTWARE DEFINED NETWORKS Jonathan Chu Muhammad Salman Malik Credits Material Derived from: Rob Sherwood, Saurav Das, Yiannis Yiakoumis AT&T Tech Talks October 2010 (available at:www.openflow.org/wk/images/1/17/openflow_in_spnetworks.ppt)
More informationA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network Architecture B Y M O H A M M A D A L - F A R E S A L E X A N D E R L O U K I S S A S A M I N V A H D A T P R E S E N T E D B Y N A N X I C H E N M A Y. 5, 2 0
More informationLecture 3: Flow-Control
High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor
More informationQuest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling
Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling Bhavya K. Daya, Li-Shiuan Peh, Anantha P. Chandrakasan Dept. of Electrical Engineering and Computer
More informationProviding Bandwidth Guarantees, Work Conservation and Low Latency Simultaneously in the Cloud
1 Providing Bandwidth Guarantees, Work Conservation and Low Latency Simultaneously in the Cloud Shuihai Hu 1, Wei Bai 1,2, Kai Chen 1, Chen Tian 3, Ying Zhang 4, Haitao Wu 5 1 SING Group @ HKUST 2 Microsoft
More informationAdaptive Routing. Claudio Brunelli Adaptive Routing Institute of Digital and Computer Systems / TKT-9636
1 Adaptive Routing Adaptive Routing Basics Minimal Adaptive Routing Fully Adaptive Routing Load-Balanced Adaptive Routing Search-Based Routing Case Study: Adapted Routing in the Thinking Machines CM-5
More informationMicro load balancing in data centers with DRILL
Micro load balancing in data centers with DRILL Soudeh Ghorbani (UIUC) Brighten Godfrey (UIUC) Yashar Ganjali (University of Toronto) Amin Firoozshahian (Intel) Where should the load balancing functionality
More informationIsoStack Highly Efficient Network Processing on Dedicated Cores
IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single
More informationMessaging Overview. Introduction. Gen-Z Messaging
Page 1 of 6 Messaging Overview Introduction Gen-Z is a new data access technology that not only enhances memory and data storage solutions, but also provides a framework for both optimized and traditional
More informationWhy AI Frameworks Need (not only) RDMA?
Why AI Frameworks Need (not only) RDMA? With Design and Implementation Experience of Networking Support on TensorFlow GDR, Apache MXNet, WeChat Amber, and Tencent Angel Bairen Yi (byi@connect.ust.hk) Jingrong
More informationETSF05/ETSF10 Internet Protocols. Routing on the Internet
ETSF05/ETSF10 Internet Protocols Routing on the Internet Circuit switched routing ETSF05/ETSF10 - Internet Protocols 2 Routing in Packet Switching Networks Key design issue for (packet) switched networks
More informationQuickSpecs. HP Z 10GbE Dual Port Module. Models
Overview Models Part Number: 1Ql49AA Introduction The is a 10GBASE-T adapter utilizing the Intel X722 MAC and X557-AT2 PHY pairing to deliver full line-rate performance, utilizing CAT 6A UTP cabling (or
More informationQoS Architecture and Its Implementation. Sueng- Yong Park, Ph.D. Yonsei University
Architecture and Its Implementation Sueng- Yong Park, Ph.D. Yonsei University 2007.11.07 1 Scheduler Deficit Round Robin (DRR) Implementation of DRR Calculation of BW 2 Deficit Round Robin Each queue,
More informationDevoFlow: Scaling Flow Management for High-Performance Networks
DevoFlow: Scaling Flow Management for High-Performance Networks Andy Curtis Jeff Mogul Jean Tourrilhes Praveen Yalagandula Puneet Sharma Sujata Banerjee Software-defined networking Software-defined networking
More information1/5/2012. Overview of Interconnects. Presentation Outline. Myrinet and Quadrics. Interconnects. Switch-Based Interconnects
Overview of Interconnects Myrinet and Quadrics Leading Modern Interconnects Presentation Outline General Concepts of Interconnects Myrinet Latest Products Quadrics Latest Release Our Research Interconnects
More informationFast packet processing in the cloud. Dániel Géhberger Ericsson Research
Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration
More informationSCALING SOFTWARE DEFINED NETWORKS. Chengyu Fan (edited by Lorenzo De Carli)
SCALING SOFTWARE DEFINED NETWORKS Chengyu Fan (edited by Lorenzo De Carli) Introduction Network management is driven by policy requirements Network Policy Guests must access Internet via web-proxy Web
More informationAdvanced Computer Networks. Flow Control
Advanced Computer Networks 263 3501 00 Flow Control Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 Oriana Riva, Department of Computer Science ETH Zürich 1 Today Flow Control Store-and-forward,
More informationAdaptive Routing for Data Center Bridges
Adaptive Routing for Data Center Bridges Cyriel Minkenberg 1, Mitchell Gusat 1, German Rodriguez 2 1 IBM Research - Zurich 2 Barcelona Supercomputing Center Overview IBM Research - Zurich Data center network
More informationNetwork Interface Architecture and Prototyping for Chip and Cluster Multiprocessors
University of Crete School of Sciences & Engineering Computer Science Department Master Thesis by Michael Papamichael Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors
More informationThis Lecture. BUS Computer Facilities Network Management. Switching Network. Simple Switching Network
This Lecture BUS0 - Computer Facilities Network Management Switching networks Circuit switching Packet switching gram approach Virtual circuit approach Routing in switching networks Faculty of Information
More informationData Center Network Topologies II
Data Center Network Topologies II Hakim Weatherspoon Associate Professor, Dept of Computer cience C 5413: High Performance ystems and Networking April 10, 2017 March 31, 2017 Agenda for semester Project
More informationSlicing a Network. Software-Defined Network (SDN) FlowVisor. Advanced! Computer Networks. Centralized Network Control (NC)
Slicing a Network Advanced! Computer Networks Sherwood, R., et al., Can the Production Network Be the Testbed? Proc. of the 9 th USENIX Symposium on OSDI, 2010 Reference: [C+07] Cascado et al., Ethane:
More informationRouter Architectures
Router Architectures Venkat Padmanabhan Microsoft Research 13 April 2001 Venkat Padmanabhan 1 Outline Router architecture overview 50 Gbps multi-gigabit router (Partridge et al.) Technology trends Venkat
More informationAdvanced Computer Networks. End Host Optimization
Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct
More informationFIBRE CHANNEL OVER ETHERNET
FIBRE CHANNEL OVER ETHERNET A Review of FCoE Today Abstract Fibre Channel over Ethernet (FcoE) is a storage networking option, based on industry standards. This white paper provides an overview of FCoE,
More informationEECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal
Lecture 19 Interconnects: Flow Control Winter 2018 Subhankar Pal http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,
More informationTowards scalable RDMA locking on a NIC
TORSTEN HOEFLER spcl.inf.ethz.ch Towards scalable RDMA locking on a NIC with support of Patrick Schmid, Maciej Besta, Salvatore di Girolamo @ SPCL presented at HP Labs, Palo Alto, CA, USA NEED FOR EFFICIENT
More informationRouting in Ad Hoc Wireless Networks PROF. MICHAEL TSAI / DR. KATE LIN 2014/05/14
Routing in Ad Hoc Wireless Networks PROF. MICHAEL TSAI / DR. KATE LIN 2014/05/14 Routing Algorithms Link- State algorithm Each node maintains a view of the whole network topology Find the shortest path
More informationEE 382C Interconnection Networks
EE 8C Interconnection Networks Deadlock and Livelock Stanford University - EE8C - Spring 6 Deadlock and Livelock: Terminology Deadlock: A condition in which an agent waits indefinitely trying to acquire
More informationNVMe Over Fabrics (NVMe-oF)
NVMe Over Fabrics (NVMe-oF) High Performance Flash Moves to Ethernet Rob Davis Vice President Storage Technology, Mellanox Santa Clara, CA 1 Access Time Access in Time Micro (micro-sec) Seconds Why NVMe
More informationETSF05/ETSF10 Internet Protocols Routing on the Internet
ETSF05/ETSF10 Internet Protocols Routing on the Internet 2014, (ETSF05 Part 2), Lecture 1.1 Jens Andersson Circuit switched routing 2014 11 05 ETSF05/ETSF10 Internet Protocols 2 Packet switched Routing
More informationInterconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.
Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,
More informationExploiting Offload Enabled Network Interfaces
spcl.inf.ethz.ch S. DI GIROLAMO, P. JOLIVET, K. D. UNDERWOOD, T. HOEFLER Exploiting Offload Enabled Network Interfaces How to We program need an abstraction! QsNet? Lossy Networks Ethernet Lossless Networks
More informationInternetworking Part 1
CMPE 344 Computer Networks Spring 2012 Internetworking Part 1 Reading: Peterson and Davie, 3.1 22/03/2012 1 Not all networks are directly connected Limit to how many hosts can be attached Point-to-point:
More informationT Computer Networks II Data center networks
T-110.5116 Computer Networks II Data center networks 29.9.2014 Matti Siekkinen (Sources: S. Kandula et al.: The Nature of Datacenter: measurements & analysis, A. Greenberg: Networking The Cloud, M. Alizadeh
More informationExpeditus: Congestion-Aware Load Balancing in Clos Data Center Networks
Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks Peng Wang, Hong Xu, Zhixiong Niu, Dongsu Han, Yongqiang Xiong ACM SoCC 2016, Oct 5-7, Santa Clara Motivation Datacenter networks
More informationProgrammable Software Switches. Lecture 11, Computer Networks (198:552)
Programmable Software Switches Lecture 11, Computer Networks (198:552) Software-Defined Network (SDN) Centralized control plane Data plane Data plane Data plane Data plane Why software switching? Early
More informationThe Tofu Interconnect 2
The Tofu Interconnect 2 Yuichiro Ajima, Tomohiro Inoue, Shinya Hiramoto, Shun Ando, Masahiro Maeda, Takahide Yoshikawa, Koji Hosoe, and Toshiyuki Shimizu Fujitsu Limited Introduction Tofu interconnect
More informationFarewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation
Farewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation Yiying Zhang Datacenter 3 Monolithic Computer OS / Hypervisor 4 Can monolithic Application Hardware
More informationMulti-resource Energy-efficient Routing in Cloud Data Centers with Network-as-a-Service
in Cloud Data Centers with Network-as-a-Service Lin Wang*, Antonio Fernández Antaº, Fa Zhang*, Jie Wu+, Zhiyong Liu* *Institute of Computing Technology, CAS, China ºIMDEA Networks Institute, Spain + Temple
More informationIntroduction to Infiniband
Introduction to Infiniband FRNOG 22, April 4 th 2014 Yael Shenhav, Sr. Director of EMEA, APAC FAE, Application Engineering The InfiniBand Architecture Industry standard defined by the InfiniBand Trade
More information2017 Storage Developer Conference. Mellanox Technologies. All Rights Reserved.
Ethernet Storage Fabrics Using RDMA with Fast NVMe-oF Storage to Reduce Latency and Improve Efficiency Kevin Deierling & Idan Burstein Mellanox Technologies 1 Storage Media Technology Storage Media Access
More informationDeadlock-Free Local Fast Failover for Arbitrary Data Center Networks
Deadlock-Free Local Fast Failover for Arbitrary Data Center Networks Brent Stephens UW-Madison Alan L. Cox Rice University Abstract Today, given data center networks sizes and bursty workloads, it is likely
More information