Reliable Communication for Datacenters

Size: px
Start display at page:

Download "Reliable Communication for Datacenters"

Transcription

1 Cornell University

2 Datacenters Internet Services (90s) Websites, Search, Online Stores Since then: # of low-end volume servers Millions Installed Server Base 00-05: Commodity up by 100% High/Mid down by 40% Today: Datacenters are ubiquitous How have they evolved? Data partially sourced from IDC press releases (

3 Networks of Datacenters Why? Business Continuity, Client Locality, Distributed Datasets or Operations... Any modern enterprise! RTT: 110 ms 100 ms 200 ms 220 ms 100 ms 110 ms 210 ms N W E S

4 Networks of Real-Time Datacenters Finance, Aerospace, Military, Search and Rescue documents, chat, , games, videos, photos, blogs, social networks The Datacenter is the Computer! Not hard real-time: real fast, highly responsive, time-critical

5 Networks of Real-Time Datacenters Finance, Aerospace, Military, Search and Rescue documents, chat, , games, videos, photos, blogs, social networks The Datacenter is the Computer! Not hard real-time: real fast, highly responsive, time-critical Gartner Survey: Real-Time Infrastructure (RTI): reaction time in secs/mins 73%: RTI is important or very important 85%: Have no RTI capability

6 The Real-Time Datacenter Systems Challenges How do we recover from failures within seconds? Real-World, Real-Time

7 The Real-Time Datacenter Systems Challenges How do we recover from failures within seconds? Software Stack Crashes Overloads Bugs Exploits Disk Failure, Disasters Packet Loss Compilers, Databases, Distributed Systems, Machine Learning, Filesystems, Operating Systems, Networking... Real-World, Real-Time

8 The Real-Time Datacenter Systems Challenges How do we recover from failures within seconds? Software Stack Crashes Overloads Bugs Exploits Disk Failure, Disasters Tempest KyotoFS Plato SMFS Packet Loss Ricochet Maelstrom Compilers, Databases, Distributed Systems, Machine Learning, Filesystems, Operating Systems, Networking... Real-World, Real-Time

9 Reliable Communication Goal: Recover lost packets fast! Existing protocols react to loss: too much, too late We want proactive recovery: stable overhead, low latencies Maelstrom: Reliability between datacenters [NSDI 2008] Ricochet: Reliability within datacenters [NSDI 2007]

10 Reliable Communication between Datacenters TCP fails in three ways: 1. Throughput Collapse 100ms RTT, 0.1% Loss, 40 Gbps Tput < 10 Mbps! 2. Massive Buffers required for High-Rate Traffic 3. Recovery Delays for Time-Critical Traffic

11 Reliable Communication between Datacenters TCP fails in three ways: 1. Throughput Collapse 100ms RTT, 0.1% Loss, 40 Gbps Tput < 10 Mbps! 2. Massive Buffers required for High-Rate Traffic 3. Recovery Delays for Time-Critical Traffic Current Solutions: Rewrite Apps: One Flow Multiple Split Flows Resize Buffers Spend (infinite) money!

12 TeraGrid: Supercomputer Network

13 TeraGrid: Supercomputer Network End-to-End UDP Probes: Zero Congestion, Non-Zero Loss! Possible Reasons: transient congestion degraded fiber malfunctioning HW misconfigured HW switching contention low receiver power end-host overflow... % of Measurements % of Lost Packets

14 TeraGrid: Supercomputer Network End-to-End UDP Probes: Zero Congestion, Non-Zero Loss! Possible Reasons: transient congestion degraded fiber malfunctioning HW misconfigured HW switching contention low receiver power end-host overflow... Electronics: Cluttered Pathways % of Measurements % of Lost Packets Optics: Lossy Fiber

15 Problem Statement Run unmodified TCP/IP over lossy high-speed long-distance networks

16 The Maelstrom Network Appliance Router Router Sending End-hosts Commodity TCP Packet Loss Receiving End-hosts Commodity TCP

17 The Maelstrom Network Appliance Maelstrom Send-Side Appliance Maelstrom Receive-Side Appliance Router Router Sending End-hosts Commodity TCP Packet Loss Receiving End-hosts Commodity TCP Transparent: No modification to end-host or network

18 The Maelstrom Network Appliance Maelstrom Send-Side Appliance Maelstrom Receive-Side Appliance Encode FEC Decode Router Router Sending End-hosts Commodity TCP Packet Loss Receiving End-hosts Commodity TCP Transparent: No modification to end-host or network FEC = Forward Error Correction

19 What is FEC? Maelstrom Ricochet Conclusion A B C D E X X X C D E A B 3 repair packets from every 5 data packets Receiver can recover from any 3 lost packets Rate : (r, c) c repair packets for every r data packets. Pro: Recovery Latency independent of RTT Constant Data Overhead: c r+c Packet-level FEC at End-hosts: Inexpensive, No extra HW

20 What is FEC? Maelstrom Ricochet Conclusion A B C D E X X X C D E A B 3 repair packets from every 5 data packets Receiver can recover from any 3 lost packets Rate : (r, c) c repair packets for every r data packets. Pro: Recovery Latency independent of RTT Constant Data Overhead: c r+c Packet-level FEC at End-hosts: Inexpensive, No extra HW Con: Recovery Latency dependent on channel data rate

21 What is FEC? Maelstrom Ricochet Conclusion A B C D E X X X C D E A B 3 repair packets from every 5 data packets Receiver can recover from any 3 lost packets Rate : (r, c) c repair packets for every r data packets. Pro: Recovery Latency independent of RTT Constant Data Overhead: c r+c Packet-level FEC at End-hosts: Inexpensive, No extra HW Con: Recovery Latency dependent on channel data rate FEC in the Network: Where and What?

22 The Maelstrom Network Appliance Maelstrom Send-Side Appliance Maelstrom Receive-Side Appliance Encode FEC Decode Router Router Sending End-hosts Commodity TCP Packet Loss Receiving End-hosts Commodity TCP Transparent: No modification to end-host or network FEC = Forward Error Correction Where: at the appliance, What: aggregated data

23 Maelstrom Mechanism Send-Side Appliance: Snoop IP packets Create repair packet = XOR + recipe of data packet IDs Lambda Jumbo MTU LAN MTU Recipe List : 25,26,27,28,29 XOR Recovered Packet Appliance Appliance 27 X LOSS

24 Maelstrom Mechanism Send-Side Appliance: Snoop IP packets Create repair packet = XOR + recipe of data packet IDs Receive-Side Appliance: Lost packet recovered using XOR and other data packets At receiver end-host: out of order, no loss Lambda Jumbo MTU LAN MTU Recipe List : 25,26,27,28,29 XOR Recovered Packet Appliance Appliance 27 X LOSS

25 Layered Interleaving for Bursty Loss Recovery Latency Actual Burst Size, not Max Burst Size Data Stream XORs: X3 X2 X1 XORs at different interleaves Recovery latency degrades gracefully with loss burstiness: X1 catches random singleton losses X2 catches loss bursts of 10 or less X3 catches bursts of 100 or less 2

26 Layered Interleaving for Bursty Loss Recovery Latency Actual Burst Size, not Max Burst Size Data Stream XORs: X3 X2 X1 XORs at different interleaves Recovery latency degrades gracefully with loss burstiness: X1 catches random singleton losses X2 catches loss bursts of 10 or less X3 catches bursts of 100 or less Recovery probability Reed Solomon Maelstrom Loss probability Comparison of Recovery Probability: r=7, c=2 2

27 Maelstrom Modes Maelstrom Ricochet Conclusion TCP Traffic: Two Flow Control Modes End-Host Appliance Appliance End-Host A) End-to-End Flow Control End-Host Appliance Appliance End-Host B) Split Flow Control Split Mode avoids client buffer resizing (PeP)

28 Implementation Details In Kernel Linux Module Commodity Box: 3 Ghz, 1 Gbps NIC ( 800$) Max speed: 1 Gbps, Memory Footprint: 10 MB 50-60% CPU NIC is the bottleneck (for c = 3) How do we efficiently store/access/clean a gigabit of data every second? Scaling to Multi-Gigabit: Partition IP space across proxies

29 Evaluation: FEC Mode and Loss Claim: Maelstrom effectively hides loss from TCP/IP Throughput (Mbps) TCP/IP without loss TCP (No loss) TCP (0.1% loss) TCP (1% loss) One Way Link Latency (ms) TCP/IP with loss

30 Evaluation: FEC Mode and Loss Claim: Maelstrom effectively hides loss from TCP/IP Throughput (Mbps) TCP/IP without loss TCP (No loss) TCP (0.1% loss) TCP (1% loss) Maelstrom (No loss) Maelstrom (0.1%) Maelstrom (1%) One Way Link Latency (ms) TCP/IP with loss

31 Evaluation: FEC Mode and Loss Claim: Maelstrom effectively hides loss from TCP/IP Data + FEC = 1 Gbps Throughput (Mbps) TCP/IP without loss TCP (No loss) { TCP (0.1% loss) TCP (1% loss) Maelstrom (No loss) 600 Maelstrom (0.1%) 500 Maelstrom (1%) One Way Link Latency (ms) TCP/IP with loss

32 Evaluation: FEC Mode and Loss Claim: Maelstrom effectively hides loss from TCP/IP Data + FEC = 1 Gbps Throughput (Mbps) TCP/IP without loss TCP (No loss) { TCP (0.1% loss) TCP (1% loss) Maelstrom (No loss) 600 Maelstrom (0.1%) 500 Maelstrom (1%) Tput RTT One Way Link Latency (ms) TCP/IP with loss

33 Evaluation: Split Mode and Buffering Claim: Maelstrom eliminates the need for large end-host buffers Throughput (Mbps) Throughput as a function of latency loss-rate= One-way link latency (ms) Tput independent of RTT!

34 Evaluation: Delivery Latency Claim: Maelstrom eliminates TCP/IP s loss-related jitter Delivery Latency (ms) TCP/IP: 0.1% Loss Delivery Latency (ms) Maelstrom: 0.1% Loss Packet # Packet # Sources of Jitter: Receive-side buffering due to sequencing Send-side buffering due to congestion control

35 Evaluation: Layered Interleaving Claim: Recovery Latency depends on Actual Burst Length Burst Length = % Recovered Recovery Latency (ms) % Recovered Recovery Latency (ms) % Recovered Recovery Latency (ms) Longer Burst Lengths Longer Recovery Latency

36 Next Step: SMFS - The Smoke and Mirrors Filesystem Classic Mirroring Trade-off: Fast return to user after sending to mirror Safe return to user after ACK from mirror SMFS return to user after sending enough FEC Maelstrom: Lossy Network Lossless Network Disk! Result: Fast, Safe Mirroring independent of link length! General Principle: Gray-box Exposure of Protocol State

37 The Big Picture Maelstrom Ricochet Conclusion Software Stack Crashes Overloads Bugs Exploits Disk Failure, Disasters Tempest KyotoFS Plato SMFS Packet Loss Ricochet Maelstrom Compilers, Databases, Distributed Systems, Machine Learning, Filesystems, Operating Systems, Networking... Real-World, Real-Time

38 Motivation Design and Implementation Evaluation From Long-Haul to Multicast Between Datacenters High RTT Data acks Sender Receiver Feedback Loop Infeasible: Inter-Datacenter Long-Haul: RTT too high

39 Motivation Design and Implementation Evaluation From Long-Haul to Multicast Within Datacenter Between Datacenters Many Receivers acks Data Data acks Data Sender High RTT Data acks Receiver Multicast Group Feedback Loop Infeasible: Inter-Datacenter Long-Haul: RTT too high Intra-Cluster Multicast: Too many receivers

40 Motivation Design and Implementation Evaluation How is Multicast Used? service replication/partitioning, publish-subscribe, data caching... Financial Pub-Sub Example: Each equity is mapped to a multicast group Each node is interested in a different set of equities... Tracking Portfolio Tracking S&P 500

41 Motivation Design and Implementation Evaluation How is Multicast Used? service replication/partitioning, publish-subscribe, data caching... Financial Pub-Sub Example: Each equity is mapped to a multicast group Each node is interested in a different set of equities... Tracking Portfolio Tracking S&P 500 Each node in many groups = Low per-group data rate

42 Motivation Design and Implementation Evaluation How is Multicast Used? service replication/partitioning, publish-subscribe, data caching... Financial Pub-Sub Example: Each equity is mapped to a multicast group Each node is interested in a different set of equities... Tracking Portfolio Tracking S&P 500 Each node in many groups = Low per-group data rate High per-node data rate = Overload

43 Motivation Design and Implementation Evaluation Where does loss occur in a Datacenter? Packet Loss occurs at end-hosts: independent and bursty Overloaded Node burst length (packets) receiver r 1 : loss bursts in A and B receiver r 1 : data bursts in A and B Data Rate Loss Bursts 180 Less Loaded Node burst length (packets) time in seconds receiver r 2 : loss bursts in A receiver r 2 : data bursts in A

44 Problem Statement Maelstrom Ricochet Conclusion Motivation Design and Implementation Evaluation Recover lost packets rapidly! Scalability: Multicast Data Lost Packet

45 Problem Statement Maelstrom Ricochet Conclusion Motivation Design and Implementation Evaluation Recover lost packets rapidly! Scalability: Number of Receivers Multicast Data Lost Packet

46 Problem Statement Maelstrom Ricochet Conclusion Motivation Design and Implementation Evaluation Recover lost packets rapidly! Scalability: Number of Receivers Number of Senders Multicast Data Lost Packet

47 Problem Statement Maelstrom Ricochet Conclusion Motivation Design and Implementation Evaluation Recover lost packets rapidly! Scalability: Number of Receivers Number of Senders Number of Groups Multicast Data Lost Packet

48 Motivation Design and Implementation Evaluation Design Space for Reliable Multicast How does latency scale? Loss Sender data

49 Motivation Design and Implementation Evaluation Design Space for Reliable Multicast How does latency scale? 1. acks: implosion data Loss Sender acks

50 Motivation Design and Implementation Evaluation Design Space for Reliable Multicast How does latency scale? naks Loss 1. acks: implosion 2. naks data Sender

51 Motivation Design and Implementation Evaluation Design Space for Reliable Multicast How does latency scale? Sender xors data Loss 1. acks: implosion 2. naks 3. Sender-based FEC recovery latency 1 datarate data rate: one sender, one group

52 Motivation Design and Implementation Evaluation Design Space for Reliable Multicast How does latency scale? Sender xors data Loss 1. acks: implosion 2. naks 3. Sender-based FEC recovery latency 1 datarate data rate: one sender, one group FEC in the network: Where and What?

53 Motivation Design and Implementation Evaluation Design Space for Reliable Multicast How does latency scale? 1. acks: implosion Sender data Loss xors 2. naks 3. Sender-based FEC recovery latency 1 datarate data rate: one sender, one group FEC in the network: Where and What? Receiver-based FEC: at receivers, from incoming data

54 Motivation Design and Implementation Evaluation Receiver-Based Forward Error Correction Receiver generates an XOR of r incoming multicast packets and exchanges with other receivers Each XOR sent to c other random receivers Rate: (r, c) S E N D E R S Multicast Data

55 Motivation Design and Implementation Evaluation Receiver-Based Forward Error Correction Receiver generates an XOR of r incoming multicast packets and exchanges with other receivers Each XOR sent to c other random receivers Rate: (r, c) 1 latency P s datarate data rate: across all senders, in a single group S E N D E R S Multicast Data

56 Motivation Design and Implementation Evaluation Lateral Error Correction: Principle Receiver R1 Receiver R2 Group A Group B A1 A1 INCOMING DATA PACKETS A2 A3 B1 B2 A4 B3 B4 A5 A6 B5 Loss Repair Packet I: (A1,A2,A3,A4,A5) Repair Packet II: (B1,B2,B3,B4,B5) A2 A3 B1 B2 A4 B3 B4 A5 A6 B5 INCOMING DATA PACKETS Single-Group RFEC Recovery

57 Motivation Design and Implementation Evaluation Lateral Error Correction: Principle Receiver R1 Receiver R2 Group A Group B A1 A1 INCOMING DATA PACKETS A2 A3 B1 B2 A4 B3 B4 A5 Loss Repair Packet I: (A1,A2,A3,B1,B2) Recovery A2 A3 B1 B2 A4 B3 B4 A5 INCOMING DATA PACKETS Single-Group RFEC Lateral Error Correction Create XORs from multiple groups faster recovery! What about complex overlap? A6 B5 Repair Packet II: (A4,B3,B4,A5,A6) A6 B5

58 Motivation Design and Implementation Evaluation Nodes and Disjoint Regions Receiver n 1 belongs to groups A, B, and C

59 Motivation Design and Implementation Evaluation Nodes and Disjoint Regions Receiver n 1 belongs to groups A, B, and C Divides groups into disjoint regions

60 Motivation Design and Implementation Evaluation Nodes and Disjoint Regions Receiver n 1 belongs to groups A, B, and C Divides groups into disjoint regions Is unaware of groups it does not belong to (D) Works with any conventional Group Membership Service

61 Regional Selection Maelstrom Ricochet Conclusion Motivation Design and Implementation Evaluation A A 1

62 Regional Selection Maelstrom Ricochet Conclusion Motivation Design and Implementation Evaluation A Select targets for XORs from regions, not groups A 1

63 Motivation Design and Implementation Evaluation Regional Selection Select targets for XORs from regions, not groups A a A abc ca ab From each region, select proportional fraction of c A : c x A = x A c A A ac

64 Regional Selection Maelstrom Ricochet Conclusion Motivation Design and Implementation Evaluation Select targets for XORs from regions, not groups A a A abc ca ab From each region, select proportional fraction of c A : c x A = x A c A 1 latency P P s A ac g datarate data rate: across all senders, in intersections of groups

65 Motivation Design and Implementation Evaluation Scalability in Groups Claim: Ricochet scales to hundreds of groups. 100 Recovery % Average Recovery Latency LEC Recovery Percentage Microseconds Groups per Node (Log Scale) Groups per Node (Log Scale) Comparision: At 128 groups, NAK/SFEC latency is 8 seconds. Ricochet is 400 times faster!

66 Motivation Design and Implementation Evaluation Distribution of Recovery Latency Claim: Ricochet is reliable and time-critical Recovery Percentage 96.8% LEC + 3.2% NAK Milliseconds Recovery Percentage 92% LEC + 8% NAK Milliseconds Recovery Percentage % LEC + 16% NAK LEC NAK Milliseconds (a) 10% Loss Rate (b) 15% Loss Rate (c) 20% Loss Rate Most lost packets recovered < 50ms by LEC. Remainder via reactive NAKs. Bursty Loss: 100 packet burst 90% recovered at 50 ms avg

67 Motivation Design and Implementation Evaluation Next Step: Dr. Multicast IP Multicast has a bad reputation! Unscalable filtering at routers/switches/nics Insecure

68 Motivation Design and Implementation Evaluation Next Step: Dr. Multicast IP Multicast has a bad reputation! Unscalable filtering at routers/switches/nics Insecure Insight: IP Multicast is a shared, controlled resource Transparent interception of socket system calls Logical address Set of network (uni/multi)cast addresses Enforcement of IP Multicast policies Gossip-based tracking of membership/mappings

69 The Big Picture Maelstrom Ricochet Conclusion Software Stack Crashes Overloads Bugs Exploits Disk Failure, Disasters Packet Loss Compilers, Databases, Distributed Systems, Machine Learning, Filesystems, Operating Systems, Networking... Tempest DSN 2008 KyotoFS HotOS XI Plato SRDS 2006 Ricochet NSDI 2007 Mistral MobiHoc 2006 Real-World, Real-Time SMFS Submitted Maelstrom NSDI 2008 Sequoia PODC 2007

70 The Big Picture Maelstrom Ricochet Conclusion Software Stack Crashes Overloads Bugs Exploits Disk Failure, Disasters Packet Loss Compilers, Databases, Distributed Systems, Machine Learning, Filesystems, Operating Systems, Networking... Tempest DSN 2008 KyotoFS HotOS XI Plato SRDS 2006 Ricochet NSDI 2007 Mistral MobiHoc 2006 What about new abstractions? Real-World, Real-Time SMFS Submitted Maelstrom NSDI 2008 Sequoia PODC 2007

71 Future Work Maelstrom Ricochet Conclusion Service Invocation (ID )

72 Future Work Instance Partition for Scalability Instance Partition (ID2 ) (ID1 ) (ID0 ) Instance

73 Future Work Group Group Partition (ID2 ) (ID1 ) (ID0 ) Partition for Scalability Replicate for Availability / Fault-Tolerance Group Replicate

74 Future Work Maelstrom Ricochet Conclusion The Datacenter is the Computer Replicate Group Rethink Old Abstractions Processes, Threads, Address Space, Protection, Locks, IPC/RPC, Sockets, Files... Group Group Partition Invent New Abstractions! (ID2 ) (ID1 ) (ID0 ) Partition for Scalability Replicate for Availability / Fault-Tolerance The DatacenterOS Concerns: Performance, Parallelism, Privacy, Power...

75 Conclusion Maelstrom Ricochet Conclusion The Real-Time Datacenter Recover from failures within seconds Reliable Communication: FEC in the Network Recover lost packets in milliseconds Maelstrom: between Datacenters Ricochet: within Datacenters

76 Conclusion Maelstrom Ricochet Conclusion The Real-Time Datacenter Recover from failures within seconds Reliable Communication: FEC in the Network Recover lost packets in milliseconds Maelstrom: between Datacenters Ricochet: within Datacenters Thank You!

77 Extra Slide: FEC and Bursty Loss Existing solution: interleaving Interleave i and rate (r, c) tolerates (c i) burst......with i times the latency A B C D E A C E G I B D F H J F G H I J X X X X X X Figure: Interleave of 2 Even and Odd packets encoded separately

78 Extra Slide: FEC and Bursty Loss Existing solution: interleaving Interleave i and rate (r, c) tolerates (c i) burst......with i times the latency A B C D E A C E G I B D F H J F G H I J X X X X X X Figure: Interleave of 2 Even and Odd packets encoded separately Wanted: Graceful degradation of recovery latency with actual burst size for constant overhead

79 Extra Slide: Maelstrom Evaluation Maelstrom goodput is near theoretical maximum Throughput (Mbits/sec) theoretical goodput M-FEC goodput FEC Layers (c)

80 Extra Slide: Layered Interleaving Recovery Latency (Milliseconds) Reed Solomon Layered Interleaving Recovered Packet #

81 Evaluation: Layered Interleaving Claim: Recovery Latency depends on Actual Burst Length Burst Length = % Recovered Recovery Latency (ms) % Recovered Recovery Latency (ms) % Recovered Recovery Latency (ms) Longer Burst Lengths Longer Recovery Latency

82 TeraGrid: Supercomputer Network End-to-End UDP Probes: Zero Congestion, Non-Zero Loss! Possible Reasons: transient congestion degraded fiber malfunctioning HW misconfigured HW switching contention low receiver power end-host overflow... % of Measurements % of Lost Packets

83 TeraGrid: Supercomputer Network End-to-End UDP Probes: Zero Congestion, Non-Zero Loss! Possible Reasons: transient congestion degraded fiber malfunctioning HW misconfigured HW switching contention low receiver power end-host overflow... % of Measurements All Measurements Without Indiana U. Site % of Lost Packets

84 TeraGrid: Supercomputer Network End-to-End UDP Probes: Zero Congestion, Non-Zero Loss! Possible Reasons: transient congestion degraded fiber malfunctioning HW misconfigured HW switching contention low receiver power end-host overflow... % of Measurements All Measurements Without Indiana U. Site % of Lost Packets Problem Statement: Run unmodified TCP/IP over lossy high-speed long-distance networks

85 Evaluation: Split Mode and Buffering Claim: Maelstrom eliminates the need for large end-host buffers 600 throughput 0.1% loss 500 Throughput (Mbits/sec) client-buf M-ALL M-FEC M-BUF tcp RTT 100ms

86 Evaluation: Split Mode and Buffering Claim: Maelstrom eliminates the need for large end-host buffers 600 throughput 0.1% loss Throughput (Mbits/sec) Maelstrom FEC 0 client-buf M-ALL M-FEC M-BUF tcp RTT 100ms

87 Evaluation: Split Mode and Buffering Claim: Maelstrom eliminates the need for large end-host buffers 600 throughput 0.1% loss Throughput (Mbits/sec) client-buf M-ALL M-FEC M-BUF tcp RTT 100ms Maelstrom FEC Maelstrom FEC + Hand-Tuned Buffers

88 Evaluation: Split Mode and Buffering Claim: Maelstrom eliminates the need for large end-host buffers Maelstrom FEC + Split Mode throughput 0.1% loss Throughput (Mbits/sec) client-buf M-ALL M-FEC M-BUF tcp RTT 100ms Maelstrom FEC Maelstrom FEC + Hand-Tuned Buffers

89 Evaluation: FEC mode and loss Claim: Maelstrom works at high loss rates Throughput (Mbps) Maelstrom TCP Packet Loss Rate % Link RTT = 100ms

Maelstrom: An Enterprise Continuity Protocol for Financial Datacenters

Maelstrom: An Enterprise Continuity Protocol for Financial Datacenters Maelstrom: An Enterprise Continuity Protocol for Financial Datacenters Mahesh Balakrishnan, Tudor Marian, Hakim Weatherspoon Cornell University, Ithaca, NY Datacenters Internet Services (90s) Websites,

More information

CS 514: Transport Protocols for Datacenters

CS 514: Transport Protocols for Datacenters Department of Computer Science Cornell University Outline Motivation 1 Motivation 2 3 Commodity Datacenters Blade-servers, Fast Interconnects Different Apps: Google -> Search Amazon -> Etailing Computational

More information

Maelstrom: Transparent Error Correction for Lambda Networks

Maelstrom: Transparent Error Correction for Lambda Networks Maelstrom: Transparent Error Correction for Lambda Networks Mahesh Balakrishnan, Tudor Marian, Ken Birman, Hakim Weatherspoon, Einar Vollset {mahesh, tudorm, ken, hweather, einar}@cs.cornell.edu Cornell

More information

presenter Hakim Weatherspoon CS 6464 February 5 th, 2009

presenter Hakim Weatherspoon CS 6464 February 5 th, 2009 presenter Hakim Weatherspoon CS 6464 February 5 th, 2009 U.S. Department of Treasury Study Financial Sector vulnerable to significant data loss in disaster Need new technical options Risks are real, technology

More information

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 3, JUNE

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 3, JUNE IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 3, JUNE 2011 617 Maelstrom: Transparent Error Correction for Communication Between Data Centers Mahesh Balakrishnan, Tudor Marian, Kenneth P. Birman, Fellow,

More information

Supporting Scalability and Adaptability via ADAptive Middleware And Network Transports (ADAMANT)

Supporting Scalability and Adaptability via ADAptive Middleware And Network Transports (ADAMANT) Supporting Scalability and Adaptability via ADAptive Middleware And Network Transports (ADAMANT) Joe Hoffert, Doug Schmidt Vanderbilt University Mahesh Balakrishnan, Ken Birman Cornell University Motivation

More information

Chapter 13 TRANSPORT. Mobile Computing Winter 2005 / Overview. TCP Overview. TCP slow-start. Motivation Simple analysis Various TCP mechanisms

Chapter 13 TRANSPORT. Mobile Computing Winter 2005 / Overview. TCP Overview. TCP slow-start. Motivation Simple analysis Various TCP mechanisms Overview Chapter 13 TRANSPORT Motivation Simple analysis Various TCP mechanisms Distributed Computing Group Mobile Computing Winter 2005 / 2006 Distributed Computing Group MOBILE COMPUTING R. Wattenhofer

More information

Mobile Transport Layer

Mobile Transport Layer Mobile Transport Layer 1 Transport Layer HTTP (used by web services) typically uses TCP Reliable transport between TCP client and server required - Stream oriented, not transaction oriented - Network friendly:

More information

WarpTCP WHITE PAPER. Technology Overview. networks. -Improving the way the world connects -

WarpTCP WHITE PAPER. Technology Overview. networks. -Improving the way the world connects - WarpTCP WHITE PAPER Technology Overview -Improving the way the world connects - WarpTCP - Attacking the Root Cause TCP throughput reduction is often the bottleneck that causes data to move at slow speed.

More information

Outline 9.2. TCP for 2.5G/3G wireless

Outline 9.2. TCP for 2.5G/3G wireless Transport layer 9.1 Outline Motivation, TCP-mechanisms Classical approaches (Indirect TCP, Snooping TCP, Mobile TCP) PEPs in general Additional optimizations (Fast retransmit/recovery, Transmission freezing,

More information

Presentation_ID. 2002, Cisco Systems, Inc. All rights reserved.

Presentation_ID. 2002, Cisco Systems, Inc. All rights reserved. 1 Gigabit to the Desktop Session Number 2 Gigabit to the Desktop What we are seeing: Today s driver for Gigabit Ethernet to the Desktop is not a single application but the simultaneous use of multiple

More information

Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance

Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Hakim Weatherspoon, Lakshmi Ganesh, Tudor Marian, Mahesh Balakrishnan, and Ken Birman Cornell University,

More information

Mobile Communications Chapter 9: Mobile Transport Layer

Mobile Communications Chapter 9: Mobile Transport Layer Prof. Dr.-Ing Jochen H. Schiller Inst. of Computer Science Freie Universität Berlin Germany Mobile Communications Chapter 9: Mobile Transport Layer Motivation, TCP-mechanisms Classical approaches (Indirect

More information

CSE 4215/5431: Mobile Communications Winter Suprakash Datta

CSE 4215/5431: Mobile Communications Winter Suprakash Datta CSE 4215/5431: Mobile Communications Winter 2013 Suprakash Datta datta@cse.yorku.ca Office: CSEB 3043 Phone: 416-736-2100 ext 77875 Course page: http://www.cse.yorku.ca/course/4215 Some slides are adapted

More information

Engineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego

Engineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego Engineering Fault-Tolerant TCP/IP servers using FT-TCP Dmitrii Zagorodnov University of California San Diego Motivation Reliable network services are desirable but costly! Extra and/or specialized hardware

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Presented by Han Zhang & Zaina Hamid Challenges

More information

RPT: Re-architecting Loss Protection for Content-Aware Networks

RPT: Re-architecting Loss Protection for Content-Aware Networks RPT: Re-architecting Loss Protection for Content-Aware Networks Dongsu Han, Ashok Anand ǂ, Aditya Akella ǂ, and Srinivasan Seshan Carnegie Mellon University ǂ University of Wisconsin-Madison Motivation:

More information

Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance

Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Hakim Weatherspoon, Lakshmi Ganesh, Tudor Marian, Mahesh Balakrishnan, and Ken Birman Cornell University,

More information

HyperIP : SRDF Application Note

HyperIP : SRDF Application Note HyperIP : SRDF Application Note Introduction HyperIP is a Linux software application that quantifiably and measurably enhances large data movement over big bandwidth and long-haul IP networks. HyperIP

More information

Good Ideas So Far Computer Networking. Outline. Sequence Numbers (reminder) TCP flow control. Congestion sources and collapse

Good Ideas So Far Computer Networking. Outline. Sequence Numbers (reminder) TCP flow control. Congestion sources and collapse Good Ideas So Far 15-441 Computer Networking Lecture 17 TCP & Congestion Control Flow control Stop & wait Parallel stop & wait Sliding window Loss recovery Timeouts Acknowledgement-driven recovery (selective

More information

Overview. A Survey of Packet-Loss Recovery Techniques. Outline. Overview. Mbone Loss Characteristics. IP Multicast Characteristics

Overview. A Survey of Packet-Loss Recovery Techniques. Outline. Overview. Mbone Loss Characteristics. IP Multicast Characteristics A Survey of Packet-Loss Recovery Techniques Overview Colin Perkins, Orion Hodson and Vicky Hardman Department of Computer Science University College London (UCL) London, UK IEEE Network Magazine Sep/Oct,

More information

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX Inventing Internet TV Available in more than 190 countries 104+ million subscribers Lots of Streaming == Lots of Traffic

More information

CSE 123A Computer Networks

CSE 123A Computer Networks CSE 123A Computer Networks Winter 2005 Lecture 14 Congestion Control Some images courtesy David Wetherall Animations by Nick McKeown and Guido Appenzeller The bad news and the good news The bad news: new

More information

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste Outline 15-441 Computer Networking Lecture 18 TCP Performance Peter Steenkiste Fall 2010 www.cs.cmu.edu/~prs/15-441-f10 TCP congestion avoidance TCP slow start TCP modeling TCP details 2 AIMD Distributed,

More information

Network-Adaptive Video Coding and Transmission

Network-Adaptive Video Coding and Transmission Header for SPIE use Network-Adaptive Video Coding and Transmission Kay Sripanidkulchai and Tsuhan Chen Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213

More information

Network Management & Monitoring

Network Management & Monitoring Network Management & Monitoring Network Delay These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (http://creativecommons.org/licenses/by-nc/3.0/) End-to-end

More information

FLEXible Middleware And Transports (FLEXMAT) for Real-time Event Stream Processing (RT-ESP) Applications

FLEXible Middleware And Transports (FLEXMAT) for Real-time Event Stream Processing (RT-ESP) Applications FLEXible Middleware And Transports (FLEXMAT) for Real-time Event Stream Processing (RT-ESP) Applications Joe Hoffert, Doug Schmidt Vanderbilt University 1 Challenges and Goals Real-time Event Stream Processing

More information

Optimizing Performance: Intel Network Adapters User Guide

Optimizing Performance: Intel Network Adapters User Guide Optimizing Performance: Intel Network Adapters User Guide Network Optimization Types When optimizing network adapter parameters (NIC), the user typically considers one of the following three conditions

More information

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet Vijay Shankar Rajanna, Smit Shah, Anand Jahagirdar and Kartik Gopalan Computer Science, State University of New York at Binghamton

More information

WebSphere MQ Low Latency Messaging V2.1. High Throughput and Low Latency to Maximize Business Responsiveness IBM Corporation

WebSphere MQ Low Latency Messaging V2.1. High Throughput and Low Latency to Maximize Business Responsiveness IBM Corporation WebSphere MQ Low Latency Messaging V2.1 High Throughput and Low Latency to Maximize Business Responsiveness 2008 IBM Corporation WebSphere MQ Low Latency Messaging Extends the WebSphere MQ messaging family

More information

Why Your Application only Uses 10Mbps Even the Link is 1Gbps?

Why Your Application only Uses 10Mbps Even the Link is 1Gbps? Why Your Application only Uses 10Mbps Even the Link is 1Gbps? Contents Introduction Background Information Overview of the Issue Bandwidth-Delay Product Verify Solution How to Tell Round Trip Time (RTT)

More information

Advanced Computer Networks. Flow Control

Advanced Computer Networks. Flow Control Advanced Computer Networks 263 3501 00 Flow Control Patrick Stuedi Spring Semester 2017 1 Oriana Riva, Department of Computer Science ETH Zürich Last week TCP in Datacenters Avoid incast problem - Reduce

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

Linux Plumbers Conference TCP-NV Congestion Avoidance for Data Centers

Linux Plumbers Conference TCP-NV Congestion Avoidance for Data Centers Linux Plumbers Conference 2010 TCP-NV Congestion Avoidance for Data Centers Lawrence Brakmo Google TCP Congestion Control Algorithm for utilizing available bandwidth without too many losses No attempt

More information

Overview. TCP & router queuing Computer Networking. TCP details. Workloads. TCP Performance. TCP Performance. Lecture 10 TCP & Routers

Overview. TCP & router queuing Computer Networking. TCP details. Workloads. TCP Performance. TCP Performance. Lecture 10 TCP & Routers Overview 15-441 Computer Networking TCP & router queuing Lecture 10 TCP & Routers TCP details Workloads Lecture 10: 09-30-2002 2 TCP Performance TCP Performance Can TCP saturate a link? Congestion control

More information

Internet Design Principles and Architecture

Internet Design Principles and Architecture Internet Design Principles and Architecture Venkat Padmanabhan Microsoft Research 2 April 2001 Venkat Padmanabhan 1 Lecture Outline A brief history of the Internet How is the Internet different from the

More information

A TRANSPORT PROTOCOL FOR DEDICATED END-TO-END CIRCUITS

A TRANSPORT PROTOCOL FOR DEDICATED END-TO-END CIRCUITS A TRANSPORT PROTOCOL FOR DEDICATED END-TO-END CIRCUITS MS Thesis Final Examination Anant P. Mudambi Computer Engineering University of Virginia December 6, 2005 Outline Motivation CHEETAH Background UDP-based

More information

Recap. TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness

Recap. TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness Recap TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness 81 Feedback Signals Several possible signals, with different

More information

Lecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren

Lecture 21: Congestion Control CSE 123: Computer Networks Alex C. Snoeren Lecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren Lecture 21 Overview" How fast should a sending host transmit data? Not to fast, not to slow, just right Should not be faster than

More information

CSCI-1680 Transport Layer III Congestion Control Strikes Back Rodrigo Fonseca

CSCI-1680 Transport Layer III Congestion Control Strikes Back Rodrigo Fonseca CSCI-1680 Transport Layer III Congestion Control Strikes Back Rodrigo Fonseca Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti, Ion Stoica Last Time Flow Control Congestion Control

More information

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li Goal: fast and cost-efficient key-value store Store, retrieve, manage key-value objects Get(key)/Put(key,value)/Delete(key) Target: cluster-level

More information

CSCI-1680 Link Layer Reliability Rodrigo Fonseca

CSCI-1680 Link Layer Reliability Rodrigo Fonseca CSCI-1680 Link Layer Reliability Rodrigo Fonseca Based partly on lecture notes by David Mazières, Phil Levis, John Janno< Last time Physical layer: encoding, modulation Link layer framing Today Getting

More information

An Empirical Study of Reliable Multicast Protocols over Ethernet Connected Networks

An Empirical Study of Reliable Multicast Protocols over Ethernet Connected Networks An Empirical Study of Reliable Multicast Protocols over Ethernet Connected Networks Ryan G. Lane Daniels Scott Xin Yuan Department of Computer Science Florida State University Tallahassee, FL 32306 {ryanlane,sdaniels,xyuan}@cs.fsu.edu

More information

Mobile Communications Chapter 9: Mobile Transport Layer

Mobile Communications Chapter 9: Mobile Transport Layer Prof. Dr.-Ing Jochen H. Schiller Inst. of Computer Science Freie Universität Berlin Germany Mobile Communications Chapter 9: Mobile Transport Layer Motivation, TCP-mechanisms Classical approaches (Indirect

More information

Traffic Characteristics of Bulk Data Transfer using TCP/IP over Gigabit Ethernet

Traffic Characteristics of Bulk Data Transfer using TCP/IP over Gigabit Ethernet Traffic Characteristics of Bulk Data Transfer using TCP/IP over Gigabit Ethernet Aamir Shaikh and Kenneth J. Christensen Department of Computer Science and Engineering University of South Florida Tampa,

More information

Congestion Control in Datacenters. Ahmed Saeed

Congestion Control in Datacenters. Ahmed Saeed Congestion Control in Datacenters Ahmed Saeed What is a Datacenter? Tens of thousands of machines in the same building (or adjacent buildings) Hundreds of switches connecting all machines What is a Datacenter?

More information

CS 138: Communication I. CS 138 V 1 Copyright 2012 Thomas W. Doeppner. All rights reserved.

CS 138: Communication I. CS 138 V 1 Copyright 2012 Thomas W. Doeppner. All rights reserved. CS 138: Communication I CS 138 V 1 Copyright 2012 Thomas W. Doeppner. All rights reserved. Topics Network Metrics Layering Reliability Congestion Control Routing CS 138 V 2 Copyright 2012 Thomas W. Doeppner.

More information

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook)

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Workshop on New Visions for Large-Scale Networks: Research & Applications Vienna, VA, USA, March 12-14, 2001 The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Wu-chun Feng feng@lanl.gov

More information

Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware

Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware What You Will Learn The goal of zero latency in financial services has caused the creation of

More information

CSCI-1680 Link Layer I Rodrigo Fonseca

CSCI-1680 Link Layer I Rodrigo Fonseca CSCI-1680 Link Layer I Rodrigo Fonseca Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Last time Physical layer: encoding, modulation Today Link layer framing Getting frames

More information

NET ID. CS519, Prelim (March 17, 2004) NAME: You have 50 minutes to complete the test. 1/17

NET ID. CS519, Prelim (March 17, 2004) NAME: You have 50 minutes to complete the test. 1/17 CS519, Prelim (March 17, 2004) NAME: You have 50 minutes to complete the test. 1/17 Q1. 2 points Write your NET ID at the top of every page of this test. Q2. X points Name 3 advantages of a circuit network

More information

iscsi : A loss-less Ethernet fabric with DCB Jason Blosil, NetApp Gary Gumanow, Dell

iscsi : A loss-less Ethernet fabric with DCB Jason Blosil, NetApp Gary Gumanow, Dell iscsi : A loss-less Ethernet fabric with DCB Jason Blosil, NetApp Gary Gumanow, Dell SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual

More information

Randomization. Randomization used in many protocols We ll study examples:

Randomization. Randomization used in many protocols We ll study examples: Randomization Randomization used in many protocols We ll study examples: Ethernet multiple access protocol Router (de)synchronization Switch scheduling 1 Ethernet Single shared broadcast channel 2+ simultaneous

More information

Randomization used in many protocols We ll study examples: Ethernet multiple access protocol Router (de)synchronization Switch scheduling

Randomization used in many protocols We ll study examples: Ethernet multiple access protocol Router (de)synchronization Switch scheduling Randomization Randomization used in many protocols We ll study examples: Ethernet multiple access protocol Router (de)synchronization Switch scheduling 1 Ethernet Single shared broadcast channel 2+ simultaneous

More information

The Best Protocol for Real-time Data Transport

The Best Protocol for Real-time Data Transport The Definitive Guide to: The Best Protocol for Real-time Data Transport Assessing the most common protocols on 6 important categories Identifying the Best Protocol For strategic applications using real-time

More information

CSCI-1680 Link Layer Reliability John Jannotti

CSCI-1680 Link Layer Reliability John Jannotti CSCI-1680 Link Layer Reliability John Jannotti Based partly on lecture notes by David Mazières, Phil Levis, Rodrigo Fonseca Roadmap Last time Physical layer: encoding, modulation Link layer framing Today

More information

TCP so far Computer Networking Outline. How Was TCP Able to Evolve

TCP so far Computer Networking Outline. How Was TCP Able to Evolve TCP so far 15-441 15-441 Computer Networking 15-641 Lecture 14: TCP Performance & Future Peter Steenkiste Fall 2016 www.cs.cmu.edu/~prs/15-441-f16 Reliable byte stream protocol Connection establishments

More information

1. What is a Computer Network? interconnected collection of autonomous computers connected by a communication technology

1. What is a Computer Network? interconnected collection of autonomous computers connected by a communication technology Review Questions for exam Preparation (22-07-2017) 1. What is a Computer Network? interconnected collection of autonomous computers connected by a communication technology 2. What is the Internet? "network

More information

ETSF10 Internet Protocols Transport Layer Protocols

ETSF10 Internet Protocols Transport Layer Protocols ETSF10 Internet Protocols Transport Layer Protocols 2012, Part 2, Lecture 2.1 Kaan Bür, Jens Andersson Transport Layer Protocols Process-to-process delivery [ed.4 ch.23.1] [ed.5 ch.24.1] Transmission Control

More information

PLEASE READ CAREFULLY BEFORE YOU START

PLEASE READ CAREFULLY BEFORE YOU START MIDTERM EXAMINATION #2 NETWORKING CONCEPTS 03-60-367-01 U N I V E R S I T Y O F W I N D S O R - S c h o o l o f C o m p u t e r S c i e n c e Fall 2011 Question Paper NOTE: Students may take this question

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

Adapting Enterprise Distributed Real-time and Embedded (DRE) Pub/Sub Middleware for Cloud Computing Environments

Adapting Enterprise Distributed Real-time and Embedded (DRE) Pub/Sub Middleware for Cloud Computing Environments Adapting Enterprise Distributed Real-time and Embedded (DRE) Pub/Sub Middleware for Cloud Computing Environments Joe Hoffert, Douglas Schmidt, and Aniruddha Gokhale Vanderbilt University Nashville, TN,

More information

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Dr. Vinod Vokkarane Assistant Professor, Computer and Information Science Co-Director, Advanced Computer Networks Lab University

More information

TCP Congestion Control

TCP Congestion Control 6.033, Spring 2014 TCP Congestion Control Dina Katabi & Sam Madden nms.csail.mit.edu/~dina Sharing the Internet How do you manage resources in a huge system like the Internet, where users with different

More information

FUJITSU Software Interstage Information Integrator V11

FUJITSU Software Interstage Information Integrator V11 FUJITSU Software V11 An Innovative WAN optimization solution to bring out maximum network performance October, 2013 Fujitsu Limited Contents Overview Key technologies Supported network characteristics

More information

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup Chapter 4 Routers with Tiny Buffers: Experiments This chapter describes two sets of experiments with tiny buffers in networks: one in a testbed and the other in a real network over the Internet2 1 backbone.

More information

Communications Software. CSE 123b. CSE 123b. Spring Lecture 3: Reliable Communications. Stefan Savage. Some slides couresty David Wetherall

Communications Software. CSE 123b. CSE 123b. Spring Lecture 3: Reliable Communications. Stefan Savage. Some slides couresty David Wetherall CSE 123b CSE 123b Communications Software Spring 2002 Lecture 3: Reliable Communications Stefan Savage Some slides couresty David Wetherall Administrativa Home page is up and working http://www-cse.ucsd.edu/classes/sp02/cse123b/

More information

IsoStack Highly Efficient Network Processing on Dedicated Cores

IsoStack Highly Efficient Network Processing on Dedicated Cores IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single

More information

Adaptive Video Multicasting

Adaptive Video Multicasting Adaptive Video Multicasting Presenters: Roman Glistvain, Bahman Eksiri, and Lan Nguyen 1 Outline Approaches to Adaptive Video Multicasting Single rate multicast Simulcast Active Agents Layered multicasting

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

Wireless TCP Performance Issues

Wireless TCP Performance Issues Wireless TCP Performance Issues Issues, transport layer protocols Set up and maintain end-to-end connections Reliable end-to-end delivery of data Flow control Congestion control Udp? Assume TCP for the

More information

The Present and Future of Congestion Control. Mark Handley

The Present and Future of Congestion Control. Mark Handley The Present and Future of Congestion Control Mark Handley Outline Purpose of congestion control The Present: TCP s congestion control algorithm (AIMD) TCP-friendly congestion control for multimedia Datagram

More information

Network Layer Flow Control via Credit Buffering

Network Layer Flow Control via Credit Buffering Network Layer Flow Control via Credit Buffering Fibre Channel maintains throughput in the data center by using flow control via buffer to buffer credits Nominally switches provide credit buffering up to

More information

Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin,

Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin, Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin, ydlin@cs.nctu.edu.tw Chapter 1: Introduction 1. How does Internet scale to billions of hosts? (Describe what structure

More information

Streaming Video and TCP-Friendly Congestion Control

Streaming Video and TCP-Friendly Congestion Control Streaming Video and TCP-Friendly Congestion Control Sugih Jamin Department of EECS University of Michigan jamin@eecs.umich.edu Joint work with: Zhiheng Wang (UofM), Sujata Banerjee (HP Labs) Video Application

More information

Loss Tolerant TCP (LT-TCP): Implementation and Evaluation

Loss Tolerant TCP (LT-TCP): Implementation and Evaluation Loss Tolerant TCP (LT-TCP): Implementation and Evaluation Koushik Kar Rensselaer Polytechnic Institute Bishwaroop Ganguly MIT Lincoln Labs Nathan Hourt Rensselaer Polytechnic Institute Outline Motivation

More information

Lecture 5: Performance Analysis I

Lecture 5: Performance Analysis I CS 6323 : Modeling and Inference Lecture 5: Performance Analysis I Prof. Gregory Provan Department of Computer Science University College Cork Slides: Based on M. Yin (Performability Analysis) Overview

More information

Multiple unconnected networks

Multiple unconnected networks TCP/IP Life in the Early 1970s Multiple unconnected networks ARPAnet Data-over-cable Packet satellite (Aloha) Packet radio ARPAnet satellite net Differences Across Packet-Switched Networks Addressing Maximum

More information

ECS-087: Mobile Computing

ECS-087: Mobile Computing ECS-087: Mobile Computing TCP over wireless TCP and mobility Most of the Slides borrowed from Prof. Sridhar Iyer s lecture IIT Bombay Diwakar Yagyasen 1 Effect of Mobility on Protocol Stack Application:

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

GFS: The Google File System. Dr. Yingwu Zhu

GFS: The Google File System. Dr. Yingwu Zhu GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can

More information

Messaging Overview. Introduction. Gen-Z Messaging

Messaging Overview. Introduction. Gen-Z Messaging Page 1 of 6 Messaging Overview Introduction Gen-Z is a new data access technology that not only enhances memory and data storage solutions, but also provides a framework for both optimized and traditional

More information

Data Center Networks and Switching and Queueing and Covert Timing Channels

Data Center Networks and Switching and Queueing and Covert Timing Channels Data Center Networks and Switching and Queueing and Covert Timing Channels Hakim Weatherspoon Associate Professor, Dept of Computer Science CS 5413: High Performance Computing and Networking March 10,

More information

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity

More information

Distributed Systems Multicast & Group Communication Services

Distributed Systems Multicast & Group Communication Services Distributed Systems 600.437 Multicast & Group Communication Services Department of Computer Science The Johns Hopkins University 1 Multicast & Group Communication Services Lecture 3 Guide to Reliable Distributed

More information

RECOMMENDATION ITU-R BT.1720 *

RECOMMENDATION ITU-R BT.1720 * Rec. ITU-R BT.1720 1 RECOMMENDATION ITU-R BT.1720 * Quality of service ranking and measurement methods for digital video broadcasting services delivered over broadband Internet protocol networks (Question

More information

Outline. Internet. Router. Network Model. Internet Protocol (IP) Design Principles

Outline. Internet. Router. Network Model. Internet Protocol (IP) Design Principles Outline Internet model Design principles Internet Protocol (IP) Transmission Control Protocol (TCP) Tze Sing Eugene Ng Department of Computer Science Carnegie Mellon University Tze Sing Eugene Ng eugeneng@cs.cmu.edu

More information

CS505: Distributed Systems

CS505: Distributed Systems Cristina Nita-Rotaru CS505: Distributed Systems Protocols. Slides prepared based on material by Prof. Ken Birman at Cornell University, available at http://www.cs.cornell.edu/ken/book/ Required reading

More information

Can Congestion-controlled Interactive Multimedia Traffic Co-exist with TCP? Colin Perkins

Can Congestion-controlled Interactive Multimedia Traffic Co-exist with TCP? Colin Perkins Can Congestion-controlled Interactive Multimedia Traffic Co-exist with TCP? Colin Perkins Context: WebRTC WebRTC project has been driving interest in congestion control for interactive multimedia Aims

More information

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Google File System (GFS) and Hadoop Distributed File System (HDFS) Google File System (GFS) and Hadoop Distributed File System (HDFS) 1 Hadoop: Architectural Design Principles Linear scalability More nodes can do more work within the same time Linear on data size, linear

More information

TRANSMISSION CONTROL PROTOCOL

TRANSMISSION CONTROL PROTOCOL COMP 635: WIRELESS & MOBILE COMMUNICATIONS TRANSMISSION CONTROL PROTOCOL Jasleen Kaur Fall 2017 1 Impact of Wireless on Protocol Layers Application layer Transport layer Network layer Data link layer Physical

More information

! " Lecture 5: Networking for Games (cont d) Packet headers. Packet footers. IP address. Edge router (cable modem, DSL modem)

!  Lecture 5: Networking for Games (cont d) Packet headers. Packet footers. IP address. Edge router (cable modem, DSL modem) Lecture 5: Networking for Games (cont d) Special Send case: to NAT 123.12.2.10 network 192.168.1.101 17.4.9.33 192.168.1.100 123.12.2.[0-128] IP address 23.11.3.10 Edge router (cable modem, DSL modem)

More information

Module objectives. Integrated services. Support for real-time applications. Real-time flows and the current Internet protocols

Module objectives. Integrated services. Support for real-time applications. Real-time flows and the current Internet protocols Integrated services Reading: S. Keshav, An Engineering Approach to Computer Networking, chapters 6, 9 and 4 Module objectives Learn and understand about: Support for real-time applications: network-layer

More information

Exercises TCP/IP Networking With Solutions

Exercises TCP/IP Networking With Solutions Exercises TCP/IP Networking With Solutions Jean-Yves Le Boudec Fall 2009 3 Module 3: Congestion Control Exercise 3.2 1. Assume that a TCP sender, called S, does not implement fast retransmit, but does

More information

RCRT:Rate-Controlled Reliable Transport Protocol for Wireless Sensor Networks

RCRT:Rate-Controlled Reliable Transport Protocol for Wireless Sensor Networks RCRT:Rate-Controlled Reliable Transport Protocol for Wireless Sensor Networks JEONGYEUP PAEK, RAMESH GOVINDAN University of Southern California 1 Applications that require the transport of high-rate data

More information

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552) Packet Scheduling in Data Centers Lecture 17, Computer Networks (198:552) Datacenter transport Goal: Complete flows quickly / meet deadlines Short flows (e.g., query, coordination) Large flows (e.g., data

More information

1-800-OVERLAYS: Using Overlay Networks to Improve VoIP Quality. Yair Amir, Claudiu Danilov, Stuart Goose, David Hedqvist, Andreas Terzis

1-800-OVERLAYS: Using Overlay Networks to Improve VoIP Quality. Yair Amir, Claudiu Danilov, Stuart Goose, David Hedqvist, Andreas Terzis 1-800-OVERLAYS: Using Overlay Networks to Improve VoIP Quality Yair Amir, Claudiu Danilov, Stuart Goose, David Hedqvist, Andreas Terzis Voice over IP To call or not to call The Internet started as an overlay

More information

Student ID: CS457: Computer Networking Date: 3/20/2007 Name:

Student ID: CS457: Computer Networking Date: 3/20/2007 Name: CS457: Computer Networking Date: 3/20/2007 Name: Instructions: 1. Be sure that you have 9 questions 2. Be sure your answers are legible. 3. Write your Student ID at the top of every page 4. This is a closed

More information

Recommended Readings

Recommended Readings Lecture 11: Media Adaptation Scalable Coding, Dealing with Errors Some slides, images were from http://ip.hhi.de/imagecom_g1/savce/index.htm and John G. Apostolopoulos http://www.mit.edu/~6.344/spring2004

More information

Using Proxies to Enhance TCP Performance over Hybrid Fiber Coaxial Networks

Using Proxies to Enhance TCP Performance over Hybrid Fiber Coaxial Networks Using Proxies to Enhance TCP Performance over Hybrid Fiber Coaxial Networks Reuven Cohen * Dept. of Computer Science, Technion, Haifa 32, Israel and Srinivas Ramanathan Hewlett-Packard Laboratories, 151

More information