Internet Traffc Managers Ibrahm Matta matta@cs.bu.edu www.cs.bu.edu/faculty/matta Computer Scence Department Boston Unversty Boston, MA 225 Jont work wth members of the WING group: Azer Bestavros, John Byers, Mark Crovella, Lang Guo, Khaled Harfoush DIMACS Workshop on Qualty of Servce Issues n the Internet, Feb 8-9, 2 Internet QoS Qualty of Servce (QoS) e.g. low delay, predctable (consstent), far servce Smarter traffc management Avod costly practce of over-provsonng Enhance commercal compettveness Traffc managers placed n strategc places Fast packet classfcaton Intellgent transmsson control of packets Keep rest of the Internet smple core routers may smply dfferentate based on class (not per-flow!)
Placement and Functonalty of Traffc Managers Aggregate Control Isolaton Control Proxy Control Improved User and Network Performance Flows controlled as a bundle Flows solated Wreless proxy General Archtecture 2
TCP Control On every ACK f (wndow < threshold) wndow += else wndow += / wndow On tmeout threshold = wndow / 2 wndow = Majorty of the bytes on // slow start phase the Internet attrbuted to TCP, // double wndow every RTT mostly long-lved flows // congeston avodance // ncrement by every RTT Short TCP transfers usually generate smaller and more varable packet bursts (wndows) On duplcate acknowledgments threshold = wndow = wndow / 2 // fast recovery Wndow changes n TCP Reno Aggregate TCP Control Control congeston for collectons of TCP flows that share the same bottleneck MB transfers, 32 concurrent TCP flows (no cross traffc) 3
Why Aggregate TCP Control? Clents Lghtly Loaded: Delay-bandwdth product of 4 packets Server Heavly Congested: Delay-bandwdth product of packet n 25ms 25ms 32Mbps RED Bottleneck Bandwdth = 5Mbps Total Transfer = 64K packets Lghtly Loaded Condtons bts 5 4 3 2 conlnk BW (n=) Bottleneck Bandwdth (n=) bts 5 4 3 2 conglnk BW (n=8) Bottleneck Bandwdth (n=8) bts 5 9 3 7 2 25 29 33 37 4 45 49 53 57 6 65 69 73 77 8 tme conglnk BW (n=6) Bottleneck Bandwdth (n=6) 5 4 3 2 4 7 3 6 9 22 25 28 3 34 37 4 43 46 49 52 55 58 6 64 67 tme bts 4 7 3 6 9 22 25 28 3 34 37 4 43 46 49 52 55 58 6 64 67 7 73 tme conglnk BW (n=32) Bottleneck Bandwdth (n=32) 5 4 3 2 4 7 3 6 9 22 25 28 3 34 37 4 43 46 49 52 55 58 6 64 67 tme 4
Heavly Congested Condtons Bottleneck BW (delay*bw= Bandwdth pkt, n=) (n=) Bottleneck MW (n=8 Bandwdth delay*bw= pkt) (n=8) bts 6 5 4 3 2 45 89 33 77 22 265 39 353 397 44 485 529 573 67 tme bts 6 5 4 3 2 57 3 69 225 28 337 393 449 55 56 67 673 729 785 84 tme Bottleneck BW (n=6 Bandwdth delay*bw= pkt) (n=6) Bottleneck BW (n=32 Bandwdth delay*bw= pkt) (n=32) 6 5 6 5 bts 4 3 2 97 93 289 385 48 577 673 769 865 96 tme 57 53 249 345 44 537 bts 4 3 2 282 563 844 25 46 687 968 2249 tme 253 28 392 3373 3654 3935 426 4497 Aggregate vs Indvdual TCP Control Under ndvdual TCP congeston control, each flow can alternate between sendng packets at ts mnmum possble congeston wndow of and tmng out Under heavly congested condtons, congeston wndow of can be more than the flow s far share of the congested lnk Bad utlzaton of network resources Aggregate TCP control allows ndvdual flows to have congeston wndows of values less than Under heavy load, a flow s far share of the avalable bandwdth can actually be less than Improved stablty and throughput 5
Implementaton A common control (vrtual socket) can keep track of - a count of actve flows, N - an aggregate congeston wndow (n addton to other parameters to manage consttuent flows) The vrtual socket would make sure that - the avalable bandwdth (n terms of aggregate congeston wndow packets) s dvded evenly among actve flows - under hgh congeston (tmeout) condtons, f needed, the aggregate sendng rate s reduced below N/RTT Identfcaton of Shared Congeston p j p j p j 2+ q j = - p j L j = p j + p j 2+ j 6
Usng Uncast Probes PS a ( ) L PS b ( ) PS ab (,ε) L a j L b a b Algorthm Idea Use PS 2 ( ), PS 3 ( ) and PS 23 (,ε) to estmate p Test f p > then there are shared losses Get the followng estmates: u 2 = q q 2 () u 3 = q q 3 (2) u 23 = p 2+ q 2 q 3 = q q 2 q 3 - p q 2 q 3 (3) u 23 = q +.5 p ( q 2 + q 3 ) + p 2+ ( q 2 q 3 ) (4) p + p + p 2+ = (5) 2 3 7
Expermental Results Performance Measures:. Accuracy 2. Convergence Tme (rate) Base Models: Hgh Medum Low 2 3 2 3 2 3 2 3 BM BM2 BM3 BM4 Accuracy and Convergence BM Accuracy.9.8.7.6.5.4.3.2.. 3 25.9 38.8 5.7 64.6 77.5 9.4 3 6 29 42 55 68 8 94 27 29 232 245 258 27 284 297 BM COnvergence Rate.9.8.7.6.5.4.3.2.. 2.8 25.5 38.2 5.9 63.6 76.3 89 2 4 27 4 53 65 78 9 23 26 229 24 254 267 28 292 8
Sze-Based Isolaton Control Routng aggregates of packet flows wth dvergent characterstcs on (logcally) separate communcaton paths Length (lfetme and sze) of TCP flows s heavy-taled few long-lved TCP flows carry most of the bytes Want long-lved flows to operate n a predctable mode Isolate long flows from short ones short flows arrve n a more bursty fashon short flows change ther wndows more drastcally n slow start Isolatng Burstness of Short Flows Load balanced assgnment polcy of UDP flows Sze based assgnment: Short flows on Path, and long flows on Path 2 CBR UDP flow lengths and nterarrvals are Pareto dstrbuted Isolaton reduces load varaton on path taken by long flows 9
Isolatng Wndow Dynamcs of Short Flows 8 TCP flows (lght load) 2 long and 4 short TCP flows 6% of bandwdth taken by long flows A flow that termnates s replaced by another of same sze Wthout solaton, long flows are prevented from stablzng to ther far share, and short flows can be completely shut off More Predctablty Long TCP flows wth 6% of total resources Short TCP flows wth 4% of total resources
More Farness Farness mproves, especally for short TCP flows RED provdes lower farness among short TCP flows, as t penalzes flows wth smaller packet bursts Isolaton also provdes control over QoS Wth Web Background Flow assgnment polces: Load balanced, sze based, and threshold based Chu and Jan s farness ndex: f ( N = N = N = g ) g 2 2
Wth Web Background RED acheves farness by sacrfcng goodput of short flows Isolated short TCP flows enjoy both hgh farness and hgh goodput QoS-Based Isolaton Control Short flows are usually nteractve, delay-senstve Long flows are usually bulk, throughput-senstve Allocate more bandwdth to short flows Response tme s reduced for the majorty of flows (that are short) Farness s also mantaned for both short and long flows 2
Large Ppe w/ telnet Background Sharng results n unfarness Isolated short flows do not suffer any packet loss Isolaton provdes predctablty wthout sacrfcng overall goodput Small Ppe w/ telnet Background Isolated short flows stll enjoy hgh farness Isolaton provdes predctablty 3
Implementaton Increase predctablty by class-based solaton and aggregate control Only traffc managers at the edges do per-flow management dffserv-lke archtecture Interacton among short and long TCP transfers Increase predctablty for long flows by solatng burstness and wndow dynamcs of short flows Short flows also beneft by not beng shut off or, unable to get ther far share before they end! Isolaton mproves farness for both short and long TCP flows Isolaton can provde faster servce to the many nteractve short flows Traffc manager marks packets as belongng to long-lved Drect long-lved flow to another CBQ queue or label-swtched MPLS path A Control-Theoretc Vew Descrbe the system behavor by a dscrete-tme model state changes at dscrete nstants of tme Evaluate stablty, convergence, farness and performance Sharng of PID controllers: w k q k = = w ( B Stablty condton: q ) [( B ) ( B k k k k + + α β N k k + w C = q max ( ), α = α β = α max ( + 2 β β < 4 N ), N :#flows q q )] ( w ) s = w + ( C w j ) / N N j = 4
Sharng vs. Isolaton & Aggregaton Short flows can affect stablty of long flows Isolaton promotes stablty and farness (or weghted sharng) Proxy Control Mask varablty over shorter tme scales to avod dsruptng control loops operatng over longer tme scales Estmate length and characterstcs of the control loops Must not compromse the end-to-end semantcs 5
Wreless Proxy Hdng local recovery tmes at a wreless access pont from TCP Transent wreless losses do not nfltrate nto the estmaton of Round Trp Tme used to detect congeston (buffer overflow) losses over the (wred) Internet Programmable Traffc Manager Lucent s NEPPI (Network Element for Programmable Packet Injecton) Control programs can nstruct dspatcher to forward certan packets to them Or nstruct dspatcher to create a fast path for packets 6
Packet Processng n NEPPI E.g. flow solaton program may nstruct dspatcher to count a certan number of packets from a flow, then mark subsequent packets as long-lved Or, may request to receve TCP packets wth SYN/FIN/ACK flags to detect begnnng of a TCP connecton, then classfy the flow based on transfer sze and network state Conclusons Traffc managers can be placed n strategc places n the Internet to provde effcent QoS support - n front of clents/servers, or - at exchange/peerng ponts between admnstratve domans Reduce cost and enhance commercal compettveness of Internet servce provders and carrers Basc research n the control of complex dynamcal systems, that of the Internet Expermental research n the mplementaton of a programmable traffc manager - programmng nterface to soft servces,.e. capabltes can be turned on/off and control parameters dynamcally adjusted 7