Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics. Shuhao Liu, Li Chen, Baochun Li University of Toronto July 12, 2018

Size: px
Start display at page:

Download "Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics. Shuhao Liu, Li Chen, Baochun Li University of Toronto July 12, 2018"

Transcription

1 Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics Shuhao Liu, Li Chen, Baochun Li University of Toronto July 12, 2018

2 What is a Coflow? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4

3 What is a Coflow? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 Map tasks

4 What is a Coflow? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 Map tasks Reduce tasks

5 What is a Coflow? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 all-to-all shuffle

6 What is a Coflow? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 Coflow: considered done only when all flows finish

7 Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 1 Job 2

8 Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 1 Job 2

9 Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 1 Job 2

10 Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 1 Job 2

11 Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 1 Job 2

12 Coflow Scheduling Objective: minimizing average coflow completion time Non-blocking Switch 3 Job 1 Job 2 Coflow 1 Coflow 2

13 Coflow Scheduling Objective: minimizing average coflow completion time Non-blocking Switch 2 3 Job 1 Job 2 Coflow 1 Coflow 2

14 Coflow Scheduling Objective: minimizing average coflow completion time Non-blocking Switch 2 3 Job 1 Job 2 Coflow 1 Coflow 2

15 Wide-Area Data Analytics

16 Wide-Area Data Analytics Data 1 Map 1 Map 2 Map 3 Map 4 Reduce 1 Reduce 2 Datacenter 1 Wide Area Network Data 2 Data 3 Data 4 Datacenter 2 Datacenter 3 Datacenter 4

17 Wide-Area Data Analytics Data 1 Map 1 Datacenter 1 Wide Area Network Reduce 1 Reduce 2 Map 2 Map 3 Map 4 Data 2 Datacenter 2 Data 3 Datacenter 3 Data 4 Datacenter 4

18 With tasks placed in different datacenter, what about their generated inter-datacenter coflows?

19 Challenges Dumb bell network model: inter-datacenter links are the only bottleneck Inter-datacenter link Datacenter A Datacenter B

20 Challenges Constantly changing available bandwidth CA-EU US-EU Measured Bandwidth (Mbps) in a 100s interval

21 Can existing heuristics work? Link 1 Link Estimated Flow Completion Time

22 Coflow scheduling should consider the distribution of available bandwidth.

23 Monte Carlo Simulation

24 Monte Carlo Simulation Scheduling Decision Tree

25 Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0]

26 Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0] B C A C A B

27 Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0] B C A C A B

28 Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0] B C A C A B

29 Monte Carlo Simulation Scheduling Decision Tree A [1/1] [0/0] B [0/1] [0/0] C [0/1] [0/0] B C A C A B

30 Monte Carlo Simulation Scheduling Decision Tree A [29/100] B [68/100] C [3/100] B C A C A B

31 Monte Carlo Simulation Scheduling Decision Tree A [29/100] B [68/100] C [3/100] B C A C A B Complexity? 100 * O(n!)

32 Reduced Simulation Complexity

33 Reduced Simulation Complexity Bounded Search Depth (t n d )

34 Reduced Simulation Complexity Bounded Search Depth (t n d ) Reduced Search Breath (Early termination)

35 <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> Reduced Simulation Complexity Bounded Search Depth (t n d ) Reduced Search Breath (Early termination) O(t n d )

36 <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> Reduced Simulation Complexity Bounded Search Depth (t n d ) Reduced Search Breath (Early termination) O(t n d ) Online Incremental Search

37 <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="srro9rjslifs1siux4d32dxwnfw=">aaab/xicjvdlsgnbeoz1gemrpm5ebomql2fxbd0gvxgzgnlasobz2ukyzhz2mekv4hl8fs8efphqf3jzb5w8diokfjquvd1uu0eihuhx/xdm5hcwl5zzk/nvtfwnzclwdt3eqwa8xmiz62zadzdc8rokllyzae6jqpjgmdgf+41bro2i1tuoe+5htkdevzckvuoudi9jisbpo4i4ieomc0fksfmoemv3avi3kcim1u7hvr3gli24qiapms3ptddpqebbjb/l26nhcwud2umtsxw1yx42+x5edqwskm6s7sgke/xrruyjy4zrydcjin3z0xulv3mtflunfizukijxbbrutsxbmiyrikhqnkecwkkzfvzxwvpuu4a2spz/sqgflt237f0dfytnszpysaf7uaiptqacf1cfgjc4gwd4gmfn3nl0xpzx6eqcm7vzgw9w3j4bru6t0a==</latexit> <latexit sha1_base64="jk9ygumiiksilpdmmqutzvtt2rq=">aaab/3icjvdlsgnbeoynrxhfq4ixl4nbiafdrgh6dhrxzgtzgcsg2ckkgti7u8z0cmhnwv/x4kerr/6gn//gyeogombbq1hvttuvxfiy9lwpjzm3v7c4lf3orayurw+4m1tveywa8qqlzktratvccsurkfdyeqw5dqpja8hgfozxbrk2illxoix5k6q9jbqcubrs2925jawcpiki5iaom7rz6i/iqdvn+0vvavi3ycmm5bb73uxelam5qiapmq3fi7gvuo2cst7knrpdy8ogtmcblipq41rp5p8r2bdkh3qjbuchmahfl1iagjmma7szuuybn95y/m1rjng9bavcxqlyxazb3uqsjmi4dnirmjouq0so08l+slifasrqvpb7xwnvo6lvff2r43zpbfzhfnzhdwrgwwmu4alkuaegd/aat/ds3dupzovzol3nolobbfgg5+0tkpiuqg==</latexit> <latexit sha1_base64="jk9ygumiiksilpdmmqutzvtt2rq=">aaab/3icjvdlsgnbeoynrxhfq4ixl4nbiafdrgh6dhrxzgtzgcsg2ckkgti7u8z0cmhnwv/x4kerr/6gn//gyeogombbq1hvttuvxfiy9lwpjzm3v7c4lf3orayurw+4m1tveywa8qqlzktratvccsurkfdyeqw5dqpja8hgfozxbrk2illxoix5k6q9jbqcubrs2925jawcpiki5iaom7rz6i/iqdvn+0vvavi3ycmm5bb73uxelam5qiapmq3fi7gvuo2cst7knrpdy8ogtmcblipq41rp5p8r2bdkh3qjbuchmahfl1iagjmma7szuuybn95y/m1rjng9bavcxqlyxazb3uqsjmi4dnirmjouq0so08l+slifasrqvpb7xwnvo6lvff2r43zpbfzhfnzhdwrgwwmu4alkuaegd/aat/ds3dupzovzol3nolobbfgg5+0tkpiuqg==</latexit> <latexit sha1_base64="jk9ygumiiksilpdmmqutzvtt2rq=">aaab/3icjvdlsgnbeoynrxhfq4ixl4nbiafdrgh6dhrxzgtzgcsg2ckkgti7u8z0cmhnwv/x4kerr/6gn//gyeogombbq1hvttuvxfiy9lwpjzm3v7c4lf3orayurw+4m1tveywa8qqlzktratvccsurkfdyeqw5dqpja8hgfozxbrk2illxoix5k6q9jbqcubrs2925jawcpiki5iaom7rz6i/iqdvn+0vvavi3ycmm5bb73uxelam5qiapmq3fi7gvuo2cst7knrpdy8ogtmcblipq41rp5p8r2bdkh3qjbuchmahfl1iagjmma7szuuybn95y/m1rjng9bavcxqlyxazb3uqsjmi4dnirmjouq0so08l+slifasrqvpb7xwnvo6lvff2r43zpbfzhfnzhdwrgwwmu4alkuaegd/aat/ds3dupzovzol3nolobbfgg5+0tkpiuqg==</latexit> <latexit sha1_base64="jk9ygumiiksilpdmmqutzvtt2rq=">aaab/3icjvdlsgnbeoynrxhfq4ixl4nbiafdrgh6dhrxzgtzgcsg2ckkgti7u8z0cmhnwv/x4kerr/6gn//gyeogombbq1hvttuvxfiy9lwpjzm3v7c4lf3orayurw+4m1tveywa8qqlzktratvccsurkfdyeqw5dqpja8hgfozxbrk2illxoix5k6q9jbqcubrs2925jawcpiki5iaom7rz6i/iqdvn+0vvavi3ycmm5bb73uxelam5qiapmq3fi7gvuo2cst7knrpdy8ogtmcblipq41rp5p8r2bdkh3qjbuchmahfl1iagjmma7szuuybn95y/m1rjng9bavcxqlyxazb3uqsjmi4dnirmjouq0so08l+slifasrqvpb7xwnvo6lvff2r43zpbfzhfnzhdwrgwwmu4alkuaegd/aat/ds3dupzovzol3nolobbfgg5+0tkpiuqg==</latexit> Reduced Simulation Complexity Bounded Search Depth (t n d ) Reduced Search Breath (Early termination) O(t n d ) Online Incremental Search O(t n d 1 )

38 How to enforce the scheduling decisions?

39 Siphon: System Overview Form a software-defined overlay network Aggregators: measure bandwidth, schedule coflows based on priority assignment Controller: compute priority based on Monte Carlo simulation Datacenter 1 Datacenter 2 Worker Process Worker Process Worker Process Worker Process Aggregator Daemon Aggregator Aggregator Daemon Aggregator Siphon Inter-Datacenter Software-Defined Overlay Controller

40 Performance: Coflow Scheduling Available Bandwidth (Mbps) Normalized CCT (%) Average th Percentile T-V T-M V-T V-M M-T M-V <25% 25-49% 50-74% 75% All

41 With Siphon, we can do more

42 Intra-Coflow Scheduling Largest Flow Group First Flow Group 1 ends Flow Group 2 ends DC M1 M2 M3 100MB 50MB 50MB DC1 to DC2 DC3 to DC2 50MB R1 50MB 50MB R2 DC 3-2 R1 R2 DC 1-2 DC Schedule 1 (LFGFS) Flow Group 2 ends Flow Group 1 ends Time R1 200 R Time Schedule 2

43 Coflow Multipath Routing 35~45Mbps N. Carolina Taiwan Tokyo 84~83Mbps ~586Mbps Taiwan Tokyo Oregon S. Carolina Belgium Figure 13: The summary of inter-datacenter traffic in the shuffle

44 Performance: Intra-Coflow Scheduling Reduce Stage Execution Time (s) Task Execution Shuffle Read Spark Naive Multipath Siphon

45 Performance: Benchmark Workloads Application Run Time (s) Siphon Spark ALS PCA BMM Pearson W2V FG

46 Takeaway Siphon is a software-defined inter-datacenter overlay that realizes: Coflow scheduling in wide-area data analytics: Monte Carlo Simulation Intra-coflow scheduling: Largest Flow Group First Coflow multipath rerouting Shorter coflow completion time leads to better job-level performance

47 Thank you!

Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics

Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics Shuhao Liu, Li Chen and Baochun Li Department of Electrical and Computer Engineering, University of Toronto Abstract It is increasingly

More information

A Hierarchical Synchronous Parallel Model for Wide-Area Graph Analytics

A Hierarchical Synchronous Parallel Model for Wide-Area Graph Analytics A Hierarchical Synchronous Parallel Model for Wide-Area Graph Analytics Shuhao Liu*, Li Chen, Baochun Li, Aiden Carnegie University of Toronto April 17, 2018 Graph Analytics What is Graph Analytics? 2

More information

A Network-aware Scheduler in Data-parallel Clusters for High Performance

A Network-aware Scheduler in Data-parallel Clusters for High Performance A Network-aware Scheduler in Data-parallel Clusters for High Performance Zhuozhao Li, Haiying Shen and Ankur Sarker Department of Computer Science University of Virginia May, 2018 1/61 Data-parallel clusters

More information

Exploiting Inter-Flow Relationship for Coflow Placement in Data Centers. Xin Sunny Huang, T. S. Eugene Ng Rice University

Exploiting Inter-Flow Relationship for Coflow Placement in Data Centers. Xin Sunny Huang, T. S. Eugene Ng Rice University Exploiting Inter-Flow Relationship for Coflow Placement in Data Centers Xin Sunny Huang, T S Eugene g Rice University This Work Optimizing Coflow performance has many benefits such as avoiding application

More information

Coflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan

Coflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan Coflow Recent Advances and What s Next? Mosharaf Chowdhury University of Michigan Rack-Scale Computing Datacenter-Scale Computing Geo-Distributed Computing Coflow Networking Open Source Apache Spark Open

More information

Information-Agnostic Flow Scheduling for Commodity Data Centers. Kai Chen SING Group, CSE Department, HKUST May 16, Stanford University

Information-Agnostic Flow Scheduling for Commodity Data Centers. Kai Chen SING Group, CSE Department, HKUST May 16, Stanford University Information-Agnostic Flow Scheduling for Commodity Data Centers Kai Chen SING Group, CSE Department, HKUST May 16, 2016 @ Stanford University 1 SING Testbed Cluster Electrical Packet Switch, 1G (x10) Electrical

More information

Sincronia: Near-Optimal Network Design for Coflows. Shijin Rajakrishnan. Joint work with

Sincronia: Near-Optimal Network Design for Coflows. Shijin Rajakrishnan. Joint work with Sincronia: Near-Optimal Network Design for Coflows Shijin Rajakrishnan Joint work with Saksham Agarwal Akshay Narayan Rachit Agarwal David Shmoys Amin Vahdat Traditional Applications: Care about performance

More information

Optimizing Shuffle in Wide-Area Data Analytics

Optimizing Shuffle in Wide-Area Data Analytics Optimizing Shuffle in Wide-Area Data Analytics Shuhao Liu, Hao Wang, Baochun Li Department of Electrical and Computer Engineering University of Toronto Toronto, Canada {shuhao, haowang, bli}@ece.toronto.edu

More information

Saath: Speeding up CoFlows by Exploiting the Spatial Dimension. Chengkok-Koh

Saath: Speeding up CoFlows by Exploiting the Spatial Dimension. Chengkok-Koh Saath: Speeding up CoFlows by Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. Charlie Hu Chengkok-Koh 1 Analytics Jobs in Big Data Analytics jobs in data-centers Process huge amount of data

More information

Running Ad Hoc Reports

Running Ad Hoc Reports Access: WAE Live > Analytics, click New Report, and choose Report Type: Ad hoc Access: WAE Live > Explore, choose objects, and click Run Report Ad hoc reports offer a flexible means of customizing reports

More information

Episode 5. Scheduling and Traffic Management

Episode 5. Scheduling and Traffic Management Episode 5. Scheduling and Traffic Management Part 3 Baochun Li Department of Electrical and Computer Engineering University of Toronto Outline What is scheduling? Why do we need it? Requirements of a scheduling

More information

Varys. Efficient Coflow Scheduling. Mosharaf Chowdhury, Yuan Zhong, Ion Stoica. UC Berkeley

Varys. Efficient Coflow Scheduling. Mosharaf Chowdhury, Yuan Zhong, Ion Stoica. UC Berkeley Varys Efficient Coflow Scheduling Mosharaf Chowdhury, Yuan Zhong, Ion Stoica UC Berkeley Communication is Crucial Performance Facebook analytics jobs spend 33% of their runtime in communication 1 As in-memory

More information

Fast and Accurate Load Balancing for Geo-Distributed Storage Systems

Fast and Accurate Load Balancing for Geo-Distributed Storage Systems Fast and Accurate Load Balancing for Geo-Distributed Storage Systems Kirill L. Bogdanov 1 Waleed Reda 1,2 Gerald Q. Maguire Jr. 1 Dejan Kostic 1 Marco Canini 3 1 KTH Royal Institute of Technology 2 Université

More information

Lecture 7: Data Center Networks

Lecture 7: Data Center Networks Lecture 7: Data Center Networks CSE 222A: Computer Communication Networks Alex C. Snoeren Thanks: Nick Feamster Lecture 7 Overview Project discussion Data Centers overview Fat Tree paper discussion CSE

More information

15-744: Computer Networking. Data Center Networking II

15-744: Computer Networking. Data Center Networking II 15-744: Computer Networking Data Center Networking II Overview Data Center Topology Scheduling Data Center Packet Scheduling 2 Current solutions for increasing data center network bandwidth FatTree BCube

More information

Episode 5. Scheduling and Traffic Management

Episode 5. Scheduling and Traffic Management Episode 5. Scheduling and Traffic Management Part 2 Baochun Li Department of Electrical and Computer Engineering University of Toronto Outline What is scheduling? Why do we need it? Requirements of a scheduling

More information

Sparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Sparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Outline The Spark scheduling bottleneck Sparrow s fully distributed, fault-tolerant technique

More information

Heuristic Search in MDPs 3/5/18

Heuristic Search in MDPs 3/5/18 Heuristic Search in MDPs 3/5/18 Thinking about online planning. How can we use ideas we ve already seen to help with online planning? Heuristics? Iterative deepening? Monte Carlo simulations? Other ideas?

More information

HotCloud 17. Lube: Mitigating Bottlenecks in Wide Area Data Analytics. Hao Wang* Baochun Li

HotCloud 17. Lube: Mitigating Bottlenecks in Wide Area Data Analytics. Hao Wang* Baochun Li HotCloud 17 Lube: Hao Wang* Baochun Li Mitigating Bottlenecks in Wide Area Data Analytics iqua Wide Area Data Analytics DC Master Namenode Workers Datanodes 2 Wide Area Data Analytics Why wide area data

More information

6.888 Lecture 8: Networking for Data Analy9cs

6.888 Lecture 8: Networking for Data Analy9cs 6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh ² Many thanks to Mosharaf Chowdhury (Michigan) and Kay Ousterhout (Berkeley) Spring 2016 1 Big Data Huge amounts of data being collected

More information

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552) Packet Scheduling in Data Centers Lecture 17, Computer Networks (198:552) Datacenter transport Goal: Complete flows quickly / meet deadlines Short flows (e.g., query, coordination) Large flows (e.g., data

More information

Coflow. Big Data. Data-Parallel Applications. Big Datacenters for Massive Parallelism. Recent Advances and What s Next?

Coflow. Big Data. Data-Parallel Applications. Big Datacenters for Massive Parallelism. Recent Advances and What s Next? Big Data Coflow The volume of data businesses want to make sense of is increasing Increasing variety of sources Recent Advances and What s Next? Web, mobile, wearables, vehicles, scientific, Cheaper disks,

More information

Linux Plumbers Conference TCP-NV Congestion Avoidance for Data Centers

Linux Plumbers Conference TCP-NV Congestion Avoidance for Data Centers Linux Plumbers Conference 2010 TCP-NV Congestion Avoidance for Data Centers Lawrence Brakmo Google TCP Congestion Control Algorithm for utilizing available bandwidth without too many losses No attempt

More information

LAN design. Chapter 1

LAN design. Chapter 1 LAN design Chapter 1 1 Topics Networks and business needs The 3-level hierarchical network design model Including voice and video over IP in the design Devices at each layer of the hierarchy Cisco switches

More information

Episode 5. Scheduling and Traffic Management

Episode 5. Scheduling and Traffic Management Episode 5. Scheduling and Traffic Management Part 2 Baochun Li Department of Electrical and Computer Engineering University of Toronto Keshav Chapter 9.1, 9.2, 9.3, 9.4, 9.5.1, 13.3.4 ECE 1771: Quality

More information

Buffer Requirements for Zero Loss Flow Control with Explicit Congestion Notification. Chunlei Liu Raj Jain

Buffer Requirements for Zero Loss Flow Control with Explicit Congestion Notification. Chunlei Liu Raj Jain Buffer Requirements for Zero Loss Flow Control with Explicit Congestion Notification Chunlei Liu Raj Jain Department of Computer and Information Science The Ohio State University, Columbus, OH 432-277

More information

Quality-Assured Cloud Bandwidth Auto-Scaling for Video-on-Demand Applications

Quality-Assured Cloud Bandwidth Auto-Scaling for Video-on-Demand Applications Quality-Assured Cloud Bandwidth Auto-Scaling for Video-on-Demand Applications Di Niu, Hong Xu, Baochun Li University of Toronto Shuqiao Zhao UUSee, Inc., Beijing, China 1 Applications in the Cloud WWW

More information

TO reduce management costs and improve performance, Delay-Optimized Video Traffic Routing in Software-Defined Inter-Datacenter Networks

TO reduce management costs and improve performance, Delay-Optimized Video Traffic Routing in Software-Defined Inter-Datacenter Networks 1 Delay-Optimized Video Traffic Routing in Software-Defined Inter-Datacenter Networks Yinan Liu, Di Niu, Member, IEEE and Baochun Li, Fellow, IEEE Abstract Many video streaming applications operate their

More information

Wide-Area Spark Streaming: Automated Routing and Batch Sizing

Wide-Area Spark Streaming: Automated Routing and Batch Sizing Wide-Area Spark Streaming: Automated Routing and Batch Sizing Wenxin Li, Di Niu, Yinan Liu, Shuhao Liu, Baochun Li University of Toronto & Dalian University of Technology University of Alberta University

More information

CAVA: Exploring Memory Locality for Big Data Analytics in Virtualized Clusters

CAVA: Exploring Memory Locality for Big Data Analytics in Virtualized Clusters 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing : Exploring Memory Locality for Big Data Analytics in Virtualized Clusters Eunji Hwang, Hyungoo Kim, Beomseok Nam and Young-ri

More information

To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha

To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai, Mosharaf Chowdhury, Harsha Madhyastha Background Over 40 Data Centers (DCs) on EC2, Azure, Google Cloud A geographically denser set of DCs across

More information

CS644 Advanced Networks

CS644 Advanced Networks Outline CS644 Advanced Networks Lecture 9 Intra Domain Routing Andreas Terzis Spring 2004 1 So far we have talked about E2E mechanisms Routing is the other big component of the network Largest distributed

More information

PREDICTIVE DATACENTER ANALYTICS WITH STRYMON

PREDICTIVE DATACENTER ANALYTICS WITH STRYMON PREDICTIVE DATACENTER ANALYTICS WITH STRYMON Vasia Kalavri kalavriv@inf.ethz.ch QCon San Francisco 14 November 2017 Support: ABOUT ME Postdoc at ETH Zürich Systems Group: https://www.systems.ethz.ch/ PMC

More information

DevoFlow: Scaling Flow Management for High-Performance Networks

DevoFlow: Scaling Flow Management for High-Performance Networks DevoFlow: Scaling Flow Management for High-Performance Networks Andy Curtis Jeff Mogul Jean Tourrilhes Praveen Yalagandula Puneet Sharma Sujata Banerjee Software-defined networking Software-defined networking

More information

Maximizing Link Utilization with Coflow-Aware Scheduling in Datacenter Networks

Maximizing Link Utilization with Coflow-Aware Scheduling in Datacenter Networks Maximizing Link Utilization with Coflow-Aware Scheduling in Datacenter Networks Jingie Jiang, Shiyao Ma, Bo Li, Baochun Li, and Jiangchuan Liu Department of Computer Science and Engineering, Hong Kong

More information

Sincronia: Near-Optimal Network Design for Coflows

Sincronia: Near-Optimal Network Design for Coflows Sincronia: Near-Optimal Network Design for Coflows Saksham Agarwal Cornell University Rachit Agarwal Cornell University Shijin Rajakrishnan Cornell University David Shmoys Cornell University Akshay Narayan

More information

Bohr: Similarity Aware Geo-distributed Data Analytics. Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong

Bohr: Similarity Aware Geo-distributed Data Analytics. Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong Bohr: Similarity Aware Geo-distributed Data Analytics Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong 1 Big Data Analytics Analysis Generate 2 Data are geo-distributed Frankfurt US Oregon

More information

Accelerating Analytical Workloads

Accelerating Analytical Workloads Accelerating Analytical Workloads Thomas Neumann Technische Universität München April 15, 2014 Scale Out in Big Data Analytics Big Data usually means data is distributed Scale out to process very large

More information

Delay Tolerant Bulk Data Transfers on the Internet

Delay Tolerant Bulk Data Transfers on the Internet Delay Tolerant Bulk Data Transfers on the Internet by N. Laoutaris et al., SIGMETRICS 09 Ilias Giechaskiel Cambridge University, R212 ig305@cam.ac.uk March 4, 2014 Conclusions Takeaway Messages Need to

More information

Optimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa

Optimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa Optimizing Network Performance in Distributed Machine Learning Luo Mai Chuntao Hong Paolo Costa Machine Learning Successful in many fields Online advertisement Spam filtering Fraud detection Image recognition

More information

Optimizing Apache Spark with Memory1. July Page 1 of 14

Optimizing Apache Spark with Memory1. July Page 1 of 14 Optimizing Apache Spark with Memory1 July 2016 Page 1 of 14 Abstract The prevalence of Big Data is driving increasing demand for real -time analysis and insight. Big data processing platforms, like Apache

More information

CODA: Toward Automatically Identifying and Scheduling COflows in the DArk

CODA: Toward Automatically Identifying and Scheduling COflows in the DArk : Toward Automatically Identifying and Scheduling COflows in the DArk Hong Zhang Li Chen Bairen Yi Kai Chen Mosharaf Chowdhury 2 Yanhui Geng 3 SING Group, Hong Kong University of Science and Technology

More information

Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics

Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics Presented by: Dishant Mittal Authors: Juwei Shi, Yunjie Qiu, Umar Firooq Minhas, Lemei Jiao, Chen Wang, Berthold Reinwald and Fatma

More information

Aggregation on the Fly: Reducing Traffic for Big Data in the Cloud

Aggregation on the Fly: Reducing Traffic for Big Data in the Cloud Aggregation on the Fly: Reducing Traffic for Big Data in the Cloud Huan Ke, Peng Li, Song Guo, and Ivan Stojmenovic Abstract As a leading framework for processing and analyzing big data, MapReduce is leveraged

More information

Application-Aware SDN Routing for Big-Data Processing

Application-Aware SDN Routing for Big-Data Processing Application-Aware SDN Routing for Big-Data Processing Evaluation by EstiNet OpenFlow Network Emulator Director/Prof. Shie-Yuan Wang Institute of Network Engineering National ChiaoTung University Taiwan

More information

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I. System upgrade and future perspective for the operation of Tokyo Tier2 center, T. Mashimo, N. Matsui, H. Sakamoto and I. Ueda International Center for Elementary Particle Physics, The University of Tokyo

More information

Better Never than Late: Meeting Deadlines in Datacenter Networks

Better Never than Late: Meeting Deadlines in Datacenter Networks Better Never than Late: Meeting Deadlines in Datacenter Networks Christo Wilson, Hitesh Ballani, Thomas Karagiannis, Ant Rowstron Microsoft Research, Cambridge User-facing online services Two common underlying

More information

Appendix B. Standards-Track TCP Evaluation

Appendix B. Standards-Track TCP Evaluation 215 Appendix B Standards-Track TCP Evaluation In this appendix, I present the results of a study of standards-track TCP error recovery and queue management mechanisms. I consider standards-track TCP error

More information

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest

More information

MixApart: Decoupled Analytics for Shared Storage Systems

MixApart: Decoupled Analytics for Shared Storage Systems MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto, NetApp Abstract Data analytics and enterprise applications have very

More information

Knowledge-Defined Network Orchestration in a Hybrid Optical/Electrical Datacenter Network

Knowledge-Defined Network Orchestration in a Hybrid Optical/Electrical Datacenter Network Knowledge-Defined Network Orchestration in a Hybrid Optical/Electrical Datacenter Network Wei Lu (Postdoctoral Researcher) On behalf of Prof. Zuqing Zhu University of Science and Technology of China, Hefei,

More information

Efficient Coflow Scheduling with Varys

Efficient Coflow Scheduling with Varys Efficient Coflow Scheduling with Varys Mosharaf Chowdhury 1, Yuan Zhong 2, Ion Stoica 1 1 UC Berkeley, 2 Columbia University {mosharaf, istoica}@cs.berkeley.edu, yz2561@columbia.edu ABSTRACT Communication

More information

CSE 124: THE DATACENTER AS A COMPUTER. George Porter November 20 and 22, 2017

CSE 124: THE DATACENTER AS A COMPUTER. George Porter November 20 and 22, 2017 CSE 124: THE DATACENTER AS A COMPUTER George Porter November 20 and 22, 2017 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative

More information

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services Overview 15-441 15-441 Computer Networking 15-641 Lecture 19 Queue Management and Quality of Service Peter Steenkiste Fall 2016 www.cs.cmu.edu/~prs/15-441-f16 What is QoS? Queuing discipline and scheduling

More information

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

Utilizing Datacenter Networks: Centralized or Distributed Solutions? Utilizing Datacenter Networks: Centralized or Distributed Solutions? Costin Raiciu Department of Computer Science University Politehnica of Bucharest We ve gotten used to great applications Enabling Such

More information

Merits of Open Loop. Siamack Ayandeh Onex Communications Corp. a subsidiary of. TranSwitch Corp.

Merits of Open Loop. Siamack Ayandeh Onex Communications Corp. a subsidiary of. TranSwitch Corp. One TM Merits of Open Loop Siamack Ayandeh sayandeh@onexco.com Onex Communications Corp a subsidiary of TranSwitch Corp. Outline Allows for dynamic partitioning between the High and Low priority traffic

More information

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks. Congestion Control in Today s Internet

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks. Congestion Control in Today s Internet Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks Ion Stoica CMU Scott Shenker Xerox PARC Hui Zhang CMU Congestion Control in Today s Internet Rely

More information

A Talari Networks White Paper. Turbo Charging WAN Optimization with WAN Virtualization. A Talari White Paper

A Talari Networks White Paper. Turbo Charging WAN Optimization with WAN Virtualization. A Talari White Paper A Talari Networks White Paper Turbo Charging WAN Optimization with WAN Virtualization A Talari White Paper Turbo Charging WAN Optimization with WAN Virtualization 2 Introduction WAN Virtualization is revolutionizing

More information

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks

Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope

More information

Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley

Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter Glenn Judd Morgan Stanley 1 Introduction Datacenter computing pervasive Beyond the Internet services domain BigData, Grid Computing,

More information

Optimal Slice Allocation in 5G Core Networks

Optimal Slice Allocation in 5G Core Networks arxiv:182.4655v3 [cs.ni] 19 Dec 18 Optimal Slice Allocation in 5G Core Networks Danish Sattar Ashraf Matrawy Department of Systems and Computer Engineering Carleton University Ottawa, Canada danish.sattar@carleton.ca

More information

Lecture 15: Datacenter TCP"

Lecture 15: Datacenter TCP Lecture 15: Datacenter TCP" CSE 222A: Computer Communication Networks Alex C. Snoeren Thanks: Mohammad Alizadeh Lecture 15 Overview" Datacenter workload discussion DC-TCP Overview 2 Datacenter Review"

More information

Amdahl s Law in the Datacenter Era! A Market for Fair Processor Allocation!

Amdahl s Law in the Datacenter Era! A Market for Fair Processor Allocation! Amdahl s Law in the Datacenter Era! A Market for Fair Processor Allocation! Seyed Majid Zahedi* (Duke University), Qiuyun Llull* (VMware/ Duke University), Benjamin C. Lee (Duke University)! *Equal Contributions!

More information

R-Store: A Scalable Distributed System for Supporting Real-time Analytics

R-Store: A Scalable Distributed System for Supporting Real-time Analytics R-Store: A Scalable Distributed System for Supporting Real-time Analytics Feng Li, M. Tamer Ozsu, Gang Chen, Beng Chin Ooi National University of Singapore ICDE 2014 Background Situation for large scale

More information

Deploying Multiple Service Chain (SC) Instances per Service Chain BY ABHISHEK GUPTA FRIDAY GROUP MEETING APRIL 21, 2017

Deploying Multiple Service Chain (SC) Instances per Service Chain BY ABHISHEK GUPTA FRIDAY GROUP MEETING APRIL 21, 2017 Deploying Multiple Service Chain (SC) Instances per Service Chain BY ABHISHEK GUPTA FRIDAY GROUP MEETING APRIL 21, 2017 Virtual Network Function (VNF) Service Chain (SC) 2 Multiple VNF SC Placement and

More information

Pricing Intra-Datacenter Networks with

Pricing Intra-Datacenter Networks with Pricing Intra-Datacenter Networks with Over-Committed Bandwidth Guarantee Jian Guo 1, Fangming Liu 1, Tao Wang 1, and John C.S. Lui 2 1 Cloud Datacenter & Green Computing/Communications Research Group

More information

Accelerate Applications Using EqualLogic Arrays with directcache

Accelerate Applications Using EqualLogic Arrays with directcache Accelerate Applications Using EqualLogic Arrays with directcache Abstract This paper demonstrates how combining Fusion iomemory products with directcache software in host servers significantly improves

More information

R-Storm: A Resource-Aware Scheduler for STORM. Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell

R-Storm: A Resource-Aware Scheduler for STORM. Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell R-Storm: A Resource-Aware Scheduler for STORM Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell Introduction STORM is an open source distributed real-time data stream processing system

More information

IT has now become commonly accepted that the volume of

IT has now become commonly accepted that the volume of 1 Time- and Cost- Efficient Task Scheduling Across Geo-Distributed Data Centers Zhiming Hu, Member, IEEE, Baochun Li, Fellow, IEEE, and Jun Luo, Member, IEEE Abstract Typically called big data processing,

More information

MixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp

MixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage

More information

Network Support for Multimedia

Network Support for Multimedia Network Support for Multimedia Daniel Zappala CS 460 Computer Networking Brigham Young University Network Support for Multimedia 2/33 make the best of best effort use application-level techniques use CDNs

More information

SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics

SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics Min LI,, Jian Tan, Yandong Wang, Li Zhang, Valentina Salapura, Alan Bivens IBM TJ Watson Research Center * A

More information

How Data Volume Affects Spark Based Data Analytics on a Scale-up Server

How Data Volume Affects Spark Based Data Analytics on a Scale-up Server How Data Volume Affects Spark Based Data Analytics on a Scale-up Server Ahsan Javed Awan EMJD-DC (KTH-UPC) (https://www.kth.se/profile/ajawan/) Mats Brorsson(KTH), Vladimir Vlassov(KTH) and Eduard Ayguade(UPC

More information

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization Wei Chen, Jia Rao*, and Xiaobo Zhou University of Colorado, Colorado Springs * University of Texas at Arlington Data Center

More information

Numerical approach estimate

Numerical approach estimate Simulation Nature of simulation Numericalapproachfor investigating models of systems. Data are gathered to estimatethe true characteristics of the model. Garbage in garbage out! One of the techniques of

More information

NaaS Network-as-a-Service in the Cloud

NaaS Network-as-a-Service in the Cloud NaaS Network-as-a-Service in the Cloud joint work with Matteo Migliavacca, Peter Pietzuch, and Alexander L. Wolf costa@imperial.ac.uk Motivation Mismatch between app. abstractions & network How the programmers

More information

Why Highly Distributed Computing Matters. Tom Leighton, Chief Scientist Mike Afergan, Chief Technology Officer J.D. Sherman, Chief Financial Officer

Why Highly Distributed Computing Matters. Tom Leighton, Chief Scientist Mike Afergan, Chief Technology Officer J.D. Sherman, Chief Financial Officer Why Highly Distributed Computing Matters Tom Leighton, Chief Scientist Mike Afergan, Chief Technology Officer J.D. Sherman, Chief Financial Officer The Akamai Platform The world s largest on-demand, distributed

More information

Streaming & Apache Storm

Streaming & Apache Storm Streaming & Apache Storm Recommended Text: Storm Applied Sean T. Allen, Matthew Jankowski, Peter Pathirana Manning 2010 VMware Inc. All rights reserved Big Data! Volume! Velocity Data flowing into the

More information

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks Peng Wang, Hong Xu, Zhixiong Niu, Dongsu Han, Yongqiang Xiong ACM SoCC 2016, Oct 5-7, Santa Clara Motivation Datacenter networks

More information

LIME: A Framework for Lustre Global QoS Management. Li Xi DDN/Whamcloud Zeng Lingfang - JGU

LIME: A Framework for Lustre Global QoS Management. Li Xi DDN/Whamcloud Zeng Lingfang - JGU LIME: A Framework for Lustre Global QoS Management Li Xi DDN/Whamcloud Zeng Lingfang - JGU Why QoS of Lustre? Quality of Service (QoS) is a mechanism to ensure a "guaranteed performance ls latency needs

More information

Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds

Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds Kevin Hsieh Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R. Ganger, Phillip B. Gibbons, Onur Mutlu Machine Learning and Big

More information

Introduction Optimizing applications with SAO: IO characteristics Servers: Microsoft Exchange... 5 Databases: Oracle RAC...

Introduction Optimizing applications with SAO: IO characteristics Servers: Microsoft Exchange... 5 Databases: Oracle RAC... HP StorageWorks P2000 G3 FC MSA Dual Controller Virtualization SAN Starter Kit Protecting Critical Applications with Server Application Optimization (SAO) Technical white paper Table of contents Introduction...

More information

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters CONGA: Distributed Congestion-Aware Load Balancing for Datacenters By Alizadeh,M et al. Motivation Distributed datacenter applications require large bisection bandwidth Spine Presented by Andrew and Jack

More information

End-to-end available bandwidth estimation

End-to-end available bandwidth estimation End-to-end available bandwidth estimation Constantinos Dovrolis Computer and Information Sciences University of Delaware Constantinos Dovrolis - dovrolis@cis.udel.edu, IPAM workshop, March 2002 1 of 28%

More information

Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs)

Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs) Project: IEEE P802.15 Working Group for Wireless Personal Area Networks (WPANs) Title: [Simulation Results for Final Proposal 15-3-0380] Date Submitted: [Aug 2013] Source: [Hongkun Li, Zhuo Chen, Chonggang

More information

Advanced Computer Networks. Flow Control

Advanced Computer Networks. Flow Control Advanced Computer Networks 263 3501 00 Flow Control Patrick Stuedi Spring Semester 2017 1 Oriana Riva, Department of Computer Science ETH Zürich Last week TCP in Datacenters Avoid incast problem - Reduce

More information

Fair Coflow Scheduling without Prior Knowledge

Fair Coflow Scheduling without Prior Knowledge Fair Coflow Scheduling without Prior Knowledge Luping Wang, Wei Wang Hong Kong University of Science and Technology {lwangbm, weiwa}@cse.ust.h bstract Coflow scheduling improves the networing performance

More information

VrooM. Abstract. Defne Gurel, Mycal Tucker, Zack Drach DP2 Report Section R11 & R12 May 9, 2014

VrooM. Abstract. Defne Gurel, Mycal Tucker, Zack Drach DP2 Report Section R11 & R12 May 9, 2014 VrooM Defne Gurel, Mycal Tucker, Zack Drach DP2 Report Section R11 & R12 May 9, 2014 Abstract These days, it is common to run distributed applications in public data centers. In this setting, network congestion

More information

Parallel Systems. Part 7: Evaluation of Computers and Programs. foils by Yang-Suk Kee, X. Sun, T. Fahringer

Parallel Systems. Part 7: Evaluation of Computers and Programs. foils by Yang-Suk Kee, X. Sun, T. Fahringer Parallel Systems Part 7: Evaluation of Computers and Programs foils by Yang-Suk Kee, X. Sun, T. Fahringer How To Evaluate Computers and Programs? Learning objectives: Predict performance of parallel programs

More information

Qos support and adaptive video

Qos support and adaptive video Qos support and adaptive video QoS support in ad hoc networks MAC layer techniques: 802.11 e - alternation of contention based and contention free periods; differentiated (per class) Interframe Spacing

More information

UNIFY DATA AT MEMORY SPEED. Haoyuan (HY) Li, Alluxio Inc. VAULT Conference 2017

UNIFY DATA AT MEMORY SPEED. Haoyuan (HY) Li, Alluxio Inc. VAULT Conference 2017 UNIFY DATA AT MEMORY SPEED Haoyuan (HY) Li, CEO @ Alluxio Inc. VAULT Conference 2017 March 2017 HISTORY Started at UC Berkeley AMPLab In Summer 2012 Originally named as Tachyon Rebranded to Alluxio in

More information

The Controlled Delay (CoDel) AQM Approach to fighting bufferbloat

The Controlled Delay (CoDel) AQM Approach to fighting bufferbloat The Controlled Delay (CoDel) AQM Approach to fighting bufferbloat BITAG TWG Boulder, CO February 27, 2013 Kathleen Nichols Van Jacobson Background The persistently full buffer problem, now called bufferbloat,

More information

CSE/EE 461 Lecture 11. Inter-domain Routing. This Lecture. Structure of the Internet. Focus How do we make routing scale?

CSE/EE 461 Lecture 11. Inter-domain Routing. This Lecture. Structure of the Internet. Focus How do we make routing scale? CSE/EE 461 Lecture 11 Inter-domain Routing This Lecture Focus How do we make routing scale? Inter-domain routing ASes and BGP Application Presentation Session Transport Network Data Link Physical sdg //

More information

Page 1. Quality of Service. CS 268: Lecture 13. QoS: DiffServ and IntServ. Three Relevant Factors. Providing Better Service.

Page 1. Quality of Service. CS 268: Lecture 13. QoS: DiffServ and IntServ. Three Relevant Factors. Providing Better Service. Quality of Service CS 268: Lecture 3 QoS: DiffServ and IntServ Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley,

More information

Workload Characterization and Optimization of TPC-H Queries on Apache Spark

Workload Characterization and Optimization of TPC-H Queries on Apache Spark Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research

More information

Lecture 10.1 A real SDN implementation: the Google B4 case. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

Lecture 10.1 A real SDN implementation: the Google B4 case. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it Lecture 10.1 A real SDN implementation: the Google B4 case Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it WAN WAN = Wide Area Network WAN features: Very expensive (specialized high-end

More information

Mohammad Hossein Manshaei 1393

Mohammad Hossein Manshaei 1393 Mohammad Hossein Manshaei manshaei@gmail.com 1393 Voice and Video over IP Slides derived from those available on the Web site of the book Computer Networking, by Kurose and Ross, PEARSON 2 Multimedia networking:

More information

4.1 Interval Scheduling

4.1 Interval Scheduling 41 Interval Scheduling Interval Scheduling Interval scheduling Job j starts at s j and finishes at f j Two jobs compatible if they don't overlap Goal: find maximum subset of mutually compatible jobs a

More information

XCP: explicit Control Protocol

XCP: explicit Control Protocol XCP: explicit Control Protocol Dina Katabi MIT Lab for Computer Science dk@mit.edu www.ana.lcs.mit.edu/dina Sharing the Internet Infrastructure Is fundamental Much research in Congestion Control, QoS,

More information

Proposal for a Resource Allocation Protocol based on 802.1CS LRP

Proposal for a Resource Allocation Protocol based on 802.1CS LRP Proposal for a Resource Allocation Protocol based on 802.1CS LRP Industrial Requirements Feng Chen, Franz-Josef Götz Siemens AG IEEE 802.1 Interim Meeting May 2017, Stuttgart, Germany Siemens AG 2017 History

More information