LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning

Size: px
Start display at page:

Download "LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning"

Transcription

1 LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning Tianyi Chen Georgios Giannakis Tao Sun Wotao Yin UMN, ECE UCLA, Math NeurIPS 208

2 Overview M := {,...,M} <latexit sha_base64="k7jru5gta9yb+ipfjpt/zlno0yy=">aaacaxicbvdlssnafj3uv62vqbvbzwarxjssseerhiibn4uk9gfnkjpjpb06mqkze6geuvfx3lhqxk/4c6/cdpmoa0hlhzouzd77wksrpv2ng+rslk6tr5r3cxtbe/s7tn7b20luoljcwsmzddaijdksuttzug3kqtfasodyhqz9tsprcoq+l0ej8sp0ydtigkkjds3jzipiwybk6trl3mrhgufvpwgn+nbzafqzacxizutmsjr7ntfxihwghoumunk9vwn0x6gpkaykunjsxvjeb6haekzylfmlj/nppjau6oemblsfndwpv6eyfcsdgotgem9fatelpxp6+x6ujszyhpuk04ni+kugagnm4yeglwzqndufyunmrxemkedymtjijwv8ezm0z6uuu3xvauv6ly+jci7bctgdlrgadxalmqafmhgez+avvflpovbn3mwwtwpnmi/sd6/aeiejx9</latexit> J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le et al., Large-scale distributed deep networks, Proc. NIPS., Lake Tahoe, NV, pp , 202 2

3 Overview M := {,...,M} <latexit sha_base64="k7jru5gta9yb+ipfjpt/zlno0yy=">aaacaxicbvdlssnafj3uv62vqbvbzwarxjssseerhiibn4uk9gfnkjpjpb06mqkze6geuvfx3lhqxk/4c6/cdpmoa0hlhzouzd77wksrpv2ng+rslk6tr5r3cxtbe/s7tn7b20luoljcwsmzddaijdksuttzug3kqtfasodyhqz9tsprcoq+l0ej8sp0ydtigkkjds3jzipiwybk6trl3mrhgufvpwgn+nbzafqzacxizutmsjr7ntfxihwghoumunk9vwn0x6gpkaykunjsxvjeb6haekzylfmlj/nppjau6oemblsfndwpv6eyfcsdgotgem9fatelpxp6+x6ujszyhpuk04ni+kugagnm4yeglwzqndufyunmrxemkedymtjijwv8ezm0z6uuu3xvauv6ly+jci7bctgdlrgadxalmqafmhgez+avvflpovbn3mwwtwpnmi/sd6/aeiejx9</latexit> q Solvers: gradient descent (GD), momentum methods q Our method improves GD by same convergence rate in theory reduced communication in theory more than 90% communication saving in practice J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le et al., Large-scale distributed deep networks, Proc. NIPS., Lake Tahoe, NV, pp , 202 3

4 <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="ad9+7z/64n0l6c5cvtcf9t7i5sq=">aaacghicbva9twjben3ze/eltbtzseywwtsblyk2fhayyefcizlbftiwt3fzntmhf36gjx/fxkjjbon8n+4bhyiv2ezlezozmrfeuhh03w9nzxvtfwmzt5xf3tnd2y8chnznlgjgayyskw4gylguitdqootnwhmia8kbwfam8xtpxbsrqqccxbwdql+jnmcavuouzn0fgytuzydp3bgt+ohol/wgkl0zcu2x+jjgcophyeacdqpft+xoqzejnydfmkeu5j43yglivfijbjt8twy2ylofezycd5pdi+bdahpw5yqcllpp9pdxvtukl3ai7r9culu/d2rqmiynwlcdgwi4m/ueeuxdtvoh4gs5yrnbvursjgiweu0kzrnkksxatlc7ujyadqxtlnkbgrd48jkpx5q9t+zdu8xk9tyohdkmj6rephjjkuswvemnmpjmxsk7+xbendfn0/mala44854j8gfo5aeuvkdg</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> Vanilla GD implementation + = X m2m rl m rl rl m rl <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> M q Per iteration communication overhead for M uploads (one per worker) 4

5 Prior art q Communication-efficient distributed learning Quantized gradient descent [Kashyap et al., 07], [Alistarh et al., 7], [Suresh et al., 7] Increasing computation before communication [Jaggi et al., 4], [Ma et al., 7], [Smith et al., 7] Sparse SGD with large entries [Aji-Heafield 7], [Sun et al., 7], [Lin et al., 8], [Stich et al., 8] Ø number of communication rounds is not reduced Our contribution Adaptively skip communication, provable communication reduction 5

6 <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="ad9+7z/64n0l6c5cvtcf9t7i5sq=">aaacghicbva9twjben3ze/eltbtzseywwtsblyk2fhayyefcizlbftiwt3fzntmhf36gjx/fxkjjbon8n+4bhyiv2ezlezozmrfeuhh03w9nzxvtfwmzt5xf3tnd2y8chnznlgjgayyskw4gylguitdqootnwhmia8kbwfam8xtpxbsrqqccxbwdql+jnmcavuouzn0fgytuzydp3bgt+ohol/wgkl0zcu2x+jjgcophyeacdqpft+xoqzejnydfmkeu5j43yglivfijbjt8twy2ylofezycd5pdi+bdahpw5yqcllpp9pdxvtukl3ai7r9culu/d2rqmiynwlcdgwi4m/ueeuxdtvoh4gs5yrnbvursjgiweu0kzrnkksxatlc7ujyadqxtlnkbgrd48jkpx5q9t+zdu8xk9tyohdkmj6rephjjkuswvemnmpjmxsk7+xbendfn0/mala44854j8gfo5aeuvkdg</latexit> Our LAG implementation + = X Fresh gradient rl m ( ) m2m k Old gradient X rl m ˆ k m m2m/m k rl rl m q Select a subset of workers M k M to upload q Remaining workers in M/M k <latexit sha_base64="qvz+pv4h383let9dvojmypjlyi=">aaacanicbvdlsgmxfltx7w+rl2jm2arxjwzutblwy0boyj9qgcsmttthmyyq5iryldc+ctuxcjiq9w59+ytino64haytn33useiofmacf5sgorq2vrg8xn0tb2zu6evx/quneqcw2smmeye2bforo0qznmtjniiqoa03ywupz67xsqfyvfrr4ni/wqlcqeaynlopmo9gjq4nnujyddhp9w7us8toxzkblrm3j2xi0ejzn4/jmlehszmluq6tql9devncketkpcqmmaywgpanvtgico/m60wqadg6amwluyijwbq744mr0qno8burlgpai3ff/zuqkol/ymistvvjd5q2hkky7rna/uz5iszcegyckz+ssiqywx0sakgnbxvx5mbsqfdepudecr2axgeyzibm3dhhopwbqoaoeheiixeluerwfrzxqflxasvocq/sd6+aavu5bx</latexit> do not upload 6

7 <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="ckjerppo0exvggranxgvbpzutg=">aaacznicbvjna9wwejxdr3t7twmpvyguheswi5legznpdawttbnaqvnmtbktlhzdqrxykuyxvv7eusp6p+ovn5cm3ha6phepgy0o6rs0miu/qrce/cfphy093jw5omz5y+g+y/pbfkblma8vkw5smakjbwyouqlliojoeiuoe/wp6+fi2mlax+iptklaritewlb/tucvibvzczu6fmq6lamq6kfmqwbutkdscsuq3spvcxy5glhozy3sqh4/78hndmlx+6dbjunl6u4oseeupqa4ixv3uwfkpojpvwwdvexq0f2njpvyroyrwy6h3cvuboplcbrnom3quydegrhzxxq5/mlwja8lozershyerxuuhbiuxilmwgorkubrymtcqw2fsau3xudd33pmrdps+korbtl/hq4k27btmwva3n7wwrjpm9eyvls4qasahezdobrwfeva7paupbeccyd4eb6xinpwq8u/q9ohxdffvjdchy0ianj/cuanrztxrfhxpm35ide5jicka9ksmaebx+dq+bb4mjpeb024fcunqx2nlfkvwh//afhrevw</latexit> LAG: GD under two alternative communication rules q Worker-side rule (LAG-WK): Include worker m in if Old gradient rl m rl m ˆ k m M Gradient innovation Optimization progress T. Chen, G. B. Giannakis, T. Sun, and W. Yin "LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning," Proc. of NIPS, Montreal, Canada, December 3-8,

8 <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="ckjerppo0exvggranxgvbpzutg=">aaacznicbvjna9wwejxdr3t7twmpvyguheswi5legznpdawttbnaqvnmtbktlhzdqrxykuyxvv7eusp6p+ovn5cm3ha6phepgy0o6rs0miu/qrce/cfphy093jw5omz5y+g+y/pbfkblma8vkw5smakjbwyouqlliojoeiuoe/wp6+fi2mlax+iptklaritewlb/tucvibvzczu6fmq6lamq6kfmqwbutkdscsuq3spvcxy5glhozy3sqh4/78hndmlx+6dbjunl6u4oseeupqa4ixv3uwfkpojpvwwdvexq0f2njpvyroyrwy6h3cvuboplcbrnom3quydegrhzxxq5/mlwja8lozershyerxuuhbiuxilmwgorkubrymtcqw2fsau3xudd33pmrdps+korbtl/hq4k27btmwva3n7wwrjpm9eyvls4qasahezdobrwfeva7paupbeccyd4eb6xinpwq8u/q9ohxdffvjdchy0ianj/cuanrztxrfhxpm35ide5jicka9ksmaebx+dq+bb4mjpeb024fcunqx2nlfkvwh//afhrevw</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="exy2rvibegrsormj8+msfvxw2g=">aaacknicdzhpbtqwemad8k8sbraocmciqiqhrpje4fibkaavvcs2rbterhpvjlhwtoi9qvqlesuehbtvg7nzjgixksx/+ohm/4mrbvyfew/gvdgzvu37+zchd3bvf/g4fjr7qmrgitxkitd2fmuhgpv4pquatyvlyjjnz6ly7d9/uw7wqeq8iutapwbyeuvkqnkutl+czwy8ubl4pkltniltzl+aguvsnbdla9eadrusxwjuwixb3e3viscv3grwzbt3lwfoqexi4//eag6lmda+//ptiuohazkc3o5xojynerge9ekwge/lukn2goboengp8wiko3bkqqg52zxvno8butkauxgonfygxcjjmvszdo5u3a3y6/8gtbs8r6uxjf078rwjcuh9u/necfu5rr4bbcrkhs9bxvzd0qlnjoldwau8x7vfgfsihjr7waazwflcscvkhkf9qbef/98nvxejijo0n8jwi77cl7zvzzzf6xi/aenbapk8gt4cj4ehwmn4xvwupbrjdy+pay/rph598qd9cj</latexit> <latexit sha_base64="exy2rvibegrsormj8+msfvxw2g=">aaacknicdzhpbtqwemad8k8sbraocmciqiqhrpje4fibkaavvcs2rbterhpvjlhwtoi9qvqlesuehbtvg7nzjgixksx/+ohm/4mrbvyfew/gvdgzvu37+zchd3bvf/g4fjr7qmrgitxkitd2fmuhgpv4pquatyvlyjjnz6ly7d9/uw7wqeq8iutapwbyeuvkqnkutl+czwy8ubl4pkltniltzl+aguvsnbdla9eadrusxwjuwixb3e3viscv3grwzbt3lwfoqexi4//eag6lmda+//ptiuohazkc3o5xojynerge9ekwge/lukn2goboengp8wiko3bkqqg52zxvno8butkauxgonfygxcjjmvszdo5u3a3y6/8gtbs8r6uxjf078rwjcuh9u/necfu5rr4bbcrkhs9bxvzd0qlnjoldwau8x7vfgfsihjr7waazwflcscvkhkf9qbef/98nvxejijo0n8jwi77cl7zvzzzf6xi/aenbapk8gt4cj4ehwmn4xvwupbrjdy+pay/rph598qd9cj</latexit> <latexit sha_base64="fwqdngx8hsasjo47ghwur+5xsmo=">aaacnxicdvfnj9mwehxck+uncdaxyv0nlyktklhcsqeqaflrldrlr3o4nrjfbtjgtpkcpv/hw/hbv/bqcpeuywkax5em9gm36tkpajkjfqxjj5q3bd/budu7df/dw0xd/8amtgspflfeqmmcpwkfkkayouymz2gjqqrkzdpwu02ffhbgykr/huhyldxkpm8kbpzumfxwnmr2vobuklk3u0q6t45hirda89uhkwdddqln9llbhczt38ycufzzoc7uhwfw6zehp7+wzbqdqe9ffcfstvyfokreygv+squb4nkoirg0sbodrbvwyhs4yqz/mtlijdalmgvwdupoxoxdgxkrkq7yi0vnfav5gluyqla2ixbunvsl55z0qwy/pvin+zfhq607db2lrqwsfejtylzrvm3iyclosgrcn7qvmjkfa0oxvdsim4qruhwi30ujegdcu/ue7e+krx74oto/gctsov0ajydhwjj3yjlwgbyqmr8mefcanzep48dsybb+dt+hz8h4hh7ps8ng2/oe/bph7ddaedgy</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> LAG: GD under two alternative communication rules q Worker-side rule (LAG-WK): Include worker m in if Old gradient rl m rl m ˆ k m M Gradient innovation Optimization progress q Server-side rule (LAG-PS): Include worker m in if : smoothness of L m L m ˆ k m M LAG-PS is a sufficient condition for LAG-WK. T. Chen, G. B. Giannakis, T. Sun, and W. Yin "LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning," Proc. of NIPS, Montreal, Canada, December 3-8,

9 Iteration and communication complexity (nonconvex) Local loss is smooth. (convex) Loss is convex. (strongly convex) Loss is (restricted) strongly convex. Theorem In all cases, LAG enjoys the same convergence rate as GD. Theorem 2 If local objectives are heterogeneous, LAG requires smaller number of uploads to a given accuracy than GD; e.g., as small as /M. 9

10 Linear prediction q Real datasets distributed on M = 9 workers Cyc-/Num-IAG: cyclic/non-uniform update of incremental aggregated gradient M. Lichman, UCI machine learning repository, 203. [Online]: 0

11 Logistic regression q Real datasets distributed on M = 9 workers Ø LAG needs same number of iterations but fewer uploads

12 Conclusions q Adaptive communication rules for distributed learning q Not degrade convergence but reduce communication Thank You! Thu Dec 6th 05: :00 Room 20 & 230 AB #8 2

Decentralized and Distributed Machine Learning Model Training with Actors

Decentralized and Distributed Machine Learning Model Training with Actors Decentralized and Distributed Machine Learning Model Training with Actors Travis Addair Stanford University taddair@stanford.edu Abstract Training a machine learning model with terabytes to petabytes of

More information

Scaling Distributed Machine Learning

Scaling Distributed Machine Learning Scaling Distributed Machine Learning with System and Algorithm Co-design Mu Li Thesis Defense CSD, CMU Feb 2nd, 2017 nx min w f i (w) Distributed systems i=1 Large scale optimization methods Large-scale

More information

FTSGD: An Adaptive Stochastic Gradient Descent Algorithm for Spark MLlib

FTSGD: An Adaptive Stochastic Gradient Descent Algorithm for Spark MLlib FTSGD: An Adaptive Stochastic Gradient Descent Algorithm for Spark MLlib Hong Zhang, Zixia Liu, Hai Huang, Liqiang Wang Department of Computer Science, University of Central Florida, Orlando, FL, USA IBM

More information

arxiv: v2 [cs.ne] 26 Apr 2014

arxiv: v2 [cs.ne] 26 Apr 2014 One weird trick for parallelizing convolutional neural networks Alex Krizhevsky Google Inc. akrizhevsky@google.com April 29, 2014 arxiv:1404.5997v2 [cs.ne] 26 Apr 2014 Abstract I present a new way to parallelize

More information

Convex Optimization. Lijun Zhang Modification of

Convex Optimization. Lijun Zhang   Modification of Convex Optimization Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Modification of http://stanford.edu/~boyd/cvxbook/bv_cvxslides.pdf Outline Introduction Convex Sets & Functions Convex Optimization

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 4, 2016 Outline Multi-core v.s. multi-processor Parallel Gradient Descent Parallel Stochastic Gradient Parallel Coordinate Descent Parallel

More information

Analyzing Stochastic Gradient Descent for Some Non- Convex Problems

Analyzing Stochastic Gradient Descent for Some Non- Convex Problems Analyzing Stochastic Gradient Descent for Some Non- Convex Problems Christopher De Sa Soon at Cornell University cdesa@stanford.edu stanford.edu/~cdesa Kunle Olukotun Christopher Ré Stanford University

More information

MALT: Distributed Data-Parallelism for Existing ML Applications

MALT: Distributed Data-Parallelism for Existing ML Applications MALT: Distributed -Parallelism for Existing ML Applications Hao Li, Asim Kadav, Erik Kruus NEC Laboratories America {asim, kruus@nec-labs.com} Abstract We introduce MALT, a machine learning library that

More information

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity

More information

Distributed Delayed Proximal Gradient Methods

Distributed Delayed Proximal Gradient Methods Distributed Delayed Proximal Gradient Methods Mu Li, David G. Andersen and Alexander Smola, Carnegie Mellon University Google Strategic Technologies Abstract We analyze distributed optimization algorithms

More information

Gradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent

Gradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent Gradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent Slide credit: http://sebastianruder.com/optimizing-gradient-descent/index.html#batchgradientdescent

More information

CS-541 Wireless Sensor Networks

CS-541 Wireless Sensor Networks CS-541 Wireless Sensor Networks Lecture 14: Big Sensor Data Prof Panagiotis Tsakalides, Dr Athanasia Panousopoulou, Dr Gregory Tsagkatakis 1 Overview Big Data Big Sensor Data Material adapted from: Recent

More information

Distributed Optimization for Machine Learning

Distributed Optimization for Machine Learning Distributed Optimization for Machine Learning Martin Jaggi EPFL Machine Learning and Optimization Laboratory mlo.epfl.ch AI Summer School - MSR Cambridge - July 5 th Machine Learning Methods to Analyze

More information

Distributed Primal-Dual Optimization for Non-uniformly Distributed Data

Distributed Primal-Dual Optimization for Non-uniformly Distributed Data Distributed Primal-Dual Optimization for Non-uniformly Distributed Data Minhao Cheng 1, Cho-Jui Hsieh 1,2 1 Department of Computer Science, University of California, Davis 2 Department of Statistics, University

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 22 2019 Outline Practical issues in Linear Regression Outliers Categorical variables Lab

More information

Parallel and Distributed Sparse Optimization Algorithms

Parallel and Distributed Sparse Optimization Algorithms Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed

More information

Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce

Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce Zehua Cheng,Zhenghua Xu State Key Laboratory of Reliability and Intelligence of Electrical Equipment Hebei University of Technology

More information

PARALLELIZED IMPLEMENTATION OF LOGISTIC REGRESSION USING MPI

PARALLELIZED IMPLEMENTATION OF LOGISTIC REGRESSION USING MPI PARALLELIZED IMPLEMENTATION OF LOGISTIC REGRESSION USING MPI CSE 633 PARALLEL ALGORITHMS BY PAVAN G JOSHI What is machine learning? Machine learning is a type of artificial intelligence (AI) that provides

More information

CS 179 Lecture 16. Logistic Regression & Parallel SGD

CS 179 Lecture 16. Logistic Regression & Parallel SGD CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)

More information

Mini-batch Block-coordinate based Stochastic Average Adjusted Gradient Methods to Solve Big Data Problems

Mini-batch Block-coordinate based Stochastic Average Adjusted Gradient Methods to Solve Big Data Problems Proceedings of Machine Learning Research 77:49 64, 2017 ACML 2017 Mini-batch Block-coordinate based Stochastic Average Adjusted Gradient Methods to Solve Big Data Problems Vinod Kumar Chauhan Department

More information

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models [Supplemental Materials] 1. Network Architecture b ref b ref +1 We now describe the architecture of the networks

More information

REVISITING DISTRIBUTED SYNCHRONOUS SGD

REVISITING DISTRIBUTED SYNCHRONOUS SGD REVISITING DISTRIBUTED SYNCHRONOUS SGD Jianmin Chen, Rajat Monga, Samy Bengio & Rafal Jozefowicz Google Brain Mountain View, CA, USA {jmchen,rajatmonga,bengio,rafalj}@google.com 1 THE NEED FOR A LARGE

More information

Logistic Regression. Abstract

Logistic Regression. Abstract Logistic Regression Tsung-Yi Lin, Chen-Yu Lee Department of Electrical and Computer Engineering University of California, San Diego {tsl008, chl60}@ucsd.edu January 4, 013 Abstract Logistic regression

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz Gradient Descent Wed Sept 20th, 2017 James McInenrey Adapted from slides by Francisco J. R. Ruiz Housekeeping A few clarifications of and adjustments to the course schedule: No more breaks at the midpoint

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

Multi-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture

Multi-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture The 51st Annual IEEE/ACM International Symposium on Microarchitecture Multi-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture Byungchul Hong Yeonju Ro John Kim FuriosaAI Samsung

More information

Parallel Deep Network Training

Parallel Deep Network Training Lecture 26: Parallel Deep Network Training Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Speech Debelle Finish This Album (Speech Therapy) Eat your veggies and study

More information

Efficient Deep Learning Optimization Methods

Efficient Deep Learning Optimization Methods 11-785/ Spring 2019/ Recitation 3 Efficient Deep Learning Optimization Methods Josh Moavenzadeh, Kai Hu, and Cody Smith Outline 1 Review of optimization 2 Optimization practice 3 Training tips in PyTorch

More information

LEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING. Nandita M. Nayak, Amit K. Roy-Chowdhury. University of California, Riverside

LEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING. Nandita M. Nayak, Amit K. Roy-Chowdhury. University of California, Riverside LEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING Nandita M. Nayak, Amit K. Roy-Chowdhury University of California, Riverside ABSTRACT We present an approach which incorporates spatiotemporal

More information

Practice Questions for Midterm

Practice Questions for Midterm Practice Questions for Midterm - 10-605 Oct 14, 2015 (version 1) 10-605 Name: Fall 2015 Sample Questions Andrew ID: Time Limit: n/a Grade Table (for teacher use only) Question Points Score 1 6 2 6 3 15

More information

Case Study 1: Estimating Click Probabilities

Case Study 1: Estimating Click Probabilities Case Study 1: Estimating Click Probabilities SGD cont d AdaGrad Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade March 31, 2015 1 Support/Resources Office Hours Yao Lu:

More information

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund Optimization Plugin for RapidMiner Technical Report Venkatesh Umaashankar Sangkyun Lee 04/2012 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft

More information

Class 6 Large-Scale Image Classification

Class 6 Large-Scale Image Classification Class 6 Large-Scale Image Classification Liangliang Cao, March 7, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Visual

More information

ActiveClean: Interactive Data Cleaning For Statistical Modeling. Safkat Islam Carolyn Zhang CS 590

ActiveClean: Interactive Data Cleaning For Statistical Modeling. Safkat Islam Carolyn Zhang CS 590 ActiveClean: Interactive Data Cleaning For Statistical Modeling Safkat Islam Carolyn Zhang CS 590 Outline Biggest Takeaways, Strengths, and Weaknesses Background System Architecture Updating the Model

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences

More information

arxiv: v1 [cs.dc] 27 Sep 2017

arxiv: v1 [cs.dc] 27 Sep 2017 Slim-DP: A Light Communication Data Parallelism for DNN Shizhao Sun 1,, Wei Chen 2, Jiang Bian 2, Xiaoguang Liu 1 and Tie-Yan Liu 2 1 College of Computer and Control Engineering, Nankai University, Tianjin,

More information

Optimization for Machine Learning

Optimization for Machine Learning with a focus on proximal gradient descent algorithm Department of Computer Science and Engineering Outline 1 History & Trends 2 Proximal Gradient Descent 3 Three Applications A Brief History A. Convex

More information

Decentralized Low-Rank Matrix Completion

Decentralized Low-Rank Matrix Completion Decentralized Low-Rank Matrix Completion Qing Ling 1, Yangyang Xu 2, Wotao Yin 2, Zaiwen Wen 3 1. Department of Automation, University of Science and Technology of China 2. Department of Computational

More information

Neural Network Based Eye Tracking

Neural Network Based Eye Tracking Neural Network Based Eye Tracking Workshop on Architectures of Smart Camera Instituto de Estudios Sociales Avanzados IESA-CSIC June 5/6, 2017, Córdoba, Spain Pavel Morozkin (SuriCog) Maria Trocan (ISEP)

More information

A Technical Report about SAG,SAGA,SVRG

A Technical Report about SAG,SAGA,SVRG A Technical Report about SAG,SAGA,SVRG Tao Hu Department of Computer Science Peking University No.5 Yiheyuan Road Haidian District, Beijing, P.R.China taohu@pku.edu.cn Notice: this is a course project

More information

Constrained optimization

Constrained optimization Constrained optimization A general constrained optimization problem has the form where The Lagrangian function is given by Primal and dual optimization problems Primal: Dual: Weak duality: Strong duality:

More information

Optimization. Industrial AI Lab.

Optimization. Industrial AI Lab. Optimization Industrial AI Lab. Optimization An important tool in 1) Engineering problem solving and 2) Decision science People optimize Nature optimizes 2 Optimization People optimize (source: http://nautil.us/blog/to-save-drowning-people-ask-yourself-what-would-light-do)

More information

Deep Neural Networks Optimization

Deep Neural Networks Optimization Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy

More information

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form

More information

Logistic Regression and Gradient Ascent

Logistic Regression and Gradient Ascent Logistic Regression and Gradient Ascent CS 349-02 (Machine Learning) April 0, 207 The perceptron algorithm has a couple of issues: () the predictions have no probabilistic interpretation or confidence

More information

Coding for Random Projects

Coding for Random Projects Coding for Random Projects CS 584: Big Data Analytics Material adapted from Li s talk at ICML 2014 (http://techtalks.tv/talks/coding-for-random-projections/61085/) Random Projections for High-Dimensional

More information

When Network Embedding meets Reinforcement Learning?

When Network Embedding meets Reinforcement Learning? When Network Embedding meets Reinforcement Learning? ---Learning Combinatorial Optimization Problems over Graphs Changjun Fan 1 1. An Introduction to (Deep) Reinforcement Learning 2. How to combine NE

More information

Revisit Long Short-Term Memory: An Optimization Perspective

Revisit Long Short-Term Memory: An Optimization Perspective Revisit Long Short-Term Memory: An Optimization Perspective Qi Lyu, Jun Zhu State Key Laboratory of Intelligent Technology and Systems (LITS) Tsinghua National Laboratory for Information Science and Technology

More information

MapReduce ML & Clustering Algorithms

MapReduce ML & Clustering Algorithms MapReduce ML & Clustering Algorithms Reminder MapReduce: A trade-off between ease of use & possible parallelism Graph Algorithms Approaches: Reduce input size (filtering) Graph specific optimizations (Pregel

More information

CafeGPI. Single-Sided Communication for Scalable Deep Learning

CafeGPI. Single-Sided Communication for Scalable Deep Learning CafeGPI Single-Sided Communication for Scalable Deep Learning Janis Keuper itwm.fraunhofer.de/ml Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Deep Neural Networks

More information

Training Deep Neural Networks (in parallel)

Training Deep Neural Networks (in parallel) Lecture 9: Training Deep Neural Networks (in parallel) Visual Computing Systems How would you describe this professor? Easy? Mean? Boring? Nerdy? Professor classification task Classifies professors as

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

Stanford University. A Distributed Solver for Kernalized SVM

Stanford University. A Distributed Solver for Kernalized SVM Stanford University CME 323 Final Project A Distributed Solver for Kernalized SVM Haoming Li Bangzheng He haoming@stanford.edu bzhe@stanford.edu GitHub Repository https://github.com/cme323project/spark_kernel_svm.git

More information

Sparse Screening for Exact Data Reduction

Sparse Screening for Exact Data Reduction Sparse Screening for Exact Data Reduction Jieping Ye Arizona State University Joint work with Jie Wang and Jun Liu 1 wide data 2 tall data How to do exact data reduction? The model learnt from the reduced

More information

Scaling Deep Learning on Multiple In-Memory Processors

Scaling Deep Learning on Multiple In-Memory Processors Scaling Deep Learning on Multiple In-Memory Processors Lifan Xu, Dong Ping Zhang, and Nuwan Jayasena AMD Research, Advanced Micro Devices, Inc. {lifan.xu, dongping.zhang, nuwan.jayasena}@amd.com ABSTRACT

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 18 2018 Logistics HW 1 is on Piazza and Gradescope Deadline: Friday, Sept. 28, 2018 Office

More information

SpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian

SpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian SpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian Taiji Suzuki Ryota Tomioka The University of Tokyo Graduate School of Information Science and Technology Department of

More information

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky

More information

Structured Learning. Jun Zhu

Structured Learning. Jun Zhu Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum

More information

A new approach for supervised power disaggregation by using a deep recurrent LSTM network

A new approach for supervised power disaggregation by using a deep recurrent LSTM network A new approach for supervised power disaggregation by using a deep recurrent LSTM network GlobalSIP 2015, 14th Dec. Lukas Mauch and Bin Yang Institute of Signal Processing and System Theory University

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - C) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Learning Loop Invariants for Program Verification

Learning Loop Invariants for Program Verification Learning Loop Invariants for Program Verification Xujie Si*, Hanjun Dai*, Mukund Raghothaman, Mayur Naik, Le Song University of Pennsylvania Georgia Institute of Technology NeurIPS 2018 Code: https://github.com/pl-ml/code2inv

More information

Linear Regression Optimization

Linear Regression Optimization Gradient Descent Linear Regression Optimization Goal: Find w that minimizes f(w) f(w) = Xw y 2 2 Closed form solution exists Gradient Descent is iterative (Intuition: go downhill!) n w * w Scalar objective:

More information

One-Pass Ranking Models for Low-Latency Product Recommendations

One-Pass Ranking Models for Low-Latency Product Recommendations One-Pass Ranking Models for Low-Latency Product Recommendations Martin Saveski @msaveski MIT (Amazon Berlin) One-Pass Ranking Models for Low-Latency Product Recommendations Amazon Machine Learning Team,

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks

Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks Shizhao Sun 1, Wei Chen 2, Jiang Bian 2, Xiaoguang Liu 1, and Tie-Yan Liu 2 1 College of Computer and Control Engineering,

More information

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3 Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.

More information

Hyperparameter optimization. CS6787 Lecture 6 Fall 2017

Hyperparameter optimization. CS6787 Lecture 6 Fall 2017 Hyperparameter optimization CS6787 Lecture 6 Fall 2017 Review We ve covered many methods Stochastic gradient descent Step size/learning rate, how long to run Mini-batching Batch size Momentum Momentum

More information

In stochastic gradient descent implementations, the fixed learning rate η is often replaced by an adaptive learning rate that decreases over time,

In stochastic gradient descent implementations, the fixed learning rate η is often replaced by an adaptive learning rate that decreases over time, Chapter 2 Although stochastic gradient descent can be considered as an approximation of gradient descent, it typically reaches convergence much faster because of the more frequent weight updates. Since

More information

TGNet: Learning to Rank Nodes in Temporal Graphs. Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2

TGNet: Learning to Rank Nodes in Temporal Graphs. Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2 TGNet: Learning to Rank Nodes in Temporal Graphs Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2 1 2 3 Node Ranking in temporal graphs Temporal graphs have been

More information

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data

More information

CS Deep Reinforcement Learning HW4: Model-Based RL due October 18th, 11:59 pm

CS Deep Reinforcement Learning HW4: Model-Based RL due October 18th, 11:59 pm CS294-112 Deep Reinforcement Learning HW4: Model-Based RL due October 18th, 11:59 pm 1 Introduction The goal of this assignment is to get experience with model-learning in the context of RL, and to use

More information

EFFICIENT NEURAL NETWORK ARCHITECTURE FOR TOPOLOGY IDENTIFICATION IN SMART GRID. Yue Zhao, Jianshu Chen and H. Vincent Poor

EFFICIENT NEURAL NETWORK ARCHITECTURE FOR TOPOLOGY IDENTIFICATION IN SMART GRID. Yue Zhao, Jianshu Chen and H. Vincent Poor EFFICIENT NEURAL NETWORK ARCHITECTURE FOR TOPOLOGY IDENTIFICATION IN SMART GRID Yue Zhao, Jianshu Chen and H. Vincent Poor Department of Electrical and Computer Engineering, Stony Brook University, Stony

More information

Optimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa

Optimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa Optimizing Network Performance in Distributed Machine Learning Luo Mai Chuntao Hong Paolo Costa Machine Learning Successful in many fields Online advertisement Spam filtering Fraud detection Image recognition

More information

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1 Multi-class Image Segmentation Assign a class label to each pixel in

More information

Guarding user Privacy with Federated Learning and Differential Privacy

Guarding user Privacy with Federated Learning and Differential Privacy Guarding user Privacy with Federated Learning and Differential Privacy Brendan McMahan mcmahan@google.com DIMACS/Northeast Big Data Hub Workshop on Overcoming Barriers to Data Sharing including Privacy

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

Accelerated tokamak transport simulations

Accelerated tokamak transport simulations Accelerated tokamak transport simulations via Neural-Network based regression of TGLF turbulent energy, particle and momentum fluxes by Teobaldo Luda 1 O. Meneghini 2, S. Smith 2, G. Staebler 2 J. Candy

More information

Latent Space Model for Road Networks to Predict Time-Varying Traffic. Presented by: Rob Fitzgerald Spring 2017

Latent Space Model for Road Networks to Predict Time-Varying Traffic. Presented by: Rob Fitzgerald Spring 2017 Latent Space Model for Road Networks to Predict Time-Varying Traffic Presented by: Rob Fitzgerald Spring 2017 Definition of Latent https://en.oxforddictionaries.com/definition/latent Latent Space Model?

More information

A low rank based seismic data interpolation via frequencypatches transform and low rank space projection

A low rank based seismic data interpolation via frequencypatches transform and low rank space projection A low rank based seismic data interpolation via frequencypatches transform and low rank space projection Zhengsheng Yao, Mike Galbraith and Randy Kolesar Schlumberger Summary We propose a new algorithm

More information

A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING

A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING 017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 5 8, 017, TOKYO, JAPAN A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING Haruki

More information

Scale-Out Acceleration for Machine Learning

Scale-Out Acceleration for Machine Learning Scale-Out Acceleration for Machine Learning Jongse Park Hardik Sharma Divya Mahajan Joon Kyung Kim Preston Olds Hadi Esmaeilzadeh Alternative Computing Technologies (ACT) Lab Georgia Institute of Technology

More information

ChainerMN: Scalable Distributed Deep Learning Framework

ChainerMN: Scalable Distributed Deep Learning Framework ChainerMN: Scalable Distributed Deep Learning Framework Takuya Akiba Preferred Networks, Inc. akiba@preferred.jp Keisuke Fukuda Preferred Networks, Inc. kfukuda@preferred.jp Shuji Suzuki Preferred Networks,

More information

Parallel Stochastic Gradient Descent

Parallel Stochastic Gradient Descent University of Montreal August 11th, 2007 CIAR Summer School - Toronto Stochastic Gradient Descent Cost to optimize: E z [C(θ, z)] with θ the parameters and z a training point. Stochastic gradient: θ t+1

More information

Neural Networks (pp )

Neural Networks (pp ) Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.

More information

arxiv: v1 [cs.cr] 2 Jan 2019

arxiv: v1 [cs.cr] 2 Jan 2019 Secure Computation for Machine Learning With SPDZ Valerie Chen Yale University v.chen@yale.edu Valerio Pastro Yale University valerio.pastro@yale.edu Mariana Raykova Yale University mariana.raykova@yale.edu

More information

Distributed Compressed Estimation Based on Compressive Sensing for Wireless Sensor Networks

Distributed Compressed Estimation Based on Compressive Sensing for Wireless Sensor Networks Distributed Compressed Estimation Based on Compressive Sensing for Wireless Sensor Networks Joint work with Songcen Xu and Vincent Poor Rodrigo C. de Lamare CETUC, PUC-Rio, Brazil Communications Research

More information

Department of Electrical and Computer Engineering

Department of Electrical and Computer Engineering LAGRANGIAN RELAXATION FOR GATE IMPLEMENTATION SELECTION Yi-Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University OUTLINE Introduction and motivation

More information

COMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare

COMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare COMP6237 Data Mining Data Mining & Machine Learning with Big Data Jonathon Hare jsh2@ecs.soton.ac.uk Contents Going to look at two case-studies looking at how we can make machine-learning algorithms work

More information

CS489/698: Intro to ML

CS489/698: Intro to ML CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun

More information

Perceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018

Perceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Perceptron Matt Gormley Lecture 5 Jan. 31, 2018 1 Q&A Q: We pick the best hyperparameters

More information

Cache-efficient Gradient Descent Algorithm

Cache-efficient Gradient Descent Algorithm Cache-efficient Gradient Descent Algorithm Imen Chakroun, Tom Vander Aa and Thomas J. Ashby Exascience Life Lab, IMEC, Leuven, Belgium Abstract. Best practice when using Stochastic Gradient Descent (SGD)

More information

End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow

End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow Ehsan Variani, Tom Bagby, Erik McDermott, Michiel Bacchiani Google Inc, Mountain View, CA, USA

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex

More information

Inception Network Overview. David White CS793

Inception Network Overview. David White CS793 Inception Network Overview David White CS793 So, Leonardo DiCaprio dreams about dreaming... https://m.media-amazon.com/images/m/mv5bmjaxmzy3njcxnf5bml5banbnxkftztcwnti5otm0mw@@._v1_sy1000_cr0,0,675,1 000_AL_.jpg

More information