LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning
|
|
- Gerald Welch
- 5 years ago
- Views:
Transcription
1 LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning Tianyi Chen Georgios Giannakis Tao Sun Wotao Yin UMN, ECE UCLA, Math NeurIPS 208
2 Overview M := {,...,M} <latexit sha_base64="k7jru5gta9yb+ipfjpt/zlno0yy=">aaacaxicbvdlssnafj3uv62vqbvbzwarxjssseerhiibn4uk9gfnkjpjpb06mqkze6geuvfx3lhqxk/4c6/cdpmoa0hlhzouzd77wksrpv2ng+rslk6tr5r3cxtbe/s7tn7b20luoljcwsmzddaijdksuttzug3kqtfasodyhqz9tsprcoq+l0ej8sp0ydtigkkjds3jzipiwybk6trl3mrhgufvpwgn+nbzafqzacxizutmsjr7ntfxihwghoumunk9vwn0x6gpkaykunjsxvjeb6haekzylfmlj/nppjau6oemblsfndwpv6eyfcsdgotgem9fatelpxp6+x6ujszyhpuk04ni+kugagnm4yeglwzqndufyunmrxemkedymtjijwv8ezm0z6uuu3xvauv6ly+jci7bctgdlrgadxalmqafmhgez+avvflpovbn3mwwtwpnmi/sd6/aeiejx9</latexit> J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le et al., Large-scale distributed deep networks, Proc. NIPS., Lake Tahoe, NV, pp , 202 2
3 Overview M := {,...,M} <latexit sha_base64="k7jru5gta9yb+ipfjpt/zlno0yy=">aaacaxicbvdlssnafj3uv62vqbvbzwarxjssseerhiibn4uk9gfnkjpjpb06mqkze6geuvfx3lhqxk/4c6/cdpmoa0hlhzouzd77wksrpv2ng+rslk6tr5r3cxtbe/s7tn7b20luoljcwsmzddaijdksuttzug3kqtfasodyhqz9tsprcoq+l0ej8sp0ydtigkkjds3jzipiwybk6trl3mrhgufvpwgn+nbzafqzacxizutmsjr7ntfxihwghoumunk9vwn0x6gpkaykunjsxvjeb6haekzylfmlj/nppjau6oemblsfndwpv6eyfcsdgotgem9fatelpxp6+x6ujszyhpuk04ni+kugagnm4yeglwzqndufyunmrxemkedymtjijwv8ezm0z6uuu3xvauv6ly+jci7bctgdlrgadxalmqafmhgez+avvflpovbn3mwwtwpnmi/sd6/aeiejx9</latexit> q Solvers: gradient descent (GD), momentum methods q Our method improves GD by same convergence rate in theory reduced communication in theory more than 90% communication saving in practice J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le et al., Large-scale distributed deep networks, Proc. NIPS., Lake Tahoe, NV, pp , 202 3
4 <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="3ycjpjs6cm9hqn3faiohgutr/e=">aaacyxicbvfntxsxepuuli9qyeupxcwijkqkabchwqupopcewgkkakhxwm06tmlf9q7swdrotx+sg5de+kfqtxjoisnzfn5vzjx+zgolhcbxaxcurx/y2nzabu83n3bjz4d3lu8tfz0ek5y+5ibe0oa0uojsjwwvodolhjipt8a/efzwcdzc4ezqgw0ji0csq7oqtt6xbjcddm+6iobei9vm/zruv6uu6skdvuyauvknlwbsviydoj/rmhnifcyop+pus0yotb2ajqvadsoo/e86huqlegblommjv7ymoelfga5auf6svzgoaklkitrtjpraf8cmpr99cafm5qzr2q6bfnhnsuw78m0jn7b0uf2jvj+kwnohfvtyzcpfvlhf0okmmkeoxhi4tgpaky08zuoprwcfqzd4bb6welfaiwopppaxktkrdpfg/uzzpj3eluz9vd66udw+sqhjetkpal0ixfyq3peu5+b+vbbrax/am3wyg8wksgwblmm/kvwso/z9u58a==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="ad9+7z/64n0l6c5cvtcf9t7i5sq=">aaacghicbva9twjben3ze/eltbtzseywwtsblyk2fhayyefcizlbftiwt3fzntmhf36gjx/fxkjjbon8n+4bhyiv2ezlezozmrfeuhh03w9nzxvtfwmzt5xf3tnd2y8chnznlgjgayyskw4gylguitdqootnwhmia8kbwfam8xtpxbsrqqccxbwdql+jnmcavuouzn0fgytuzydp3bgt+ohol/wgkl0zcu2x+jjgcophyeacdqpft+xoqzejnydfmkeu5j43yglivfijbjt8twy2ylofezycd5pdi+bdahpw5yqcllpp9pdxvtukl3ai7r9culu/d2rqmiynwlcdgwi4m/ueeuxdtvoh4gs5yrnbvursjgiweu0kzrnkksxatlc7ujyadqxtlnkbgrd48jkpx5q9t+zdu8xk9tyohdkmj6rephjjkuswvemnmpjmxsk7+xbendfn0/mala44854j8gfo5aeuvkdg</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> <latexit sha_base64="s6z3bog3qzlpbrnhhzck7zmxid8=">aaacghicbvbns8naen34bf2qevsywas9eqepra9efbqsfvoyplst+3szsbstoqs+jo8+fe8efdea2/+gzdtdlp9sozjvrlm5owjfazd98uzmz2bxhcwi6trk6tb5q3txomtjxjdrblwn+hylguitdroot3ieyqhzlfhf3z3l975nqiwn3iiofbbf0looibwqlvpvqvhbiyn4gkl8pwlr+k7r4fxrjtbph9mh97hgh40m+dga54lbdmehf4hwkqgpct8ojvx2znoikmqrjmp6byjcbrsekh5b8paewb+6vgmpgoibibsfnqr7vmnttqztu0jh6s+odcktr2kri8cemfzy8t+vmwlnnmiesllkik0gdvjjmaz5srqtngcob5ya08luslkpndc0wzzscn70yx9j46jquvxv5rhsoyviwci7zjfse4+ckbq5inekthh5ii/kjbw7z86r8+f8tkpnnkjnm/ycm/oge5ygqg==</latexit> Vanilla GD implementation + = X m2m rl m rl rl m rl <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> M q Per iteration communication overhead for M uploads (one per worker) 4
5 Prior art q Communication-efficient distributed learning Quantized gradient descent [Kashyap et al., 07], [Alistarh et al., 7], [Suresh et al., 7] Increasing computation before communication [Jaggi et al., 4], [Ma et al., 7], [Smith et al., 7] Sparse SGD with large entries [Aji-Heafield 7], [Sun et al., 7], [Lin et al., 8], [Stich et al., 8] Ø number of communication rounds is not reduced Our contribution Adaptively skip communication, provable communication reduction 5
6 <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="xjaypflh2d2effhubkez30ozte=">aaac2hicbvjdixmxfm2mx2v92kqpvksl0ew2zmicvgglvvigsildxwzqccdno6fjzkjuccum+kcir/403/wv/guzbcwozeehm65offmi6+udjgkv6l4ytvr2/s3ezdun3n7n7/3vtv9awizevvwnpc3bcsspgkfgj88ok0lksz/nyvauffrlwydk8xulphowrs4lbwxuv/n8lln3eqhxtmsbelz0s+fps3llmv5yebvbbbhybhaz4zatzjoojbiddrwucugg7yn02mh5wb/siwycdcwpcswve+vgnz/l5wlickdfusjo4uchadnmhwt9qtjkkevg3qlbmqbjn/j5uvvnbcia/tuemavdjyffyjzoeq52ogc9hisybgtdctf36yrr6jdazoi9tmabpmr24w4n2bbchuwmwbldrys5tuup8xdrlu9uodn8umtekyknbv6yzaqvhtqoaujwhv8olsmax/iveuir098ixwemzuzqm0ndhg+oj7xxskyfkmrmsldwnx+qosfjwqnx5kmv0df4q/w5/hz/36tg0xbpa/jfxd/+ahy55xq=</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="0izsstpvxcdybhaas99noor5ey=">aaacghicbva9swnben2l3/erammzgityxdsrtbrtlcwima/ixtc32srl9vao3tkhhpkznv4vgwtfbnp5b9xlumjig2uf780wmy+iptdout9obml5zxvtfso/ubw9svy26+zkngmvkki90iwhapfk+iqmkbseyqbplxg8fn5tefudyiug84jhkrhj4sxceardqunpokagmpz0dsuhb8wprk/lbjdtmgnov9bhpeuapg8w5arekbtmdgc4sb0akzizkuzd2oxflqq6qstcm6bkxtllqkjjko7yfgb4dg0cpnyvehltsiehjeixvtq0g2n7fnkj+rsjhdbka9rkelbv5rm/m9rjti9bkvcxqlyxaaduomkgnesjdormjouq0uaawf3pawpghjalpm2bg/+5evsoyt7btm7py9exc/iwceh5iiuiecuybw5jrvsjyw8kfytj6cf+fn+xs+pqu5z9zzqp7agf8atcigjg==</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="kl3zlmnta00f+bxwcxva28jxtua=">aaacbxicbvbns8naen3urq/oh7sfgetyupgh4lxrwifewhnlfstpn26wytdzdccb48a948aciv/+dn/+n2zactj4yelw3w8y8iofmacf5sgplyyura8x0sbmvaovbvxvheqktrozgpzdogczgq0nnmc2okeegucwshwyuk37keqfosbpuraj0hfsjbroo3utq8zjxkor8a3q0+lgqind/hh69plp+jmgrejm5myylhv2p9el6zpbejttptque6i/yxizsihccllfssedkkfooykeohys+kxy3xslb4oy2lkadxvf09kjfjqfawmmyj6ooa9ifif0leo5ntcspbkfni8kuyx3jsss4xyrqzuegecqzurxtazgeahncyytgzr+8sjrviutu3ovtcq2axfeb+ginsaxnaeaukreaupaan9ijerufr2xqz3metbsuf2ud/yh8aysfmey=</latexit> <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="s6cfni5l/+zpwvoelvdqmdidqvu=">aaacdxicbva9swnbej3z2xg2tosbie28c5gs8hgwklbrcexw9xmkyzz2zt254rw5gfy+fdslbsxtfpfujekuoodzr7vzbazl0qvtot7x97c4tlyyuraemmjvlmxdkpn22sgs4apfgjuyvqciwajakje5sizcollinhueff/sgjjwjvqfrktox9rxssy7kpe7lknqykcxdjopdjjtxgml+lywsbwj2ju8pieghn8pc+ewu6n6dx8cnk+cgancdfedymfytxgwc0cobwtwe+pnamhyzuyl8lmiht5epui5ajgwnh2pjlsza6c0mw9xliniu3unxm5xrzy03xgsap7yve/7xwrr3tdi5mphqfpprlomelakxlrsce5q5ahyi92uja/qicexzcmfepw9ez40j+ubxw+ufvidpdihggrwamdwavfqaa6p8ayv8oy9es/e+zsubw+w2y78gvfxdconn0c=</latexit> <latexit sha_base64="ad9+7z/64n0l6c5cvtcf9t7i5sq=">aaacghicbva9twjben3ze/eltbtzseywwtsblyk2fhayyefcizlbftiwt3fzntmhf36gjx/fxkjjbon8n+4bhyiv2ezlezozmrfeuhh03w9nzxvtfwmzt5xf3tnd2y8chnznlgjgayyskw4gylguitdqootnwhmia8kbwfam8xtpxbsrqqccxbwdql+jnmcavuouzn0fgytuzydp3bgt+ohol/wgkl0zcu2x+jjgcophyeacdqpft+xoqzejnydfmkeu5j43yglivfijbjt8twy2ylofezycd5pdi+bdahpw5yqcllpp9pdxvtukl3ai7r9culu/d2rqmiynwlcdgwi4m/ueeuxdtvoh4gs5yrnbvursjgiweu0kzrnkksxatlc7ujyadqxtlnkbgrd48jkpx5q9t+zdu8xk9tyohdkmj6rephjjkuswvemnmpjmxsk7+xbendfn0/mala44854j8gfo5aeuvkdg</latexit> Our LAG implementation + = X Fresh gradient rl m ( ) m2m k Old gradient X rl m ˆ k m m2m/m k rl rl m q Select a subset of workers M k M to upload q Remaining workers in M/M k <latexit sha_base64="qvz+pv4h383let9dvojmypjlyi=">aaacanicbvdlsgmxfltx7w+rl2jm2arxjwzutblwy0boyj9qgcsmttthmyyq5iryldc+ctuxcjiq9w59+ytino64haytn33useiofmacf5sgorq2vrg8xn0tb2zu6evx/quneqcw2smmeye2bforo0qznmtjniiqoa03ywupz67xsqfyvfrr4ni/wqlcqeaynlopmo9gjq4nnujyddhp9w7us8toxzkblrm3j2xi0ejzn4/jmlehszmluq6tql9devncketkpcqmmaywgpanvtgico/m60wqadg6amwluyijwbq744mr0qno8burlgpai3ff/zuqkol/ymistvvjd5q2hkky7rna/uz5iszcegyckz+ssiqywx0sakgnbxvx5mbsqfdepudecr2axgeyzibm3dhhopwbqoaoeheiixeluerwfrzxqflxasvocq/sd6+aavu5bx</latexit> do not upload 6
7 <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="ckjerppo0exvggranxgvbpzutg=">aaacznicbvjna9wwejxdr3t7twmpvyguheswi5legznpdawttbnaqvnmtbktlhzdqrxykuyxvv7eusp6p+ovn5cm3ha6phepgy0o6rs0miu/qrce/cfphy093jw5omz5y+g+y/pbfkblma8vkw5smakjbwyouqlliojoeiuoe/wp6+fi2mlax+iptklaritewlb/tucvibvzczu6fmq6lamq6kfmqwbutkdscsuq3spvcxy5glhozy3sqh4/78hndmlx+6dbjunl6u4oseeupqa4ixv3uwfkpojpvwwdvexq0f2njpvyroyrwy6h3cvuboplcbrnom3quydegrhzxxq5/mlwja8lozershyerxuuhbiuxilmwgorkubrymtcqw2fsau3xudd33pmrdps+korbtl/hq4k27btmwva3n7wwrjpm9eyvls4qasahezdobrwfeva7paupbeccyd4eb6xinpwq8u/q9ohxdffvjdchy0ianj/cuanrztxrfhxpm35ide5jicka9ksmaebx+dq+bb4mjpeb024fcunqx2nlfkvwh//afhrevw</latexit> LAG: GD under two alternative communication rules q Worker-side rule (LAG-WK): Include worker m in if Old gradient rl m rl m ˆ k m M Gradient innovation Optimization progress T. Chen, G. B. Giannakis, T. Sun, and W. Yin "LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning," Proc. of NIPS, Montreal, Canada, December 3-8,
8 <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="0r4hplkwhklcrgku4unegnfajvs=">aaacw3icbvjnb9naef2brxikbk5cvlri7sgrzqwocc5igbqk0lbkptf4m7zxwx90dwpbc3+izd+ap+ddrwkad3sap/em9hmvtmkspsfp0kwjt379/cpbw9ojw8zon42ehp7zqjms5rhrlzhowqfwjckk8bw2cewi8szzfoj0sys0vlxln9rwucwgkwqjjcnvupf4r3kxduxjsqanjcg+ed2vyhezcciqftabgt/oue5erqxm045mqzn50buomjrf24zidtdbd9rzhjjrwpaurhxqhmsvp/2ue6dqhnh4ezdle9m2myni66s9xprqtxkfrnnofvw3ipthi+5itxj/fupjngsvjddyu4qimpqndsmpsr6kxwipcqiyld0so0c7dbh0tf+wznu8r409jfmf+w+ggsn3yprmayunrsohtevd6dulu2xdejayb5q2mlpfu93yttiosw89agmun5xlhlyh5h9az0j888m3wenrarxn468ro2av2et2zgl2hrjh9mmzzkmpgwxwffahbpwkmx7u8jg79tz9l+ep/4ayclksg==</latexit> <latexit sha_base64="ckjerppo0exvggranxgvbpzutg=">aaacznicbvjna9wwejxdr3t7twmpvyguheswi5legznpdawttbnaqvnmtbktlhzdqrxykuyxvv7eusp6p+ovn5cm3ha6phepgy0o6rs0miu/qrce/cfphy093jw5omz5y+g+y/pbfkblma8vkw5smakjbwyouqlliojoeiuoe/wp6+fi2mlax+iptklaritewlb/tucvibvzczu6fmq6lamq6kfmqwbutkdscsuq3spvcxy5glhozy3sqh4/78hndmlx+6dbjunl6u4oseeupqa4ixv3uwfkpojpvwwdvexq0f2njpvyroyrwy6h3cvuboplcbrnom3quydegrhzxxq5/mlwja8lozershyerxuuhbiuxilmwgorkubrymtcqw2fsau3xudd33pmrdps+korbtl/hq4k27btmwva3n7wwrjpm9eyvls4qasahezdobrwfeva7paupbeccyd4eb6xinpwq8u/q9ohxdffvjdchy0ianj/cuanrztxrfhxpm35ide5jicka9ksmaebx+dq+bb4mjpeb024fcunqx2nlfkvwh//afhrevw</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="isvjkbigrcoxdot70vzudktbfj4=">aaab8xicbva9swnbej2lxzf+rstfongfe5e0djgy2erwxxgcos9zsrzsrt37o4j4ci/slfqxnz/y+e/czncoykpbh7vztazl0oen9b3v73c2vrg5lzxu7szu7d/ud48apo4qwblbaxbkfuooakg5zbgeei5wrwfy0vpn5rsfuhsfqwu4sdcudkj7gjfonpwzdrgw5m/zkrzxq/4czjueoalajnqv/nxtxyyvqcwtjho4cc2zki2namclrqpwysymrixffjzowm88jwdo6znbrf0ps+bq74mmsmmmmnkdktqrwfzm4n9ej7wd6zdjkkktkrzynegfstgzvu/6xcozyuiizzq7wwkbuu2zdsgvxajb8surphlrdfxqch9zqv3mcrthbe7hhak4ghrcqh0awedbm7zcm2e8f+/d+i0frx85hj+wpv8atlpkiw=</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="hp+6lruf2d3tzaldqaqqvekmxyw=">aaab2xicbzdnsgmxfixvl86vqrn8eiucozbnqpuhfzwbzco5rm5k4bmskmyr2hdh0bf25efc93vo3pz0jbdwq+zknivscullqubn9ebwd3b/+gfugfnfzjk9nmo2fz0gjsilzl5jnmfpxu2cvjcp8lgzylffbj6f0i77+gstlxtzqrmmr4wmtuck7o6oyaraadlmw2ivxdc9yanb+gss7kddujxa0dhefbucunsafw7g9liwuxuz7ggupnm7rrtrxzzi6dk7a0n+5oykv394ukz9bostjdzdhn7ga2mp/lbiwlteldvesarh6kc0vo5wtdmajnchizrxwyasblykjnyqa8z3hysbg29d77odbu3wmya6nmmfxeein3ahd9cblghi4bxevyn35n2suqp569lo4i+8zx84xio4</latexit> <latexit sha_base64="exy2rvibegrsormj8+msfvxw2g=">aaacknicdzhpbtqwemad8k8sbraocmciqiqhrpje4fibkaavvcs2rbterhpvjlhwtoi9qvqlesuehbtvg7nzjgixksx/+ohm/4mrbvyfew/gvdgzvu37+zchd3bvf/g4fjr7qmrgitxkitd2fmuhgpv4pquatyvlyjjnz6ly7d9/uw7wqeq8iutapwbyeuvkqnkutl+czwy8ubl4pkltniltzl+aguvsnbdla9eadrusxwjuwixb3e3viscv3grwzbt3lwfoqexi4//eag6lmda+//ptiuohazkc3o5xojynerge9ekwge/lukn2goboengp8wiko3bkqqg52zxvno8butkauxgonfygxcjjmvszdo5u3a3y6/8gtbs8r6uxjf078rwjcuh9u/necfu5rr4bbcrkhs9bxvzd0qlnjoldwau8x7vfgfsihjr7waazwflcscvkhkf9qbef/98nvxejijo0n8jwi77cl7zvzzzf6xi/aenbapk8gt4cj4ehwmn4xvwupbrjdy+pay/rph598qd9cj</latexit> <latexit sha_base64="exy2rvibegrsormj8+msfvxw2g=">aaacknicdzhpbtqwemad8k8sbraocmciqiqhrpje4fibkaavvcs2rbterhpvjlhwtoi9qvqlesuehbtvg7nzjgixksx/+ohm/4mrbvyfew/gvdgzvu37+zchd3bvf/g4fjr7qmrgitxkitd2fmuhgpv4pquatyvlyjjnz6ly7d9/uw7wqeq8iutapwbyeuvkqnkutl+czwy8ubl4pkltniltzl+aguvsnbdla9eadrusxwjuwixb3e3viscv3grwzbt3lwfoqexi4//eag6lmda+//ptiuohazkc3o5xojynerge9ekwge/lukn2goboengp8wiko3bkqqg52zxvno8butkauxgonfygxcjjmvszdo5u3a3y6/8gtbs8r6uxjf078rwjcuh9u/necfu5rr4bbcrkhs9bxvzd0qlnjoldwau8x7vfgfsihjr7waazwflcscvkhkf9qbef/98nvxejijo0n8jwi77cl7zvzzzf6xi/aenbapk8gt4cj4ehwmn4xvwupbrjdy+pay/rph598qd9cj</latexit> <latexit sha_base64="fwqdngx8hsasjo47ghwur+5xsmo=">aaacnxicdvfnj9mwehxck+uncdaxyv0nlyktklhcsqeqaflrldrlr3o4nrjfbtjgtpkcpv/hw/hbv/bqcpeuywkax5em9gm36tkpajkjfqxjj5q3bd/budu7df/dw0xd/8amtgspflfeqmmcpwkfkkayouymz2gjqqrkzdpwu02ffhbgykr/huhyldxkpm8kbpzumfxwnmr2vobuklk3u0q6t45hirda89uhkwdddqln9llbhczt38ycufzzoc7uhwfw6zehp7+wzbqdqe9ffcfstvyfokreygv+squb4nkoirg0sbodrbvwyhs4yqz/mtlijdalmgvwdupoxoxdgxkrkq7yi0vnfav5gluyqla2ixbunvsl55z0qwy/pvin+zfhq607db2lrqwsfejtylzrvm3iyclosgrcn7qvmjkfa0oxvdsim4qruhwi30ujegdcu/ue7e+krx74oto/gctsov0ajydhwjj3yjlwgbyqmr8mefcanzep48dsybb+dt+hz8h4hh7ps8ng2/oe/bph7ddaedgy</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> <latexit sha_base64="yzvfko7wyha/yervzh/rngyfka=">aaacnxicdvfdi9nafj3er7v+bnu3fxcwcovdlmqr9leogsoqk9jtqqcbjtobzohmemduhdkbf+uv8c/46spolvwnap59zlvxnuwitpmyp+beg6zdu3tq7pbhz9979/egdh6e2aoyaqahuzc5sbkhjeqyoucfzbydrvmesxb3t9nl3mfzw5vdc7dqpc9ljgvhtyxdh8ejzm9kzi4osyutgvtk2nyapl2fhxico5uh9qm+tytduo272y5fkmsmy4uhwfwqygq0//yrhxdcf7+ua/k3aw/qqj8wjf9ilddabjcbsno03qqydeghhzxkky/mmwlwg0lcgut3yerzuuhdcohyj2wbolnrcrnspcw5jrsau3cbelzz2zpfll/curbti/oxzxtlvbv2qohb2sdequbd5g9nrhzfk3ckxob2wnoljr7lr0kq0ivgspudds70pfwb2h6a/amrbf/vjvcho0jqnx/oxlahk0twoppchpyagjyssyie/jczksetwojsgh4gp4nhwxhoef+9iw2py8iv9eopsnqvhrna==</latexit> LAG: GD under two alternative communication rules q Worker-side rule (LAG-WK): Include worker m in if Old gradient rl m rl m ˆ k m M Gradient innovation Optimization progress q Server-side rule (LAG-PS): Include worker m in if : smoothness of L m L m ˆ k m M LAG-PS is a sufficient condition for LAG-WK. T. Chen, G. B. Giannakis, T. Sun, and W. Yin "LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning," Proc. of NIPS, Montreal, Canada, December 3-8,
9 Iteration and communication complexity (nonconvex) Local loss is smooth. (convex) Loss is convex. (strongly convex) Loss is (restricted) strongly convex. Theorem In all cases, LAG enjoys the same convergence rate as GD. Theorem 2 If local objectives are heterogeneous, LAG requires smaller number of uploads to a given accuracy than GD; e.g., as small as /M. 9
10 Linear prediction q Real datasets distributed on M = 9 workers Cyc-/Num-IAG: cyclic/non-uniform update of incremental aggregated gradient M. Lichman, UCI machine learning repository, 203. [Online]: 0
11 Logistic regression q Real datasets distributed on M = 9 workers Ø LAG needs same number of iterations but fewer uploads
12 Conclusions q Adaptive communication rules for distributed learning q Not degrade convergence but reduce communication Thank You! Thu Dec 6th 05: :00 Room 20 & 230 AB #8 2
Decentralized and Distributed Machine Learning Model Training with Actors
Decentralized and Distributed Machine Learning Model Training with Actors Travis Addair Stanford University taddair@stanford.edu Abstract Training a machine learning model with terabytes to petabytes of
More informationScaling Distributed Machine Learning
Scaling Distributed Machine Learning with System and Algorithm Co-design Mu Li Thesis Defense CSD, CMU Feb 2nd, 2017 nx min w f i (w) Distributed systems i=1 Large scale optimization methods Large-scale
More informationFTSGD: An Adaptive Stochastic Gradient Descent Algorithm for Spark MLlib
FTSGD: An Adaptive Stochastic Gradient Descent Algorithm for Spark MLlib Hong Zhang, Zixia Liu, Hai Huang, Liqiang Wang Department of Computer Science, University of Central Florida, Orlando, FL, USA IBM
More informationarxiv: v2 [cs.ne] 26 Apr 2014
One weird trick for parallelizing convolutional neural networks Alex Krizhevsky Google Inc. akrizhevsky@google.com April 29, 2014 arxiv:1404.5997v2 [cs.ne] 26 Apr 2014 Abstract I present a new way to parallelize
More informationConvex Optimization. Lijun Zhang Modification of
Convex Optimization Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Modification of http://stanford.edu/~boyd/cvxbook/bv_cvxslides.pdf Outline Introduction Convex Sets & Functions Convex Optimization
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 4, 2016 Outline Multi-core v.s. multi-processor Parallel Gradient Descent Parallel Stochastic Gradient Parallel Coordinate Descent Parallel
More informationAnalyzing Stochastic Gradient Descent for Some Non- Convex Problems
Analyzing Stochastic Gradient Descent for Some Non- Convex Problems Christopher De Sa Soon at Cornell University cdesa@stanford.edu stanford.edu/~cdesa Kunle Olukotun Christopher Ré Stanford University
More informationMALT: Distributed Data-Parallelism for Existing ML Applications
MALT: Distributed -Parallelism for Existing ML Applications Hao Li, Asim Kadav, Erik Kruus NEC Laboratories America {asim, kruus@nec-labs.com} Abstract We introduce MALT, a machine learning library that
More informationIncremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. Chapter 4 : Optimization for Machine Learning
Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity
More informationDistributed Delayed Proximal Gradient Methods
Distributed Delayed Proximal Gradient Methods Mu Li, David G. Andersen and Alexander Smola, Carnegie Mellon University Google Strategic Technologies Abstract We analyze distributed optimization algorithms
More informationGradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent
Gradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent Slide credit: http://sebastianruder.com/optimizing-gradient-descent/index.html#batchgradientdescent
More informationCS-541 Wireless Sensor Networks
CS-541 Wireless Sensor Networks Lecture 14: Big Sensor Data Prof Panagiotis Tsakalides, Dr Athanasia Panousopoulou, Dr Gregory Tsagkatakis 1 Overview Big Data Big Sensor Data Material adapted from: Recent
More informationDistributed Optimization for Machine Learning
Distributed Optimization for Machine Learning Martin Jaggi EPFL Machine Learning and Optimization Laboratory mlo.epfl.ch AI Summer School - MSR Cambridge - July 5 th Machine Learning Methods to Analyze
More informationDistributed Primal-Dual Optimization for Non-uniformly Distributed Data
Distributed Primal-Dual Optimization for Non-uniformly Distributed Data Minhao Cheng 1, Cho-Jui Hsieh 1,2 1 Department of Computer Science, University of California, Davis 2 Department of Statistics, University
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 22 2019 Outline Practical issues in Linear Regression Outliers Categorical variables Lab
More informationParallel and Distributed Sparse Optimization Algorithms
Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed
More informationBandwidth Reduction using Importance Weighted Pruning on Ring AllReduce
Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce Zehua Cheng,Zhenghua Xu State Key Laboratory of Reliability and Intelligence of Electrical Equipment Hebei University of Technology
More informationPARALLELIZED IMPLEMENTATION OF LOGISTIC REGRESSION USING MPI
PARALLELIZED IMPLEMENTATION OF LOGISTIC REGRESSION USING MPI CSE 633 PARALLEL ALGORITHMS BY PAVAN G JOSHI What is machine learning? Machine learning is a type of artificial intelligence (AI) that provides
More informationCS 179 Lecture 16. Logistic Regression & Parallel SGD
CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)
More informationMini-batch Block-coordinate based Stochastic Average Adjusted Gradient Methods to Solve Big Data Problems
Proceedings of Machine Learning Research 77:49 64, 2017 ACML 2017 Mini-batch Block-coordinate based Stochastic Average Adjusted Gradient Methods to Solve Big Data Problems Vinod Kumar Chauhan Department
More informationOne Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models
One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models [Supplemental Materials] 1. Network Architecture b ref b ref +1 We now describe the architecture of the networks
More informationREVISITING DISTRIBUTED SYNCHRONOUS SGD
REVISITING DISTRIBUTED SYNCHRONOUS SGD Jianmin Chen, Rajat Monga, Samy Bengio & Rafal Jozefowicz Google Brain Mountain View, CA, USA {jmchen,rajatmonga,bengio,rafalj}@google.com 1 THE NEED FOR A LARGE
More informationLogistic Regression. Abstract
Logistic Regression Tsung-Yi Lin, Chen-Yu Lee Department of Electrical and Computer Engineering University of California, San Diego {tsl008, chl60}@ucsd.edu January 4, 013 Abstract Logistic regression
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationGradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz
Gradient Descent Wed Sept 20th, 2017 James McInenrey Adapted from slides by Francisco J. R. Ruiz Housekeeping A few clarifications of and adjustments to the course schedule: No more breaks at the midpoint
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationMulti-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture
The 51st Annual IEEE/ACM International Symposium on Microarchitecture Multi-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture Byungchul Hong Yeonju Ro John Kim FuriosaAI Samsung
More informationParallel Deep Network Training
Lecture 26: Parallel Deep Network Training Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Speech Debelle Finish This Album (Speech Therapy) Eat your veggies and study
More informationEfficient Deep Learning Optimization Methods
11-785/ Spring 2019/ Recitation 3 Efficient Deep Learning Optimization Methods Josh Moavenzadeh, Kai Hu, and Cody Smith Outline 1 Review of optimization 2 Optimization practice 3 Training tips in PyTorch
More informationLEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING. Nandita M. Nayak, Amit K. Roy-Chowdhury. University of California, Riverside
LEARNING A SPARSE DICTIONARY OF VIDEO STRUCTURE FOR ACTIVITY MODELING Nandita M. Nayak, Amit K. Roy-Chowdhury University of California, Riverside ABSTRACT We present an approach which incorporates spatiotemporal
More informationPractice Questions for Midterm
Practice Questions for Midterm - 10-605 Oct 14, 2015 (version 1) 10-605 Name: Fall 2015 Sample Questions Andrew ID: Time Limit: n/a Grade Table (for teacher use only) Question Points Score 1 6 2 6 3 15
More informationCase Study 1: Estimating Click Probabilities
Case Study 1: Estimating Click Probabilities SGD cont d AdaGrad Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade March 31, 2015 1 Support/Resources Office Hours Yao Lu:
More informationOptimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund
Optimization Plugin for RapidMiner Technical Report Venkatesh Umaashankar Sangkyun Lee 04/2012 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft
More informationClass 6 Large-Scale Image Classification
Class 6 Large-Scale Image Classification Liangliang Cao, March 7, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Visual
More informationActiveClean: Interactive Data Cleaning For Statistical Modeling. Safkat Islam Carolyn Zhang CS 590
ActiveClean: Interactive Data Cleaning For Statistical Modeling Safkat Islam Carolyn Zhang CS 590 Outline Biggest Takeaways, Strengths, and Weaknesses Background System Architecture Updating the Model
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationarxiv: v1 [cs.dc] 27 Sep 2017
Slim-DP: A Light Communication Data Parallelism for DNN Shizhao Sun 1,, Wei Chen 2, Jiang Bian 2, Xiaoguang Liu 1 and Tie-Yan Liu 2 1 College of Computer and Control Engineering, Nankai University, Tianjin,
More informationOptimization for Machine Learning
with a focus on proximal gradient descent algorithm Department of Computer Science and Engineering Outline 1 History & Trends 2 Proximal Gradient Descent 3 Three Applications A Brief History A. Convex
More informationDecentralized Low-Rank Matrix Completion
Decentralized Low-Rank Matrix Completion Qing Ling 1, Yangyang Xu 2, Wotao Yin 2, Zaiwen Wen 3 1. Department of Automation, University of Science and Technology of China 2. Department of Computational
More informationNeural Network Based Eye Tracking
Neural Network Based Eye Tracking Workshop on Architectures of Smart Camera Instituto de Estudios Sociales Avanzados IESA-CSIC June 5/6, 2017, Córdoba, Spain Pavel Morozkin (SuriCog) Maria Trocan (ISEP)
More informationA Technical Report about SAG,SAGA,SVRG
A Technical Report about SAG,SAGA,SVRG Tao Hu Department of Computer Science Peking University No.5 Yiheyuan Road Haidian District, Beijing, P.R.China taohu@pku.edu.cn Notice: this is a course project
More informationConstrained optimization
Constrained optimization A general constrained optimization problem has the form where The Lagrangian function is given by Primal and dual optimization problems Primal: Dual: Weak duality: Strong duality:
More informationOptimization. Industrial AI Lab.
Optimization Industrial AI Lab. Optimization An important tool in 1) Engineering problem solving and 2) Decision science People optimize Nature optimizes 2 Optimization People optimize (source: http://nautil.us/blog/to-save-drowning-people-ask-yourself-what-would-light-do)
More informationDeep Neural Networks Optimization
Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy
More informationPartitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning
Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationLogistic Regression and Gradient Ascent
Logistic Regression and Gradient Ascent CS 349-02 (Machine Learning) April 0, 207 The perceptron algorithm has a couple of issues: () the predictions have no probabilistic interpretation or confidence
More informationCoding for Random Projects
Coding for Random Projects CS 584: Big Data Analytics Material adapted from Li s talk at ICML 2014 (http://techtalks.tv/talks/coding-for-random-projections/61085/) Random Projections for High-Dimensional
More informationWhen Network Embedding meets Reinforcement Learning?
When Network Embedding meets Reinforcement Learning? ---Learning Combinatorial Optimization Problems over Graphs Changjun Fan 1 1. An Introduction to (Deep) Reinforcement Learning 2. How to combine NE
More informationRevisit Long Short-Term Memory: An Optimization Perspective
Revisit Long Short-Term Memory: An Optimization Perspective Qi Lyu, Jun Zhu State Key Laboratory of Intelligent Technology and Systems (LITS) Tsinghua National Laboratory for Information Science and Technology
More informationMapReduce ML & Clustering Algorithms
MapReduce ML & Clustering Algorithms Reminder MapReduce: A trade-off between ease of use & possible parallelism Graph Algorithms Approaches: Reduce input size (filtering) Graph specific optimizations (Pregel
More informationCafeGPI. Single-Sided Communication for Scalable Deep Learning
CafeGPI Single-Sided Communication for Scalable Deep Learning Janis Keuper itwm.fraunhofer.de/ml Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Deep Neural Networks
More informationTraining Deep Neural Networks (in parallel)
Lecture 9: Training Deep Neural Networks (in parallel) Visual Computing Systems How would you describe this professor? Easy? Mean? Boring? Nerdy? Professor classification task Classifies professors as
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationStanford University. A Distributed Solver for Kernalized SVM
Stanford University CME 323 Final Project A Distributed Solver for Kernalized SVM Haoming Li Bangzheng He haoming@stanford.edu bzhe@stanford.edu GitHub Repository https://github.com/cme323project/spark_kernel_svm.git
More informationSparse Screening for Exact Data Reduction
Sparse Screening for Exact Data Reduction Jieping Ye Arizona State University Joint work with Jie Wang and Jun Liu 1 wide data 2 tall data How to do exact data reduction? The model learnt from the reduced
More informationScaling Deep Learning on Multiple In-Memory Processors
Scaling Deep Learning on Multiple In-Memory Processors Lifan Xu, Dong Ping Zhang, and Nuwan Jayasena AMD Research, Advanced Micro Devices, Inc. {lifan.xu, dongping.zhang, nuwan.jayasena}@amd.com ABSTRACT
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 18 2018 Logistics HW 1 is on Piazza and Gradescope Deadline: Friday, Sept. 28, 2018 Office
More informationSpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian
SpicyMKL Efficient multiple kernel learning method using dual augmented Lagrangian Taiji Suzuki Ryota Tomioka The University of Tokyo Graduate School of Information Science and Technology Department of
More informationDeep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia
Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky
More informationStructured Learning. Jun Zhu
Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum
More informationA new approach for supervised power disaggregation by using a deep recurrent LSTM network
A new approach for supervised power disaggregation by using a deep recurrent LSTM network GlobalSIP 2015, 14th Dec. Lukas Mauch and Bin Yang Institute of Signal Processing and System Theory University
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - C) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More informationLearning Loop Invariants for Program Verification
Learning Loop Invariants for Program Verification Xujie Si*, Hanjun Dai*, Mukund Raghothaman, Mayur Naik, Le Song University of Pennsylvania Georgia Institute of Technology NeurIPS 2018 Code: https://github.com/pl-ml/code2inv
More informationLinear Regression Optimization
Gradient Descent Linear Regression Optimization Goal: Find w that minimizes f(w) f(w) = Xw y 2 2 Closed form solution exists Gradient Descent is iterative (Intuition: go downhill!) n w * w Scalar objective:
More informationOne-Pass Ranking Models for Low-Latency Product Recommendations
One-Pass Ranking Models for Low-Latency Product Recommendations Martin Saveski @msaveski MIT (Amazon Berlin) One-Pass Ranking Models for Low-Latency Product Recommendations Amazon Machine Learning Team,
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationEnsemble-Compression: A New Method for Parallel Training of Deep Neural Networks
Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks Shizhao Sun 1, Wei Chen 2, Jiang Bian 2, Xiaoguang Liu 1, and Tie-Yan Liu 2 1 College of Computer and Control Engineering,
More informationNeural Network Optimization and Tuning / Spring 2018 / Recitation 3
Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.
More informationHyperparameter optimization. CS6787 Lecture 6 Fall 2017
Hyperparameter optimization CS6787 Lecture 6 Fall 2017 Review We ve covered many methods Stochastic gradient descent Step size/learning rate, how long to run Mini-batching Batch size Momentum Momentum
More informationIn stochastic gradient descent implementations, the fixed learning rate η is often replaced by an adaptive learning rate that decreases over time,
Chapter 2 Although stochastic gradient descent can be considered as an approximation of gradient descent, it typically reaches convergence much faster because of the more frequent weight updates. Since
More informationTGNet: Learning to Rank Nodes in Temporal Graphs. Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2
TGNet: Learning to Rank Nodes in Temporal Graphs Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2 1 2 3 Node Ranking in temporal graphs Temporal graphs have been
More informationLarge-Scale Lasso and Elastic-Net Regularized Generalized Linear Models
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data
More informationCS Deep Reinforcement Learning HW4: Model-Based RL due October 18th, 11:59 pm
CS294-112 Deep Reinforcement Learning HW4: Model-Based RL due October 18th, 11:59 pm 1 Introduction The goal of this assignment is to get experience with model-learning in the context of RL, and to use
More informationEFFICIENT NEURAL NETWORK ARCHITECTURE FOR TOPOLOGY IDENTIFICATION IN SMART GRID. Yue Zhao, Jianshu Chen and H. Vincent Poor
EFFICIENT NEURAL NETWORK ARCHITECTURE FOR TOPOLOGY IDENTIFICATION IN SMART GRID Yue Zhao, Jianshu Chen and H. Vincent Poor Department of Electrical and Computer Engineering, Stony Brook University, Stony
More informationOptimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa
Optimizing Network Performance in Distributed Machine Learning Luo Mai Chuntao Hong Paolo Costa Machine Learning Successful in many fields Online advertisement Spam filtering Fraud detection Image recognition
More informationConstrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1 Multi-class Image Segmentation Assign a class label to each pixel in
More informationGuarding user Privacy with Federated Learning and Differential Privacy
Guarding user Privacy with Federated Learning and Differential Privacy Brendan McMahan mcmahan@google.com DIMACS/Northeast Big Data Hub Workshop on Overcoming Barriers to Data Sharing including Privacy
More informationI How does the formulation (5) serve the purpose of the composite parameterization
Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationAccelerated tokamak transport simulations
Accelerated tokamak transport simulations via Neural-Network based regression of TGLF turbulent energy, particle and momentum fluxes by Teobaldo Luda 1 O. Meneghini 2, S. Smith 2, G. Staebler 2 J. Candy
More informationLatent Space Model for Road Networks to Predict Time-Varying Traffic. Presented by: Rob Fitzgerald Spring 2017
Latent Space Model for Road Networks to Predict Time-Varying Traffic Presented by: Rob Fitzgerald Spring 2017 Definition of Latent https://en.oxforddictionaries.com/definition/latent Latent Space Model?
More informationA low rank based seismic data interpolation via frequencypatches transform and low rank space projection
A low rank based seismic data interpolation via frequencypatches transform and low rank space projection Zhengsheng Yao, Mike Galbraith and Randy Kolesar Schlumberger Summary We propose a new algorithm
More informationA LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING
017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 5 8, 017, TOKYO, JAPAN A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING Haruki
More informationScale-Out Acceleration for Machine Learning
Scale-Out Acceleration for Machine Learning Jongse Park Hardik Sharma Divya Mahajan Joon Kyung Kim Preston Olds Hadi Esmaeilzadeh Alternative Computing Technologies (ACT) Lab Georgia Institute of Technology
More informationChainerMN: Scalable Distributed Deep Learning Framework
ChainerMN: Scalable Distributed Deep Learning Framework Takuya Akiba Preferred Networks, Inc. akiba@preferred.jp Keisuke Fukuda Preferred Networks, Inc. kfukuda@preferred.jp Shuji Suzuki Preferred Networks,
More informationParallel Stochastic Gradient Descent
University of Montreal August 11th, 2007 CIAR Summer School - Toronto Stochastic Gradient Descent Cost to optimize: E z [C(θ, z)] with θ the parameters and z a training point. Stochastic gradient: θ t+1
More informationNeural Networks (pp )
Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.
More informationarxiv: v1 [cs.cr] 2 Jan 2019
Secure Computation for Machine Learning With SPDZ Valerie Chen Yale University v.chen@yale.edu Valerio Pastro Yale University valerio.pastro@yale.edu Mariana Raykova Yale University mariana.raykova@yale.edu
More informationDistributed Compressed Estimation Based on Compressive Sensing for Wireless Sensor Networks
Distributed Compressed Estimation Based on Compressive Sensing for Wireless Sensor Networks Joint work with Songcen Xu and Vincent Poor Rodrigo C. de Lamare CETUC, PUC-Rio, Brazil Communications Research
More informationDepartment of Electrical and Computer Engineering
LAGRANGIAN RELAXATION FOR GATE IMPLEMENTATION SELECTION Yi-Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University OUTLINE Introduction and motivation
More informationCOMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare
COMP6237 Data Mining Data Mining & Machine Learning with Big Data Jonathon Hare jsh2@ecs.soton.ac.uk Contents Going to look at two case-studies looking at how we can make machine-learning algorithms work
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun
More informationPerceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Perceptron Matt Gormley Lecture 5 Jan. 31, 2018 1 Q&A Q: We pick the best hyperparameters
More informationCache-efficient Gradient Descent Algorithm
Cache-efficient Gradient Descent Algorithm Imen Chakroun, Tom Vander Aa and Thomas J. Ashby Exascience Life Lab, IMEC, Leuven, Belgium Abstract. Best practice when using Stochastic Gradient Descent (SGD)
More informationEnd-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow
End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow Ehsan Variani, Tom Bagby, Erik McDermott, Michiel Bacchiani Google Inc, Mountain View, CA, USA
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex
More informationInception Network Overview. David White CS793
Inception Network Overview David White CS793 So, Leonardo DiCaprio dreams about dreaming... https://m.media-amazon.com/images/m/mv5bmjaxmzy3njcxnf5bml5banbnxkftztcwnti5otm0mw@@._v1_sy1000_cr0,0,675,1 000_AL_.jpg
More information