Introduc)on to Cloud Compu)ng Dr. Zhenlin Wang Michigan Tech
Very Short Bio BS, MS, Peking University How did I get there? Why CS? PhD, University of MassachuseAs, Amherst Professor at Tech 2
Hobbies? Teaching and Research Go Tea & Sports Games Well, PE was the only course I couldn t get an A 3
Clouding CompuOng is here! Google docs Dropbox, Overleaf I am using them now Tencent, TwiAer, Facebook Wechat: 600M users and counong NeYlix, Amazon Prime. I am a subscriber 4
What is Cloud Compu)ng? Let s hear from the experts 5
What is Cloud Compu)ng? A few years back. The infinite wisdom of the crowds (via Google Suggest) 6
What is Cloud CompuOng? Now 7
What is Cloud Compu)ng? We ve redefined Cloud CompuOng to include everything that we already do.... I don t understand what we would do differently in the light of Cloud CompuOng other than change the wording of some of our ads. Larry Ellison, Co-founder, CEO of Oracle 8
What is Cloud Compu)ng? It s stupidity. It s worse than stupidity: it s a markeong hype campaign Richard Stallman GNU 9
What is Cloud Compu)ng? Cloud CompuOng will become a focal point of our work in security. I m opomisoc Ron Rivest The R of RSA 10
What is Cloud Compu)ng? It s about jobs! It s about small business! 11
So, What really is Cloud Compu)ng? Cloud compu)ng is a new compuong paradigm, involving data and/or computaoon outsourcing, with Infinite and elasoc resource scalability On demand just-in-ome provisioning No upfront cost pay-as-you-go That is, use as much or as less you need, use only when you want, and pay only what you use, 12
NeYlix Version 1 NeDlix Home Movies: Master copies Amazon.com 13
What s new in Today s Clouds? Besides massive scale, three major features: I. On-demand access: Pay-as-you-go, no upfront commitment. Anyone can access it (e.g., Washington Post Hillary Clinton example) II. III. Data-intensive Nature: What was MBs has now become TBs, PBs. Daily logs, forensics, Web data, photos, videos, etc. Do you know the size of Wikipedia dump? New Cloud Programming Paradigms: MapReduce/Hadoop, Pig LaOn, and many others. High in accessibility and ease of programmability CombinaOon of one or more of these gives rise to novel and unsolved distributed compuong problems in cloud compuong. 14
The real story CompuOng UOlity holy grail of computer science in the 1960s. Code name: MULTICS (MulOplexed InformaOon and CompuOng Service) Why it failed? Ahead of Ome lack of communicaoon tech. (In other words, there was NO (public) Internet) And personal computer became cheaper and stronger 15
Mid to late 90s, Grid compu)ng was proposed to link and share compuong resources The real story 16
The real story cononued Post-dot-com bust, big companies ended up with large data centers, with low uolizaoon Solu)on: Throw in virtualizaoon technology, and sell the excess compuong power And thus, Cloud Compu)ng was born 17
Cloud compuong provides numerous economic advantages For clients: No upfront commitment in buying/leasing hardware Can scale usage according to demand Barriers to entry lowered for startups For providers: Increased uolizaoon of datacenter resources 18
Cloud compuong means selling X as a service IaaS: Infrastructure as a Service Selling virtualized hardware PaaS: PlaYorm as a service Access to a configurable playorm/api SaaS: Somware as a service Somware that runs on top of a cloud 19
Cloud compuong architecture e.g., Web browser SaaS, e.g., Google Docs PaaS, e.g., Google AppEngine IaaS, e.g., Amazon EC2 20
Top 10 Obstacles (Berkley 09) Availability of Service Use MulOple Cloud Providers; Use ElasOcity to Prevent DDOS Data Lock-In Standardize APIs CompaOble SW to enable Surge CompuOng Data ConfidenOality and Auditability Deploy EncrypOon, VLANs, Firewalls; Geographical Data Storage Data Transfer BoAlenecks FedExing Disks; Data Backup/Archival; Higher BW Switches Performance Unpredictability I/O interferences Improved VM Support; Flash Memory; Gang Schedule VMs 21
Top 10 Obstacles Scalable Storage Invent Scalable Store Bugs in Large Distributed Systems Invent Debugger that relies on Distributed VMs Scaling Quickly Invent Auto-Scaler that relies on machine learning Snapshots for ConservaOon ReputaOon Fate Sharing offer reputaoon-guarding services like those for email Somware Licensing Pay-for-use licenses; Bulk use sales 22
My Research Memory system modeling and virtualizaoon Dynamic data center resource management 23
Memory Balancing Dynamic member balancing for virtual machines (Zhao&Wang VEE 99, Wang et al. ATC 11) 2G? 2G? 2G? 24
Memory Balancing 473.astar 2G? 2G? 2G? 25
Memory Balancing: Demand Control Plane PredicOon Phase detectioin Miss ratio curve WSS Estimation Kernel Intermittent Memory Tracking restore resize revoke Dynamic Hot Set AVL-tree Based LRU resize Hardware L1,L2,DTLB Monitoring 26
Key-Value Store Management LAMA: Op(mized Locality-aware Memory Alloca(on for Key-value Cache (Hu et al. ATC 15) class A class B How to dynamically adjust cache allocaoon? 27
Cross-Architecture Co-Tenancy PredicOon NSF CSR 14 with Dr. Laura Brown (CCGRID 15, AAAI PhD ConsorOum 15) Sensitivity Curve Latency Sensitive Programs Batch programs as interference Latency Sensitive Programs Batch programs as interference Core 1 Core 2 Shared cache/memory Hardware Configuration1(HW1) Core 1 Core 2 Shared cache/memory Hardware Configuration2(HW2) Profiling Profiling astar gcc report pressure score of batch program Sensitivity Curve astar gcc Batch s Pressure Score y=f_astar(x) Curve fitting y=f_gcc(x) Training Regression Performance? y=g_astar(x) Curve fitting y=g_gcc(x) Training Regression latency sensitive program A s sensitivity function on HW1 cross architectural mapping y=f_astar(x) y=f_gcc(x) p_astar p_gcc Input y=g_astar(x) y=g_gcc(x) Model: g_program=func(f_program,hw1,hw2) q_astar q_gcc Model: p_program=func (q_program,hw1,hw2) Output Output Program A s sensitivity curve on HW2 performace degradation final prediction pressure score Program B s pressure score on HW2 Input batch program B s pressure score on HW1 28
Systems research is exciong! Students are always welcome! Junior year is the best Ome to join 29