Basics of Cloud Computing MTAT.08.027 Basics of Cloud Computing (3 ECTS) MTAT.08.011 Basics of Grid and Cloud Computing Satish Srirama satish.srirama@ut.ee
Course Purpose Introduce cloud computing concepts Introduce cloud providers Introduction to distributed computing algorithms like MapReduce Glance of research at distributed systems group in cloud computing domain http://courses.cs.ut.ee/2012/cloud/ 4/4/2012 Satish Srirama 2/33
Questions Is everyone comfortable with data structures? How comfortable you are with algorithms? How comfortable you are with programming? Java? External APIs? Python I assume you are Web programming Were you able to submit all exercises in Grid part? 4/4/2012 Satish Srirama 3/33
Outline Cloud computing Cloud providers SciCloud- http://ds.cs.ut.ee/research/scicloud MapReduce MapReduce in different domains Cloud platforms - Aneka Mobile Cloud 4/4/2012 Satish Srirama 4/33
Grading Written exam 50% Labs 20% 4 lab exercises Active participation in the lectures (Max 5%) Project 30% 1 man week task/per person Duration of 4 weeks Groups should deliver bigger tasks Advertised after the 3 rd lecture 4/4/2012 Satish Srirama 5/33
Course schedule 04.04Basics of Cloud computing 11.04Cloud Providers 18.04 MapReduce No lecture Me and Pelle will be on a EU project meeting 02.05 Working with Hadoop 09.05MapReduce Algorithms 16.05MapReduce in Information Retrieval 23.05Aneka and other PaaS 4/4/2012 Satish Srirama 6/33
Course schedule -continued Labs 05-10.04 Starting with a cloud 12-17.04 Working with SciCloud 19-24.04 MapReduce. Projects will be advertised. Remaining lab will be advertised in sometime Projects will have 2 meetings Intermediary presentation Final presentation Deliverables will be discussed when projects are advertised Dates will be announced later based on consensus 4/4/2012 Satish Srirama 7/33
Course schedule -continued Exam 30 th May 2012 Second exam 5 th June 2012 Students who want their results by end of th May should submit their projects by 29 th May Other students can submit projects by 12 th June Examination for second attempt students 13 th June 4/4/2012 Satish Srirama 8/33
Lecture 1 CLOUD COMPUTING 4/4/2012 Satish Srirama 9
It s nothing new It s a trap...we ve redefined Cloud It s worse than stupidity: it s Computing to include everything marketing hype. Somebody is that we already do... I don t saying this is inevitable and understand what we would do whenever you hear that, it s very differently... other than change likely to be a set of businesses the wording of some of our ads. campaigning to make it true. WHAT IS CLOUD COMPUTING? Larry Ellison, CEO, Oracle (Wall Richard Stallman, Founder, Free Street Journal, Sept. 26, 2008) Software Foundation (The Guardian, Sept. 29, 2008) No consistent answer! Everyone thinks it is something else 4/4/2012 Satish Srirama 10 Slide taken from Professor Anthony D. Joseph s lecture at RWTH Aachen
What is Cloud Computing? Computing as a utility Utility services e.g. water, electricity, gas etc Consumers pay based on their usage Cloud Computing characteristics Illusion of infinite resources No up-front cost Fine-grained billing (e.g. hourly) Gartner: Cloud computing is a style of computing where massively scalable IT-related capabilities are provided as a service across the Internet to multiple external customers 4/4/2012Satish Srirama 11/33
Timeline 4/4/2012 Satish Srirama 12/33
How Cloud & Grid are related Share a lot commonality Intention, architecture and technology Differences Programming model, business model, compute model, applications, and Virtualization. The problems are mostly the same Manage large facilities; Define methods by which consumers discover, request and use resources provided by the central facilities; Implement the often highly parallel computations that execute on those resources. 4/4/2012 Satish Srirama 13/33
Virtualization Virtualization techniques are the basis of the cloud computing Virtualization technologies partition hardware and thus provide flexible and scalable computing platforms Virtual machine techniques App App App VMware and Xen OpenNebula Amazon EC2 Grid OS OS OS Hypervisor Hardware Virtualized Stack do not rely on virtualization as much as Clouds do, each individual organization maintain full control of their resources For cloud, virtualization is almost an indispensable ingredient 4/4/2012 Satish Srirama 14/33
Enabling Grids for E-sciencE - EGEE 4/4/2012 Satish Srirama 15/33
Clouds -Why Now (not then)? Commoditization of HW x86 as universal ISA, plus fast virtualization Bet: Can statistically multiplex multiple instances onto a single box without interference between instances Web 2.0 Standard software stack, largely open source (LAMP) Asynchronous JavaScript and XML (AJAX) Novel economic model: fine grain billing Earlier examples: Sun, Intel Computing Services longer commitment, more $$$/hour Infrastructure software: e.g. Google FileSystem, HDFS Operational expertise: failover, DDoS, firewalls... More pervasive broadband Internet ISA - Instruction Set Architecture LAMP Linux, Apache Http Server, MySQL, PHP, perlor python 4/4/2012 Satish Srirama 16/33
Cloud Computing -Services Software as a Service SaaS A way to access applications hosted on the web through your web browser Platform as a Service PaaS A pay-as-you-go model for IT resources accessed over the Internet Infrastructure as a Service IaaS Use of commodity computers, distributed across Internet, to perform parallel processing, distributed storage, indexing and mining of data Virtualization SaaS Facebook, Flikr,Myspace.com, Google maps API, Gmail PaaS Google App Engine, Force.com, Hadoop, Azure, Amazon S3, etc IaaS Amazon EC2, SciCloud, Joyent Accelerators, Nirvanix Storage Delivery Network, etc. Level of Abstraction 4/4/2012 Satish Srirama 17/33
Cloud Computing -Themes Massively scalable On-demand & dynamic Only use what you need -Elastic No upfront commitments, use on short term basis Accessible via Internet, location independent Transparent Complexity concealed from users, virtualized, abstracted Service oriented Easy to use SLAs SLA Service Level Agreement 4/4/2012Satish Srirama 18/33
Cloud Models Internal (private) cloud Cloud with in an organization Community cloud Cloud infrastructure jointly owned by several organizations Public cloud Cloud infrastructure owned by an organization, provided to general public as service Hybrid cloud Composition of two or more cloud models 4/4/2012 Satish Srirama 19/33
Short Term Implications of Clouds Startups and prototyping Minimize infrastructure risk Lower cost of entry Batch jobs One-off tasks Washington post, NY Times Cost associatively for scientific applications Research at scale 4/4/2012 Satish Srirama 20/33
Cloud Application Demand Many cloud applications have cyclical demand curves Daily, weekly, monthly, Re esources Demand Workload spikes are more frequent and significant When some event happens like a pop star has expired: More # tweets, Wikipedia traffic increases 22% of tweets, 20% of Wikipedia traffic when Michael Jackson expired in 2009 Google thought they are under attack Time 4/4/2012 Satish Srirama 21/33
Economics of Cloud Users Pay by use instead of provisioning for peak Resource es Capacity Demand Resource es Capacity Demand Time Static data center Time Data center in the cloud Unused resources 4/4/2012 Satish Srirama 22/33
Economics of Cloud Users -continued Risk of over-provisioning: underutilization Huge sunk cost in infrastructure Capacity Unused resources Resources Demand Time Static data center 4/4/2012 Satish Srirama 23/33
Economics of Cloud Users -continued Heavy penalty for under-provisioning Resources s 1 2 3 Time (days) Capacity Demand Resources Resources 1 2 3 Time (days) Lost revenue 1 2 3 Time (days) Lost users 4/4/2012 Satish Srirama 24/33 Capacity Demand Capacity Demand
Economics of Cloud Providers Building a very large-scale datacenter is very expensive $100+ Million (Minimum) Large Internet Companies Already Building Huge DCs Google, Amazon, Microsoft 5-7x economies of scale [Hamilton 2008] Resource Cost in Medium DC Cost in Very Large DC Ratio Network $95 / Mbps / month $13 / Mbps / month 7.3x Storage $2.20 / GB / month $0.40 / GB / month 5.5x Administration 140 servers/admin >1000 servers/admin 7.1x 4/4/2012 Satish Srirama 25/33
Power Economics of Cloud Providers - Price per KWH Where continued Possible Reasons Why 3.6 Idaho Hydroelectric power; not sent long distance 10.0 California Electricity transmitted long distance over the grid; limited transmission lines in Bay Area; no coal fired electricity allowed in California. 18.0 Hawaii Must ship fuel to generate electricity Cooling is also expensive Build data centers near rivers Extra benefits Amazon: utilize off-peak capacity Microsoft: sell.net tools Google: reuse existing infrastructure 4/4/2012 Satish Srirama 26/33
Economics of Cloud Providers -Failures Cloud Computing providers bring a shift from high reliability/availability servers to commodity servers At least one failure per day in large datacenter Why? Significant economic incentives much lower per-server cost Caveat: User software has to adapt to failures Very hard problem! Solution: Replicate data and computation MapReduce & Distributed File System (Will discuss later in Lecture 3) 4/4/2012 Satish Srirama 27/33
Adoption Challenges Availability Data lock-in Challenge Data Confidentiality and Auditability Opportunity Multiple providers & Use elasticity to prevent DDoS attacks Standardization Encryption, VLANs, Firewalls; Geographical Data Storage 4/4/2012 Satish Srirama 28/33
Growth Challenges Challenge Data transfer bottlenecks Performance unpredictability Scalable storage Bugs in large distributed systems Scaling quickly Opportunity FedEx-ing disks, Data Backup/Archival Improved VM support, flash memory, scheduling VMs Invent scalable store Invent Debugger that relies on Distributed VMs Invent Auto-Scaler; Snapshots for conservation 4/4/2012 Satish Srirama 29/33
Policy and Business Challenges Challenge Opportunity Reputation Fate Sharing Offer reputation-guarding services like those for email Software Licensing Pay-for-use licenses; Bulk use sales 4/4/2012 Satish Srirama 30/33
Long Term Implications of clouds Application software: Cloud & client parts, disconnection tolerance Infrastructure software: Resource accounting, VM awareness Hardware systems: Containers, energy proportionality 4/4/2012 Satish Srirama 31/33
Next lecture Cloud providers Amazon EC2, S3, EBS Eucalyptus SciCloud 4/4/2012 Satish Srirama 32/33
References Several of the slides are taken from Prof. Anthony D. Joseph s lecture at RWTH Aachen (March 2010) Papers to read M. Armbrustet al., Above the Clouds, A Berkeley View of Cloud Computing, Technical Report, University of California, Feb, 2009. The Cloud: Battle of the Tech Titans Cover story in Businessweek http://www.businessweek.com/magazine/content/11_11/b4219052599182.htm 4/4/2012 Satish Srirama 33/33