CS454/654 Distributed Systems M. Tamer Özsu DC 3350 tozsu@uwaterloo.ca Course Objective This course provides an introduction to the fundamentals of distributed computer systems, assuming the availability of facilities for data transmission. The structure of distributed systems using multiple levels of software is emphasized. Specific topics include: distributed algorithms distributed file systems distributed databases, security and protection distributed services such as the world-wide web, and examples of research and commercial distributed systems CS454/654 0-2 1
Course Modules Module 1: Fundamental Architectures and Models of Distributed Systems Module 2: Brief Overview of Computer Networks Module 3: Distributed Objects and Remote Invocation Module 4: Distributed Naming Module 5: Distributed File Systems Module 6: Synchronization Module 7: Data Replication Module 8: Fault Tolerance Module 9: Security CS454/654 0-3 Course Information Intended Audience: CS 454 is a course for CS major students and is normally completed in the fourth year. Related Courses: Prerequisites: CS 354 (CS 350)and fourth-year standing in a CS major program. Antirequisites: CS 436, E&CE 454. CS 456 is not a prerequisite but provides information about the underlying facilities assumed in this course. CS454/654 0-4 2
Course Information Lectures: TR 8:30-9:30 & 11:30-12:50 in MC 4042 Office Hours: T 10:00-11:15 in DC 3350 Assignments: 3 assignments, a combination of answering questions and implementation work. Exams: A Midterm and a Final Midterm scheduled for 23 June, 7:00-9:00 PM in MC 2034 & 2035 CS454/654 0-5 Course Documents Textbook: A.S. Tanenbaum and M. van Steen, Distributed Systems: Principles and Paradigms, Prentice-Hall, 2002. Other References: G. Colouris, J. Dollimore, and T. Kindberg, Distributed Systems: Concepts and Design, 3rd edition, Addison-Wesley, 2001. M.L. Liu, Distributed Computing Principles and Applications, Addison-Wesley, 2004. J. Kurose and K. Ross, Computer Networking: A Top-Down Approach Featuring the Internet, Addison-Wesley, 2001 Course Home Page: http://db.uwaterloo.ca/~tozsu/courses/cs454 Home page will include all the slides used in the course. Slides will also be available for purchase as course notes. Course Newsgroup: uw.cs.cs454 CS454/654 0-6 3
Administravia Grading CS 454 CS 654 Assignments 30% 22% (Assignments 2&3 only) Midterm 30% 20% Final 40% 40% Project 18% Assignments 1& 3 are 8%, 2 is 14% (programming) Students have to pass the exams to obtain a passing grade in the course. Announcements In class, on the course Web page, and on the newsgroup; material will also be available electronically Collaboration Collaborate on assignments, but do not merely copy. CS454/654 0-7 What s a Distributed System? A distributed system is a collection of independent computers that appear to the users of the system as a single computer Example: a network of workstations allocated to users a pool of processors in the machine room allocated dynamically a single file system (all users access files with the same path name) user command executed in the best place (user workstation, a workstation belonging to someone else, or on an unassigned processor in the machine room) CS454/654 0-8 4
Why Distributed? Economics Speed Inherent distribution Reliability Incremental growth Microprocessors offer a better price/ performance than mainframes A distributed system may have more total computing power than a mainframe Some applications involve spatially separated machines If one machine crashes, the system as a whole can still survive Computing power can be added in small increments CS454/654 0-9 Primary Features Multiple computers Concurrent execution Independent operation and failures Communications Ability to communicate No tight synchronization (no global clock) Virtual Computer Transparency CS454/654 0-10 5
Types of Transparency Access Location Concurrency Replication Failure Mobility Performance Scaling local and remote resources are accessed using identical operations resources are accessed without knowledge of their location several processes operate concurrently using shared resources without interference between them multiple instances of resources appeare as a single instance the concealment of faults from users the movement of resources and clients within a system (also called migration transparency) the system can be reconfigured to improve performance the system and applications can expand in scale without change to the system structure or the application algorithms CS454/654 0-11 Typical Layering in DSs Applications, services Middleware Operating system Platform Computer and network hardware CS454/654 0-12 6
Example Internet (Portion of it) ISP intranet backbone satellite link desktop computer: server: network link: From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. CS454/654 Addison-Wesley Publishers 2000 0-13 Example An Intranet print and other servers email server Desktop computers Web server Local area network email server File server print other servers the rest of the Internet router/firewall From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. CS454/654 Addison-Wesley Publishers 2000 0-14 7
Example Mobile Environment Internet Host intranet Wireless LAN WAP gateway Home intranet Printer Camera Mobile phone Laptop Host site From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. CS454/654 Addison-Wesley Publishers 2000 0-15 Example - Web www.google.com Web servers http://www.google.comlsearch?q=kindberg Browsers www.cdk3.net www.w3c.org Internet http://www.cdk3.net/ File system of www.w3c.org Protocols http://www.w3c.org/protocols/activity.html Activity.html From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. CS454/654 Addison-Wesley Publishers 2000 0-16 8
Heterogeneity Networks Hardware OS Programming Languages Implementations Openness Challenges (I) Can you extend and re-implement the system? Public and published interfaces Uniform communication mechanism CS454/654 0-17 Scalability Challenges (II) Can the system behave properly if the number of users and components increase? Scale-up: increase the number of users while keeping the system performance unaffected Speed-up: improvement in the system performance as system resources are increased Impediments Centralized data A single file Centralized services A single server Centralized algorithms Algorithms that know it all CS454/654 0-18 9
Failure handling Partial failures Challenges (III) Can non-failed components continue operation? Can the failed components easily recover? Failure detection Failure masking Failure tolerance Recovery An important technique is replication Why does the space shuttle has 3 on-board computers? CS454/654 0-19 Challenges (IV) Concurrency Multiple clients sharing a resource how to you maintain integrity of the resource and proper operation (without interference) of these clients? Example: bank account Assume your account has a balance of $100 You are depositing from an ATM a cheque for $50 into that account Someone else is withdrawing from the same account $30 using another ATM but at the same time What should the final balance of the account be? $120 $70 $150 CS454/654 0-20 10
Challenges (V) Transparency Not easy to maintain Example: EMP(ENO,ENAME,TITLE,LOC) PROJECT(PNO,PNAME,LOC) PAY(TITLE,SAL) ASG(ENO,PNO,DUR) SELECT ENAME,SAL FROM EMP,ASG,PAY WHERE DUR > 12 AND EMP.ENO = ASG.ENO AND PAY.TITLE = EMP.TITLE Boston Boston projects Boston employees Boston assignments New York Tokyo Communication Network Boston projects New York employees New York projects New York assignments Paris Paris projects Paris employees Paris assignments Boston employees Montreal Montreal projects Paris projects New York projects with budget > 200000 Montreal employees Montreal assignments CS454/654 0-21 11