Send me up to 5 good questions in your opinion, I ll use top ones Via direct message at slack. Can be a group effort. Try to add some explanation.

Similar documents
Discussion #4 CSS VS XSLT. Multiple stylesheet types with cascading priorities. One stylesheet type

CS454/654 Midterm Exam Fall 2004

Cloud Computing CS

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

Case Study Schedule. NoSQL/NewSQL Database Distributed File System Peer-to-peer (P2P) computing Cloud computing Big Data Internet of Thing

Scalability of web applications

REST Easy with Infrared360

Announcements. me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris

Operating Systems. Week 13 Recitation: Exam 3 Preview Review of Exam 3, Spring Paul Krzyzanowski. Rutgers University.

CS 416: Operating Systems Design April 22, 2015

Chapter 18 Distributed Systems and Web Services

Internet Technology. 06. Exam 1 Review Paul Krzyzanowski. Rutgers University. Spring 2016

Internet Technology 3/2/2016

TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica

ReST 2000 Roy Fielding W3C

Distributed Architectures & Microservices. CS 475, Spring 2018 Concurrent & Distributed Systems

Scaling DreamFactory

Module 6 Node.js and Socket.IO

Distributed Systems. 21. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2018

CS November 2017

CS November 2018

Traditional Web Based Systems

Question 1 (6 points) Compare circuit-switching and packet-switching networks based on the following criteria:

Information Network Systems The application layer. Stephan Sigg

Achieving Scalability and High Availability for clustered Web Services using Apache Synapse. Ruwan Linton WSO2 Inc.

Internet Technology 2/18/2016

PLEASE READ CAREFULLY BEFORE YOU START

EEC-682/782 Computer Networks I

Large-Scale Web Applications

CSE 123b Communications Software

Chapter 10 DISTRIBUTED OBJECT-BASED SYSTEMS

CAS 703 Software Design

PLEASE READ CAREFULLY BEFORE YOU START

PLEASE READ CAREFULLY BEFORE YOU START

Data Management in Application Servers. Dean Jacobs BEA Systems

IERG 4080 Building Scalable Internet-based Services

Operating Systems Design Exam 3 Review: Spring Paul Krzyzanowski

Today s class. CSE 123b Communications Software. Telnet. Network File System (NFS) Quick descriptions of some other sample applications

Deployment Scenarios for Standalone Content Engines

Background. 20: Distributed File Systems. DFS Structure. Naming and Transparency. Naming Structures. Naming Schemes Three Main Approaches

BlackBerry Enterprise Server for IBM Lotus Domino Version: 5.0. Administration Guide

Distributed System: Definition

CSE/EE 461 HTTP and the Web

02 - Distributed Systems

CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #4 Tuesday, December 15 th 11:00 12:15. Advanced Topics: Distributed File Systems

02 - Distributed Systems

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

Distributed Systems Principles and Paradigms

PLEASE READ CAREFULLY BEFORE YOU START

CPS 512 midterm exam #1, 10/7/2016

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Session 8. Reading and Reference. en.wikipedia.org/wiki/list_of_http_headers. en.wikipedia.org/wiki/http_status_codes

AWS Lambda + nodejs Hands-On Training

Microservices at Netflix Scale. First Principles, Tradeoffs, Lessons Learned Ruslan

Joseph Faber Wonderful Talking Machine (1845)

CS 677 Distributed Operating Systems. Programming Assignment 3: Angry birds : Replication, Fault Tolerance and Cache Consistency

SaaS Providers. ThousandEyes for. Summary

Application Level Protocols

Application Layer Introduction; HTTP; FTP

Advanced Distributed Systems

Notes. Submit homework on Blackboard The first homework deadline is the end of Sunday, Feb 11 th. Final slides have 'Spring 2018' in chapter title

Four times Microservices: REST, Kubernetes, UI Integration, Async. Eberhard Fellow

COMP6218: Content Caches. Prof Leslie Carr

RKN 2015 Application Layer Short Summary

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

CCDP-ARCH. Section 11. Content Networking

GFS-python: A Simplified GFS Implementation in Python

416 practice questions (PQs)

SPDY - A Web Protocol. Mike Belshe Velocity, Dec 2009

ThousandEyes for. Application Delivery White Paper

Yahoo Traffic Server -a Powerful Cloud Gatekeeper

Controller/server communication

Sharding & CDNs. CS 475, Spring 2018 Concurrent & Distributed Systems

Using peer to peer. Marco Danelutto Dept. Computer Science University of Pisa

Review for Internet Introduction

Replication in Distributed Systems

CSCI 466 Midterm Networks Fall 2013

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

Enabling Embedded Systems to access Internet Resources

System Models for Distributed Systems

JXTA TM Technology for XML Messaging

Last Class: RPCs and RMI. Today: Communication Issues

AWS Lambda: Event-driven Code in the Cloud

Unit 28 Website Production ASSIGNMENT 1

BUILDING MICROSERVICES ON AZURE. ~ Vaibhav

Summary of Data Communications

BlackBerry Enterprise Server for Microsoft Exchange Version: 5.0. Feature and Technical Overview

殷亚凤. Processes. Distributed Systems [3]

WHITE PAPER NGINX An Open Source Platform of Choice for Enterprise Website Architectures

ENHANCE APPLICATION SCALABILITY AND AVAILABILITY WITH NGINX PLUS AND THE DIAMANTI BARE-METAL KUBERNETES PLATFORM

Lecture 8: February 17

CS6601 DISTRIBUTED SYSTEM / 2 MARK

CS603: Distributed Systems

GroupWise Architecture and Best Practices. WebAccess. Kiran Palagiri Team Lead GroupWise WebAccess

Java Architectures A New Hope. Eberhard Wolff

System Specification

CSC 401 Data and Computer Communications Networks

Electronic Mail. Three Components: SMTP SMTP. SMTP mail server. 1. User Agents. 2. Mail Servers. 3. SMTP protocol

CSC 4900 Computer Networks:

Transcription:

Notes Midterm reminder Second midterm next week (04/03), regular class time 20 points, more questions than midterm 1 non-comprehensive exam: no need to study modules before midterm 1 Online testing like Midterm 1. Same exam structure. Remember to bring your computer Send me up to 5 good questions in your opinion, I ll use top ones Via direct message at slack. Can be a group effort. Try to add some explanation. If you want to learn more on Distributed Systems (including Cloud Computing) and/or Big Data, please consider enroll 'Independent Study' and/or 'Master Thesis' with me You could do in the summer/winter or regular semesters IS 651: Distributed Systems 1

Case Study Notes The goal of case study Learn the latest techniques in distributed systems know and collaborate with other team members Learn from other teams Process Team building: results have been sent out Select topic: talk to your team members on which topic to work on after class Inform your selection: post your topic on slack (case-study channel) after class. For a topic, the team who post earliest get it. Search and select paper/project you want to work on: https://scholar.google.com/, http://www.apache.org/ Send me an email and get my approval Work as a team Present as a team on week 14 (05/01): every member presents IS 651: Distributed Systems 2

Case Study Topics Team Topic 1 NoSQL/NewSQL Database 2 Parallel Computing 3 Distributed File System 4 Peer-to-peer (P2P) computing 5 Micro Services 6 Cloud computing Big Data IS 651: Distributed Systems 3

Common Mistakes for HW4 The Guardian html doesn't have required 2 additional variables from the guardian JSON The Flickr urls should be REST based, not static xml urls No actual content for the last Flickr url IS 651: Distributed Systems 4

Discussion #5 A weather Web service maintains the current weather information which gives different results for the same place when you invoke it at different times This Web service is a stateless service because the execution result doesn t rely on previous executions requested by the same client. Stateless service is about execution independence. The Web service could keep the weather state in database or external resources. So stateless doesn t mean no state at server side. IS 651: Distributed Systems 5

IS 651: Distributed Systems Chapter 8: Distributed Systems Basics Jianwu Wang Spring 2018

Learning Outcomes After learning this chapter, you should be able to Understand each technique (why we need it, how it works) Understand differences between similar techniques IS 651: Distributed Systems 7

Distributed Systems Basics Caching Load-balancing Distributed naming Database replication Processes and threads Push technology Microservice Server virtualization (see Chapter 13) IS 651: Distributed Systems 8

Caching Caching is an optimization for distributed systems that reduces latency and decreases network traffic There are two categories of this kind of network-based cache: Web caching Application caching IS 651: Distributed Systems 9

Web Caching There are four locations for web cache: Web browser cache (see figure) Forward proxy cache Open proxy cache Reverse proxy cache A proxy (server) acts as an intermediary for requests from clients seeking resources from other servers IS 651: Distributed Systems 10

Forward Proxy Cache A forward proxy cache is located at the organization (as at UMBC) or at the internet service provider (ISP) Two approaches used by forward proxy cache Configure the browser Interception caching IS 651: Distributed Systems 11

Reverse Proxy Cache A reverse proxy cache is on the internal network of the server Can reduce load on its origin servers Can distribute the load from incoming requests to several servers (see load-balancing) IS 651: Distributed Systems 12

Cache Info in HTTP Response Headers Date: response time Cache-Control: directives that MUST be obeyed Expires: how long the cache should be kept before the cache refreshes Last-Modified ETag (Entity Tags): a short unique identifier that the server generates for each object such as a web page IS 651: Distributed Systems 13

Content Delivery Network (CDN) This improves access to data by locating content in various places on the Internet in order to get it closer to clients Example: Netflix IS 651: Distributed Systems 14

Application Caching Application caching is caching that is managed by the application itself to improve performance rather than the web and Internet infrastructure IS 651: Distributed Systems 15

Load-balancing Load-balancing is a technique to make many servers appear as one server to clients and thereby getting performance increase Three major methods of load balancing Round-robin Domain Name System (RRDNS) Load-balancing switches Application servers IS 651: Distributed Systems 16

Round-Robin Domain Name System (RRDNS) RRDNS is a low cost (in fact, free!), low performance way to loadbalance Multiple IP addresses and servers for the same domain name A new request uses the next IP address, until it wraps around to the first IP address again IS 651: Distributed Systems 17

Load-Balancing Switches Load-balancing switches are the highest performance and most common method IS 651: Distributed Systems 18

Application Servers Application server load balancing approaches use the server to control the load-balancing. IS 651: Distributed Systems 19

Distributed Naming There are two major types of distributed naming systems: Structured naming (NFS, AFS, DNS) Attribute-based naming IS 651: Distributed Systems 20

Network File System (NFS) A network file system protocol originally developed by Sun Microsystems in 1984, allowing a user on a client computer to access files over a network in a manner similar to how local storage is accessed AFS is a more modern version We use it at UMBC application presentation session transport network data link physical NFS XDR RPC UDP/TCP IP network interface IS 651: Distributed Systems 21

Domain Name System (DNS) It is the most widely used distributed naming system since it is used for looking up the addresses of hosts on the Internet Two roles Server: respond to requests to convert names to IP addresses or the reverse Client (resolver): ask other name servers to convert names to IP addresses Two optimizations: caching and replication Two resolver modes: recursive and iterative Resolver demo IS 651: Distributed Systems 22

Attribute-based Naming Attribute-based naming is known as directory services Allows searches by attributes ldap[s]://<hostname>:<port>/<base_dn>? <attributes>?<scope>?<filter> Implementations/standards X.500 Lightweight Directory Access Protocol (LDAP) Active Directory X.500 demo IS 651: Distributed Systems 23

Replication General optimization on any network for scalability and fault-tolerance making copies of information on different nodes on a network having consistency mechanism between the replicas Eventual consistency Partition and replication Read/write operation with master-slave replication IS 651: Distributed Systems 24

Processes and Threads Thread is often a component within a process Multiple threads can exist within one process, and share resources such as memory Benefits of Threads: Less time to create a new thread Less time to terminate a thread Less time to switch between two threads within the same process Less communication overheads IS 651: Distributed Systems 25

Processes and Threads (2) IS 651: Distributed Systems 26

Benefits of Multithreading Improve application responsiveness The user of a multithreaded GUI does not have to wait for one activity to complete before starting another. Use multiprocessors more efficiently Numerical algorithms and applications with a high degree of parallelism, such as matrix multiplications, can run much faster when implemented with threads on a multiprocessor. Improve program structure Many programs are more efficiently structured as multiple independent or semi-independent units of execution instead of as a single, monolithic thread. Use fewer system resources Each process has a full address space and operating systems state. The inherent separation between processes can require a major effort by the programmer to communicate between the threads in different processes, or to synchronize their actions. IS 651: Distributed Systems 27

Asynchronous Event Loop Web server normally uses multi-threading for different requests, which has one problem It creates a new thread for each synchronous request is that it is very memory intensive and Disk I/O (input/output) bound. Each thread cannot respond to client until the data is read from disk. Asynchronous event loop is a new model to deal with the problem These servers run as a single threaded process asynchronously. The server just runs an event loop that gets requests and passes them on to other processes. A callback mechanism informs the server process when data is ready. Node.js is an open-source runtime environment based on this model IS 651: Distributed Systems 28

Push Technology Push technology, or server push: a style of Internet-based communication where the request for a given transaction is initiated by the publisher or server Pull technology: the request for information transmission is initiated by the receiver/client (short) polling: the client periodically (every few seconds) makes a request to check for new data WebSockets provides for a bi-directional, full-duplex communications channel over a TCP socket IS 651: Distributed Systems 29

Microservice A system design pattern that follows Service-Oriented Architecture. Compared to monolithic applications where different functionalities are combined into a single program, microservice applications are easier to design, implement, deploy and maintain. From https://martinfowler.com/articles/microservices.html IS 651: Distributed Systems 30

Discussion #6 What are the commonalities and differences between caching and replication? Difference 1 Difference 2 Commonality 1 Commonality 2 Caching Replication IS 651: Distributed Systems 31