Carnegie Mellon. Proxy Recita-on. Jeffrey Liu Recita4on 13: November 23, 2015

Similar documents
Carnegie Mellon Concurrency & Proxy Lab

Recitation 13: Proxylab Network and Web

Networked Applications: Sockets. End System: Computer on the Net

Socket Programming. #In the name of Allah. Computer Engineering Department Sharif University of Technology CE443- Computer Networks

Tutorial on Socket Programming

Networked Applications: Sockets. Goals of Todayʼs Lecture. End System: Computer on the ʻNet. Client-server paradigm End systems Clients and servers

Socket Programming TCP UDP

UNIX Sockets. COS 461 Precept 1

UNIX Sockets. Developed for the Azera Group By: Joseph D. Fournier B.Sc.E.E., M.Sc.E.E.

CSMC 412. Computer Networks Prof. Ashok K Agrawala Ashok Agrawala Set 2. September 15 CMSC417 Set 2 1

Computer Networks Prof. Ashok K. Agrawala

TCP: Three-way handshake

A Client-Server Exchange

CSE 124 Discussion Section Sockets Programming 10/10/17

CSE 333 Section 8 - Client-Side Networking

Systems software design NETWORK COMMUNICATIONS & RPC SYSTEMS

Socket Programming for TCP and UDP

Sockets 15H2. Inshik Song

CSC209H Lecture 9. Dan Zingaro. March 11, 2015

MSc Integrated Electronics Networks Assignment. Investigation of TCP/IP Sockets and Ports. Gavin Cameron

CLIENT-SIDE PROGRAMMING

sottotitolo Socket Programming Milano, XX mese 20XX A.A. 2016/17 Federico Reghenzani

A Socket Example. Haris Andrianakis & Angelos Stavrou George Mason University

STUDY OF SOCKET PROGRAMMING

Overview. Administrative. * HW# 5 Due next week. * HW# 5 : Any Questions. Topics. * Client Server Communication. * 12.

Server-side Programming

Lecture 2. Outline. Layering and Protocols. Network Architecture. Layering and Protocols. Layering and Protocols. Chapter 1 - Foundation

CS 43: Computer Networks. 05: Socket Programming September 12-14, 2018

Network Communication

CSE 333 SECTION 8. Sockets, Network Programming

CSE 333 SECTION 7. C++ Virtual Functions and Client-Side Network Programming

Ports under 1024 are often considered special, and usually require special OS privileges to use.

HTTP, circa HTTP protocol. GET /foo/bar.html HTTP/1.1. Sviluppo App Web 2015/ Intro 3/3/2016. Marco Tarini, Uninsubria 1

CSE 333 SECTION 7. Client-Side Network Programming

Sockets and Parallel Computing. CS439: Principles of Computer Systems April 11, 2018

Recitation 12: ProxyLab Part 1

Introduc)on to Computer Networks

CS118 Discussion 1B, Week 1. Taqi Raza BUNCHE 1209B, Fridays 12:00pm to 1:50pm

Outline. Distributed Computer Systems. Socket Basics An end-point for a IP network connection. Ports. Sockets and the OS. Transport Layer.

UNIT IV- SOCKETS Part A

Web. Computer Organization 4/16/2015. CSC252 - Spring Web and HTTP. URLs. Kai Shen

Outline. Operating Systems. Socket Basics An end-point for a IP network connection. Ports. Network Communication. Sockets and the OS

Outline. Distributed Computing Systems. Socket Basics (1 of 2) Socket Basics (2 of 2) 3/28/2014

Processes communicating. Network Communication. Sockets. Addressing processes 4/15/2013

Hybrid of client-server and P2P. Pure P2P Architecture. App-layer Protocols. Communicating Processes. Transport Service Requirements

Chapter 6. The Transport Layer. Transport Layer 3-1

EEC-484/584 Computer Networks

Introduction to Lab 2 and Socket Programming. -Vengatanathan Krishnamoorthi

Client-server model The course that gives CMU its Zip! Network programming Nov 27, Using ports to identify services.

Server-side Programming

Dependent Types in the Idris Programming Language

Review. Preview. Closing a TCP Connection. Closing a TCP Connection. Port Numbers 11/27/2017. Packet Exchange for TCP Connection

Unix Network Programming

Basics. Once socket is configured, applica6ons. Socket is an interface between applica6on and network

Communication. Sockets (Haviland Ch. 10)

Networks. Practical Investigation of TCP/IP Ports and Sockets. Gavin Cameron

Web Client And Server

CSCI-1680 WWW Rodrigo Fonseca

Introduction to Socket Programming

Client software design

Network Programming November 3, 2008

Hyo-bong Son Computer Systems Laboratory Sungkyunkwan University

The TCP Protocol Stack

9/13/2007. Motivations for Sockets What s in a Socket? Working g with Sockets Concurrent Network Applications Software Engineering for Project 1

CSE 333 Lecture 16 - network programming intro

CSE 124 January 12, Winter 2016, UCSD Prof. George Porter

Project 1: Web Client and Server

SOCKET PROGRAMMING. What is a socket? Using sockets Types (Protocols) Associated functions Styles

Applications & Application-Layer Protocols: The Web & HTTP

Internet applications

NETWORK PROGRAMMING. Instructor: Junaid Tariq, Lecturer, Department of Computer Science

Introduction to Socket Programming

Computer Networks CS3516 B Term, 2013

Program Security and Vulnerabilities Class 2

CMPSC 311- Introduction to Systems Programming Module: Network Programming

Oral. Total. Dated Sign (2) (5) (3) (2)

Network Programming: Part II

CSSE 460 Computer Networks Group Projects: Implement a Simple HTTP Web Proxy

Socket Programming. CSIS0234A Computer and Communication Networks. Socket Programming in C

CS 640: Computer Networking


CPS 214: Computer Networks. Slides by Adolfo Rodriguez

Programming Internet with Socket API. Hui Chen, Ph.D. Dept. of Engineering & Computer Science Virginia State University Petersburg, VA 23806

CSE 333 Section 3. Thursday 12 April Thursday, April 12, 12

Client-side Networking

Session NM056. Programming TCP/IP with Sockets. Geoff Bryant Process software

CSCI-1680 WWW Rodrigo Fonseca

Outline. Option Types. Socket Options SWE 545. Socket Options. Out-of-Band Data. Advanced Socket. Many socket options are Boolean flags

Socket Programming. Dr. -Ing. Abdalkarim Awad. Informatik 7 Rechnernetze und Kommunikationssysteme

Socket Programming(2/2)

CS193i Handout #18. HTTP Part 5

CSE 333 Final Exam June 6, Name ID #

CSE 333 SECTION 7. Client-Side Network Programming

Network Programming: Part II

CS321: Computer Networks Introduction to Application Layer

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

CS 428, Easter 2017 Proxy Lab: Writing a Proxy Server Due: Wed, May 3, 11:59 PM

Group-A Assignment No. 6

Network Programming /18-243: Introduc3on to Computer Systems 22 nd Lecture, 13 April Instructors: Bill Nace and Gregory Kesden

2- Application Level Protocols HTTP 1.0/1.1/2

Transcription:

Proxy Recita-on Jeffrey Liu Recita4on 13: November 23, 2015 1

Outline Ge3ng content on the web: Telnet/cURL Demo How the web really works Networking Basics Proxy Due Tuesday, December 8 th Grace days allowed String Manipula-on in C 2

The Web in a Textbook Client request page, server provides, transac-on done. Web client (browser) Web server A sequen-al server can handle this. We just need to serve one page at a -me. This works great for simple text pages with embedded styles. 3

Telnet/Curl Demo Telnet Interac4ve remote shell like ssh without security Must build HTTP request manually This can be useful if you want to test response to malformed headers [rjaganna@makoshark ~]% telnet www.cmu.edu 80 Trying 128.2.42.52... Connected to WWW-CMU-PROD-VIP.ANDREW.cmu.edu (128.2.42.52). Escape character is '^]'. GET http://www.cmu.edu/ HTTP/1.0 HTTP/1.1 301 Moved Permanently Date: Sat, 11 Apr 2015 06:54:39 GMT Server: Apache/1.3.42 (Unix) mod_gzip/1.3.26.1a mod_pubcookie/3.3.4a mod_ssl/2.8.31 OpenSSL/0.9.8efips-rhel5 Location: http://www.cmu.edu/index.shtml Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www.cmu.edu/index.shtml">here</A>.<P> <HR> <ADDRESS>Apache/1.3.42 Server at <A HREF="mailto:webmaster@andrew.cmu.edu">www.cmu.edu</A> Port 80</ADDRESS> </BODY></HTML> Connection closed by foreign host. 4

Telnet/cURL Demo curl URL transfer library with a command line program Builds valid HTTP requests for you! [rjaganna@makoshark ~]% curl http://www.cmu.edu/ <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www.cmu.edu/index.shtml">here</A>.<P> <HR> <ADDRESS>Apache/1.3.42 Server at <A HREF="mailto:webmaster@andrew.cmu.edu">www.cmu.edu</A> Port 80</ADDRESS> </BODY></HTML> Can also be used to generate HTTP proxy requests: [rjaganna@makoshark ~]% curl --proxy lemonshark.ics.cs.cmu.edu:3092 http://www.cmu.edu/ <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www.cmu.edu/index.shtml">here</A>.<P> <HR> <ADDRESS>Apache/1.3.42 Server at <A HREF="mailto:webmaster@andrew.cmu.edu">www.cmu.edu</A> Port 80</ADDRESS> </BODY></HTML> 5

How the Web Really Works In reality, a single HTML page today may depend on 10s or 100s of support files (images, stylesheets, scripts, etc.) Builds a good argument for concurrent servers Just to load a single modern webpage, the client would have to wait for 10s of back-to-back request I/O is likely slower than processing, so back Caching is simpler if done in pieces rather than whole page If only part of the page changes, no need to fetch old parts again Each object (image, stylesheet, script) already has a unique URL that can be used as a key 6

How the Web Really Works Excerpt from www.cmu.edu/index.html: <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"> <head>... <link href="homecss/cmu.css" rel="stylesheet" type="text/css"/> <link href="homecss/cmu-new.css" rel="stylesheet" type="text/css"/> <link href="homecss/cmu-new-print.css" media="print" rel="stylesheet" type="text/ css"/> <link href="http://www.cmu.edu/rss/stories.rss" rel="alternate" title="carnegie Mellon Homepage Stories" type="application/rss+xml"/>... <script language="javascript" src="js/dojo.js" type="text/javascript"></script> <script language="javascript" src="js/scripts.js" type="text/javascript"></ script> <script language="javascript" src="js/jquery.js" type="text/javascript"></script> <script language="javascript" src="js/homepage.js" type="text/javascript"></ script> <script language="javascript" src="js/app_ad.js" type="text/javascript"></script>... <title>carnegie Mellon University CMU</title> </head> <body>... 7

Sequen-al Proxy 8

Sequen-al Proxy Note the sloped shape of when requests finish Although many requests are made at once, the proxy does not accept a new job un4l it finishes the current one Requests are made in batches. This results from how HTML is structured as files that reference other files. Compared to the concurrent example (next), this page takes a long -me to load with just sta-c content 9

Concurrent Proxy 10

Concurrent Proxy Now, we see much less purple (wai-ng), and less -me spent overall. No-ce how mul-ple green (receiving) blocks overlap in -me Our proxy has mul4ple connec4ons open to the browser to handle several tasks at once 11

How the Web Really Works A note on AJAX (and XMLHZpRequests) Normally, a browser will make the ini4al page request then request any suppor4ng files And XMLHapRequest is simply a request from the page once it has been loaded & the scripts are running The dis4nc4on does not maaer on the server side everything is an HTTP Request 12

Outline Ge3ng content on the web: Telnet/cURL Demo How the web really works Networking Basics Proxy Due Tuesday, December 8 th Grace days allowed String Manipula-on in C 13

Sockets What is a socket? To an applica4on, a socket is a file descriptor that lets the applica4on read/ write from/to the network (all Unix I/O devices, including networks, are modeled as files) Clients and servers communicate with each other by reading from and wri-ng to socket descriptors The main difference between regular file I/O and socket I/ O is how the applica-on opens the socket descriptors 14

Overview of the Sockets Interface Client Server getaddrinfo getaddrinfo open_clientfd socket socket bind open_listenfd connect Connec4on request listen accept Client / Server Session rio_writen rio_readlineb close EOF rio_readlineb rio_writen rio_readlineb Await connec4on request from next client close 15

Host and Service Conversion: getaddrinfo getaddrinfo is the modern way to convert string representa4ons of host, ports, and service names to socket address structures. Replaces obsolete gethostbyname - unsafe because it returns a pointer to a sta4c variable Advantages: Reentrant (can be safely used by threaded programs). Allows us to write portable protocol-independent code(ipv4 and IPv6) Given host and service, getaddrinfo returns result that points to a linked list of addrinfo structs, each poin4ng to socket address struct, which contains arguments for sockets APIs. getnameinfo is the inverse of getaddrinfo, conver-ng a socket address to the corresponding host and service. 16

Sockets API int socket(int domain, int type, int protocol); Create a file descriptor for network communica4on used by both clients and servers int sock_fd = socket(pf_inet, SOCK_STREAM, IPPROTO_TCP); One socket can be used for two-way communica4on int bind(int socket, const struct sockaddr *address, socklen_t address_len); Associate a socket with an IP address and port number used by servers struct sockaddr_in sockaddr family, address, port 17

Sockets API int listen(int socket, int backlog); socket: socket to listen on used by servers backlog: maximum number of wai4ng connec4ons err = listen(sock_fd, MAX_WAITING_CONNECTIONS); int accept(int socket, struct sockaddr *address, socklen_t *address_len); used by servers socket: socket to listen on address: pointer to sockaddr struct to hold client informa4on amer accept returns return: file descriptor 18

Sockets API int connect(int socket, struct sockaddr *address, socklen_t address_len); aaempt to connect to the specified IP address and port described in address used by clients int close(int fd); used by both clients and servers (also used for file I/O) fd: socket fd to close 19

Sockets API ssize_t read(int fd, void *buf, size_t nbyte); used by both clients and servers (also used for file I/O) fd: (socket) fd to read from buf: buffer to read into nbytes: buf length ssize_t write(int fd, void *buf, size_t nbyte); used by both clients and servers (also used for file I/O) fd: (socket) fd to write to buf: buffer to write nbytes: buf length 20

Outline Ge3ng content on the web: Telnet/cURL Demo How the web really works Networking Basics Proxy Due Tuesday, December 8th Grace days allowed String Manipula-on in C 21

Byte Ordering Reminder So, how are the bytes within a mul--byte word ordered in memory? Conven-ons Big Endian: Sun, PPC Mac, Internet Least significant byte has highest address Liale Endian: x86, ARM processors running Android, ios, and Windows Least significant byte has lowest address 22

Byte Ordering Reminder So, how are the bytes within a mul--byte word ordered in memory? Conven-ons Big Endian: Sun, PPC Mac, Internet Least significant byte has highest address Make sure to use correct endianness 23

Proxy - Func-onality Should work on vast majority of sites Twitch, CNN, NY Times, etc. Some features of sites which require the POST opera4on (sending data to the website), will not work - Logging in to websites, sending Facebook message HTTPS is not expected to work Google, YouTube (and some other popular websites) now try to push users to HTTPs by default; watch out for that Cache previous requests Use LRU evic4on policy Must allow for concurrent reads while maintaining consistency Details in write up 24

Proxy - Func-onality Why a mul--threaded cache? n Sequen4al cache would boaleneck parallel proxy n n Mul4ple threads can read cached content safely n n Search cache for the right data and return it Two threads can read from the same cache block But what about wri4ng content? n n Overwrite block while another thread reading? Two threads wri4ng to same cache block? 25

Proxy - How Proxies are a bit special - they are a server and a client at the same 4me. They take a request from one computer (ac4ng as the server), and make it on their behalf (as the client). Ul4mately, the control flow of your program will look like a server, but will have to act as a client to complete the request Start small Grab yourself a copy of the echo server (pg. 946) and client (pg. 947) in the book Also review the 4ny.c basic web server code to see how to deal with HTTP headers Note that 4ny.c ignores these; you may not 26

Proxy - How What you end up with will resemble: Client socket address 128.2.194.242:51213 Server socket address 208.216.181.15:80 Client Proxy Server (port 80) Proxy server socket address 128.2.194.34:15213 Proxy client socket address 128.2.194.34:52943 27

Summary Step 1: Sequen-al Proxy Works great for simple text pages with embedded styles Step 2: Concurrent Proxy mul4-threading Step 3 : Cache Web Objects Cache individual objects, not the whole page Use an LRU evic-on policy Your caching system must allow for concurrent reads while maintaining consistency. Concurrency? Shared Resource? 28

Proxy Tes-ng & Grading New: Autograder./driver.sh will run the same tests as autolab: Ability to pull basic web pages from a server Handle a (concurrent) request while another request is s4ll pending Fetch a web page again from your cache amer the server has been stopped This should help answer the ques4on is this what my proxy is supposed to do? Please don t use this grader to defini4vely test your proxy; there are many things not tested here 29

Proxy Tes-ng & Grading Test your proxy liberally The web is full of special cases that want to break your proxy Generate a port for yourself with./port-for-user.pl [andrewid] Generate more ports for web servers and such with./free-port.sh Consider using your andrew web space (~/www) to host test files You have to visit haps://www.andrew.cmu.edu/server/publish.html to publish your folder to the public server Create a handin file with make handin Will create a tar file for you with the contents of your proxylabhandin folder 30

Outline Ge3ng content on the web: Telnet/cURL Demo How the web really works Networking Basics Proxy Due Tuesday, December 8th Grace days allowed String Manipula-on in C 31

String manipula-on in C sscanf: Read input in specific format int sscanf(const char *str, const char *format, ); Example: buf = 213 is awesome // Read integer and string separated by white space from buffer buf // into passed variables ret = sscanf(buf, %d %s %s, &course, str1, str2); This results in: course = 213, str1 = is, str2 = awesome, ret = 3 32

String manipula-on (cont) sprine: Write input into buffer in specific format int sprini(char *str, const char *format, ); Example: buf[100]; str = 213 is awesome // Build the string in double quotes ( ) using the passed arguments // and write to buffer buf sprini(buf, String (%s) is of length %d, str, strlen(str)); This results in: buf = String (213 is awesome) is of length 14 33

String manipula-on (cont) Other useful string manipula-on func-ons: strcmp, strncmp, strncasecmp strstr strlen strcpy, strncpy 34

Aside: Se3ng up Firefox to use a proxy You may use any browser, but we ll be grading with Firefox Preferences > Advanced > Network > Se3ngs (under Connec-on) Check Use this proxy for all protocols or your proxy will appear to work for HTTPS traffic. 35

Acknowledgements Slides derived from recita-on slides of last 2 years by Shiva Hartaj Singh Dugal Ian Hartwig Rohith Jagannathan 36

Ques-ons? 37