Proxying. Why and How. Alon Altman. Haifa Linux Club. Proxying p.1/24

Similar documents
Caching. Caching Overview

CS 355. Computer Networking. Wei Lu, Ph.D., P.Eng.

Web, HTTP and Web Caching

EE 122: HyperText Transfer Protocol (HTTP)

1-1. Switching Networks (Fall 2010) EE 586 Communication and. September Lecture 10

Configure. Version: Copyright ImageStream Internet Solutions, Inc., All rights Reserved.

The HTTP protocol. Fulvio Corno, Dario Bonino. 08/10/09 http 1

Lecture 7b: HTTP. Feb. 24, Internet and Intranet Protocols and Applications

HyperText Transfer Protocol

Seminar on. By Sai Rahul Reddy P. 2/2/2005 Web Caching 1

Lecture 6 Application Layer. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

CS 43: Computer Networks. Layering & HTTP September 7, 2018

Produced by. Mobile Application Development. Higher Diploma in Science in Computer Science. Eamonn de Leastar

CMSC 332 Computer Networking Web and FTP

Lecture 9a: Sessions and Cookies

Application Protocols and HTTP

HTTP Reading: Section and COS 461: Computer Networks Spring 2013

Computer Networks. HTTP and more. Jianping Pan Spring /20/17 CSC361 1

Web Technology. COMP476 Networked Computer Systems. Hypertext and Hypermedia. Document Representation. Client-Server Paradigm.

static phlapa.east.verizon.net /

COSC 2206 Internet Tools. The HTTP Protocol

Web Engineering. Basic Technologies: Protocols and Web Servers. Husni

CSSE 460 Computer Networks Group Projects: Implement a Simple HTTP Web Proxy

Features of a proxy server: - Nowadays, by using TCP/IP within local area networks, the relaying role that the proxy

Project 2 Implementing a Simple HTTP Web Proxy

Software. Linux. Squid Windows

Internet Architecture. Web Programming - 2 (Ref: Chapter 2) IP Software. IP Addressing. TCP/IP Basics. Client Server Basics. URL and MIME Types HTTP

HTTP TRAFFIC CONSISTS OF REQUESTS AND RESPONSES. All HTTP traffic can be

Computer Networks. Wenzhong Li. Nanjing University

Review of Previous Lecture

Application Level Protocols

3. WWW and HTTP. Fig.3.1 Architecture of WWW

Application Layer. Applications and application-layer protocols. Goals:

INTERNET ENGINEERING. HTTP Protocol. Sadegh Aliakbary

Application Layer Introduction; HTTP; FTP

Snapt Accelerator Manual

ICS 351: Today's plan. IPv6 routing protocols (summary) HTML HTTP web scripting languages certificates (review) cookies

COMP6218: Content Caches. Prof Leslie Carr

Outline Computer Networking. HTTP Basics (Review) How to Mark End of Message? (Review)

Linux VPN Configuration

Project 2 Group Project Implementing a Simple HTTP Web Proxy

0 0& Basic Background. Now let s get into how things really work!

CSE 333 Lecture HTTP

Squirrel case-study. Decentralized peer-to-peer web cache. Traditional centralized web cache. Based on the Pastry peer-to-peer middleware system

CS4/MSc Computer Networking. Lecture 3: The Application Layer

Configuring Caching Services

Web Programming 4) PHP and the Web

CMPE 151: Network Administration. Servers

CSE/EE 461 HTTP and the Web

Hypertext Transport Protocol HTTP/1.1

CSCI-1680 WWW Rodrigo Fonseca

Announcements Fawzi Emad, Computer Science Department, UMCP

Configuring Authentication Proxy

The World Wide Web. EE 122: Intro to Communication Networks. Fall 2006 (MW 4-5:30 in Donner 155) Vern Paxson TAs: Dilip Antony Joseph and Sukun Kim

Announcements. The World Wide Web. Retrieving From the Server. Goals of Today s Lecture. The World Wide Web. POP3 Protocol

Internet Content Distribution

Cisco IOS Firewall Authentication Proxy

CONFIGURATION MANUAL. English version

HTTP Protocol and Server-Side Basics

Introduc)on to Computer Networks

Chapter 2. Application Layer

Introduction to Firewalls using IPTables


The HTTP Protocol HTTP

HTTP and Web Content Delivery

CSCI-1680 WWW Rodrigo Fonseca

World Wide Web, etc.

Browser behavior can be quite complex, using more HTTP features than the basic exchange, this trace will show us how much gets transferred.

Fireware-Essentials. Number: Fireware Essentials Passing Score: 800 Time Limit: 120 min File Version: 7.

Chapter 2: outline. 2.6 P2P applications 2.7 socket programming with UDP and TCP

WEB TECHNOLOGIES CHAPTER 1

Applications & Application-Layer Protocols: The Web & HTTP

Notes beforehand... For more details: See the (online) presentation program.

Applications & Application-Layer Protocols: The Web & HTTP

HashCookies A Simple Recipe

CSE 333 Lecture HTTP

How to work with HTTP requests and responses

UA-Tester.... or why Web-Application Penetration Testers are only getting half the story

HTTP Server Application

Table of Contents. Cisco How NAT Works

Homework 2 50 points. CSE422 Computer Networking Spring 2018

Configuring Authentication Proxy

Lecture 7 Application Layer. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

Networks and the Internet A Primer for Prosecutors and Investigators

ELEC / COMP 177 Fall Some slides from Kurose and Ross, Computer Networking, 5 th Edition

Real Life Web Development. Joseph Paul Cohen

Computer Network Midterm Explain Internet protocol stack (1% each layer s name, 1% each layer s functions, 10% total)

A taste of HTTP v1.1. additions. HTTP v1.1: introduces many complexities no longer an easy protocol to implement. G.Bianchi, G.Neglia, V.

HTTP (HyperText Transfer Protocol)

HTTP. Robert Grimm New York University

CORS Attacks. Author: Milad Khoshdel Blog: P a g e. CORS Attacks

Matt Terwilliger. Networking Crash Course

HTTP, circa HTTP protocol. GET /foo/bar.html HTTP/1.1. Sviluppo App Web 2015/ Intro 3/3/2016. Marco Tarini, Uninsubria 1

Content Gateway v7.x: Frequently Asked Questions

HTTP Review. Carey Williamson Department of Computer Science University of Calgary

Lab 2. All datagrams related to favicon.ico had been ignored. Diagram 1. Diagram 2

Announcements Fawzi Emad, Computer Science Department, UMCP

C22: Browser & Web Server Communication

Session 8. Reading and Reference. en.wikipedia.org/wiki/list_of_http_headers. en.wikipedia.org/wiki/http_status_codes

Remote User - Mediatrix SBC on the Edge

Transcription:

Proxying p.1/24 Proxying Why and How Alon Altman alon@haifux.org Haifa Linux Club

Proxying p.2/24 Definition proxy \Prox"y\, n.; pl. Proxies. The agency for another who acts through the agent; authority to act for another, esp. to vote in a legislative or corporate capacity. Webster s Revised Unabridged Dictionary

Proxying p.3/24 In computer networking... Proxy refers to a special kind of server that functions as an intermediate link between a client application (like a web browser) and a real server. The proxy server intercepts requests for information from the real server and whenever possible, fills the request. When it is unable to do so, the request is forwarded to the real server. http://www.oit.ohio-state.edu/glossary/gloss3.html

HTTP Proxying Proxying p.4/24

Proxying p.5/24 What is HTTP HTTP, or Hyper Text Transfer Protocol, is the standard protocol used to access web pages, and interact with web-based applications. It s newest version HTTP/1.1 is defined by RFC2616.

Proxying p.6/24 Why Proxy? Cache Keep frequently visited web pages available without need to re-download.

Proxying p.6/24 Why Proxy? Cache Keep frequently visited web pages available without need to re-download. Maintain privacy Strip or manipulate identifying information before it s sent to the server.

Proxying p.6/24 Why Proxy? Cache Keep frequently visited web pages available without need to re-download. Maintain privacy Strip or manipulate identifying information before it s sent to the server. Audit Keep a log of all web access from your network.

Proxying p.6/24 Why Proxy? Cache Keep frequently visited web pages available without need to re-download. Maintain privacy Strip or manipulate identifying information before it s sent to the server. Audit Keep a log of all web access from your network. Filter Prevent access to certain web sites from your network.

Proxying p.7/24 How to proxy? The most advanced and free proxy server is squid, downloadable from http://www.squid-cache.org/. squid is straightforward to set up: Install the RPM and then edit the configuration file /etc/squid/squid.conf. Initialize your cache directory using squid -z. Install the squid service to be run at startup

Proxying p.8/24 The HTTP Protocol To explain HTTP proxying we must first explain the standard HTTP protocol. The HTTP protocol usually works as follows: Client connects to server port 80.

Proxying p.8/24 The HTTP Protocol To explain HTTP proxying we must first explain the standard HTTP protocol. The HTTP protocol usually works as follows: Client connects to server port 80. Client sends request and headers. GET /lectures/084/ HTTP/1.1 Host: www.haifux.org User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-us; rv:1.4) Gecko/20030624... Referer: http://www.haifux.org/future.html

Proxying p.9/24 The HTTP Protocol (cont.) Server replies with status code, headers, and data. HTTP/1.1 200 OK Date: Thu, 08 Jan 2004 16:29:56 GMT Server: Apache/1.3.28 (Unix) PHP/4.3.3 Last-Modified: Sun, 04 Jan 2004 11:39:38 GMT... ETag: "1d8802-1d5-3ff7fb7a" Content-Type: text/html <HTML>...

Proxying p.9/24 The HTTP Protocol (cont.) Server replies with status code, headers, and data. HTTP/1.1 200 OK Date: Thu, 08 Jan 2004 16:29:56 GMT Server: Apache/1.3.28 (Unix) PHP/4.3.3 Last-Modified: Sun, 04 Jan 2004 11:39:38 GMT... ETag: "1d8802-1d5-3ff7fb7a" Content-Type: text/html <HTML>... Both client and server close the connection.

Proxying p.10/24 HTTP Protocol with a proxy The HTTP protocol via proxy works as follows: Client sends request and headers to proxy including full URL of the resource.

Proxying p.10/24 HTTP Protocol with a proxy The HTTP protocol via proxy works as follows: Client sends request and headers to proxy including full URL of the resource. Proxy checks its cache.

Proxying p.10/24 HTTP Protocol with a proxy The HTTP protocol via proxy works as follows: Client sends request and headers to proxy including full URL of the resource. Proxy checks its cache. Cache hit

Proxying p.10/24 HTTP Protocol with a proxy The HTTP protocol via proxy works as follows: Client sends request and headers to proxy including full URL of the resource. Proxy checks its cache. Cache hit If needed, Proxy validates data from original host.

Proxying p.10/24 HTTP Protocol with a proxy The HTTP protocol via proxy works as follows: Client sends request and headers to proxy including full URL of the resource. Proxy checks its cache. Cache hit If needed, Proxy validates data from original host. Proxy returns data to the client.

Proxying p.10/24 HTTP Protocol with a proxy The HTTP protocol via proxy works as follows: Client sends request and headers to proxy including full URL of the resource. Proxy checks its cache. Cache hit If needed, Proxy validates data from original host. Proxy returns data to the client. Cache miss Proxy retrieves page from original host and returns data to the client.

Proxying p.10/24 HTTP Protocol with a proxy The HTTP protocol via proxy works as follows: Client sends request and headers to proxy including full URL of the resource. Proxy checks its cache. Cache hit If needed, Proxy validates data from original host. Proxy returns data to the client. Cache miss Proxy retrieves page from original host and returns data to the client. Proxy stores page in cache if possible, and depending on configuration.

Proxying p.11/24 When not to cache Caching is not always a good idea: Dynamic Sites You want to see current news. Query Results You want current search results. Pages with side effects You want your daily donation at http://www.hungersite.com/ to be counted each day. Sometimes, cached data should not be served: Expiration Even static sited are updated sometimes. User Request If the user pressed Reload, you d better fetch a fresh page.

Proxying p.12/24 Cache control The caching behavior of proxies (and also your browser s cache) is primarily controlled in HTTP 1.1 by the Cache-Control header, using information from the Date and Modification-Time headers. Values of this header include: max-age=n Return pages not older than n seconds, or cache response for up to n seconds. max-age=0 is used by Reload to request revalidation of the page from the source. no-cache Do not use a cache for satisfying the request, or do not cache the result. Used by the server to mark uncachable pages, or to strongly force reload when cache is corrupt.

Proxying p.13/24 Notes about caching A proxy server may return a stale (expired) response if it cannot contact the source of the data. Some proxy servers (including squid) cache errors in addition to normal responses. Web authors should take care to include the appropriate Cache-Control header in dynamic pages, and to use different URLs for different versions of cachable resources to allow for efficient caching.

Proxying p.14/24 Proxy headers Proxy servers add (if configured to do so) special headers about the nature of the server and the client: Via is a standard header which lists the name(s) of the proxy server(s) the request has passed through. X-Forwarded-For is a header squid adds to identify the client originating the request. These headers allow the web server to know about the proxy and the user behind it.

Proxying p.15/24 Header manipulation Every time you request a page, your browser sends the following information in the request headers: User-Agent Your browser and OS version. Referrer The page that linked to the requested page, or if using IE the last page visited in the window with the page. Cookie Small piece of information used to uniquely track visitors.... and cache information as seen before. A proxy server is in the unique position to manipulate request headers in order to protect your privacy or to skew server statistics.

Proxying p.16/24 Filtering and Auditing Proxies usually keep detailed logs of all requests. A proxy can also deny certain requests. This can be used to require authorization or to restrict users web access. To make these restrictions effective, direct connection to the web should be blocked. squidguard (from http://www.squidguard.org/) could be used to block access to questionable sites in conjunction with squid. Warning: sites such as Google s cache and open proxies may be used in certain occasions to bypass your proxy s filters.

Proxying p.17/24 Transparent Proxies A transparent HTTP proxy is a proxy server that simulates the actual web in a form transparent to the clients. Transparent proxies have the following advantages over standard proxies: No need to reconfigure each and every application with proxy information. Bypassing the proxy is blocked, but standard browsing still works. Users don t realize that there is a proxy at all.

Proxying p.18/24 Transparent proxy detection Many ISPs set up transparent HTTP proxies in order to reduce their bandwidth costs by caching requests made by their clients. To check if your ISP is running a transparent proxy, visit http://www.whatismyip.com/ and compare the result with your actual IP address from the output of /sbin/ifconfig ppp0. If the IP addresses are different, you are behind a NAT or a transparent proxy.

Proxying p.19/24 Transparent proxy config. To set up a transparent proxy, all web requests must be redirected to the proxy. This is done using iptables in the nat table. If the server is a NAT or a firewall, first we should setup the redirection for requests originating from the internal network, using the following command: iptables -t nat -A PREROUTING -p tcp --dport 80 -i eth1 -j REDIRECT --to-ports 8080 (assuming the internal network is on eth1)

Proxying p.20/24 Transparent proxy config. If you have web servers on your internal network routed by the firewall, you should specifically allow access to the server by issuing a command such as: iptables -t nat -A PREROUTING -d 192.168.0.0/16 -j ACCEPT before the former command.

Proxying p.21/24 Local transparent proxy If you wish to apply the transparent proxy to the local machine as well, use the following commands: iptables -t nat -N proxy iptables -t nat -A OUTPUT -p tcp --dport 80 -j proxy iptables -t nat -A proxy -m owner --owner-uid squid -j RETURN iptables -t nat -A proxy -p tcp -j REDIRECT --to-ports 8080

Proxying p.22/24 squid configuration The REDIRECT target in iptables only redirects the packets towards the proxy, but does not modify the request to include the entire URL. Therefore, the proxy must find the designated host name by looking at the Host header in the HTTP request. All browsers send the Host header as it is also used to allow several virtual hosts with the same IP address. To trigger this behavior, the squid configuration option httpd_accel_uses_host_header must be set to on.

Proxying p.23/24 Transparent proxies and DNS The behavior of a transparent proxy requires it to perform a DNS lookup for each missed request, which may slow down the transaction. It is suggested to run a caching DNS server for the network as well as the proxy server. Due to the redirection, the identity of actual target host is lost. Thus the client cannot override private DNS settings for testing purposes.

Proxying p.24/24 Summary HTTP proxies allow for caching, privacy, auditing and filtering of WWW requests. Cache control mechanisms determine the behavior of caches and proxies. Transparent proxies allow proxying without client configuration.