Introduction to Internet, Web, and TCP/IP Protocols SEEM 3460 1
Local-Area Networks A Local-Area Network (LAN) covers a small distance and a small number of computers LAN A LAN often connects the machines in a single room or building 2
Wide-Area Networks A Wide-Area Network (WAN) connects two or more LANs, often over long distances LAN LAN A LAN usually is owned by one organization, but a WAN often connects groups in different countries 3
The Internet The Internet is a WAN which spans the entire planet The word Internet comes from the term internetworking It started as a United States government project, sponsored by the Advanced Research Projects Agency (ARPA) - originally it was called the ARPANET The Internet grew very quickly since then 4
IP Address Each computer and device on the Internet has a unique IP address. Currently it has 32 bits such as: 5
Internet Address Each computer or device on the Internet also has a unique Internet name, which is referred to as an Internet address: spencer.villanova.edu kant.gestalt-llc.com The first part indicates a particular computer (spencer) The rest is the domain name, indicating the organization (villanova.edu) This Internet address can also be called hostnames 6
Domain Names The last part of a domain name, called a toplevel domain (TLD), indicates the type of organization: edu com org net - educational institution - commercial entity - non-profit organization - network-based organization Sometimes the suffix indicates the country: uk au ca se - United Kingdom - Australia - Canada - Sweden New TLDs have recently been added: biz, info, tv, name 7
Domain Names A domain name can have several parts Unique domain names mean that multiple sites can have individual computers with the same local name Eventually each computer or device will have its unique Internet address (hostname) 8
IP Address and Internet Address Each computer or device on the Internet has its own unique IP address as well as its own Internet address (hostname) There is no one-to-one correspondence between the sections of an IP address and the sections of an Internet address For example, using the command nslookup, the information about IP address and hostname can be found > /usr/sbin/nslookup cuse123.se.cuhk.edu.hk Name: cuse123.se.cuhk.edu.hk Address: 137.189.59.123 When used, an Internet address is translated to an IP address by software called the Domain Name System (DNS) 9
The World Wide Web The World Wide Web allows many different types of information to be accessed using a common interface A browser is a program which accesses and presents information text, graphics, video, sound, audio, executable programs enables Web navigation popular browsers: Internet Explorer Firefox Camino (Mac OSX) Chrome (Google) A Web document usually contains links to other Web documents, creating a hypermedia environment The term Web comes from the fact that information is not organized in a linear fashion 10
Architecture of Web Application Web browser Web server Internet Web browser Server Client SEEM 3460 11
HTML and URL Web documents are often defined using the HyperText Markup Language (HTML) Information on the Web is found using a Uniform Resource Locator (URL). For example: http://www.lycos.com http://www.villanova.edu/webinfo/domains.html ftp://java.sun.com/applets/animation.zip 12
Web - URL A URL indicates an application protocol (e.g. http), a server address, and possibly specific documents Protocol identifies the means of access Server address contains the host and toplevel domain Target resource identifies the subdirectories within the Web site URL http:// www.nytimes.com/ Pages/cartoons/ Examples of application protocols: http, mailto, ftp SEEM 3460 13
Web Client Server Model In a client computer, there is a client process running in a Web browser In a server computer, it may store data (e.g. Web pages) There is a server process running in a server computer Typically a server process is running in an endless loop and waits for client requests If a user enters a URL, the client process will make a request and send the request to the Web server. In a Web server, if a client request arrives, it will: process the requests, and return the result data (e.g. Web page) to the client SEEM 3460 14
Architecture of Web Application Web browser (Client process running in the browser) (server process running) Web server Web browser Internet Server Client SEEM 3460 15
Network Communication To handle the software complexity of network communication and message transmission, we use a layered approach The top layer is the application layer. For example, Web messages correspond to the application layer. The next layer is the connection layer. For Internet, TCP/IP handles this layer. Each layer has its own protocol A protocol is a set of rules that determine how messages communicate with each other 16
Application Protocol and TCP/IP Protocols Application protocol is a set of rules and formats corresponding to the message communication required for an Internet application (e.g. HTTP protocol for the Web) The Transmission Control Protocol (TCP) and Internet Protocol (IP) control and handle the complexity of the communication task of sending messages (message transmission) needed for Internet applications 17
Web Application - HTTP Protocol A user on host argon.tcpip-lab.edu ( Argon ) makes a Web request via URL http://neon.tcpip-lab.edu/index.html. argon.tcpip-lab.edu ("Argon") neon.tcpip-lab.edu ("Neon") Web request Web page Web client Web server What actually happens in the network? SEEM 3460 18
HTTP Request and HTTP response Web browser runs an HTTP client process Web server runs an HTTP server process HTTP client process at the source machine (i.e. Argon) sends an HTTP request message to HTTP server process at the destination machine (i.e. Neon) HTTP server process responds with HTTP response message and sends back to the HTTP client process HTTP protocol is an example of application protocol Argon HTTP client process HTTP request HTTP response Neon HTTP server process SEEM 3460 19
An Example of HTTP Request Message GET /index.html HTTP/1.1 Accept: image/gif, */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 Host: neon.tcpip-lab.edu Connection: Keep-Alive SEEM 3460 20
An Example of HTTP Response Message HTTP/1.1 200 OK Date: Sat, 25 May 2002 21:10:32 GMT Server: Apache/1.3.19 (Unix) Last-Modified: Sat, 25 May 2002 20:51:33 GMT ETag: "56497-51-3ceff955" Accept-Ranges: bytes Content-Length: 81 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html <HTML> <BODY> <H1>Internet Lab</H1> Click <a href="http://www.tcpip-lab.net/index.html">here</a> for the Internet Lab webpage. </BODY> </HTML> SEEM 3460 21
Overview of TCP/IP Protocol A message (e.g. HTTP response message) is divided into small pieces called packets These are labeled with the IP addresses of the machine they came from and the one they are going to, and a order number (like 1 of 5) The protocol says how to route them to get to the destination Not all packets take the same route! SEEM 3460 22
How TCP/IP Works 1) Transfer Control Protocol (TCP) breaks data into small pieces of no bigger than 1500 characters each. These pieces are called packets. 101010101 001101010 011010011 101010101001 101010011010 011010210101 010101011010 111101010111 011101110110 110000101110 110101010101 001110101001 010111101000 1010101010 0110101001 1010011 101010101 001101010 011010011 SEEM 3460 23
How TCP/IP Works (II) 2) Each packet is inserted into different Internet Protocol (IP) envelopes. Each contains the address of the intended recipient and has the exact same header as all other envelopes. 101010101 001101010 011010011 101010101 001101010 011010011 101010101 001101010 011010011 SEEM 3460 24
How TCP/IP Works (III) In the IP layer, a router receives each packet and then determines the most efficient way to send the packet to the destination. After traveling along a series of routers, the packet arrive at their destination. Different packets of the same message may use different routes The TCP layer is responsible for assembling packets into a complete message Packet 10101010100110 1010011010011 1010011010011 Router 1 Router 3 Router 2 Router 4 Packet 10101010100110 1010011010011 SEEM 3460 25
From HTTP to TCP A TCP server process is running on the Neon. To send a request, the HTTP client process at the source machine (i.e. Argon) establishes an TCP connection to the HTTP server process at the destination machine (i.e. Neon). The HTTP client process invokes the TCP client process to handle the message transmission The TCP client process will break the message into packets Argon HTTP client process HTTP request / HTTP response Neon HTTP server process TCP client process TCP connection TCP server process SEEM 3460 26
Resolving hostnames and port numbers TCP needs to work with hostnames (e.g. neon.tclip-lab.edu ) and also needs to find the HTTP server process at Neon The hostname neon.tcpip-lab.edu must be translated into a 32-bit IP address. The HTTP server process at Neon must be identified by a 16-bit port number. SEEM 3460 27
Translating a hostname into an IP address The translation of the hostname neon.tcpip-lab.edu into an IP address is done via a distributed database lookup HTTP client process neon.tcpip-lab.edu 128.143.71.21 DNS server process argon.tcpip-lab.edu 128.143.136.15 The distributed database used is called the Domain Name System (DNS) SEEM 3460 28
Finding the port number Note: Most services on the Internet are reachable via well-known ports. e.g. All HTTP server processes on the Internet can be reached at port number 80. So: Argon simply knows the port number of the HTTP server process at a remote machine. On most Unix systems, the well-known ports are listed in a file with name /etc/services. The well-known port numbers of some of the most popular services are: ftp 21 finger 79 telnet 23 http 80 smtp 25 nntp 119 SEEM 3460 29
Requesting a TCP Connection The HTTP client process at argon.tcpip-lab.edu requests the TCP client process to establish a connection to port 80 of the machine with address 128.141.71.21 argon.tcpip-lab.edu HTTP client process TCP client process Establish a TCP connection to port 80 of 128.143.71.21 SEEM 3460 30
Invoking the IP Protocol The TCP client process at Argon sends a request to establish a connection to port 80 at Neon This is done by asking its local IP module to send an IP datagram to 128.143.71.21 argon.tcpip-lab.edu TCP client process IP Send an IP datagram to 128.143.71.21 SEEM 3460 31
Sending the IP datagram to an IP router Suppose that the default gateway for Argon is an IP router router137.tcpip-lab.edu (128.143.137.1). Also suppose that the local area network (LAN) hardware adopts the commonly used Ethernet technology "Argon argon.tcpip-lab.edu 128.143.137.144 neon.tcpip-lab.edu 128.143.71.21 "Neon" "Router137" router137.tcpip-lab.edu 128.143.137.1 "Router71" router71.tcpip-lab.edu 128.143.71.1 Router Ethernet Network Ethernet Network SEEM 3460 32
Sending the IP datagram to an IP router Argon sends the IP datagram to its default gateway To send an IP datagram to Router137, Argon puts the IP datagram in an Ethernet frame, and transmits the frame. The IP router receives the Ethernet frame at interface 128.143.137.1 (Router137) and recovers the IP datagram The main goal of IP layer is to determine the next router it should send by using a routing algorithm SEEM 3460 33
Forwarding the IP datagram In this example, the IP datagram should be forwarded to the interface with name 128.143.71.1 (Router71) Next, the IP protocol at Router71, tells its Ethernet device driver to send an Ethernet frame to Neon "Argon argon.tcpip-lab.edu 128.143.137.144 neon.tcpip-lab.edu 128.143.71.21 "Neon" "Router137" router137.tcpip-lab.edu 128.143.137.1 "Router71" router71.tcpip-lab.edu 128.143.71.1 Router Ethernet Network Ethernet Network SEEM 3460 34
Data has arrived at Neon Neon receives the Ethernet frame The Ethernet frame contains an IP datagram which is passed to the IP protocol. The IP datagram, which is a TCP segment, is passed to the TCP server process neon.tcpip-lab.edu TCP server process IP module Ethernet SEEM 3460 35
Wrapping-up the example So far, Neon has only obtained a single datagram All datagrams are required to be transmitted The TCP server process at Neon will assemble all datagrams and get the whole HTTP message In reality, the following additional issues need to be handled: transmission errors long and complicated routes between Argon and Neon interacting with the DNS server SEEM 3460 36