UNIT II - ELEMENTARY TCP SOCKETS Introduction to Socket Programming Introduction to Sockets Socket address Structures Byte ordering functions address conversion functions Elementary TCP Sockets socket, connect, bind, listen, accept, read, write, close functions Iterative Server Concurrent Server. Introduction to Socket Programming Socket It is an abstraction that is provided to an application programmer to send or receive data to another process. Data can be sent to or received from another process running on the same machine or a different machine. It is like an endpoint of a connection Exists on either side of connection Identified by IP Address and Port number E.g. Berkeley Sockets in C Released in 1983 Similar implementations in other languages Part of the java.net package import java.net.*; Provides two classes of sockets for TCP Socket client side of socket ServerSocket server side of socket Provides one socket type for UDP DatagramSocket 1 CCET
Java TCP Sockets ServerSocket performs functions bind and listen Bind fix to a certain port number Listen wait for incoming requests on the port Socket performs function connect Connect begin TCP session UDP sockets UDP is packet-oriented Info sent in packet format as needed by app. Every packet requires address information. Lightweight, no connection required. Overhead of adding destination address with each packet. Introduction to Sockets An interface between application and network The application creates a socket The socket type dictates the style of communication reliable vs. best effort connection-oriented vs. connectionless Once configured the application can pass data to the socket for network transmission receive data from the socket (transmitted through the network by some other host) Two essential types of sockets SOCK_STREAM a.k.a. TCP reliable delivery in-order guaranteed connection-oriented bidirectional r App r r 3 r 2 1 2 CCET socket Dest.
SOCK_DGRAM a.k.a. UDP unreliable delivery no order guarantees no notion of connection app indicates dest. for each packet can send or receive App D1 3 2 1 socket D3 D2 Socket Creation in C: socket int s = socket(domain, type, protocol); s: socket descriptor, an integer (like a file-handle) domain: integer, communication domain e.g., PF_INET (IPv4 protocol) typically used type: communication type SOCK_STREAM: reliable, 2-way, connection-based service SOCK_DGRAM: unreliable, connectionless, other values: need root permission, rarely used, or obsolete protocol: specifies protocol (see file /etc/protocols for a list of options) - usually set to 0 NOTE: socket call does not specify where data will be coming from, nor where it will be going to it just creates the interface! 3 CCET Socket Address Structures Socket functions like connect(), accept(), and bind() require the use of specifically defined address structures to hold IP address information, port number, and protocol type. This can be one of the more confusing aspects of socket programming so it is necessary to clearly understand how to use
the socket address structures. The difficulty is that you can use sockets to program network applications using different protocols. For example, we can use IP4, IP6, Unix local, etc. Here is the problem: Each different protocol uses a different address structure to hold its addressing information, yet they all use the same functions connect(), accept(), bind() etc. So how do we pass these different structures to a given socket function that requires an address structure? Well it may not be the way you would think it should be done and this is because sockets where developed a long time ago before things like a void pointer where features in C. So this is how it is done: There is a generic address structure: struct sockaddr. This is the address structure which must be passed to all of the socket functions requiring an address structure. This means that you must type cast your specific protocol dependent address structure to the generic address structure when passing it to these socket functions. Protocol specific address structures usually start with sockaddr_ and end with a suffix depending on that protocol. For example: struct sockaddr_in (IP4, think of in as internet) struct sockaddr_in6 (IP6) struct sockaddr_un (Unix local) struct sockaddr_dl (Data link) We will be only using the IP4 address structure: struct sockaddr_in. So once we fill in this structure with the IP address, port number, etc we will pass this to one of our socket functions and we will need to type cast it to the generic address structure. For example: struct sockaddr_in myaddressstruct; //Fill in the address information into myaddressstruct here, (will be explained in detail shortly) connect(socket_file_descriptor, (struct sockaddr *) &myaddressstruct, sizeof(myaddressstruct)); Here is how to fill in the sockaddr_in structure: struct sockaddr_in{ sa_family_t sin_family /*Address/Protocol Family*/ (we ll use PF_INET) unit16_t sin_port /* 16-bit Port number --Network Byte Ordered-- */ struct in_addr sin_addr /*A struct for the 32 bit IP Address */ unsigned char sin_zero[8] /*Just ignore this it is just padding*/ }; struct in_addr{ unit32_t s_addr /*32 bit IP Address --Network Byte Ordered-- */ }; 4 CCET
For the sa_family variable sin_family always use the constant: PF_INET or AF_INET ***Always initialize address structures with bzero() or memset() before filling them in *** ***Make sure you use the byte ordering functions when necessary for the port and IP address variables otherwise there will be strange things a happening to your packets*** To convert a string dotted decimal IP4 address to a NETWORK BYTE ORDERED 32 bit value use the functions: inet_addr() inet_aton() To convert a 32 bit NETWORK BYTE ORDERED to a IP4 dotted decimal string use: inet_ntoa() Byte Ordering Functions UNIX s byte-ordering funcs u_long htonl (u_long x); u_short htons(u_short x); u_long ntohl(u_long x); u_short ntohs(u_short x); On big-endian machines, these routines do nothing On little-endian machines, they reverse the byte Same code would have worked regardless of endian-ness of the two machines 5 CCET Address Conversion Functions gethostbyname Function gethostbyaddr Function gethostname Function getservbyname and getservbyport Functions 1.gethostbyname Function #include <netdb.h> struct hostent *gethostbyname (const char *hostname); Returns: non-null pointer if OK, NULL on error with h_errno set struct hostent {
hostent{ h_name h_aliases h_addrtype h_length h_addr_list }; 4 char *h_name; /* official (canonical) name of host */ char **h_aliases; /* pointer to array of of pointers to alias names */ int h_addrtype; /* host address type : AF_INET*/ int h_length; /* length of address : 4*/ char **h_addr_list; /* ptr to array of ptrs with IPv4 addrs*/ AF_INET official hostname \0 NULL NULL in_addr{ in_addr{ in_addr{ Alias #1 \0 Alias #2 \0 IP addr #1 IP addr #2 IP addr #3 h_length #define h_addr h_addr_list[0] /* for backward compatibility */ struct hostent * hp = gethostbyname(argv[1]); bcopy ( hp->h_addr, &server.sin_addr, hp->h_length); //see intro/daytimetcpcli_hostname.c Will only retrieve IPv4 addresses, performs a query for an A record Some versions of gethostbyname will allow the following hptr = gethostbyname ( 192.168.42.2 ); not portable If error, sets global integer h_errno to HOST_NOT_FOUND TRY_AGAIN NO_RECOVERY NO_DATA specified name valid but does not have A records Can use hstrerror function to get a description of the error (value of h_errno) See names/hostent.c for an example Example Usage >hostent ap1 >hostent cnn.com >hostent www gethostbyaddr Function Takes a binary IPv4 address and tries to find the hostname corresponding to that address Performs a query for a PTR record #include <netdb.h> struct hostent *gethostbyaddr(const char *addr, socklen_t len, int family); 6 CCET
Returns non-null pointer if OK, NULL on error with h_errno set Field of interest in the returning structure is h_name (canonical host name) addr argument is not a char* but really a pointer to an in_addr structure containing the IPv4 address gethostname Function Obtains the host name #include <unistd.h> int gethostname(char *name, size_t len); // On success, zero is returned. On error, -1 is returned, and errno is set appropriately Example #define MAXHOSTNAME 80 char ThisHost[80]; gethostname (ThisHost, MAXHOSTNAME); getservbyname and getservbyport Functions #include <netdb.h> struct servent *getservbyname(const char *servname, const char *protoname); //returns non-null pointer if OK, NULL on error struct servent *getservbyport(int port, const char *protoname); //returns non-null pointer is OK, NULL on error //port value must by in network byte order struct servent { char *s_name; /* official service name */ char **s_ aliases;/* aliases list*/ int s_port; /* port number, network byte order */ char *s_proto;/* protocol to use */ }; #include <netdb.h> struct servent *getservbyname(const char *servname, const char *protoname); //returns non-null pointer if OK, NULL on error struct servent *getservbyport(int port, const char *protoname); //returns non-null pointer is OK, NULL on error //port value must by in network byte order struct servent { char *s_name; /* official service name */ char **s_ aliases;/* aliases list*/ int s_port;/* port number, network byte order */ char *s_proto;/* protocol to use */ }; 7 CCET
Elementary TCP Socket To perform network I/O, first thing a process must do is call the socket function #include <sys/socket.h> int socket(int family, int type, int protocol); - returns: non-negative descriptor if ok, -1 on error 8 CCET
The connect function is used by a TCP client to establish a connection with a TCP server: #include <sys/socket.h> int connect(int sockfd, const struct sockaddr *servaddr, socklen_t addrlen); Returns: 0 if ok, -1 on error Sockfd is a socket descriptor returned by the socket function 2 nd & 3 rd args are the socket address structures, must contain the address of the server to communicate with The client does not have to call bind The kernel chooses both an ephemeral port and the source IP address if necessary. 9 CCET
Bind function The bind funtion assigns a local protocol address to a socket. With IP, combination of 32-bit (IPv4 or 128-bit for IPv6) address, along with a 16-bit TCP or UDP port number. #include <sys/socket.h> int bind(int sockfd, const struct sockaddr *myaddr, socklen_t addrlen); Servers bind to their well-known port when they start A process can bind a specific IP address to its socket Normally, however, a client does not bind an IP address, so that client can then respond on any interface available on the host. Listen function The listen function is called only by a TCP server and it performs 2 actions 1. Converts an unconnected (active) socket into a passive socket (indicates kernel should accept incoming connect requests directed to this socket 2. 2 nd argument specifies the maximum number of connections kernel should queue for this socket #include <sys/socket.h> int listen(int sockfd, int backlog); Listen function Normally called after both the socket and bind function, only by the server of course Backlog - for a given listening socket, the kernel maintains 2 queues: 1. An incomplete connection queue, which contains an entry for each SYN that has arrived from a client for which server is awaiting completion of the TCP 3-way handshake 2. A completed connection queue, entry for each client with whom 3-way handshake has completed. Accept function 10 CCET
Accept is called by a TCP server to return the next completed connection from the front of the completed connection queue. If completed queue is empty, the process is put to sleep. #include <sys/socket.h> int accept(int sockfd, struct sockaddr *cliaddr, socklen_t *addrlen); Returns: non-negative descriptor if OK, -1 on error The cliaddr and addrlen args are used to return the protocol address of the connect peer process (the client). Fork and exec functions We will look at building a concurrent server Need to create a new child process to handle each incomming client request/transaction fork function is the only way in Unix to create a new process: #include <unistd.h> pid_t fork(void); Returns: 0 in child, process ID of child in parent, -1 on error Called once but returns TWICE Once in the parent process (returns child process id), and once in the child process (return of 0) More Forking All descriptors open in the parent before the call to fork() are shared with the child after fork returns. Including the connected socket file description returned by accept Exec function Only way in which an executable program file on disk can be executed in Unix is for an existing process to call one of the 6 exec functions Close function Close() function used to close a socket and terminate a TCP connection #include <unistd.h> int close(int sockfd); Returns: 0 if ok, -1 on error Default action of close with a TCP socket description is to mark the socket as closed and return to the process immediately. 11 CCET
Socket descriptor is no longer usable to the app process at this point But TCP will try to send any data that is already queued, and once flushed begin the normal TCP termination sequence. Iterative Server 1.Iterative, connection-oriented server Algorithm 1. Create a socket and bind to the well-known address for the service being o_ered 2. Place the socket in passive mode 3. Accept the next connection request from the socket, and obtain a new socket for the connection 4. Repeatedly read a request from the client, formulate a response, and send a reply back to the client according to the application protocol 5. When _nished with a particular client, close the connection and return to step 3 to accept a new Connection _ servers should specify INADDR ANY as internet address while binding _ needed for hosts with multiple IP addresses Iterative, connection-less servers Algorithm 1. Create a socket and bind to the well-known address for the service being o_ered 2. Repeatedly read the next request from a client, formulate a response, and send a reply back to the client according to the application protocol _ cannot use connect (unlike clients) _ use sendto and recvfrom Concurrent Server Concurrent, Connection-less servers Algorithm Master 1. Create a socket and bind to the well- known address for the service being o_ered. Leave the socket unconnected. Master 2. Repeatedly call recvfrom to receive the next request from a client, and create a new slave thread/process to handle the response Slave 1. Receive a speci_c request upon creation as well as access to the socket Slave 2. Form a reply according to the application protocol and send it back to the client using send to Slave 3. Exit _ cost of process/thread creation for each client request _ while using threads, use thread-safe functions and be careful while passing arguments to threads 12 CCET
Concurrent, Connection-oriented servers Algorithm Master 1. Create a socket and bind to the well- known address for the service being o_ered. Leave the socket unconnected. Master 2. Place the socket in passive mode. Master 3. Repeatedly call accept to receive the next request from a client, and create a new slave process/thread to handle the response Slave 1. Receive a connection request (i.e., socket for connection) upon creation Slave 2. Interact with the client using the connection: read request(s) and send back response(s) Slave 3. Close the connection and exit _ processes created using fork; can also use execve 13 CCET