A client-server transaction is simply a sequence of steps carried out by a client and a server. It's NOT database transations and do not share any of their properties, such as atomicity.
The Internet model (also known as Internet protocol suite or TCP/IP) is a subset of Open Systems Interconnect (OSI) model that provides end-to-end connectivity specifying how data should be packetized, addressed, transmitted, routed and received at the destination. The functionality is organized into four abstraction layers.
application
, presentation
and session
layers of the OSI model. Application layer provides process-to-process data exchange for application. It's in this layer where the Internet protocols, e.g., HTTP, SSH, DNS, are implemented that directly interact with Internet applications.datalink
and physical
layers in OSI stack, contains communication methods for data that remains within a single network segment (link). Network drivers are implemented here to send packets over physical network media such as Ethernet
, PPP
and ADSL
.Clients and servers often run on separate hosts and communicate using the hardware and software resources of computer network.
To a host, a network is just another I/O device that serves as a source and sink for data. An adapter plugged into an expansion slot on the I/O bus provides the physical interface to the network. Data received from the network is copied from the adapter across the I/O and memory buses into memory, typically by a DMA transfer. Similarly, data can also be copied from memory to the network.
Physically, a network is a hierarchical system that is organized by geographical proximity. At the lowest level is LAN that spans a building or a campus, e.g., the most popular Ethernet.
An Ethernet segment consists of some wires and a small box called a hub. Ethernet segments typically span small areas, such as a room or a floor in a building. Each wire has the same max bit bw, with one end being attached to an adaptor on a host, and the other end to a port on the hub. A hub slavishly copies every bit that it receives on each port to every other port. Thus, every host sees every bit.
Each Ethernet adapter has a globally unique 48-bit addr that is stored in a NVM on the adaptor. A host can send a chunk of bits called a frame to any other host on the segment. Each frame includes some fixed number of header bits that identify the source and dest of teh frame and the frame length, followed by a payload of data bits. Every host adapter sees the frame, but only the dest host actually reads it.
Multi Ethernet segments can be connected into larger LANs, called bridged Ethernets, using a set of wires and small boxes called bridges. Bridged Ethernets can span entire buildings or campuses.
Multi incompatible LANs can be connected by specialized computers called routers to form an internet. Each router has an adapter (port) for each network that it is connected to. Routers can also connect high-speed point-to-point phone connections, which are examples of networks known as WANs.
The curcial property of an internet is that it can consist of different LANs and WANs with radically different and incompatible technologies. Each host is physically connected to every other host, but how it is psbl for some source host to send data bits to another dest host across all of these incompatible networks?
The solution is a layer of protocol software running on each host and router that smoothes out the differences bt the different networks. This software is implemented as a protocol that governs how hosts and routers cooperate in order to transfer data. The protocol must provide two basic capabilities:
Each Internet host runs software that implements the TCP/IP protocol. Internet clients and servers communicate using a mix of sockets interface funcs and Unix I/O funcs. The sockets funcs are typically implemented as system calls that trap into the kernel and call various kernel-mode funcs in TCP/IP.
TCP/IP is actually a family of protocols, each of which contributes different capabilities:
From a programmer's perspective, we can think of the Internet as a worldwide collection of hosts with the following properties:
An IP address is an unsigned 32-bit integer. Network programs store IP addresses in the IP address structure.
//netinet/in.h
/* Internet address structure */
struct in_addr {
unsigned int s_addr; //network byte order (big-endian)
};
Because Internet hosts can have different host byte orders, TCP/IP defines a uniform network byte order (big-endian byte order) for any integer data item. Addresses in IP address structures are always stored in (big-endian) network byte order, even if the host byte order is little-endian. Unix provides the following funcs for converting between network and host byte order:
#include <netinet/in.h>
//converts a 32-bit int from host to network byte order
unsigned long int htonl(unsigned long int hostlong);
unsigned short int htons(unsigned short int hostshort);
//converts a 32-bit int from network to host byte order
unsigned long int ntohl(unsigned long int netlong);
unsigned short int ntohs(unsigned short int netshort);
IP addresses are typically presented to humans in dotted-decimal notation. e.g., 128.2.194.242
is 0x8002c2f2
. Internet programs convert back and forth bt IP addresses and dotted-decimal strings using the funcs inet_aton
and inet_ntoa
:
#include <arpa/inet.h>
//convert string to an IP addr in network byte order
// returns 1 if OK, 0 on error
int inet_aton(const char *p, struct in_addr *p);
//convert an IP addr in network byte order to its str
// returns a ptr to a dotted-decimal str
char *inet_ntoa(struct in_addr in);
The Internet defines a mapping bt the set of domain names and the set of IP addresses. The mapping is maintained in a distributed world-wide database known as DNS. Conceptually, the DNS database consists of millions of the host entry structures, each of which defines the mapping bt a set of domain names (an official name and a list of aliases) and a set of IP addresses.
//netbd.h
/* DNS host entry structure */
struct hostent {
char *h_name; //official domain name of host
char **h_aliases; //Null-terminated arr of domain names
int h_addrtype; //host addr type (AF_INET)
int h_length; //len of an addr, in bytes
char **h_addr_list; //Null-terminated arr of in_addr structs
};
Internet applications retrieve arbitrary host entries from the DNS database by calling the gethostbyname
and gethostbyaddr
funcs:
#include <netdb.h>
//returns: non-NULL ptr if OK, NULL ptr on error with h_errno set
struct hostent *gethostbyname(const char *name);
//returns: non-NULL ptr if OK, NULL ptr on error with h_errno set
struct hostent *gethostbyaddr(const char *addr, int len, 0);
Internet clients and servers communicates by sending and receiving streams of bytes over connections.
A socket is an end point of a connection. Each socket has a crspding socket address that consists of an Internet addr and a 16-bit integer port, and is denoted by address:port. The port in the client's socket addr is assigned auto by the kernel when the client makes a connection request, and is known as an ephemeral port. However, the port in the server's socket addr is typically known as well-known port that is associated with the service. (see /etc/services
for the comprehensive list of services and ports provided on that machine)
A connection is uniquely identified by the socket pair:
(cliaddr:cliport, servaddr:servport)
The sockets interface is a set of funcs that are used in conjunction with the Unix I/O funcs to build network applications.
From the perspective of the Unix kernel, a socket is an end point for communication. From the perspective of a Unix program, a socket is an open file with a crspding descriptor.
Internet socket addresses are stored in 16-byte structures of the type sockaddr_in
. For Internet applications, the sin_family
member is AF_INET
, the sin_port
member is a 16-bit port number, and the sin_addr
is a 32-bit IP address. The IP addr and port number are always stored in network (big-endian) byte order.
//sockaddr: socketbits.h (included by socket.h)
//sockaddr_in: netinet/in.h
/* Generic socket addr structures (for connect, bind and accept) */
struct sockaddr {
unsigned short sa_family; //protocol family
char sa_data[14];//addr data
};
/* Internet-style socket addr structure */
struct sockaddr_in {
unsigned short sin_family; //Addr family (always AF_INET)
unsigned short sin_port; //Port num in network byte order
struct in_addr sin_addr; //IP addr in network byte order
unsigned char sin_zero[8];//Pad to sizeof(struct sockaddr)
};
The connect
, bind
and accept
funcs require a ptr to a protocol-specific socket addr structure. To make the funcs accept any kind of socket addr structure, without void*
ptr, the solution is to define sockets funcs to expect a ptr to a generic sockaddr
structure, and then require apps to cast ptrs to protocol-specific structures to this generic structure.
typedef struct sockaddr SA;
socket
functionClients and servers use the socket
func to create a socket descriptor.
#include <sys/types.h>
#include <sys/socket.h>
//returns: nonnegative descriptor if OK, -1 on error
int socket(int domain, int type, int protocol);
In our codes, we will always call the socket
func with the arguments
//AF_INET: indicates using the Internet
//SOCK_STREAM: indicates that the socket will be an end point for an Internet connection
//clientfd: returned with being partially opened and cannot be used for reading and writing
clientfd = Socket(AF_INET, SOCK_STREAM, 0);
connect
functionA client establishes a connection with a server by calling the connect
func.
#include <sys/socket.h>
//returns: 0 if OK, -1 on error
int connect(int sockfd, struct sockaddr *serv_addr,
int addrlen);
The connect
func attempts to establish an Internet connection with the server at socket addr serv_addr
, where addrlen
is sizeof(sockaddr_in)
. The connect
func blocks until either the connection is successfully established or an error occurs. If successful, the sockfd
descriptor is now ready for reading and writing, and the resulting connection is characterized by the socket pair
//x: client's IP addr
//y: ephermeral port that uniquely identifies the client
(x:y, serv_addr.sin_addr:serv_addr.sin_port)
open_clientfd
functionopen_clientfd
is a help func wrapped by socket
and connect
. A client can call open_clientfd
to establish a conn with a server.
The client_fd
func establishes a conn with a server running on host hostname
and listening for conn requests on the well-known port port
. It returns an open socket descriptor that is ready for input and output using the Unix I/O funcs.
int open_clientfd(char *hostname, int port) {
int clientfd;
struct hostent *hp;
struct sockaddr_in serveraddr;
//creating the socket descriptor
if((clientfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
return -1; //check errno for cause of error
/* Fill in the server's IP addr and port */
if((hp=gethostbyname(hostname)) == NULL)
return -2; //check h_errno for cause of error
bzero((char *) &serveraddr, sizeof(serveraddr));
serveraddr.sin_family = AF_INET;
bcopy((char *)hp->h_addr_list[0],
(char *)&serveraddr.sin_addr.s_addr, hp->h_length);
serveraddr.sin_port = htons(port);
/* Establish a conn with the server */
if(connect(clientfd, (SA *) &serveraddr,
sizeof(serveraddr)) < 0)
return -1;
return clientfd;
}
The remaining sockets funcs -- bind
, listen
and accept
-- are used by servers to establish conns with clients.
bind
functionThe bind
func tells the kernel to associate the server's socket addr in my_addr
with the socket descriptor sockfd
, The addrlen
argument is sizeof(sockaddr_in)
.
#include <sys/socket.h>
//returns: 0 if OK, -1 on error
int bind(int sockfd, sockaddr *my_addr, int addrlen);
listen
functionClients are active entities that initiate conn requests. Servers are passive entities that wait for conn requests from clients. By default, the kernel assumes that a descriptor created by the socket
func crspds to an active socket
that will live on the client end of a conn. A server calls the listen
func to tell the kernel that the descriptor will be used by a server instead of a client.
#include <sys/socket.h>
//returns: 0 if OK, -1 on error
int listen(int sockfd, int backlog);
The listen
func converts sockfd
from an active socket to a listening socket that can accept conn reqs from clients. The backlog
argument is a hint about #outstanding conn reqs that the kernel should queue up before it starts to refuse reqs.
open_listenfd
functionIt's helpful to combine the socket
, bind
and listen
funcs into a helper func called open_listenfd
that a server can use to create a listening descriptor.
int open_listenfd(int port) {
int listenfd, optval = 1;
struct sockaddr_in serveraddr;
/* Create a socket descriptor */
if((listendfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
return -1;
/* Eliminate "Add already in use" error from bind */
if(setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR,
(const void*)&optval, sizeof(int)) < 0)
return -1;
/* Listenfd will be an end point for all reqs to
port on any IP add for this host */
bzero((char *) &serveraddr, sizeof(serveraddr));
serveraddr.sin_family = AF_INET;
serveraddr.sin_addr.s_addr = htonl(INADDE_ANY);
serveraddr.sin_port = htons(unsigned short)port);
if(bind(listenfd, (SA *)&serveraddr,
sizeof(serveraddr)) < 0)
return -1;
/* Make it a listening socket ready to accept conn reqs */
if(listen(listenfd, LISTENQ) < 0)
return -1;
return listenfd;
}
The open_listenfd
func opens and returns a listening descriptor that is ready to receive conn reqs on the well-known port port
.
accept
functionServers wait for conn reqs from clients by calling the accept
func:
#include <sys/socket.h>
//returns: nonnegative conned descriptor if OK, -1 on error
int accept(int listenfd, struct sockaddr *addr, int *addrlen);
The accept
func waits for a conn req from a client to arrive on the listening descriptor listenfd
, then fills in the clients's socket addr in addr
, and returns a connected descriptor that can be used to communicate with the client using Unix I/O funcs.
The listening descriptor serves as an end point for client conn reqs. It's typically createad once and exists for the lifetime of the server. The connected descriptor is the end point of the conn that is established bt the client and the server. It is created each time the server accepts a conn req and exists only as long as it takes the server to service a client.
HTTP is a simple protocol. A Web client (known as a browser) opens an Internet conn to a server and requests some content. The server responds with the requested content and then closes the conn. The browser reads the content and displays it on the screen.
Secure Shell is an encrypted network protocol to allow remote login and other network services to operate securely over an unsecured network.
File Transfer Protocol is a standard network protocol used to transfer computer files from one host to another host over a TCP-based network. FTP is built on a client-server arch and uses separate control and data conns bt the client and the server.
Trivial File Transfer Protocol is a simple, lock-step FTP that allows a client to get from or put a file onto a remote host. TFTP lacks security and most of the advanced features offered by more robust file transfer protocols such as FTP.
Simple File Transfer Protocol is an unsecured file transfer protocol with a level of complexity intermediate bt TFTP and FTP.
The Dynamic Host Configuration Protocol is a standardized network protocol used on IP network for dynamically distributing network configuration parameters, such as IP addresses for interfaces and services. With DHCP, computer request IP addresses and networking paras auto from a DHCP server, reducing the need for a network administrator or a user to configure these settings manually.
The Lightweight Directory Access Protocol is an open, vendor-neutral, industry standard app protocol for accessing and maintaining distributed directory info services over an IP network.
The Linear Printer Daemon protocol (or Line Printer Remote protocol) is a network protocol for submitting print jobs to a remote printer.