16. Introduction to the Web and HTTP

Network Applications

Network applications are programs that run on different end-systems and communicate with each other over the network (e.g. web browser and server).

Application protocols are a small part of a network application, and different network applications may share the same application protocol.

Services applications need:

TCP provides a reliable transport service, flow control and congestion control, but no guarantees for latency or transmission rate.

UDP provides none of these guarantees.

Application Structure

Client-server:

Peer-to-peer:

Hybrid. Napster is an example:

The Web

Hypertext Transfer Protocol (HTTP)

A web page consists of a base HTML-file and several referenced objects:

HTTP uses the client-server model:

Uses TCP:

HTTP is stateless: the server can work without maintaining any information about past client requests.

HTTP Connections

Non-persistent HTTP: At most one object is sent over a TCP connection. Used by HTTP/1.0.

Persistent HTTP: multiple objects can be sent over a single TCP connection; used by HTTP/1.1 by default.

The client will initialize the TCP connection, send an HTTP request message (containing some URL); the server will receive this response and form a response message containing the requested object.

HTTP/1.0 will close the TCP connection after the object is delivered, while persistent HTTP will leave the connection open and use the same connection for subsequent HTTP requests.

Response Time Modelling (RTT): time to send a small packet to travel from the client to the server and back.

For HTTP/1.0:

Hence, the total time is 2 RTTs, plus the transmission time for the file. After the file is received, the TCP connection must also be closed.

In addition to this, the OS must allocate resources for each TCP connection; browsers will often use parallel TCP connections to fetch referenced objects.

Persistent HTTP with pipelining is the default behavior in HTTP/1.1; the client sends requests as soon as it encounters a referenced file.

HTTP Messages

The HTTP request consists of the following ASCII-encoded text:

Some methods:

Some headers:

The response message follows this format:

Some HTTP status codes:

Cookies: User-Server State

HTTP is stateless, so to identify users cookies are used. There are four components:

Web Caches (Proxy Servers)

Proxy servers sit between the origin server and client. There can be multiple proxy servers, one of which will hopefully be closer to the client than the origin server.

The proxy server/cache acts as both a client and server. It helps to:

If the requested object is in the cache, the browser will use it. Otherwise, it will make a request to the proxy server (and if the proxy does not have the resource, it will in turn make a request to the origin server).

Conditional GET

The server will not send the object if the cache has an up-to-date copy of the object.

Request header: If-modified-since: some_date.

If the object has not been modified since the given date, a 304 Not Modified response will be sent and there will be no body. Otherwise, it will generate a normal 200 OK response.