06. IP and Related Protocols

The Internet

A packet-switched network of networks. The networks and links follow the end-to-end principle:

IPv4

Terminology:

Best-Effort, Datagram Delivery:

Packet Format

Big endian byte ordering.

Length Name
4 Version
4 Hdr Len
6 TOS/DSCP
2 Unused
16 TotalLength
Bytes 0-3
16 Identification
3 Flags
13 Fragment Offset
Bytes 4-7
8 Time-To-Live
8 Protocol Type
16 Header Checksum
Bytes 8-11
32 Source Address
Bytes 12-15
32 Destination Address
Bytes 16+
32n Options + Padding
Data

Version (4)

Version of IP protocol. 4 for IPv4.

Header Length (4)

Length of header in multiples of 4 bytes. Without options, the header is 20 bytes long so the value would be 5.

TOS/DSCP (2)

Stands for Type of Service/DiffServ Code Point. Allows the priority for each packet to be defined (usually by the ISP depending on how much you pay) - mostly ignored.

Fill with zero.

Unused (2)

Fill with zero.

Total Length (16)

Total length (in bytes) of the datagram - required since some MAC layers may add padding to meet minimum length requirements.

This may be modified during fragmentation and reassembly.

Identification (16)

Identifies each IP payload accepted from higher layers for a given interface - a sequence number.

Flags (3)

Contains two flags for fragmentation and reassembly (DF = don’t fragment; MF = more fragments).

Fragmentation Offset (13)

Gives the offset of the current fragment within the entire datagram, in multiples of 8 bytes.

Time-to-Live (8)

Last resort to deal with loops - IP does not specify routing protocols and so there is a possibility that loops can occur.

This value is the upper limit on number of routers a packet can traverse, and is decremented by each router (hence routers must recalculate checksum) the packet passes through. Once the TTL values reaches 0, the packet is discarded and the sender is notified via an ICMP message.

TTL is typically 32 or 64.

Protocol Type (8)

Determines which higher-level protocol generated the payload - provides different SAPs to allow protocol multiplexing.

Value Protocol
0x01 ICMP
0x02 IGMP
0x04 IP-in-IP Encapsulation
0x06 TCP
0x11 UDP

Header Checksum (16)

Checksum for the IP header (NOT the data) - for some types of data, a few bad bits may not matter.

Source/Destination Address (32)

Indicates the initial sender and final receiver of the datagram. Internally structured into two sections: the network and host ID. The network ID refers to some local network, and the host ID must be unique within that network.

Routing and Forwarding

Routing/Forwarding Tables

IP routers have several network interfaces - these are called ports. When a router receives a packet on some input port, it looks at the DestinationAddress field to determine the output port using the forwarding table.

The forwarding table is a table of all networks the router knows about, identified by their network ID, and the output port to send the packet through to reach that network. A table lookup is used to then select the output port for every incoming packet.

Notes:

Important: a host address is tied to its location in the network; there is no mobility support - if a host switches to another network, it obtains another address, breaking ongoing TCP connections.

Classless Inter-Domain Routing

How many bits should be allocated to the network ID? Classful addressing used to be used, allocating exactly 8, 16 or 24 bits to the network ID.

CIDR was introduced in 1993 and is now mandatory, and allows an arbitrary number of bits to be allocated to the network.

Netmask

The netmask specifies which bits belong to the network ID. It is a 32 bit value, with the leftmost k bits being 1s, and the remaining bits 0. A /k netmask will have k bits allocated to the network ID. The network ID is calculated using a logical AND with the IP address.

Private host addresses

There are two special host addresses:

There are also some reserved IP blocks:

Block Usage
10.0.0.0/8 Private-use IP networks
127.0.0.0/8 Host loopback network
169.254.0.0/16 Link-local for point-to-point links
172.16.0.0/12 Private-use IP networks
192.168.0.0/16 Private-use IP networks

These addresses can be used within the provider network; packets with private addresses are not routed in the public internet.

The loopback address is usually denoted as 127.0.0.1, but any valid address in 127.0.0.0/8 can be used.

Simplified Packet Processing

IP input:

IP output:

Forwarding Table Contents and Lookup (rough approximation)

Each entry contains:

A router only knows the next hop, not the entire remaining route. The forwarding table lookup has three stages:

In End-Hosts

Forwarding tables in end hosts generally contain two or more entries:

In Routers

Most routers at the ‘border’ of the Internet (close to the customer) usually have small a forwarding table, often filled with other networks belonging to the same owner. For all other networks, they rely on the default router.

Core routers:

Fragmentation and Reassembly

The link-layer technologies used by IP have different maximum packet sizes, called the maximum transmission unit. For several reasons, IP abstracts these concerns:

Hence, IP offers its own maximum message length of 65515: (2^{16} - 1) - 20, where 20 is the minimum IP header size.

The overall process is as follows:

All fragments datagrams belonging to the same message have:

The sender can set the DF (don’t fragment) bit in the IP header to forbid fragmentation by intermediate routers. If the outgoing link’s MTU is too small, the intermediate router may return an ICMP message.

Address Resolution Protocol

IP datagrams are encapsulated into Ethernet frames, and Ethernet stations pick up a packet only if the destination MAC address matches its own. Hence, stations must know which MAC address a given IP address refers to.

ARP allows a host to map a given IP address to an MAC address. ARP is dynamic:

This trades the downsides of static configuration for the complexity of running a separate protocol.

Basic Operation

ARP makes no retransmissions if an request is not answered.

If a sender wants to send an IP packet to a local destination, it first checks its ARP cache for a binding to the IP address. If not, it starts the address resolution procedure. Once this is finished, it will encapsulate the IP packet in an Ethernet frame and direct it to the MAC address.

Entries in the ARP cache are soft-state; they get discarded some fixed time (usually 20 minutes) after it is inserted into the cache. Some implementations restart the timer after using the entry.

This ensures that stale information gets flushed out (this is much more reliable than that device of interesting notifying everyone that its binding has changed).

Frame format

ARP is a protocol working an a higher-level than Ethernet, so although the destination and source address is included in the packet, it needs to be included again in the Ethernet payload.

Internet Control Message Protocol

Allows routers or the destination host to inform the sender of error conditions such as when:

ICMP messages are encapsulated into regular IP datagrams and like IP, has no error control mechanisms.

ICMP messages are not required, and are often filtered out by firewalls even if they are sent. Thus, the sending host must not rely on ICMP messages.

Message Format

Some common type/code Combinations

Type Code Meaning
0 0 Echo reply (ping)
3 0 Destination network unreachable
3 1 Destination host unreachable
3 2 Destination protocol unreachable
3 3 Destination port unreachable
3 4 Fragmentation required but DF set
3 6 Destination network unknown
3 7 Destination host unknown
4 0 Source-quench
8 0 Echo request (ping)
11 0 TTL expired in transit
11 1 Fragment reassembly time exceeded

Destination Unreachable (type=3):

Source-quench (type=4, code=0): if IP router drops packet due to congestion. The intention is to let the source host throttle itself, but the act of sending the packet adds more work to the router.

TTL expiration (type=11, code=0): if IP router drops packet due to TTL reaching zero. tracert uses this, incrementing the TTL each time it sends a packet.

Fragmentation reassembly timeout (type=11, code=1): if not all fragments of a message received within a timeout. This invites the higher-level protocol to re-transmit the message.