10. NAT, IPv6, RIP, OSPF, BGP

NAT

Motivations:

ISP only needs to allocate IP per customer
Can change addresses of devices in the local network without the outside world needing to know
Can change ISP without changing local addresses
Devices in local network not explicitly addressable by outside world

Each device has an internal IP address, but packets exiting the network have their source address transformed into a single public IP address.

Question: how does the router know which host to forward returning traffic if all devices have the same public IP address?

A table mapping a (external IP, external port) tuple to an (internal IP, internal port) tuple. When the device sends a packet, the router will randomly choose an external port to use, and when the response packet arrives, it can simply map this to the device’s internal IP and port.

The port number is a 16 bit field, so it theoretically allows over 60,000 simultaneous connections.

Controversy:

Port numbers address processes, not hosts
Port numbers are layer 4; routers should only process up to layer 3
Violates end-to-end agreement; hosts no longer talk directly to each other
IPv6 is a better solution

IPv6

Motivations:

IPv4 address space getting full
Header format helps speed up processing/forwarding
- Facilitates QoS

Datagram Format

IPv6 is simpler; no fragmentation, no checksums, fixed length (40 bytes) header:

Version (4), value 6
Traffic class (8): facilitates QoS
Flow label (20): packet identifier
Payload length (16): length of payload in bytes
Next header (8): type of next (upper-layer) header (e.g. TCP, UDP)
Hop limit (8): TTL
Source address (128)
Destination address (128)
Payload

Transition

Old devices only support IPv4, and some routers cannot be upgraded. Hence, ‘flag days’, where the world switches protocol on some specified day, will not work.

With a dual-stack approach, all IPv6 nodes also have complete IPv4 implementations, allowing packets to be transformed in both directions as it is routed. This loses IPv6 or IPv4 specific fields.

Another approach is tunnelling: stuff the whole IPv6 datagram as the payload of a IPv4 datagram. If the packet is small enough and DF is set, a IPv6 router can then parse the payload and forward it as a IPv6 packet.

Due to DHCP, CIDRised addresses and NAT partially solving the IP address shortage problem (in the short term), IPv6 adoption has been slow.

Routing in the Internet

Flat networks don’t scale: if you have millions of hosts, storing routing information for each requires large amounts of memory. If link-state routing is used, broadcasting LS updates takes too much bandwidth, and if distance-vector routing is used, it will never converge.

Hierarchical Routing

Hierarchical routing aggregates routers into autonomous regions, where routers belong to the same organization and run the same intra-AS routing protocol.

Gateway routers have a direct link to a router in another AS.

The forwarding table is configured by both intra- and inter-AS routing algorithms:

The intra-AS algorithm sets entries for internal and external destinations
The inter-AS algorithm sets entries for only external destinations

Routing between ASes

Case 1: a single gateway router to only one other AS:

Just forward all external traffic to that gateway router

Case 2: two or more physical links to other ASes (typically a transit AS):

Learn from the inter-AS protocol that subnet X is reachable via multiple gateways
Use routing information from intra-AS protocol to determine costs of path to each of the gateways
Hot potato routing; choose the closest gateway
Determine interface I that connects to first router on the path to the nearest gateway; insert (X, I) into the forwarding table

RIP - Routing Information Protocol

Uses the distance vector algorithm.

Distance metric:

Number of hops, max 15
Number of subnets traversed along the shortest path from the source router to the destination subnet, including the destination subset (hence it is always at least 1)

To determine subnets, detach each interface from its host/router to create islands of isolated networks; the interfaces are the endpoints of the isolated networks, called subnets.

RIP advertisements:

Distance vectors exchanged among neighbors every 30 seconds via Response Message
- They are also called advertisements, and are sent in UDP packets (layer-4)
- An application-level process (daemon) called route-d manages the RIP routing tables - UDP is level 4 so it cannot be managed at the router level
Each advertisement contains a list of up to 25 destination nets within the AS

Example:

w-- A --x-- D --y-- B-- ... --`z`
    |
    C

The routing table in D would have:

Dest. net	Next router	Num. hops
`w`	`A`	2
`y`	`B`	2
`z`	`B`	7
`x`	NA	1

If A then advertises its own routing table where the cost to z (through C) is 4, D would use the Bellman-Ford equation and update its routing table to route requests to z through A with a cost of 5.

Link Failure and Recovery

After 180 seconds of no advertisements, the neighbor/link is declared dead; routes going through that neighbor are invalidated and new advertisements are sent to neighbors. Through this, the link failure information propagates through the entire network.

Differences between RIP and the DV Algorithm

Can advertise a maximum of 25 subsets
The link cost is always 1 in RIP
Nodes in the DV algorithm are subsets
Instead of distance vectors, it sends the routing table
Routers do not store their neighbors’ DVs
Routers exchange advertisements at fixed intervals, not when there are updates
Does not have the count-to-infinity problem - has the count-to-16 problem

OSPF - Open Shortest Path First

Uses the link state algorithm
Algorithms are disseminated to the entire AS via broadcast messages (flooding)
- OSPF messages sent directly over IP (layer 3), not TCP/UDP
Typically deployed in upper-tier ISPs

Advanced features not in RIP:

Security: all OSPF messages are authenticated
Multiple same-cost paths allowed
Link cost can be set by admins (traffic engineering)
Unicast and multicast support (one-to-one and one-to-group)
Two-level hierarchy

OSPF ASes are configured into areas. Within an area:

Each area runs its own OSPF algorithm
Link-state advertisements do not exit the area
Each node has detailed area topology
Area border routers send summarized distance information to nets in its own areas and advertise to other area border routers

There is one backbone area which links areas together. In the backbone:

Routers that run OSPF routing within the backbone area called backbone routers
Boundary routers connect to other AS’s, typically through BGP

BGP - Border Gateway Protocol

The de facto standard. Provides each AS with the ability to:

Obtain subnet reachability information
Propagate the reachability information
Determines ‘good’ routes to subsets

Pairs of routers - BGP peers, exchange routing information over TCP connections (called BGP sessions).

eBGP sessions allow routers in two different AS’s to exchange reachability information - when it advertises a prefix, it promises that it can forward any datagrams destined to that prefix. The receiver can then create an entry for that prefix in its forwarding table and re-advertise the same information to others via iBGP (between routers in the same AS) or eBGP sessions.

The advertisement is a combination of the prefix and attributes. Two important attributes are:

AS_PATH: not updated in iBGP connections. Contains a list of ASes through which the advertisement has passed - reject if already in the path
NEXT_HOP: not updated in iBGP connections. Advertisements flow from the destination outwards, so it contains the address of the boundary router for the next-hop AS
An import policy is used to accept or decline adverts

Route selection

Routers may learn about more than one route to a given prefix, so it uses elimination rules to pick one:

Local preference value attribute (policy decision)
Shortest AS-PATH (DV algorithm)
Closest NEXT-HOP router (hot potato routing)
Additional criteria

Routing policy can affect routing. For example, a keep silent policy could occur if X is a dual-homed (e.g. for redundancy) customer network connected to B and C provider networks; it will not want to route traffic from B to C and hence hence, it will not advertise to B a route to C, or vice versa.