NAT
Motivations:
- ISP only needs to allocate IP per customer
- Can change addresses of devices in the local network without the outside world needing to know
- Can change ISP without changing local addresses
- Devices in local network not explicitly addressable by outside world
Each device has an internal IP address, but packets exiting the network have their source address transformed into a single public IP address.
Question: how does the router know which host to forward returning traffic if all devices have the same public IP address?
A table mapping a (external IP, external port) tuple to an (internal IP, internal port) tuple. When the device sends a packet, the router will randomly choose an external port to use, and when the response packet arrives, it can simply map this to the device’s internal IP and port.
The port number is a 16 bit field, so it theoretically allows over 60,000 simultaneous connections.
Controversy:
- Port numbers address processes, not hosts
- Port numbers are layer 4; routers should only process up to layer 3
- Violates end-to-end agreement; hosts no longer talk directly to each other
- IPv6 is a better solution
IPv6
Motivations:
- IPv4 address space getting full
- Header format helps speed up processing/forwarding
- Facilitates QoS
Datagram Format
IPv6 is simpler; no fragmentation, no checksums, fixed length (40 bytes) header:
- Version (4), value 6
- Traffic class (8): facilitates QoS
- Flow label (20): packet identifier
- Payload length (16): length of payload in bytes
- Next header (8): type of next (upper-layer) header (e.g. TCP, UDP)
- Hop limit (8): TTL
- Source address (128)
- Destination address (128)
- Payload
Transition
Old devices only support IPv4, and some routers cannot be upgraded. Hence, ‘flag days’, where the world switches protocol on some specified day, will not work.
With a dual-stack approach, all IPv6 nodes also have complete IPv4 implementations, allowing packets to be transformed in both directions as it is routed. This loses IPv6 or IPv4 specific fields.
Another approach is tunnelling: stuff the whole IPv6 datagram as the payload of a IPv4 datagram. If the packet is small enough and DF is set, a IPv6 router can then parse the payload and forward it as a IPv6 packet.
Due to DHCP, CIDRised addresses and NAT partially solving the IP address shortage problem (in the short term), IPv6 adoption has been slow.
Routing in the Internet
Flat networks don’t scale: if you have millions of hosts, storing routing information for each requires large amounts of memory. If link-state routing is used, broadcasting LS updates takes too much bandwidth, and if distance-vector routing is used, it will never converge.
Hierarchical Routing
Hierarchical routing aggregates routers into autonomous regions, where routers belong to the same organization and run the same intra-AS routing protocol.
Gateway routers have a direct link to a router in another AS.
The forwarding table is configured by both intra- and inter-AS routing algorithms:
- The intra-AS algorithm sets entries for internal and external destinations
- The inter-AS algorithm sets entries for only external destinations
Routing between ASes
Case 1: a single gateway router to only one other AS:
- Just forward all external traffic to that gateway router
Case 2: two or more physical links to other ASes (typically a transit AS):
- Learn from the inter-AS protocol that subnet X is reachable via multiple gateways
- Use routing information from intra-AS protocol to determine costs of path to each of the gateways
- Hot potato routing; choose the closest gateway
- Determine interface I that connects to first router on the path to the nearest gateway; insert
(X, I)into the forwarding table
RIP - Routing Information Protocol
Uses the distance vector algorithm.
Distance metric:
- Number of hops, max 15
- Number of subnets traversed along the shortest path from the source router to the destination subnet, including the destination subset (hence it is always at least 1)
To determine subnets, detach each interface from its host/router to create islands of isolated networks; the interfaces are the endpoints of the isolated networks, called subnets.
RIP advertisements:
- Distance vectors exchanged among neighbors every 30 seconds via Response Message
- They are also called advertisements, and are sent in UDP packets (layer-4)
- An application-level process (daemon) called
route-dmanages the RIP routing tables - UDP is level 4 so it cannot be managed at the router level
- Each advertisement contains a list of up to 25 destination nets within the AS
Example:
w-- A --x-- D --y-- B-- ... --`z`
|
C
The routing table in D would have:
| Dest. net | Next router | Num. hops |
|---|---|---|
w |
A |
2 |
y |
B |
2 |
z |
B |
7 |
x |
NA | 1 |
If A then advertises its own routing table where the cost to z (through C) is 4, D would use the Bellman-Ford equation and update its routing table to route requests to z through A with a cost of 5.
Link Failure and Recovery
After 180 seconds of no advertisements, the neighbor/link is declared dead; routes going through that neighbor are invalidated and new advertisements are sent to neighbors. Through this, the link failure information propagates through the entire network.
Differences between RIP and the DV Algorithm
- Can advertise a maximum of 25 subsets
- The link cost is always 1 in RIP
- Nodes in the DV algorithm are subsets
- Instead of distance vectors, it sends the routing table
- Routers do not store their neighbors’ DVs
- Routers exchange advertisements at fixed intervals, not when there are updates
- Does not have the count-to-infinity problem - has the count-to-16 problem
OSPF - Open Shortest Path First
- Uses the link state algorithm
- Algorithms are disseminated to the entire AS via broadcast messages (flooding)
- OSPF messages sent directly over IP (layer 3), not TCP/UDP
- Typically deployed in upper-tier ISPs
Advanced features not in RIP:
- Security: all OSPF messages are authenticated
- Multiple same-cost paths allowed
- Link cost can be set by admins (traffic engineering)
- Unicast and multicast support (one-to-one and one-to-group)
- Two-level hierarchy
OSPF ASes are configured into areas. Within an area:
- Each area runs its own OSPF algorithm
- Link-state advertisements do not exit the area
- Each node has detailed area topology
- Area border routers send summarized distance information to nets in its own areas and advertise to other area border routers
There is one backbone area which links areas together. In the backbone:
- Routers that run OSPF routing within the backbone area called backbone routers
- Boundary routers connect to other AS’s, typically through BGP
BGP - Border Gateway Protocol
The de facto standard. Provides each AS with the ability to:
- Obtain subnet reachability information
- Propagate the reachability information
- Determines ‘good’ routes to subsets
Pairs of routers - BGP peers, exchange routing information over TCP connections (called BGP sessions).
eBGP sessions allow routers in two different AS’s to exchange reachability information - when it advertises a prefix, it promises that it can forward any datagrams destined to that prefix. The receiver can then create an entry for that prefix in its forwarding table and re-advertise the same information to others via iBGP (between routers in the same AS) or eBGP sessions.
The advertisement is a combination of the prefix and attributes. Two important attributes are:
AS_PATH: not updated in iBGP connections. Contains a list of ASes through which the advertisement has passed - reject if already in the pathNEXT_HOP: not updated in iBGP connections. Advertisements flow from the destination outwards, so it contains the address of the boundary router for the next-hop AS- An import policy is used to accept or decline adverts
Route selection
Routers may learn about more than one route to a given prefix, so it uses elimination rules to pick one:
- Local preference value attribute (policy decision)
- Shortest AS-PATH (DV algorithm)
- Closest NEXT-HOP router (hot potato routing)
- Additional criteria
Routing policy can affect routing. For example, a keep silent policy could occur if X is a dual-homed (e.g. for redundancy) customer network connected to B and C provider networks; it will not want to route traffic from B to C and hence hence, it will not advertise to B a route to C, or vice versa.