Networking II

In Networking I, we concentrated on the very basic topics of private IPv4 networks to get our VMs connected to the backbone network. In this second part, we would like to explore how IPv4 is used in the public Internet and how network address translation and firewalls work.

Exercise: motivation for dynamic routing

Assume the following network topology:

A truly artistic rendition of multiple networks connected by several routers.

Assign the routers some IP addresses from the respective IP ranges (for each network a router is connected to, pick one IP address and assign it to the respective router’s network interface). Usually the addresses from the beginning of the IP range are reserved for routers but you can theoretically choose a random IP as long as the IPs are uniquely assigned.
Write down the routing table of each router. The goal is to reach from anywhere to everywhere; i.e. if I pick random network (imagine there’s a host connected to that network), I should be able to ping another host in any other network.
Try to aggregate some routes in the routing tables. Can I substitute two or more routes with just one and reduce the size of the routing table this way? Why would that be useful?

That was quite a lot of work, wasn’t it? (Still, it’s a good idea to do the exercise.) Imagine this would have to be done manually, as we did in the last networking lecture, but at the Internet scale!

Static and dynamic routing, routing protocols

Filling-in the routing table manually is called static routing. There is another technique called dynamic routing which utilizes routing protocols to discover the network topology dynamically and fill in the routing tables accordingly.

Do not confuse routing information exchange with IP forwarding which we’ve discussed on the last networking lecture. IP uses routing tables to exchange IP packets according to the routing tables. But how the routing tables are filled-in, is another story.

In practice nowadays, mostly BGP and OSPF are used (RIP is often taught for its simplicity). You probably heard about them in other networking classes.

We’ll focus on BGP. If you need a reminder of how it works, here’s a nice and detailed explanation.

Try to answer the following questions:

What is an autonomous system (AS) and how does it relate to BGP? What is iBGP and eBGP? What is the Autonomous System Number (ASN)? Do they have to be uniquely assigned?
What are stub, multihomed, transit, and IXP types of autonomous systems?
If a BGP router receives multiple routes to the same destination network, how does it choose the best one? What is an AS path? Does this decision depend only on the AS path?
How does BGP prevent loops in the network topology?

Routing daemons, BIRD

Software implementations of the routing protocols are called routing daemons. Usually a routing daemon is a complex piece of software capable of running multiple routing protocols on multiple network interfaces according to its configuration.

One of the most popular routing daemons is the BIRD Internet Routing Daemon (or BIRD for short). It was initially developed in 1998 as a software project at our faculty; the original team was Ondřej Filip, Martin Machek and Martin Mareš, led by Libor Forst. This software is responsible for routing much of the traffic on the Internet.

Please go through BIRD’s user’s manual. Focus mainly on the following:

Read the Introduction.
Architecture, namely 2.1 Routing tables, and 2.3 Protocols and channels.
Configuration, namely 3.1 Introduction, some 3.2 Global options might be useful (pick only those you need), 3.5 Channel options (make sure you understand the export and import options, this is a key concept).
6.6 Device, 6.7 Direct, and 6.8 Kernel protocols.
6.4 BGP protocol (only read the introduction and pick the options you really need).

On of the lab tasks will be to configure BIRD to interconnect our private networks.

IPv4 address space exhaustion

The IPv4 protocol, conceived in the late 1970s, saw its first standard published in 1980. As the Internet witnessed unprecedented growth in the latter half of the 1980s, fueled by the widespread availability of connectivity through dial-up connections over telephone wires, the number of end users surged.

In an IP network, each node must be uniquely identified by its IP address. By the end of the 1980s, it became evident that the global IPv4 address range, totaling 2³² (4,294,967,296) addresses, would not be sufficient to identify every individual node in the near future, and the exhaustion of the global IPv4 address range was inevitable.

To mitigate this impending exhaustion, several mechanisms were introduced to at least delay the issue:

CIDR Subnetting Scheme (1993): The Classless Inter-Domain Routing (CIDR) subnetting scheme replaced the earlier class-based subnetting scheme in 1993. The class-based scheme was rigid and led to inefficient IP assignments. CIDR brought flexibility by allowing more efficient allocation of IP addresses based on actual needs. Unlike the class-based scheme, where companies were assigned the nearest larger block, CIDR facilitated more precise allocations, reducing unused IP addresses.
Private Address Ranges (1994 and 1996): In 1994, RFC 1597 introduced private address ranges, preceding the widespread adoption of CIDR. In 1996, RFC 1918 replaced RFC 1597 while still leveraging CIDR. The introduction of private address ranges allowed organizations to use non-routable IP addresses within their private networks. Although CIDR made the overall address space more efficient, private IP ranges helped alleviate the strain on the public IP address pool.
Network Address Translation (NAT) (1994): Before the widespread use of NAT, proposed in 1994, there was a need to address the challenge of “hiding” multiple hosts in a private network behind one device with a public IP. NAT allows multiple devices in a private network to share a single public IP address, helping to conserve the limited pool of public IP addresses.
These measures worked to some extent, however on January 31, 2011, the IANA central authority assigned the last two blocks to APNIC RIR, making APNIC the first Regional Internet Registry (RIR) to run out of IP addresses.
Subsequently, on November 25, 2019, RIPE became the last RIR to exhaust its pool of available IP addresses. Although IPv4 addresses are still owned by ISPs redistributing them between their customers, the scarcity persists. Over time, as customers leave or ISPs dissolve, IP addresses are returned to the corresponding RIRs. At the RIR level, there exists a waiting list of applicants seeking IP address blocks. While it is still possible to request public IPv4 addresses or blocks, they are now considered valuable and relatively expensive commodities.

These events played a crucial role in shaping the evolution of the Internet, influencing its current structure. And while the transition from the class-based scheme to CIDR is today considered a move in the right direction, the same cannot be said of private address ranges and network address translation. Let’s look at them in greater detail.

Public vs. Private IP Addresses, NAT

If you are operating an isolated network where devices are interconnected without any external connection, you have the freedom to use any IP address range, and it will function seamlessly. No one is required to allocate specific blocks, granting you the flexibility to choose an arbitrary range. However, potential issues may arise if, at a later stage, you decide to connect to the public Internet or integrate with another corporate network, especially in the case of mergers. The challenge lies in the necessity for unique IP addresses, and using an arbitrary range is considered bad practice. Instead, it is advisable to employ an IP range assigned by the corresponding authority (RIR or ISP) or utilize a specially designated private range.

In 1994, before the introduction of RFC 1597, there was no dedicated private range. All IPs were considered global (or public) IPs, lacking the distinction present today. RIRs and ISPs globally assigned IPs without differentiation, contributing significantly to the problem of IP exhaustion.

Originally designed to route the Internet, the IP protocol gained popularity and found application in “private” setups. Institutions, such as universities, were assigned blocks of public IP addresses for widespread use, even on devices like workstations in university libraries and printers. It became evident that these devices did not require a globally routable (public) IP address, as it was unlikely that anyone outside the university would need to access these devices over the Internet. In response, private IP ranges and Network Address Translation (NAT) were introduced.

RFC 1918 reserved three blocks of the global IP address range for use in private networks:

10.0.0.0 - 10.255.255.255 (10/8 prefix)
172.16.0.0 - 172.31.255.255 (172.16/12 prefix)
192.168.0.0 - 192.168.255.255 (192.168/16 prefix)

These reserved blocks are not assigned to anyone else on the Internet. The key concept here is that IP addresses from these ranges can be reused by an infinite number of private networks that do not require “global” connectivity, thereby conserving global address space. However, this reuse seemingly contradicts the requirement for unique IP addresses across the Internet.

In a world with private ranges, the Internet transforms from a single, massive network into a global one, with addresses assigned as usual and numerous private stub networks somehow connected to the larger Internet. To maintain functionality, a thick boundary must be established between the private network and the rest of the Internet. Inside a private network, IP addresses are assigned uniquely, avoiding clashes with other devices in the same private network. However, clashes with someone else’s private network are possible, but due to the thick boundary, direct connectivity between private networks over the global Internet is lost – a trade-off that must be accepted.

It’s worth noting that subnetting within the private range is possible. For instance, using the 10.0.0.0/8 prefix for addressing a private network allows further subdivision into several subnets (e.g., 10.1.0.0/16 and 10.2.0.0/16), enabling the use of different subnets for various departments within a company.

Ensuring that these private ranges are reserved and not used by anyone else on the Internet is crucial. Using an unreserved arbitrary range for numbering a private network may work if you are fortunate, but it could also lead to clashes with other entities on the Internet. In such a scenario, the Internet would still function, but you would be unable to reach the intended target. For example, choosing to use the 1.1.1.0/24 range for a private network could result in conflicts, such as a user attempting to visit the CloudFlare website https://one.one.one.one/ (with an IP of 1.1.1.1). The user’s PC would believe that 1.1.1.1 is reachable in the local network, causing the connection to fail.

The concept of the “thick boundary” raises the question of how private networks maintain connectivity with the global Internet. This is where NAT comes into play.

NAT, or Network Address Translation, is a technique for altering network headers, primarily IP, TCP, or UDP headers, of packets in transit. Commonly, the altered parameters include src/dst IP addresses and src/dst port numbers. When altering source parameters, it is known as SNAT or SRCNAT, and when altering destination parameters, it is referred to as DNAT or DSTNAT.

NAT is typically deployed on a boundary router in a private network, which also has a public IP assigned to one of its interfaces. If SNAT is deployed on such a gateway, it can “hide” the network behind its public IP address, a process known as masquerading. When a packet arrives at the gateway, the original source address of the packet is replaced with the gateway’s public IP. The packet is then routed to the destination as usual. When a response is sent back, the original source IP becomes the destination. The response is sent to the globally routable IP address of the router, where the NAT mechanism translates the destination IP back to the original sender’s address. However, this introduces a challenge: if multiple PCs in the private network request the same server and two responses arrive at the router, the router needs to distinguish to which host it should forward the response. This is why NAT also changes port numbers in the TCP/UDP headers.

The workflow of this situation is as follows:

When a packet is sent by the sender, a random free port is used as the source port, and the sender’s IP address is used as the source address.
When the packet reaches a NAT gateway, SNAT changes the source address as described. Simultaneously, it changes the original source port to another random free port and maintains a NAT table. An entry is added to the table stating: “The original source IP address and source port correspond to this changed port.”
The packet is routed to the destination.
At the destination, the reply is sent to the sender, and source IP and ports become the destination.
The packet is routed back to the NAT gateway.
On the gateway, the NAT table is looked up, and based on the destination port (which was the altered source port before), the destination address and port are translated back to the original.
The packet is routed to the original sender in the private network.

It’s important to note that the example above illustrates SNAT. Even if the flow in the opposite direction could be perceived as DNAT, it is not DNAT; it’s the second complementary part of SNAT. The initiator determines whether it is SNAT or DNAT.

Also, masquerading is just one of many NAT usage examples. For instance, DNAT can be used for port forwarding from a public IP to a private IP, enabling a host inside a private network to be reached through the NAT gateway. Further tasks related to this will be discussed.

NAT

In summary, the complex interplay between private and global networks, along with the introduction of NAT, has significantly shaped the contemporary landscape of network addressing and connectivity.

nftables (filter and NAT)

For further understanding and practical application, refer to the following links:

Here are some useful links which should help you to grasp the topic and mostly the practical aspects:

Firewall in general – What it is and why is it useful. For more reasoning, see task description in evalweb.
What is nftables? – In basic words: firewall and NAT implementation for Linux. Successor of former iptables.
Quick reference-nftables in 10 minutes – This should help you to find out how to configure nftables. Only filter and NAT table types and IP family should be relevant for you.
This diagram can be useful to understand how hooks work and what is the packet flow on Linux. Only the first top layers should be relevant for you (orange and green rectangles).

Acknowledgements

This chapter was written by Ondra Hrubý.