A short article that lays some theory on the concepts of switching and forwarding in switches and routers.
- Backplane: a shared bus over which all packets pass through
- On a packet switch, each egress interface has a queue (the output queue)
- Recall from the “Principle of Packet Switching” article that:
- On an egress link, only one packet is sent at a time. If there are many ingress packets, only one traverses the link at a time and the remaining packets are stored in queue memory.
- Packet switches are Switches and Routers
- A Switch builds the Forwarding Table by inspecting the SA of incoming frames. It creates an entry in the Forwarding Table that associates that SA with the port on which the frame came.
How an Ethernet switch forwards packets
- When a packet arrives at an Ethernet switch:
- the header is examined
- it inspects the Ethernet DA (Destination Address). Let’s call it DA_recv.
- it looks into the Forwarding Table for a match
- if there is a match, the frame is forwarded to the destination port
- if not, then it broadcasts the frame over all ports, except the port on which the frame came. When a frame with SA = DA_recv arrives, the Switch updates the Forwarding Table.
How an Internet router forwards packets
- When a packet arrives at a Router:
- if TTL=0 (after decrementing it), the router discards the packet
- the IP checksum is updated
- to determine the next hop, it performs Longest Prefix Match on the Forwarding Table. if an entry is found, the packet is prepared to be sent to the next hop over the adequate port
- it examines the Ethernet DA of the packet to verify if the packet is destined to it. If the packet is not destined to the router, the router drops it.
- it checks the version of IP
- the router reads the TTL value and decrements it
- it inspects the IP DA
- a new Link Layer checksum and a new Link Layer header is built
- the packet is queued
So at a high level, switches and routers have similar behaviours in terms of forwarding:
- There are two known ways to perform Longest Prefix Match operations:
- Binary Tries
- Ternary Content Addressable Memory TCAM
- Nowadays, a popular method for address lookup is Generic Lookup
Packet switching and queues
- When studying the output queue on a packet switch, we should memorize that the queue has one writing rate (the rate at which packet arrive) and one reading rate (the rate at which packets are put on the link). The sum of both rates give the rate of the queue (or queue speed)
- For a packet switch with N ingress interfaces and one egress interface (one queue), and
- if each interface has a rate of R,
- then in the worst case, the writing rate is N*R and the reading rate is R. So the total rate of the queue is (N+1)*R. The queue can not sustain this situation, unless there are congestion control mechanisms that shape the input rate.
-> a solution for this problem is to have one queue per ingress interface, instead of the egress queue. The queue of each ingress interface holds packets for all flows. This way, we will have at worst case a writing rate R and a reading rate R per queue. And the total rate per queue becomes 2R, which is easier on the packet switch than the (N+1)*R However, the input queues lead to the Head of Line Blocking problem. The Head of Line Blocking problem is when, for a queue, there are packet in the head of the line that are blocking the next packets in the queue, simply because they can not get on the link.
Take a look at the following diagram. In each of the queues, the head of line is a packet that belongs to the orange flow. Recall that only one packet at a time uses the link. So the orange packet of the second queue is chosen to be put on the link, while the orange packets from the first and the third queues will wait. And they wait at the expense of the rest of the packets.
To solve this problem we implement Virtual Output queues. In Virtual Output queuing, each ingress interface has more than one queue, and each queue holds the packets of only one flow. That’s the difference with the precedent example.
With Virtual Output Queues, each incoming flow at an ingress interface will land on one queue and not share room with other flows. So even of the head of the line packet is stuck in waiting, it will not impact packets from other flows on that same ingress interface.
- There are packet switches with input queues and others with output queues.
- Output-queued packet switches are cheaper than input-queued packet switches.
- Output queuing maximizes throughput and minimizes delay.
- In Input queueing with Vertical output queues, the throughput is also maximized.
- To sum up, this is a comparison between queueing strategies in terms of performance, from worst to best:
- Output Queued packet switches < Input Queued packet switches < Input Queued packet switches with Vertical output queues.
A Simple Deterministic Queueing Model
Queues can be studied with a simple deterministic model. A Deterministic queueing model helps to learn about the dynamics of a packet in a network. This model defines these functions:
– A(t): the cumulative number of bits (or bytes) that arrived at instant “t”. It’s the arrival rate.
– D(t): the cumulative number of bits (or bytes) that departed from the queue, at instant “t”. It’s the departure rate.
– Q(t): the cumulative number of bits or bytes in a queue, at instant “t”. Q(t) is also called the occupancy of the queue.
– d(t): the time spent by a bit (or a byte) in the queue. It’s the queueing delay of one bit
Q(t) = A(t) – D(t)
– Sending smaller packets instead of big ones helps reduce end-to-end delay, because with smaller packets we can send many consecutively (pipelining)
– when sending a packet of size M over i links with rates ri and length Li (and queueing delays are null) the end-to-end delay is the sum of the propagation delays and the packetization delays
end-to-end delay = sum(M/ri + Li/c)
– suppose the packet is divided into smaller chunks of size p. The end-to-end delay becomes the sum of the propagation delays and the packetization delays of the first small packet, plus the packetization delay of the remaining small packets
→ end-to-end delay= sum (p/ri + Li/c) + (M-p)/rmin,
where rmin is the rate of the slowest link
– Statistical multiplexing helps reduce the rate of the egress link:
suppose we have N ingress flows at C rate each, and the egress link is at C rate also. Without statistical multiplexing, N*C > C, which means if all ingress flows arrive at full capacity, there will be packet loss at the egress interface. Even if we have a buffer, there still are lost packets because the size of the buffer is finite.
→ However, with statistical multiplexing, the egress rate R is smaller than N*C, with the following assumptions:
- the ingress average rate is sufficiently low
- the ingress rates are bursty
The reduction of the egress rate compared to the initial N*C rate is called Statistical Multiplexing Gain and this definition is true for systems with or without buffers