We study in this network engineering article how queueing mechanisms work and we take one example: Weighted Fair Queueing.
 In a queue, we can model an arrival process with a deterministic process. However, when aggregated, arrival processes are random events. So we model them with a random process.
 The study of random arrival processes is part of a discipline called Queueing theory. This study shows that random arrival processes have interesting properties:
 Traffic burstiness increases delay.
 Determinism reduces delay
 Poisson process: models an aggregation of random events, eg the incoming phone calls at a telecom switch.
 Generally we can not use the Poisson process with queue arrivals. However, we can use the Poisson process with new flows in some types of events such as Web requests or new user connections
 Packet arrival on the network is not a Poisson process.
 Little’s Result, is a simplistic queueing model
 given:
 λ the arrival rate,
 L: the number of packets in the queue waiting and not served,
 d: the average delay spent by packets waiting in the queue and not being served
 then L = λ * d
 given:
note:
L can also be defined as the number of packets in the queue and being served, and d as the average delay spent by packets in the queue and being served.
 M/M/1 queue model gives a good intuition about the arrival rate, queue occupancy and the service rate. However, it is not an accurate measure of the queue occupancy.
 The Poisson process is used in the M/M/1 queue model
 given:
 λ: the arrival rate,
 µ: the service rate
 then d = 1 / (µ – λ)
 given:
and having L = λ* d, in the Little’s Result,
then L = λ / (µ – λ) = (λ / µ) / (1 – λ/ µ)
EndtoEnd Delay Guarantee in Weighted Fair Queueing WFQ
 Using the simple deterministic queueing model, if the arrival process A(t) grows very large, then we can have at some time Q(t) >= B (where B is the size of the queue). This leads to packet drops. We can not talk about delay guarantees if we have packet drops.
 To avoid the situation of buffer overflow, we need a way to guarantee that, at any time:
A(t+T) – A(t) <= B + R1*T, where:

 B is the size of the queue,
 R1 is the service rate of the queue (i.e. the rate at which the queue is drained)
 T is any time interval.
 In more general terms, A(t) must be regulated by the (σ,ϼ) function, which is σ + ϼ*t. This means:
 A(t) will never exceed (σ,ϼ)(t): A(t) < σ + ϼ*t , and
 at any time t, the arriving bits have an upper bound of σ + ϼ*t. Taking σ = B and ϼ = R1, we get B + R1*t
 The (σ,ϼ) function gives us these interesting values:
 The maximum queue size Bmax is the vertical distance between (σ,ϼ) function and D(t).
 the maximum delay a bit takes in the queue dmax is the horizontal distance between (σ,ϼ) function and D(t)
 How can we accomplish this in practice? How can we guarantee no packet drops, thus no buffer overflow, thus an endtoend delay guarantee? This is done by having a WFQ router, applying the (σ,ϼ) constraint and a Leaky Bucket Regulator at the source host, where:
 σ is the size of the token bucket
 ϼ is the rate at which the token bucket is filled,
 each queue delivers packets on the wire only if the size of tokens in the token bucket equals the size of the packet.
 This (σ,ϼ) constraint is implemented in RSVP. This protocol is not used a lot in networking.
So, in addition to providing rate guarantee, WFQ can provide an endtoend delay guarantee if at the source flows are leaky bucketconstrained.
Rate Guarantee In Weighted Fair Queueing WFQ
FIFO Qeueues
 All traffic is treated equally
 If B is the size of the queue and R is the output rate (the departure rate), then queue delay <= B / R
Strict Priority Queueing
 There is one strict priority queue and one low priority queue
 The low priority queue is examined only when there the high priority queue is empty
 Strict Priority Queueing leads to a starvation phenomena
 A better alternative is weighted priority queueing
Weighted Fair Queueing
 Packets are processed packet by packet, not in bit by bit
 Each queue has an outgoing rate (service rate) Ri and a weight wi. The egress link has a rate R. The service rate of each queue is calculated as follows: Ri = (wi / ∑wj ) * R
 If packets are of the same length, queues will be visited in “Rounds”. At each round wi bits are taken from queuei.
 WFQ calculates the Finishing time of each packet.
 Knowing the size of the queue L and the Starting time of the packet in the queue Sk, then the Finishing time of the packet Fk = Sk + (L / wi)
 Knowing the size of the queuei and the Finishing time of the last packet in the queue Fk1, then we can calculate the Starting time of the next packet arriving to the queue Sk = max (Fk1, now)
 At the end of all queues, there is a scheduler right before the egress link that examines the head of the line of each queue, takes the packet with the lowest Finishing time and transfer it to the egress link.
 It gives a “fair share” of the outgoing link to all flows. And it gives a rate guarantee to each incoming flow (Ri).
 If packets were to be treated bit by bit, there will be a variance of Δ between the Finishing time in the bitbybit scheme and the Finishing time in the packetbypacket scheme. Δ = LPmax / R, where LPmax is the maximum size of a packet.
Playback Buffers
 most applications (web browsing, file transfer…) don’t care about the variation of the endtoend delay. Realtime applications are impacted by variations of the endtoend delay
 packet switches have buffers. End host realtime applications have playback buffers. These applications build playback buffers to have a reserve of packets. Playback buffers are maintained in memory.
 The purpose of the playback buffers is to absorb the variation of the endtoend delay, not the delay itself.
 in the case of a realtime application (such as Youtube), the server streaming rate is fixed. However, the number of bytes received by the client is variable in time, because between the server streaming and the transmission of bytes there is a variable delay.
 The playback buffer (at the client side) streams at a fixed rate.
 If the size of the playback buffer is very big, then it leads to a big delay between the time the server streams the file to the time the client plays it back. If the size of the playback buffer is not big enough, then there may be received bytes but with no available playback buffer space, which leads to a “buffering” phenomena.
 the variable queueing delays in a network path are experienced at the packet switches and not at the end host.