Chapter 25. Linux traffic control
Linux offers tools for managing and manipulating the transmission of packets. The Linux Traffic Control (TC) subsystem helps in policing, classifying, shaping, and scheduling network traffic. TC also mangles the packet content during classification by using filters and actions. The TC subsystem achieves this by using queuing disciplines (qdisc
), a fundamental element of the TC architecture.
The scheduling mechanism arranges or rearranges the packets before they enter or exit different queues. The most common scheduler is the First-In-First-Out (FIFO) scheduler. You can do the qdiscs
operations temporarily using the tc
utility or permanently using NetworkManager.
In Red Hat Enterprise Linux, you can configure default queueing disciplines in various ways to manage the traffic on a network interface.
25.1. Overview of queuing disciplines Copy linkLink copied to clipboard!
Queuing disciplines (qdiscs
) help with queuing up and, later, scheduling of traffic transmission by a network interface. A qdisc
has two operations;
- enqueue requests so that a packet can be queued up for later transmission and
- dequeue requests so that one of the queued-up packets can be chosen for immediate transmission.
Every qdisc
has a 16-bit hexadecimal identification number called a handle
, with an attached colon, such as 1:
or abcd:
. This number is called the qdisc
major number. If a qdisc
has classes, then the identifiers are formed as a pair of two numbers with the major number before the minor, <major>:<minor>
, for example abcd:1
. The numbering scheme for the minor numbers depends on the qdisc
type. Sometimes the numbering is systematic, where the first-class has the ID <major>:1
, the second one <major>:2
, and so on. Some qdiscs
allow the user to set class minor numbers arbitrarily when creating the class.
- Classful
qdiscs
Different types of
qdiscs
exist and help in the transfer of packets to and from a networking interface. You can configureqdiscs
with root, parent, or child classes. The point where children can be attached are called classes. Classes inqdisc
are flexible and can always contain either multiple children classes or a single child,qdisc
. There is no prohibition against a class containing a classfulqdisc
itself, this facilitates complex traffic control scenarios.Classful
qdiscs
do not store any packets themselves. Instead, they enqueue and dequeue requests down to one of their children according to criteria specific to theqdisc
. Eventually, this recursive packet passing ends up where the packets are stored (or picked up from in the case of dequeuing).- Classless
qdiscs
-
Some
qdiscs
contain no child classes and they are called classlessqdiscs
. Classlessqdiscs
require less customization compared to classfulqdiscs
. It is usually enough to attach them to an interface.
25.2. Inspecting qdiscs of a network interface by using the tc utility Copy linkLink copied to clipboard!
By default, Red Hat Enterprise Linux systems use fq_codel qdisc
. You can inspect the qdisc
counters using the tc
utility.
Procedure
Optional: View your current
qdisc
:tc qdisc show dev enp0s1
# tc qdisc show dev enp0s1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Inspect the current
qdisc
counters:tc -s qdisc show dev enp0s1
# tc -s qdisc show dev enp0s1 qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn Sent 1008193 bytes 5559 pkt (dropped 233, overlimits 55 requeues 77) backlog 0b 0p requeues 0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
dropped
- the number of times a packet is dropped because all queues are full -
overlimits
- the number of times the configured link capacity is filled -
sent
- the number of dequeues
-
25.3. Updating the default qdisc Copy linkLink copied to clipboard!
If you observe networking packet losses with the current qdisc
, you can change the qdisc
based on your network-requirements.
Procedure
View the current default
qdisc
:sysctl -a | grep qdisc
# sysctl -a | grep qdisc net.core.default_qdisc = fq_codel
Copy to Clipboard Copied! Toggle word wrap Toggle overflow View the
qdisc
of current Ethernet connection:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the existing
qdisc
:sysctl -w net.core.default_qdisc=pfifo_fast
# sysctl -w net.core.default_qdisc=pfifo_fast
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To apply the changes, reload the network driver:
modprobe -r NETWORKDRIVERNAME modprobe NETWORKDRIVERNAME
# modprobe -r NETWORKDRIVERNAME # modprobe NETWORKDRIVERNAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start the network interface:
ip link set enp0s1 up
# ip link set enp0s1 up
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
View the
qdisc
of the Ethernet connection:tc -s qdisc show dev enp0s1
# tc -s qdisc show dev enp0s1 qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 373186 bytes 5333 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 ...
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
25.4. Temporarily setting the current qdisc of a network interface by using the tc utility Copy linkLink copied to clipboard!
You can update the current qdisc
without changing the default one.
Procedure
Optional: View the current
qdisc
:tc -s qdisc show dev enp0s1
# tc -s qdisc show dev enp0s1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the current
qdisc
:tc qdisc replace dev enp0s1 root htb
# tc qdisc replace dev enp0s1 root htb
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
View the updated current
qdisc
:tc -s qdisc show dev enp0s1
# tc -s qdisc show dev enp0s1 qdisc htb 8001: root refcnt 2 r2q 10 default 0 direct_packets_stat 0 direct_qlen 1000 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
25.5. Permanently setting the current qdisc of a network interface by using NetworkManager Copy linkLink copied to clipboard!
You can update the current qdisc
value of a NetworkManager connection.
Procedure
Optional: View the current
qdisc
:tc qdisc show dev enp0s1
# tc qdisc show dev enp0s1 qdisc fq_codel 0: root refcnt 2
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the current
qdisc
:nmcli connection modify enp0s1 tc.qdiscs 'root pfifo_fast'
# nmcli connection modify enp0s1 tc.qdiscs 'root pfifo_fast'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To add another
qdisc
over the existingqdisc
, use the+tc.qdisc
option:nmcli connection modify enp0s1 +tc.qdisc 'ingress handle ffff:'
# nmcli connection modify enp0s1 +tc.qdisc 'ingress handle ffff:'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Activate the changes:
nmcli connection up enp0s1
# nmcli connection up enp0s1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
View current
qdisc
the network interface:tc qdisc show dev enp0s1
# tc qdisc show dev enp0s1 qdisc pfifo_fast 8001: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc ingress ffff: parent ffff:fff1 ................
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
25.6. Configuring the rate limiting of packets by using the tc-ctinfo utility Copy linkLink copied to clipboard!
You can limit network traffic and prevent the exhaustion of resources in the network by using rate limiting. With rate limiting, you can also reduce the load on servers by limiting repetitive packet requests in a specific time frame. In addition, you can manage bandwidth rate by configuring traffic control in the kernel with the tc-ctinfo
utility.
The connection tracking entry stores the Netfilter
mark and connection information. When a router forwards a packet from the firewall, the router either removes or modifies the connection tracking entry from the packet. The connection tracking information (ctinfo
) module retrieves data from connection tracking marks into various fields. This kernel module preserves the Netfilter
mark by copying it into a socket buffer (skb
) mark metadata field.
Prerequisites
-
The
iperf3
utility is installed on a server and a client.
Procedure
Perform the following steps on the server:
Add a virtual link to the network interface:
ip link add name ifb4eth0 numtxqueues 48 numrxqueues 48 type ifb
# ip link add name ifb4eth0 numtxqueues 48 numrxqueues 48 type ifb
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command has the following parameters:
name ifb4eth0
- Sets a new virtual device interface.
numtxqueues 48
- Sets the number of transmit queues.
numrxqueues 48
- Sets the number of receive queues.
type ifb
- Sets the type of the new device.
Change the state of the interface:
ip link set dev ifb4eth0 up
# ip link set dev ifb4eth0 up
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
qdisc
attribute on the physical network interface and apply it to the incoming traffic:tc qdisc add dev enp1s0 handle ffff: ingress
# tc qdisc add dev enp1s0 handle ffff: ingress
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the
handle ffff:
option, thehandle
parameter assigns the major numberffff:
as a default value to a classfulqdisc
on theenp1s0
physical network interface, whereqdisc
is a queueing discipline parameter to analyze traffic control.Add a filter on the physical interface of the
ip
protocol to classify packets:tc filter add dev enp1s0 parent ffff: protocol ip u32 match u32 0 0 action ctinfo cpmark 100 action mirred egress redirect dev ifb4eth0
# tc filter add dev enp1s0 parent ffff: protocol ip u32 match u32 0 0 action ctinfo cpmark 100 action mirred egress redirect dev ifb4eth0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command has the following attributes:
parent ffff:
-
Sets major number
ffff:
for the parentqdisc
. u32 match u32 0 0
-
Sets the
u32
filter tomatch
the IP headers of theu32
pattern. The first0
represents the second byte of IP header while the other0
is for the mask match telling the filter which bits to match. action ctinfo
- Sets action to retrieve data from the connection tracking mark into various fields.
cpmark 100
-
Copies the connection tracking mark (connmark)
100
into the packet IP header field. action mirred egress redirect dev ifb4eth0
-
Sets the
action
tomirred
to redirect the received packets to theifb4eth0
destination interface.
Add a classful
qdisc
to the interface:tc qdisc add dev ifb4eth0 root handle 1: htb default 1000
# tc qdisc add dev ifb4eth0 root handle 1: htb default 1000
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command sets the major number
1
to rootqdisc
and uses thehtb
hierarchy token bucket with classfulqdisc
of minor-id1000
.Limit the traffic on the interface to 1 Mbit/s with an upper limit of 2 Mbit/s:
tc class add dev ifb4eth0 parent 1:1 classid 1:100 htb ceil 2mbit rate 1mbit prio 100
# tc class add dev ifb4eth0 parent 1:1 classid 1:100 htb ceil 2mbit rate 1mbit prio 100
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command has the following parameters:
parent 1:1
-
Sets
parent
withclassid
as1
androot
as1
. classid 1:100
-
Sets
classid
as1:100
where1
is the number of parentqdisc
and100
is the number of classes of the parentqdisc
. htb ceil 2mbit
-
The
htb
classfulqdisc
allows upper limit bandwidth of2 Mbit/s
as theceil
rate limit.
Apply the Stochastic Fairness Queuing (
sfq
) of classlessqdisc
to interface with a time interval of60
seconds to reduce queue algorithm perturbation:tc qdisc add dev ifb4eth0 parent 1:100 sfq perturb 60
# tc qdisc add dev ifb4eth0 parent 1:100 sfq perturb 60
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the firewall mark (
fw
) filter to the interface:tc filter add dev ifb4eth0 parent 1:0 protocol ip prio 100 handle 100 fw classid 1:100
# tc filter add dev ifb4eth0 parent 1:0 protocol ip prio 100 handle 100 fw classid 1:100
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restore the packet meta mark from the connection mark (
CONNMARK
):nft add rule ip mangle PREROUTING counter meta mark set ct mark
# nft add rule ip mangle PREROUTING counter meta mark set ct mark
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In this command, the
nft
utility has amangle
table with thePREROUTING
chain rule specification that alters incoming packets before routing to replace the packet mark withCONNMARK
.If no
nft
table and chain exist, create a table and add a chain rule:nft add table ip mangle nft add chain ip mangle PREROUTING {type filter hook prerouting priority mangle \;}
# nft add table ip mangle # nft add chain ip mangle PREROUTING {type filter hook prerouting priority mangle \;}
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the meta mark on
tcp
packets that are received on the specified destination address192.0.2.3
:nft add rule ip mangle PREROUTING ip daddr 192.0.2.3 counter meta mark set 0x64
# nft add rule ip mangle PREROUTING ip daddr 192.0.2.3 counter meta mark set 0x64
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save the packet mark into the connection mark:
nft add rule ip mangle PREROUTING counter ct mark set mark
# nft add rule ip mangle PREROUTING counter ct mark set mark
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the
iperf3
utility as the server on a system by using the-s
parameter and the server then waits for the response of the client connection:iperf3 -s
# iperf3 -s
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
On the client, run
iperf3
as a client and connect to the server that listens on IP address192.0.2.3
for periodic HTTP request-response timestamp:iperf3 -c 192.0.2.3 | tee rate
# iperf3 -c 192.0.2.3 | tee rate
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 192.0.2.3
is the IP address of the server while192.0.2.4
is the IP address of the client.Terminate the
iperf3
utility on the server by pressing Ctrl+C:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Terminate the
iperf3
utility on the client by pressing Ctrl+C:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Display the statistics about packet counts of the
htb
andsfq
classes on the interface:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Display the statistics of packet counts for the
mirred
andctinfo
actions:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Display the statistics of the
htb
rate-limiter and its configuration:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
25.7. Available qdiscs in RHEL Copy linkLink copied to clipboard!
Each qdisc
addresses unique networking-related issues. The following is the list of qdiscs
available in RHEL. You can use any of the following qdisc
to shape network traffic based on your networking requirements.
qdisc name | Included in | Offload support |
---|---|---|
Credit-Based Shaper |
| Yes |
Enhanced Transmission Selection (ETS) |
| Yes |
Earliest TxTime First (ETF) |
| |
Fair Queue (FQ) |
| |
Fair Queuing Controlled Delay (FQ_CODel) |
| |
Generalized Random Early Detection (GRED) |
| |
Hierarchical Fair Service Curve (HSFC) |
| |
Hierarchy Token Bucket (HTB) |
| Yes |
INGRESS |
| Yes |
Multi Queue Priority (MQPRIO) |
| Yes |
Multiqueue (MULTIQ) |
| Yes |
Network Emulator (NETEM) |
| |
Random Early Detection (RED) |
| Yes |
Stochastic Fairness Queueing (SFQ) |
| |
Time-aware Priority Shaper (TAPRIO) |
| |
Token Bucket Filter (TBF) |
| Yes |
The qdisc
offload requires hardware and driver support on NIC.