This is a background article on throughput calculation on layered packet-based networks. For information about throughput values in ByteBlower GUI reports, see here.
Throughput in packet-based networks
Calculating throughput is very straightforward. We simple have to calculate how much of something is handled in a second. In the case of networking, this boils down to how much data is transmitted or received in a second:
throughput = data / time [s]
To calculate the throughput of something, we have to define that something. In other words, we should define what traffic is part of the
data variable in the throughput calculation. This depends on the task at hand:
- All outgoing data on an interface.
- All incoming data on an interface.
- All incoming data on an interface destined for that interface.
- All incoming or outgoing data within a data flow.
- All incoming data originating from a specific source.
Furthermore, within the context of packet-based networks, the relevant data may be counted or measured in two different units.
- number of packets
- number of bits
The resulting throughput will thus be measured in packets per seconds (pps) or bits per second (bps). These two measurements are obviously closely related.
throughput_packets [pps] = data [frames] / time [s]
throughput_bits [bps] = data [bits] / time [s]
= throughput_packets [pps] * packet_size [bits/packet]
The throughput in packets is typically called the packet rate or (depending on the protocol in question) frame rate, segment rate,....
After deciding what data is counted and in what unit (packets or bits) it is measured, there is only one thing left to decide: what does it mean to transmit or receive data. In layered networks such as TCP/IP, this differs from layer to layer.
Throughput in layered packet-based networks
In layered networks, the networking functionality is divided in multiple layers. For example, the TCP/IP stack consists of the following layers:
- Layer 5 or application layer: offers communication logic to an application and defines network messages (syntax and semantics
- Layer 4 or transport layer: reliable or unreliable end-to-end connectivity
- Layer 3 or network layer: identifying network hosts and sending data in their direction
- Layer 2 or data link layer: passing traffic over a link between two layer 3 hops
- Layer 1 or physical layer: transmitting bits over the physical link
Network protocols implement the functionality of such a layer. Examples are:
- Layer 5: HTTP, FTP, Telnet, SMTP, DHCP, ...
- Layer 4: TCP (connection-oriented) and UDP (connectionless, best effort)
- Layer 3: IPv4 and IPv6
- Layer 2: Ethernet, WLAN, ATM, PPP, DOCSIS, ...
- Layer 1: Ethernet physical layers (e.g. 1000BASE-TX), Wi-Fi physical layers (e.g. 802.11 n), SONET/SDH, DSL, ...
Each layer provides a service to the layer above. For example, layer 3 offers unreliable delivery over a network to the end-to-end layer 4 transport protocol. On the other hand, each protocols uses a combination of internal logic and services of the layer below to operate. For example, layer 3 delivers messages over a network by determining the direction of the destination (layer 3 logic) and sending the packet to the next hop in that direction (through a layer 2 service). This is clarified in the figure below:
Due to the different scope and functionality of network layers, the definition of transmitting (TX) and receiving (RX) changes as well. For example, from a layer 5 point of view, transmission of a network message may be reliable end-to-end delivery of that message, while layer 2 sees transmission as taking data across a single network link.
To define a measure of throughput, we still needed to define what TX and RX ment. In the context of a layered network stack, it comes down to this rule:
The layer N throughput is the amount of data flowing accross the interface between layer N and layer N-1 below in one second. Down the stack at the TX side and up the stack at the RX side.
This means that we can calculate both the throughput in packets per second and the throughput in bits per second for each of the layers. This is put in practice for the TCP/IP stack in the next section.
TCP/IP network stack
The figure below shows a typical TCP/IP network stack when used over Ethernet and a typical network packet that comes through that stack. It consists of the following protocols:
- Physical layer: Ethernet II
- Data link layer: Ethernet II
- Network layer: IPv4
- Transport layer: TCP
- Application layer: HTTP
When a layer uses a service of the layer below, this service takes up time and resources. In a stacked setup, bottlenecks throughout the stack may limit the throughput. Possible limitating factors are:
- physicial bandwith of the TX or RX network link (layer 1)
- retrensmission due to collision on link (layer 2)
- fragmentation and reassembly of packets (layer 3)
- retransmission due to lost packets (layer 4)
- limited buffers (layer 4)
Furthermore, much of this functionality has in itself impact on the throughput. This will be examplified in the following subsections.
Layer 2 throughput: Ethernet
The Ethernet protocol spans both the data-link layer (layer 2) and the physical layer (layer 1) and there is no clear interface between those two parts. Because there is no single interface, there is no single defintion of throughput, because the definition depends on that interface (see above).
Frames per second (framerate)
To calculate the throughput in Ethernet frames per second, we simply have to count the number of frames in the relevant period of time.
Bits per second
On the other hand, the throughput in bits per second depends on whether we consider none, some or all of the following functionality as part of the layer 2 protocol (data link transfer) or the layer 1 protocol (physical transmission):
- interleaving consecutive ethernet frames with pause bytes (interframe gap or IFG, 12 bytes)
- prepending ethernet frames with a preamble for physical synchronization (preamble, 7 bytes)
- announce the start of a frame with a specific sequence of bits (frame start delimiter or FSD, 1 byte)
- adding a CRC checksum field to detect transmission errors (frame sequence check or FSC, 4 bytes)
Functionality that is considered part of layer 2 is located above the inter-layer interface used in the throughput definition. Therefore, the corresponding bytes in the network packet are also part of the layer 2 header and thus part of the data moving accross the interface to and from layer 1. This is clarified in picture X.
By placing more and more functions at layer 2, we get the following configurations:
||Layer 2 frame size
||size of layer 2 payload + 14 bytes
|MAC header + FSC
||size of layer 2 payload + 18 bytes (14+4)
|MAC header + FSC + preamble + FSD
||size of layer 2 payload + 26 bytes (14+4+7+1)
|MAC header + FSC + preamble + FSD + interframe gap
||size of layer 2 payload + 38 bytes (14+4+7+1+12)
The effect on the throughput becomes clear when looking at an example. Consider a layer 2 payload of 1000 bytes and a frame rate (i.e. throughput in pps) of 1000 frames per second. The throughput in bits per second is then calculated follows:
||1014 byte/frame * 8 bit/byte * 1000 frames/seconds
|MAC header + FSC
||1018 byte/frame * 8 bit/byte * 1000 frames/seconds
|MAC header + FSC + preamble + FSD
||1026 byte/frame * 8 bit/byte * 1000 frames/seconds
|MAC header + FSC + preamble + FSD + interframe gap
||1038 byte/frame * 8 bit/byte * 1000 frames/seconds
Notice the effect of the configuration is limited. The results are only as the results only 2.3 percent apart. However, this effect becomes much larger when packets are smaller. Consider the same example with the minimal layer 2 payload size for ethernet, which is 46 bytes:
||60 byte/frame * 8 bit/byte * 1000 frames/seconds
|MAC header + CRC
||64 byte/frame * 8 bit/byte * 1000 frames/seconds
|MAC header + CRC + preamble + FSD
||72 byte/frame * 8 bit/byte * 1000 frames/seconds
|MAC header + CRC + preamble + FSD + interframe gap
||84 byte/frame * 8 bit/byte * 1000 frames/seconds
In this case the difference in throughput between the two extremes has risen to 40 percent!
Note: minimum size and padding
The Ethernet specification determines that the minimal size for a layer 2 Ethernet frame (including the MAC header but excluding the rest) is 60 bytes. This means that the minimal payload size is 46 bytes. When the layer 2 payload (typically an IP datagram) is less then 46 bytes, Ethernet will fill up the remaining bytes with padding to create a 60 byte frame.
This padding is therefore part of the layer 2 payload, but not of the layer 3 packet.
Layer 3 throughput: IP
Packets per second (packet rate)
When IP runs on top of ethernet, calculating the layer 3 throughput in packets per second is typically quite straightforward. An IP datagram corresponds exactly with the payload of a single Ethernet frame.
If the IP protocol receives a network segment (TCP) or datagram (UDP) from layer 4 that is too large for the interface to layer 2 it will either:
- Fragment it into multiple IP datagrams. This is the case for IPv4 routers and IPv4/6 end-hosts)
- Drop it. This is the case for IPv6 routers and IPv4/6 datagrams with the don't fragment flag set).
In any case, the number of layer 3 IP datagrams is the same as the number of layer 2 Ethernet frames.
Bits per second
To calculate the throughput in bits per second, simply strip the Ethernet frame from all layer 2 information. This includes all Ethernet II header information and the padding. Use the resulting IP datagram size to calculate the throughput in bits per second.
Note: other protocols
The calculation for ethernet is straighforward because a single IP datagram always matches a single Ethernet frame. Notice that for other layer 2 protocols, things may not be that easy.
For example, running IP over ATM causes a single IP datagram to be chopped up to fit the fixed 48-byte payload size of ATM cells. At the receiving side of the link, ATM reassembles the IP datagram before throwing it over the inter-layer interface towards layer 3. This may influence the throughput in a number of ways:
- The number of packets is different at layer 2 and layer 3. The number of ATM cells per second will be much higher then the number of IP datagrams per second. The difference grows with the size of the IP datagram.
- The layer 2 overhead also depends on the size of the IP datagram. For each 48-byte layer 2 payload, 5 bytes of overhead are introduced at layer 2. This means that the data throughput at layer 3 will be significantly lower then on layer 2 and even more so for long datagrams.
- If chopping up the IP datagram into cells or reassembling those cells into IP datagrams is a performence bottleneck, the number of layer 3 datagrams that can be processed by layer 2 in a second may limit the throughput.
Layer 4 throughput: TCP or UDP
The throughput definition is just as valid on layer 4: the amount of relevant data per second that is passed up or down the interface to layer 3 (IP). The packets passed down to or up from layer 3 are typically called segments in the case of TCP and segments or datagrams in the case of UDP. The layer 4 protocol may decide on their size.
The fragmentation and reassembly of both IPv4 and IPv6 (described above) is transparent to the layer 4 protocol running at the end-hosts. From the layer 4 point of view, the TCP segment or UDP datagram passed down at the sending host is received unchanged at the destination host.
However, the fragmentation at layer 3 may have impact on the layer 4 throughput:
- If IP fragmentation or reassembly is a performance bottleneck, the speed at which TCP or UDP can send packets may be slowed down.
- If no IP fragmentation is possible (due to the don't fragment flag or because IPv6 routers drop datagrams that are too large), nothing may get through!
Therefore, layer 4 will typically avoid layer 3 fragmentation by providing packets of the correct size. This may be done through end-to-end path Maximum Transmission Unit (MTU) discovery.
Segments or datagrams per second
For the connectionless UDP, defining the number of packets per second is trivial. The packet throughput is only limited by the performance of the underlying layers.
For the connection-oriented TCP all transferred segments such as ACK messages and possible retransmissions are included. The segment throughput may be influenced by:
- The performance of the underlying layers.
- Congestion of the network (TCP will stop sending when the congestion window is full).
- Processing speed at the receiver (TCP will stop sending when the receiver window is full).
- The congestion avoidance algorithm (e.g. may influence send rate and the number of retransmissions).
Due to this complexity, layer 4 throughput is not very interesting. If we want to know the end-user throughput, it is better to calculate the layer 5 throughput, which takes and end-user application point of view.
Bits per second
Once the number of TCP segments or UDP datagrams per second is defined, the throughput in bits per second can be easily calculated.
Layer 5 throughput: HTTP, FTP, ...
From the standpoint of an application protocol, such as HTTP, all functionality of the network stack is abstracted away.
- Dividing in segments (layer 4)
- Retransmissions (layer 4)
- Flow control and congestion control policy (layer 4)
- Packet-based networking (layer 3)
- Possible fragmentation and reassembly (layer 3)
- Possible data collisions on data links along the path (layer 2)
- Data link bandwiths along the path (layer 1)
The interface between layer 4 and layer 5 is typically called a socket. The socket interface completely hides the packet based network. Instead it acts as a buffered input or output stream:
- An application may only write data to a socket when there is place in the layer 4 (send) buffer. When no data can be pushed to layer 4, the application protocol may decide to:
- wait for buffer space, e.g. to send the rest of a file (TCP)
- drop the data, e.g. some samples in a voice call (UDP)
- A TCP socket may transparently buffer small messages into a single segment before transmitting. However, the application may force TCP to immediatly send the buffered content. An UDP socket will handle application messages immediately.
- An application is in control of pulling data from the socket. It decides on the frequency of checking a socket and the amount of data that is pulled at once. This means that the layer 4 (receive) buffer may also get filled up when traffic comes in faster than the application can or will handle. When it is full, the layer 4 protocol may:
- drop traffic (UDP)
- informing the sender to stop sending (TCP)
The rule of thumb for layer 5 thus boils down to the amount of data moving through the socket and may depend on many factors:
- Bandwith of the data links on the network path.
- Loss on the network and possible retransmission (either end-to-end or on a single link).
- Fragmentation and reassembly.
- Congestion and QoS on the network.
- How fast the transmitting application generates network traffic.
- How fast the receiving application reads incoming traffic. This may be limited by either performance or application logic.
Since the notion of packets is gone at layer 5, only the throughput in bits per second is defined.