TCP Graphs Explained
Posted by Francis Rammeloo, Last modified by Dries Decock on 17 April 2019 04:37 PM
Components of the TCP graph
Goodput indicates the rate at which data is arriving at the application layer. It's what a web browser refers to as "download speed" when downloading a file.
Throughput indicates the rate of data that is transferred by the TCP protocol. This includes application payload, TCP header size and TCP retransmissions. In practice throughput is very close to goodput because the size of TCP header is only a fraction of the payload and retransmissions don't happen very often.
A large gap between goodput and throughput indicates a high retransmission rate.
The round-trip time indicates how long it takes on average for a transmitted TCP segment to be acknowledged.
Looking at the evolution of the round-trip time is more useful than focusing on its absolute value. For example a steadily rising RTT usually indicates that network buffers are filling up resulting in longer queuing delays. A sudden rise in RTT may indicate that a competing flow started transmitting data.
The transmit window determines how much data there can be in-flight at a given point in time. It's value is derived from the congestion window and receive window used by TCP internally. A high transmit window is needed for high throughput.
A drop in the transmit window usually indicates that packet loss has occurred.
Diagnosing common performance problems with the help of TCP graphs
Throughput graph grows too slowly
In below graph you can see that throughput takes a long time before reaching the maximum:
The reason turns out to be that the slow-start-threshold value is set too low. If we increase the slow-start threshold to infinite and re-run the test we get the result we want:
Throughput is stable but lower than expected
In this example we expected a throughput near 1 Gbit/s, however, it stabilized at around 400 Mbit/s:
If you see a stable low throughput without retransmissions then this typically indicates that the window scale factor is set too low.
The solution is to increase the window scale factor in the TCP config. We find that normally a value of 4-6 is sufficient. In a high-latency environment a higher value may be needed. In above example the window scale value was set to 2. If we increase it to 4 then we achieve a much higher average throughput:
Very large round-trip times
In this graph the throughput is fine but the round-trip time of >100ms seems to be way too high:
This usually means that the window-scale factor was set too high. A high window scale value leads to a very large transmit window and makes TCP send out packets faster than the network card can. This causes internal buffers to fill up resulting in queuing delay. This delay is what causes the large round-trip times.
In our example the window-scale factor was to the maximum value of 8. Reducing it to 4-6 should greatly improve the round-trip times.
Unstable throughput with many re-transmissions
Unstable throughput is typically caused by packet loss. TCP is designed to slow down in case packet loss is detected. Even a small loss rate can severly impact performance. Here's a few things you can try to improve the throughput in this situation:
Try setting a rate-limit to reduce the transmission-rate
Sometimes packet loss is caused by a device that is unable to keep up with the ByteBlower's transmission rate. A quick way to fix this is by introducing a rate limit. You can do this in the TCP configuration tab.
Try using a different congestion avoidance algorithm
The congestion avoidance algorithm determines how TCP deals with packet loss. TCP interprets packet loss as a sign of network congestion (competition with other flows) and it actively slows down so that other flows are able to get their fair share of the available bandwidth. Different congestion avoidance algorithms work better different type of network environments. This is a very complex topic and still actively researched today.
Here are a few simple guidelines that can help you to select the best congestion avoidance algorithm:
Capture the network traffic and perform in-depth analysis using a tool like Wireshark
Sometimes this is the only way to figure out what’s really going on.