Tutorial: Interpreting aggregate throughput values in GUI reports
Posted by Tim De Backer, Last modified by Dries Decock on 03 April 2018 02:32 PM
This tutorial expands on the aggregate measurements found in test-report from the ByteBlower GUI. These measurements are generated automatically for any scenario with TCP-flows. They were included to provide insight into the behaviour of a TCP flow in a particular setup. In particular, most TCP connections attempt to throttle their throughput in order to fairly fill the available bandwidth. Multiple sessions using the same network links need to blindly cooperate with each other. The amount different implementations already hints for this to be no simple feat. It is not unusual for multiple data-flows arriving at the same destination, to have a significantly different throughput. Especially in such situations, the aggregate measurements help understanding the whole picture.
The basis of this text is the report generated from a small scenario. Both the ByteBlower project and test-result are attached to this tutorial. One is invited to repeat the test, nonetheless we will explain the relevant section explicitly throughout this text Especially for the first section, this will be the sole focus. The next two sections provide a first look at the test results. In the same order as found in the report, we will start from the summary table. It will be shown that these values can change significantly over time. These sections offer a gentle setup to introduce the aggregate results over time. From here, we retrace our steps and generate the aggregate summary. Finally the text ends with a word of warning when attempting to manually recalculate the aggregate values.
In this text we will solely work with the ByteBlower GUI and especially the report it generates for TCP flows. This GUI is build on top of the ByteBlower API. Any user is thus able to collect similar or even more detailed measurements through this same interface. To keep this text brief, this is left to other texts. Similarly, we will not expand on the particulars of various TCP configurations.
Test-scenario's are run from the ByteBlower GUI. During each test, measurements results are collected and stored. After a test, these values are collected into the report. This document contains two types of data
Throughout this text we will show small parts of the same report, as mentioned above the report and original ByteBlower project are attached to this tutorial.
The first table below lists the configuration of the test. The scenario used four TCP flows. None of the flows start and end at the same time, but there is significant overlap between all. As will be shown below, all flows target the same port. In addition, the ByteBlower throttles FLOW_3 to 10Mbit. This is a reasonably simple scenario intended predominantly to display the aggregate measurements. Often other tests will naturally show similarity to the one used here.
Per flow test results
In the previous section we briefly touched the configuration, the majority of the report displays various results of the scenario run. As we mentioned in the introduction, before detailing the aggregate results, we'll first display the per-flow measurements. But, even in these separate measurements, the aggregate behaviour is already visible.
The TCP Throughput table is found below. It offers a general overview of each flow. It summarizes their results over their whole lifetime in the scenario run. As listed, the destination of all flows is PORT_1, in other words, most of the TCP traffic did flow in towards this ByteBlower port. Except for the last two flows, all have a different source ByteBlower port. The average throughput of the flow is one of the listed values. This value is calculated from the received payload size, divided by the duration. As displayed, large differences are visible for this average. As expected, 'FLOW_3' has the lowest throughput. This is intended, a rate-limited was applied to this flow. Others archive a much higher average throughput. Given these differences, it is difficult to guess to average throughput arriving at 'PORT_1'. We'll demonstrate this even stronger in the next paragraph.
The average throughput of the flow is measurement over the duration of the flow. Depending on the configured congestion algorithm and other parameters, the TCP protocol at the source ByteBlower port will increase or lower the amount of data it attempts to transmit. In addition, network devices along the path may drop some of the packets. The actual bandwidth might thus differ from moment to moment. To visualize this clearly, two graphs were selected below. We'll use these to highlight the dynamic behaviour of the throughput. In both the name of flow to be displayed is listed in blue, on the left edge of the graph.
The first graph shows the results of 'FLOW_3', also configured third in the tables above.T his flow was rate-limited to 10Mbit. Although it is not our goal in this text to fully explain this graph, we'll use the opportunity to detail some of the concepts. The two shades of blue show the Throughput and the Goodput. Both are strongly correlated to each other. The throughput line displays the received traffic at the TCP layer. These packets are subsequently processed by the protocol. This includes removing the small header, but also potentially reordering the frames and occasionally dropping duplicated packets. The resulting data-stream from TCP-protocol is 'plain' HTTP traffic. How fast this data is streamed out, is called the Goodput. As an example, in a WebBrowser this Goodput will be displayed in the download view. In this setup, almost none traffic was lost, thus Throughput and Goodput follow each with only a minimal separation. Returning to the rate-limited 'FLOW_3', it is not unexpected for the Throughput to be near the imposed limit. As we remarked before, so did the average throughput. Especially in more complex examples most flows will rather have dynamics changes as shown in the next example.
The second graph is generated from 'FLOW_4'. No rate-limit is applied to this flow, the TCP protocol was allowed to choose its throughput based on perceived available bandwidth. As can be traced back in the previous samples, this flow has the same source and destination port as the first graph, 'FLOW_3'. As is displayed, the average throughput for this flow changed significantly over time. Initially it had to share the network with other flows. Between the 5 minute and 6 minute marks, this flow was could utilize a significant amount of extra bandwidth. As shown, the throughput rose in clearly in two distinct steps. Comparing the first and second graph, show for the last increment of 'FLOW_4' to occur neatly at the same moment as the end of 'FLOW_3'.
Aggregate results over time
Above we picked two result-over-time graphs from the total of four found in the report. These were displayed in full and we noted for some of their changes to neatly compose together. A drop in used bandwidth of 'FLOW_3' resulted in a similar increase for 'FLOW_4'. All configured TCP-flows target 'PORT_1', thus this exercise can be repeated using all throughput graphs. Adding them all together would offer a good estimate of the received throughput at the destination. The Byteblower GUI will generate such a visualization for each ByteBlower port with a TCP flow. The images below are picked from the attached report.
The section 'Aggregate Rx Throughput' Over Time' found in the report, contains a number of graphs. The ByteBlower Port though which the traffic is calculated is listed on the left axis. As mentioned above, all configured TCP flows target 'PORT_1'. It is thus particularly interesting to look at the Aggregate Rx throughput of this port. In the figure below, we show centrally the stacked graph as found in the report. This representation is created from the individual Rx throughput measurements of the four TCP flows. Each of the coloured regions can be traced back to one such throughput over time measurement. For some, like 'FLOW_1', the shape of the filled region and the graph in the annotation is mostly the same. In all cases, the order of the flows will be the same as configured in the scenario. Unavoidably, this makes some flows, such as 'FLOW_3', shift down every time a TCP-flow ends.
In the explanation above, we reused the TCP throughput graphs from the attached report. As we already briefly touched, these graphs contain two shaded blue lines: the goodput and the throughput. The aggregate graphs are constructed from the TCP throughput (dark blue). For the aggregate graph found above, the difference between both measurements is small enough to make little visual difference. In the three additional aggregate throughput graphs found in the report, this difference between goodput and throughput is immense. As one can see, these graphs are each generated for the source ByteBlower port of the TCP-flow. As most data is transmitted to the configured target 'PORT_1', a small amount of TCP overhead will be transmitted to the source ByteBlower port. These are predominantly acknowledgement messages and close the loop. They offer feedback to the transmitting end to allow for TCP to guarantee a reliable connection. The changes in both throughput and goodput will be strongly influenced this return traffic. We will not delve deeper in the mechanics of the mechanism. In more complex scenarios, this trickle of data will be found together with TCP-flows targeted to this ByteBlower port. In addition, as the figure below shows. Multiple TCP-flows will each have their own return path.
Summary Aggregate results
The sections above detailed the Aggregate Results Over Time. Similar to the remark at the beginning of this text, these results over time graphs tend to be noisy. To this end, the report contains a summary of the aggregate traffic tabular form. The generated summary for this report is listed immediately below. The next paragraphs will detail information found in the table.
To calculate this table, the two parts of the throughput formula must be defined:
The amount of data, i.e.
Frame blasting flows are ignored for now, because the server does not offer accurate timing on them.
The duration, i.e.
This is clarified in the picture below.
However, the information provided by this report is too limited for several real live use cases. For example, when simulating multiple concurrent data flows through a device under test, the relevant result is typically the aggregate throughput of that device.
More importantly, the flow based results may lead to wrong conclusions. The typical example is calculating the aggregate throughput simply as the sum of throughputs of all flows running over that path.
Danger of aggregating flow-based throughput results
While aggregating the throughput results of multiple concurrent flows by simply adding them up may seem intuitive, this typically leads to wrong results.
To see why, consider the following example setup:
When running the traffic, something like the following may happen. Note this is strongly simplified.
This situation is shown in the graph below.
The average combined throughput should obviously be 90 Mbps. However, when summing up the flow throughput results, we get this result:
throughput_a = 360 Mbit / 8s = 45 Mbps throughput_b = (360 Mbit + 360 Mbit) / (8s + 4s) = 60 Mbps throughput_sum = throughput_a + throughput_b = 105 Mbps
Note this summed value of 105 Mbps is different from the expected 90 Mbps, and is even higher than is physically possible (100 Mbps)!
The main problem is that flow throughputs are calculated over their own period, while the aggregated throughput of a channel must be calculated over the complete testing period. For our example, this becomes:
throughput_aggr = sum(flowsize) / total_duration = (360 Mbit + 720 Mbit) / 12s = 90 Mbps
When we looking at the contributions of the various flows to the aggregate throughput, we get the same result
throughput_aggr_a = 360 Mbit / 12s = 30 Mbps throughput_aggr_b = 720 Mbit / 12s = 60 Mbps throughput_aggr = throughput_aggr_a + throughput_aggr_b = 90 Mbps
To avoid this issue, the report now includes precalculated aggregated throughput values for all ByteBlower ports. How they are calculated is shown in the next section.