Benchmark of serializers
As we have previously seen, throughput is essential. One major factor in throughput is how data is serialized. With the experiments in this post, I motivate my choice for capnproto in this project.
Serialization is the process of translating in-memory objects in a format that can be stored or transmitted and reconstructed later. There are several serialization protocols implemented for C++. Konstantin Sorokin has provided a GitHub repository that performs a benchmark on a collection of these serializers, to perform a comparison on them. I have updated the serializers to their latest versions in this commit, and performed the benchmark on my own laptop (as a baseline) and on the UDOO as this will be the data collector.
As I am writing this, the pull requests are not yet made. Therefore I will refer to cpp-serializers repository with my own forked version.
Experiment setup
To determine which serialization tool to use, I have used the following serializers (and their specific version):
Serializer | Version |
---|---|
thrift | 0.13.0 |
protobuf | 3.11.2 |
boost | 1.72.0 |
msgpack | 3.1.1 |
cereal | 1.3.0 |
avro | 1.9.1 |
capnproto | 0.7.0 |
flatbuffers | 1.11.0 |
YAS | 7.0.5 |
The benchmarks are performed on the UDOO.
The GitHub repository pkok/cpp-serializers has been compiled on the UDOO:
cd ~
git clone git@github.com:pkok/cpp-serializers.git
cd cpp-serializers
git checkout thesis-experiments
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build .
Afterwards, the UDOO has been rebooted, to know that there are no unneccessary background processes running, we can run two small experiments
The tiny scripts presented below are stored in the repository’s ./images/generate_graph_data.sh
.
Size
To determine the message sizes, the following bash script is executed:
cd ~/cpp-serializers/build/
for t in thrift-binary thrift-compact protobuf boost msgpack cereal avro capnproto flatbuffers yas yas-compact;
do
./benchmark -i 1 -s $t | grep Size | awk "{printf \"%d, # %s\n\", \$3, \"$t\"}";
done
Time
As stated in the original experiment, for the time experiment, 1.000.000 serialize-deserialize operations are 50 times performed. The presented results are averaged over the 50 runs.
While connected through SSH with GNU-screen, the results of the following bash script are presented:
for t in thrift-binary thrift-compact protobuf boost msgpack cereal avro yas yas-compact capnproto flatbuffer;
do
rm -f /tmp/$t.time
for i in `seq 1 50`
do
./benchmark -i 1000000 -s $t | grep Time | awk '{print $3}' >>/tmp/$t.time
done;
awk "{ sum += \$1 } END {printf \"%f, # %s\n\", sum/50, \"$t\"}" /tmp/$t.time
done
As capnproto and flatbuffers store the direct in-memory representation of the data, we measure the full build-serialize-deserialize cycle of the data structure. For the other libraries, only the serialize-deserialize cycle of the already built data structures is measured.
Results
The results of above experiments are reported in the tables below. The results are sorted based on how much time it took on the UDOO, fastest first. The two different experiments are reported in separate tables.
Serializer | Object size (bytes) | Time on laptop (ms) | Time on UDOO (ms) | \(T_{U} / T_{l}\) |
---|---|---|---|---|
yas | 17416 | 3307.36 | 12526.64 | 3.78750 |
cereal | 17416 | 14203.78 | 43540.38 | 3.06541 |
thrift-binary | 17017 | 13795.10 | 45265.94 | 3.28131 |
boost | 17470 | 14231.32 | 52358.60 | 3.67911 |
msgpack | 13402 | 31668.70 | 74930.82 | 2.36608 |
protobuf | 16116 | 30041.42 | 89314.20 | 2.97304 |
yas-compact | 13321 | 28981.90 | 92681.32 | 3.19790 |
thrift-compact | 13378 | 41831.28 | 128487.80 | 3.07157 |
avro | 16384 | 55352.74 | 139369.58 | 2.51784 |
Results of running the serialize-deserialze experiments.
Serializer | Object size (bytes) | Time on laptop (ms) | Time on UDOO (ms) | \(T_{U} / T_{l}\) |
---|---|---|---|---|
capnproto | 17768 | 4512.32 | 15054.64 | 2.70578 |
flatbuffers | 17632 | 5821.92 | 15752.86 | 3.33634 |
Results of running the build-serialize-deserialize experiments.
Discussion
When comparing between the performance on my laptop and on the UDOO, it is interesting to look at the differences in ranking. Differences in ranking seldomly occur, and if present, do not exceed more than 1. These changes seem neglectable.
The final colum in both tables present the speedup ratio the laptop provides over the UDOO. We see that this lies between 2.4 (msgpack) and 3.8 (yas). This might be explained by unexpected background tasks running on any platform, or specific CPU optimalizations performed by the compiler or serialization library. If a good comparison between platforms is needed, more investigation is needed. For my purposes, this is not needed, but might be interesting for future projects.
yas seems to be the fastest on both platforms in the serialize-deserialize category. It is a factor 3.5 (UDOO) to 4.3 (laptop) faster than the number two in that category. However, it differs not that much with capnproto (factor 1.2 (UDOO) to 1.4 (laptop)) or flatbuffers (factor 1.3 (UDOO) to 1.8 (laptop)), while these two also build the data structure during the experiments. Including the build step in yas will most certainly increase the duration of each simulation, and probably making it slower than capnproto and flatbuffers.
Differences in speed between capnproto and flatbuffers are present but relatively small.
Object size ranges from 13321 bytes (yas-compact) to 17768 bytes (capnproto). The object sizes of yas, capnproto and flatbuffers also barely differ, with a maximum size difference of 352 bytes between capnproto and yas.
Conclusion
The differences between the top-tier serializers – yas, capnproto and flatbuffers – are small, but present. Having a faster serializer comes with the price of larger objects. For this project, speed is more limiting than object size. Therefore I will use capnproto for this project.