Analyzing ORBSLAM3's published results against the public repo's defaults
Recently, ORB SLAM3 came out. I will focus on an ablation study of this system. But first, reproduce the published results.
The system as publicly posted online does not give any comparable results with those published in their article. The settings files were as provided in the repository. I’ve run some basic tests on the EuRoC datasets. Results (RMS ATE in meters) are reported below, with a comparison of those in the article. Each third row represents the factor by which the results generated by the public code differs from those published in the paper.
Task | Result source | MH01 | MH02 | MH03 | MH04 | MH05 | V101 | V102 | V103 | V201 | V202 | V203 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Monocular | Paper | 0.017 | 0.014 | 0.031 | 0.066 | 0.044 | 0.033 | 0.016 | 0.037 | 0.021 | 0.022 | - |
Default settings | 1.152 | 6.040 | 2.532 | 0.278 | 2.080 | 0.999 | 10.211 | 0.773 | 1.463 | 2.250 | 0.952 | |
Diff. factor | 67.8× | 431.4× | 81.7× | 4.2× | 47.3× | 30.3× | 638.2× | 20.9× | 70.0× | 102.3× | - | |
Stereo | Paper | 0.025 | 0.022 | 0.027 | 0.089 | 0.058 | 0.035 | 0.021 | 0.049 | 0.032 | 0.027 | 0.361 |
Default settings | 0.044 | 0.026 | 0.026 | 0.141 | 0.044 | 0.034 | 0.020 | 0.046 | 0.128 | 0.028 | 0.264 | |
Diff. factor | 1.8× | 1.2× | 1.0× | 1.6× | 0.8× | 1.0× | 1.0× | 0.9× | 4.0× | 1.0× | 0.7× | |
Monocular Inertial | Paper | 0.032 | 0.053 | 0.033 | 0.099 | 0.071 | 0.043 | 0.016 | 0.025 | 0.041 | 0.015 | 0.037 |
Default settings | 0.036 | 0.082 | 0.030 | 0.151 | 0.106 | 0.041 | 0.016 | 0.024 | 0.042 | 0.016 | 0.020 | |
Diff. factor | 1.1× | 1.5× | 0.9× | 1.5× | 1.5× | 1.0× | 1.0× | 1.0× | 1.0× | 1.1× | 0.5× | |
Stereo Inertial | Paper | 0.037 | 0.031 | 0.026 | 0.059 | 0.086 | 0.037 | 0.014 | 0.023 | 0.037 | 0.014 | 0.029 |
Default settings | 0.038 | 0.288 | 0.027 | 0.052 | 0.098 | 0.036 | 0.013 | 0.020 | 0.040 | 0.014 | 0.060 | |
Diff. factor | 1.0× | 9.3× | 1.0× | 0.9× | 1.1× | 1.0× | 0.9× | 0.9× | 1.1× | 1.0× | 2.1× |
Some noteworthy data from this table:
- The generated results for the monocular scenario differ strongly from those published, underperforming by a factor between 4.2 to 638.2! The generated monocular RMS ATE scores are also a lot higher than in all other scenarios. While some difference between the monocular scenario and, for example, the stereo or monocular-inertial scenario, these inter-scenario differences are a lot bigger than reported.
- Most other published RMS ATE scores are near those generated with the default settings within a factor 0.5 to 2.0 from the published value (with two exceptions).
- Also noteworthy is that the published results are not always the best results.
There might be several causes for some of the results:
- The authors have used per-scenario or per-dataset (or both!) settings. This would explain why the monocular scenario performs so badly when run with the default settings, but not why some of the generated scores are lower than reported.
- The publicly available algorithm is different from the one used in the publication. This would be unfitting for a paper which claims to provide ``an accurate open-source library for visual, visual-inertial and multi-map SLAM’’.
- The used evaluation method for the RMS ATE is different from the one provided. Again, that would be unfitting in the open-source spirit of the paper, implied by its title.
I assume the authors have used some optimal set of parameters for their paper, which they haven’t released. Because of the large size of the parameter space, I will use an evolutionary search to find the optimal set (which will probably hopefully approach the published results).
I follow the steps defined in §2.3 of Eiben’s Introduction to Evolutionary Computing:
- Representation: Internal: a list of numbers, each representing a parameter. These will be translated to a YAML file format, that will be accepted by ORBSLAM3.
- Evaluation function: ORBSLAM3’s
evaluation/evaluate_ate_scale.py
after a single run on dataset MH05. This computes the RMS ATE (root mean square absolute trajectory error) in meters. Lower values are more desired. - Population: 20. Because evaluation will take a long time, I’ll first try out how well this all works with a small population size.
- Parent selection mechanism: TBA
- Variation operators (recombination and mutation): TBA
- Survivor selection mechanism: TBA
References
-
Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. Martínez
Montiel, and Juan Domingo Tardós.
ORB-SLAM3: an accurate open-source library for visual,
visual-inertial and multi-map SLAM.
CoRR, abs/2007.11898, 2020.
[ bib | DOI ]
@article{campos2020orbslam3, author = {Carlos Campos and Richard Elvira and Juan J. Gómez Rodríguez and José M. Martínez Montiel and Juan Domingo Tardós}, title = { {ORB-SLAM3:} An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map {SLAM}}, journal = {CoRR}, volume = {abs/2007.11898}, year = {2020}, url = {https://arxiv.org/abs/2007.11898}, archiveprefix = {arXiv}, eprint = {2007.11898}, timestamp = {Wed, 29 Jul 2020 15:36:39 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2007-11898.bib}, bibsource = {dblp computer science bibliography, https://dblp.org}, doi = {10.1109/TRO.2021.3075644} }