Analyzing ORBSLAM3's published results against the public repo's defaults

Recently, ORB SLAM3 came out. I will focus on an ablation study of this system. But first, reproduce the published results.

The system as publicly posted online does not give any comparable results with those published in their article. The settings files were as provided in the repository. I’ve run some basic tests on the EuRoC datasets. Results (RMS ATE in meters) are reported below, with a comparison of those in the article. Each third row represents the factor by which the results generated by the public code differs from those published in the paper.

Task Result source MH01 MH02 MH03 MH04 MH05 V101 V102 V103 V201 V202 V203
Monocular Paper 0.017 0.014 0.031 0.066 0.044 0.033 0.016 0.037 0.021 0.022 -
  Default settings 1.152 6.040 2.532 0.278 2.080 0.999 10.211 0.773 1.463 2.250 0.952
  Diff. factor 67.8× 431.4× 81.7× 4.2× 47.3× 30.3× 638.2× 20.9× 70.0× 102.3× -
Stereo Paper 0.025 0.022 0.027 0.089 0.058 0.035 0.021 0.049 0.032 0.027 0.361
  Default settings 0.044 0.026 0.026 0.141 0.044 0.034 0.020 0.046 0.128 0.028 0.264
  Diff. factor 1.8× 1.2× 1.0× 1.6× 0.8× 1.0× 1.0× 0.9× 4.0× 1.0× 0.7×
Monocular Inertial Paper 0.032 0.053 0.033 0.099 0.071 0.043 0.016 0.025 0.041 0.015 0.037
  Default settings 0.036 0.082 0.030 0.151 0.106 0.041 0.016 0.024 0.042 0.016 0.020
  Diff. factor 1.1× 1.5× 0.9× 1.5× 1.5× 1.0× 1.0× 1.0× 1.0× 1.1× 0.5×
Stereo Inertial Paper 0.037 0.031 0.026 0.059 0.086 0.037 0.014 0.023 0.037 0.014 0.029
  Default settings 0.038 0.288 0.027 0.052 0.098 0.036 0.013 0.020 0.040 0.014 0.060
  Diff. factor 1.0× 9.3× 1.0× 0.9× 1.1× 1.0× 0.9× 0.9× 1.1× 1.0× 2.1×

Some noteworthy data from this table:

There might be several causes for some of the results:

  1. The authors have used per-scenario or per-dataset (or both!) settings. This would explain why the monocular scenario performs so badly when run with the default settings, but not why some of the generated scores are lower than reported.
  2. The publicly available algorithm is different from the one used in the publication. This would be unfitting for a paper which claims to provide ``an accurate open-source library for visual, visual-inertial and multi-map SLAM’’.
  3. The used evaluation method for the RMS ATE is different from the one provided. Again, that would be unfitting in the open-source spirit of the paper, implied by its title.

I assume the authors have used some optimal set of parameters for their paper, which they haven’t released. Because of the large size of the parameter space, I will use an evolutionary search to find the optimal set (which will probably hopefully approach the published results).

I follow the steps defined in §2.3 of Eiben’s Introduction to Evolutionary Computing:

References

  • Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. Martínez Montiel, and Juan Domingo Tardós. ORB-SLAM3: an accurate open-source library for visual, visual-inertial and multi-map SLAM. CoRR, abs/2007.11898, 2020. [ bib | DOI ]
    
    @article{campos2020orbslam3,
      author = {Carlos Campos and
                   Richard Elvira and
                   Juan J. Gómez Rodríguez and
                   José M. Martínez Montiel and
                   Juan Domingo Tardós},
      title = { {ORB-SLAM3:} An Accurate Open-Source Library for Visual, Visual-Inertial
                   and Multi-Map {SLAM}},
      journal = {CoRR},
      volume = {abs/2007.11898},
      year = {2020},
      url = {https://arxiv.org/abs/2007.11898},
      archiveprefix = {arXiv},
      eprint = {2007.11898},
      timestamp = {Wed, 29 Jul 2020 15:36:39 +0200},
      biburl = {https://dblp.org/rec/journals/corr/abs-2007-11898.bib},
      bibsource = {dblp computer science bibliography, https://dblp.org},
      doi = {10.1109/TRO.2021.3075644}
    }