Analyzing ORBSLAM3's published results against the public repo's defaults

6 May 2021

thesis

Recently, ORB SLAM3 came out. I will focus on an ablation study of this system. But first, reproduce the published results.

The system as publicly posted online does not give any comparable results with those published in their article. The settings files were as provided in the repository. I’ve run some basic tests on the EuRoC datasets. Results (RMS ATE in meters) are reported below, with a comparison of those in the article. Each third row represents the factor by which the results generated by the public code differs from those published in the paper.

Task	Result source	MH01	MH02	MH03	MH04	MH05	V101	V102	V103	V201	V202	V203
Monocular	Paper	0.017	0.014	0.031	0.066	0.044	0.033	0.016	0.037	0.021	0.022	-
	Default settings	1.152	6.040	2.532	0.278	2.080	0.999	10.211	0.773	1.463	2.250	0.952
	Diff. factor	67.8×	431.4×	81.7×	4.2×	47.3×	30.3×	638.2×	20.9×	70.0×	102.3×	-
Stereo	Paper	0.025	0.022	0.027	0.089	0.058	0.035	0.021	0.049	0.032	0.027	0.361
	Default settings	0.044	0.026	0.026	0.141	0.044	0.034	0.020	0.046	0.128	0.028	0.264
	Diff. factor	1.8×	1.2×	1.0×	1.6×	0.8×	1.0×	1.0×	0.9×	4.0×	1.0×	0.7×
Monocular Inertial	Paper	0.032	0.053	0.033	0.099	0.071	0.043	0.016	0.025	0.041	0.015	0.037
	Default settings	0.036	0.082	0.030	0.151	0.106	0.041	0.016	0.024	0.042	0.016	0.020
	Diff. factor	1.1×	1.5×	0.9×	1.5×	1.5×	1.0×	1.0×	1.0×	1.0×	1.1×	0.5×
Stereo Inertial	Paper	0.037	0.031	0.026	0.059	0.086	0.037	0.014	0.023	0.037	0.014	0.029
	Default settings	0.038	0.288	0.027	0.052	0.098	0.036	0.013	0.020	0.040	0.014	0.060
	Diff. factor	1.0×	9.3×	1.0×	0.9×	1.1×	1.0×	0.9×	0.9×	1.1×	1.0×	2.1×

Some noteworthy data from this table:

The generated results for the monocular scenario differ strongly from those published, underperforming by a factor between 4.2 to 638.2! The generated monocular RMS ATE scores are also a lot higher than in all other scenarios. While some difference between the monocular scenario and, for example, the stereo or monocular-inertial scenario, these inter-scenario differences are a lot bigger than reported.
Most other published RMS ATE scores are near those generated with the default settings within a factor 0.5 to 2.0 from the published value (with two exceptions).
Also noteworthy is that the published results are not always the best results.

There might be several causes for some of the results:

The authors have used per-scenario or per-dataset (or both!) settings. This would explain why the monocular scenario performs so badly when run with the default settings, but not why some of the generated scores are lower than reported.
The publicly available algorithm is different from the one used in the publication. This would be unfitting for a paper which claims to provide ``an accurate open-source library for visual, visual-inertial and multi-map SLAM’’.
The used evaluation method for the RMS ATE is different from the one provided. Again, that would be unfitting in the open-source spirit of the paper, implied by its title.

I assume the authors have used some optimal set of parameters for their paper, which they haven’t released. Because of the large size of the parameter space, I will use an evolutionary search to find the optimal set (which will probably hopefully approach the published results).

I follow the steps defined in §2.3 of Eiben’s Introduction to Evolutionary Computing:

Representation: Internal: a list of numbers, each representing a parameter. These will be translated to a YAML file format, that will be accepted by ORBSLAM3.
Evaluation function: ORBSLAM3’s evaluation/evaluate_ate_scale.py after a single run on dataset MH05. This computes the RMS ATE (root mean square absolute trajectory error) in meters. Lower values are more desired.
Population: 20. Because evaluation will take a long time, I’ll first try out how well this all works with a small population size.
Parent selection mechanism: TBA
Variation operators (recombination and mutation): TBA
Survivor selection mechanism: TBA

References

Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. Martínez Montiel, and Juan Domingo Tardós. ORB-SLAM3: an accurate open-source library for visual, visual-inertial and multi-map SLAM. CoRR, abs/2007.11898, 2020. [ bib | DOI ]


@article{campos2020orbslam3,
  author = {Carlos Campos and
               Richard Elvira and
               Juan J. Gómez Rodríguez and
               José M. Martínez Montiel and
               Juan Domingo Tardós},
  title = { {ORB-SLAM3:} An Accurate Open-Source Library for Visual, Visual-Inertial
               and Multi-Map {SLAM}},
  journal = {CoRR},
  volume = {abs/2007.11898},
  year = {2020},
  url = {https://arxiv.org/abs/2007.11898},
  archiveprefix = {arXiv},
  eprint = {2007.11898},
  timestamp = {Wed, 29 Jul 2020 15:36:39 +0200},
  biburl = {https://dblp.org/rec/journals/corr/abs-2007-11898.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org},
  doi = {10.1109/TRO.2021.3075644}
}