ORBSLAM datasets review

Together with Arnoud, I decided on a clearer, more up-to-date topic for my thesis. Also, an investigation on available datasets for this topic.

Motivation for pre-made datasets

During last week’s meeting with Arnoud, we decided to renew the thesis topic. I will investigate the merging of Visual-Inertial ORB-SLAM into ORB-SLAM2. One of the interesting things to see here is whether ORB-SLAM2 will increase in its accuracy. To test my models, I want to use a dataset. This is more preferrable for multiple reasons:

  1. Setting up synchronized sensors can be time-consuming.
  2. Extrinsic calibration is error-prone, and has to be done precisely.
  3. Designing a path through a large state space that provides a robust test for the system is hard, and can be publication-worthy on its own (see links below).
  4. For these physics-driven experiments, a lot of real-world noise parameters (for example: motion blur, IMU noise) are not implemented or only in a very simplified way.
  5. Above points only distract from the topic I try to investigate.

Overview of used datasets

Mur-Artal and Tardós have used several datasets in their paper series. This is an investigation of the usability of those datasets for my setting.

Datasets per paper:

Technical review of datasets

Requirements of a dataset for my experiments are:

Dataset Optical sensors Optical frame rate Image resolution Shutter type Inertial sensor Inertial frame rate Calibrated system Ground truth provider GT: frame rate GT: position? GT: orientation? GT: lin.vel.? GT: lin.acc.? GT: ang.vel.? GT: error models?
NewCollege Point Grey Bumblebee (stereo, gray) and LadyBug 2 (spherical, RGB) 20 Hz, 3 Hz 512×384 px ? N/A N/A y GPS? 5 Hz y n ? ? ? n
TUM RGB-D Microsoft Kinect (RGB-D) 30 Hz 640×480 px ? Microsoft Kinect (3D acceleration only) ? y „high-accuracy motion-capture system” 100 Hz y ? ? ? ? ?
KITTI 2× Point Grey Flea 2 (FL2-14S3M-C) (grey), 2× Flea 2 (FL2-14S3C-C) (color) 15 Hz, 15 Hz 1384×1032 px, 1384×1032 px global, global OXTS RT 3003 (GPS/IMU) 100 Hz y GPS? 100 Hz? y ? ? ? ? ?
EuRoC 2× Aptina MT9V034 (grey) 2×20 Hz 768×480 pz global ADIS16448 (MEMS IMU, acceleration and angular rate) 200 Hz y Vicon motion-capture, Leica MS50 ? y y ? ? ? ?
Own setup Intel RealSense D435i RGB: 30 Hz, D: 90 Hz RGB: 1920×1080 px, D: 1280×720 px Global Intel RealSense D435i ? ? ABB IRB-4600-60/2.05 ? y y y? y? ? ?

One interesting fact from all this data, is that only EuRoC fully complies with my wishes: it has both camera images, IMU data and ground truth information. However:

  1. Is 1 data set with 2 scenes, each with multiple runs, enough for this thesis? Arnoud: „Only if you can show your method is better than the authors’.”
  2. Is it valid to view GPS data as ground truth information? If so, KITTI is also of interest. Arnoud: „No, rather use optitrack.”
  3. Can I do without angular velocity? If so, TUM RGB-D is also of interest.

Arnoud suggests looking at the Intel RealSense D435i, which will soon be released. It is an RGB-D sensor with integrated IMU. What I can see from the API is that it will most probably have no issues with time synchronization, but I haven’t found whether there is any extrinsic calibration between camera(s) and IMU. As a ground truth provider of pose, I could use the ABB robotic arm of the HvA.

Updated with Arnoud’s answers on 2018-11-19

References

  • Mike Smith, Ian Baldwin, Winston Churchill, Rohan Paul, and Paul Newman. The new college vision and laser data set. The International Journal of Robotics Research, 28(5):595–599, May 2009. [ bib | DOI ]
    
    @article{smith2009newcollege,
      title = {The new college vision and laser data set},
      author = {Smith, Mike and Baldwin, Ian and Churchill, Winston and Paul, Rohan and Newman, Paul},
      journal = {The International Journal of Robotics Research},
      volume = {28},
      number = {5},
      pages = {595--599},
      issn = {0278-3649},
      year = {2009},
      month = may,
      publisher = {SAGE Publications, Inc.},
      doi = {10.1177/0278364909103911},
      url = {http://www.robots.ox.ac.uk/NewCollegeData/}
    }
    
    
  • Jürgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard, and Daniel Cremers. A benchmark for the evaluation of RGB-D SLAM systems. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 573–580, October 2012. [ bib | DOI ]
    
    @inproceedings{sturm2012benchmark,
      author = {Sturm, Jürgen and Engelhard, Nikolas and Endres, Felix and Burgard, Wolfram and Cremers, Daniel},
      booktitle = {2012 IEEE/RSJ International Conference on Intelligent Robots and Systems},
      title = {A benchmark for the evaluation of {RGB-D} {SLAM} systems},
      year = {2012},
      pages = {573-580},
      keywords = {cameras;distance measurement;image colour analysis;image resolution;image sequences;object tracking;pose estimation;SLAM (robots);RGB-D SLAM systems;image sequences;Microsoft Kinect;ground truth camera poses;motion capture system;color image;depth image;full sensor resolution;video frame rate;ground-truth trajectory;motion-capture system;high-speed tracking cameras;office environment;industrial hall;camera motions;slow motions;loop closures;handheld Kinect;unconstrained 6DOF motions;Pioneer 3 robot;cluttered indoor environment;automatic evaluation tools;visual odometry systems;global pose error;Cameras;Simultaneous localization and mapping;Calibration;Trajectory;Visualization},
      doi = {10.1109/IROS.2012.6385773},
      issn = {2153-0866},
      month = oct
    }
    
    
  • Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013. [ bib | DOI ]
    
    @article{geiger2013kitti,
      author = {Geiger, Andreas and Lenz, Philip and Stiller, Christoph and Urtasun, Raquel},
      title = {Vision meets robotics: The {KITTI} dataset},
      journal = {The International Journal of Robotics Research},
      volume = {32},
      number = {11},
      pages = {1231-1237},
      year = {2013},
      doi = {10.1177/0278364913491297},
      abstract = { We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10–100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner and a high-precision GPS/IMU inertial navigation system. The scenarios are diverse, capturing real-world traffic situations, and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data is calibrated, synchronized and timestamped, and we provide the rectified and raw image sequences. Our dataset also contains object labels in the form of 3D tracklets, and we provide online benchmarks for stereo, optical flow, object detection and other tasks. This paper describes our recording platform, the data format and the utilities that we provide. }
    }
    
    
  • Raúl Mur-Artal, José M. Martínez Montiel, and Juan Domingo Tardós. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 31(5):1147–1163, October 2015. [ bib | DOI ]
    
    @article{murartal2015orbslam,
      author = {Mur-Artal, Raúl and Montiel, José M. Martínez and Tardós, Juan Domingo},
      journal = {IEEE Transactions on Robotics},
      title = { {ORB-SLAM}: A Versatile and Accurate Monocular {SLAM} System},
      year = {2015},
      month = oct,
      volume = {31},
      number = {5},
      pages = {1147-1163},
      keywords = {SLAM (robots);ORB-SLAM system;feature-based monocular simultaneous localization and mapping system;survival of the fittest strategy;Simultaneous localization and mapping;Cameras;Optimization;Feature extraction;Visualization;Real-time systems;Computational modeling;Lifelong mapping;localization;monocular vision;recognition;simultaneous localization and mapping (SLAM);Lifelong mapping;localization;monocular vision;recognition;simultaneous localization and mapping (SLAM)},
      doi = {10.1109/TRO.2015.2463671},
      issn = {1552-3098}
    }
    
    
  • Michael Burri, Janosch Nikolic, Pascal Gohl, Thomas Schneider, Joern Rehder, Sammy Omari, Markus W Achtelik, and Roland Siegwart. The EuRoC micro aerial vehicle datasets. The International Journal of Robotics Research, 35(10):1157–1163, January 2016. [ bib | DOI ]
    
    @article{burri2016euroc,
      author = {Michael Burri and Janosch Nikolic and Pascal Gohl and Thomas Schneider and Joern Rehder and Sammy Omari and Markus W Achtelik and Roland Siegwart},
      title = {The {EuRoC} micro aerial vehicle datasets},
      journal = {The International Journal of Robotics Research},
      volume = {35},
      number = {10},
      pages = {1157-1163},
      year = {2016},
      month = jan,
      doi = {10.1177/0278364915620033},
      abstract = { This paper presents visual-inertial datasets collected on-board a micro aerial vehicle. The datasets contain synchronized stereo images, IMU measurements and accurate ground truth. The first batch of datasets facilitates the design and evaluation of visual-inertial localization algorithms on real flight data. It was collected in an industrial environment and contains millimeter accurate position ground truth from a laser tracking system. The second batch of datasets is aimed at precise 3D environment reconstruction and was recorded in a room equipped with a motion capture system. The datasets contain 6D pose ground truth and a detailed 3D scan of the environment. Eleven datasets are provided in total, ranging from slow flights under good visual conditions to dynamic flights with motion blur and poor illumination, enabling researchers to thoroughly test and evaluate their algorithms. All datasets contain raw sensor measurements, spatio-temporally aligned sensor data and ground truth, extrinsic and intrinsic calibrations and datasets for custom calibrations. }
    }
    
    
  • Raúl Mur-Artal and Juan Domingo Tardós. Visual-inertial monocular SLAM with map reuse. IEEE Robotics and Automation Letters, 2(2):796–803, April 2017. [ bib | DOI ]
    
    @article{murartal2017visual,
      author = {Mur-Artal, Raúl and Tardós, Juan Domingo},
      journal = {IEEE Robotics and Automation Letters},
      title = {Visual-Inertial Monocular {SLAM} With Map Reuse},
      year = {2017},
      month = apr,
      volume = {2},
      number = {2},
      pages = {796-803},
      keywords = {cameras;robot vision;SLAM (robots);visual-inertial monocular SLAM;map reuse;visual-inertial odometry;sensor incremental motion;trajectory estimation;visual-inertial simultaneous localization;zero-drift localization;camera configuration;IMU initialization method;gyroscope;accelerometer;micro-aerial vehicle public dataset;monocular camera;Optimization;Cameras;Gravity;Accelerometers;Simultaneous localization and mapping;Tracking loops;Sensor fusion;SLAM;visual-based navigation},
      doi = {10.1109/LRA.2017.2653359},
      issn = {2377-3766}
    }
    
    
  • Raúl Mur-Artal and Juan Domingo Tardós. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 33(5):1255–1262, October 2017. [ bib | DOI ]
    
    @article{murartal2017orbslam2,
      title = { {ORB-SLAM2}: An Open-Source {SLAM} System for Monocular, Stereo, and {RGB-D} Cameras},
      author = {Mur-Artal, Raúl and Tardós, Juan Domingo},
      journal = {IEEE Transactions on Robotics},
      year = {2017},
      month = oct,
      volume = {33},
      number = {5},
      pages = {1255-1262},
      keywords = {cameras;distance measurement;Kalman filters;mobile robots;motion estimation;path planning;robot vision;SLAM (robots);ORB-SLAM;open-source SLAM system;lightweight localization mode;map points;zero-drift localization;SLAM community;monocular cameras;stereo cameras;simultaneous localization and mapping system;RGB-D cameras;Simultaneous localization and mapping;Cameras;Optimization;Feature extraction;Tracking loops;Trajectory;Localization;mapping;RGB-D;simultaneous localization and mapping (SLAM);stereo},
      doi = {10.1109/TRO.2017.2705103},
      issn = {1552-3098}
    }