Decided to focus on a simulated implementation for my thesis. Main reason: could not obtain extrinsic camera-IMU calibration.

Reason for switch to simulation

Toby and I decided that, for now, I will focus on an implementation on a simulator for the system. I could not obtain an extrinsic camera-IMU calibration. The things I have tried:

  1. I tried Furgale et al.’s Kalibr, but it didn’t work (see this and this post; setting the exposure to 400 microseconds did not resolve the issue);
  2. The method of Lobo and Dias is named InerVis, but the download section is down.
  3. When I contacted Kelly and Sukhatme, they adviced me to use Kalibr as their toolkit is slow.
  4. Mirzaei and Roumeliotis’s and Hol’s implementations cannot be shared as the suite or parts thereof are under a commercial license. Hol told me that, by following his dissertation, it should not be too hard to implement this Kalman filter based approach myself.
  5. Toby and I decided that up to now I have spend enough time on this issue, and should continue.

Simulator choice

There are basically two options: Gazebo and USARsim.

Pro’s of Gazebo vs. USARsim:

Pro’s of USARsim vs. Gazebo:

Cons of simulating vs. the real world:

I will try to talk with Arnoud tomorrow to make a decision on which simulator to use.

Motion blur model

Motion blur will be simulated in the following way. Let \(F_t\) be the camera frame at the current time \(t\). The blurred frame \(F^B_t\) is obtained as the linearly weighed combination of the previously blurred frame \(F^B_{t-1}\) and \(F_t\) as \(F^B_t = (1 - w) F^B_{t-1} + w F_t\). For the initial blurred image \(F^B_0\), take \(w = 1\), or for \(t \leq 0\), take \(F^B_{t-1} = F_0\).

By expanding this recursive expression once for an arbitrary time \(t\), you have \(F^B_t = \left(1 - w\right) \left( \left(1 - w\right) F^B_{t-2} + w F_{t-1}\right) + w F_t = (1 - 2w + w^2) F^B_{t-2} + (w - w^2) F_{t-1} + w F_t\). Does this imply some stronger/deeper connection? Didn’t Sutton and Barto have to say something about this in their chapter on TD(λ)?

Bigger values for the weight \(w\) will reduce the blur, which corresponds with a lower exposure time in real cameras. Lower values for the weight \(w\) will increase the blur more, as if the exposure time has been increased. However, the difference with real motion blur is that in this model a discrete path is recorded of the movement. A higher framerate will make this model more realistic.

More complex and realistic models exist. Some need the 3D structure of the observed scene (Potmesil and Chakravarty), while others are designed with stop-motion animations in mind (Brostow and Essa). For a full overview, see Navarro et al..

These models have not been implemented for the sake of time restrictions. The proposed model does seem sufficient for the current purpose.

Ideas about experiments

The GP-BayesFilter framework, and especially GPBF-Learn, is designed as an alternative for GP Latent Variable Models which exploit the information provided by a system’s control vector (input for state change).

To see if the control vector in the localisation problem of AR is informative, we can compare the error of GPBF-Learn and GPLVM on the same Kalman filter \(KF^\mathbf{u}\) and data set \(D\). This „difference in errors” \(\mathbf{e}_\mathbf{u} = \mathbf{e}^\mbox{GPBF-Learn}_\mbox{GPLVM}(KF^\mathbf{u})\) (needs to be defined more precisely) should be compared with a similar Kalman filter, but not using any control vector, \(KF^\emptyset\), and the same dat set \(D\). This also results in a „difference in errors” \(\mathbf{e}_\emptyset = \mathbf{e}^\mbox{GPBF-Learn}_\mbox{GPLVM}(KF^\emptyset)\). Comparing these two „differences in errors” can bring us to several conclusions:

  1. If \(\mathbf{e}_\mathbf{u} < \mathbf{e}_\emptyset\), the control vector is informative.
  2. If \(\mathbf{e}_\mathbf{u} \approx \mathbf{e}_\emptyset\), the control vector is not informative.
  3. If \(\mathbf{e}_\mathbf{u} > \mathbf{e}_\emptyset\), the control vector contains misleading information.

Bleser et al. defines several Kalman filters for the earlier described localisation problem. For these experiments, their first/third? and fourth models will be used. The former model does not use any control vector, whereas the other does use the accelerometer readings in the control vector. Are these models comparative?

References

  • Michael Potmesil and Indranil Chakravarty. Modeling motion blur in computer-generated images. In Proceedings of the 10th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH '83, pages 389–399, New York, NY, USA, 1983. ACM. [ bib | DOI ]
    
    @inproceedings{potmesil1983modeling,
      author = {Potmesil, Michael and Chakravarty, Indranil},
      title = {Modeling Motion Blur in Computer-generated Images},
      booktitle = {Proceedings of the 10th Annual Conference on Computer Graphics and Interactive Techniques},
      series = { {SIGGRAPH} '83},
      year = {1983},
      isbn = {0-89791-109-1},
      location = {Detroit, Michigan, USA},
      pages = {389--399},
      numpages = {11},
      doi = {10.1145/800059.801169},
      acmid = {801169},
      publisher = {ACM},
      address = {New York, NY, USA},
      keywords = {Camera model, Digital optics, Image restoration, Motion blur, Point-spread function}
    }
    
    
  • Gabriel J. Brostow and Irfan Essa. Image-based motion blur for stop motion animation. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH '01, pages 561–566, New York, NY, USA, 2001. ACM. [ bib | DOI ]
    
    @inproceedings{brostow2001image,
      author = {Brostow, Gabriel J. and Essa, Irfan},
      title = {Image-based Motion Blur for Stop Motion Animation},
      booktitle = {Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques},
      series = {SIGGRAPH '01},
      year = {2001},
      isbn = {1-58113-374-X},
      pages = {561--566},
      numpages = {6},
      doi = {10.1145/383259.383325},
      acmid = {383325},
      publisher = {ACM},
      address = {New York, NY, USA},
      keywords = {animation, computer vision, image-based rendering, motion blur, stop motion animation, temporal antialiasing, video post-processing}
    }
    
    
  • Jorge Lobo and Jorge Dias. Relative pose calibration between visual and inertial sensors. The International Journal of Robotics Research, 26(6):561–575, June 2007. [ bib | .pdf | DOI ]
    
    @article{lobo2007relative,
      title = {Relative pose calibration between visual and inertial sensors},
      author = {Lobo, Jorge and Dias, Jorge},
      journal = {The International Journal of Robotics Research},
      volume = {26},
      number = {6},
      pages = {561--575},
      month = jun,
      year = {2007},
      publisher = {SAGE Publications},
      url = {http://ijr.sagepub.com/content/26/6/561.full.pdf},
      doi = {10.1177/0278364907079276}
    }
    
    
  • Faraz M. Mirzaei and Stergios I. Roumeliotis. A kalman filter-based algorithm for imu-camera calibration: Observability analysis and performance evaluation. Robotics, IEEE Transactions on, 24(5):1143–1156, October 2008. [ bib | DOI ]
    
    @article{mirzaei2008kfbased,
      author = {Mirzaei, Faraz M. and Roumeliotis, Stergios I.},
      journal = {Robotics, IEEE Transactions on},
      title = {A Kalman Filter-Based Algorithm for IMU-Camera Calibration: Observability Analysis and Performance Evaluation},
      year = {2008},
      month = oct,
      volume = {24},
      number = {5},
      pages = {1143-1156},
      abstract = {Vision-aided inertial navigation systems (V-INSs) can provide precise state estimates for the 3-D motion of a vehicle when no external references (e.g., GPS) are available. This is achieved by combining inertial measurements from an inertial measurement unit (IMU) with visual observations from a camera under the assumption that the rigid transformation between the two sensors is known. Errors in the IMU-camera extrinsic calibration process cause biases that reduce the estimation accuracy and can even lead to divergence of any estimator processing the measurements from both sensors. In this paper, we present an extended Kalman filter for precisely determining the unknown transformation between a camera and an IMU. Contrary to previous approaches, we explicitly account for the time correlation of the IMU measurements and provide a figure of merit (covariance) for the estimated transformation. The proposed method does not require any special hardware (such as spin table or 3-D laser scanner) except a calibration target. Furthermore, we employ the observability rank criterion based on Lie derivatives and prove that the nonlinear system describing the IMU-camera calibration process is observable. Simulation and experimental results are presented that validate the proposed method and quantify its accuracy.},
      keywords = {Kalman filters;calibration;computer vision;image sensors;inertial navigation;observability;state estimation;Kalman filter-based algorithm;inertial measurement unit;observability analysis;observability rank criterion;performance evaluation;vision-aided inertial navigation systems;Extended Kalman filter;Lie derivatives;inertial measurement unit (IMU)-camera calibration;observability of nonlinear systems;vision-aided inertial navigation},
      doi = {10.1109/TRO.2008.2004486},
      issn = {1552-3098}
    }
    
    
  • Gabriele Bleser and Didier Stricker. Advanced tracking through efficient image processing and visual-inertial sensor fusion. In Virtual Reality Conference, pages 137–144. IEEE, 2008. [ bib | DOI ]
    
    @inproceedings{bleser2008advanced,
      author = {Bleser, Gabriele and Stricker, Didier},
      booktitle = {Virtual Reality Conference},
      title = {Advanced tracking through efficient image processing and visual-inertial sensor fusion},
      year = {2008},
      pages = {137--144},
      organization = {IEEE},
      doi = {10.1109/VR.2008.4480765}
    }
    
    
  • Jonathan Ko and Dieter Fox. GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models. Autonomous Robots, 27(1):75–90, July 2009. [ bib | DOI ]
    
    @article{ko2009gp-bayesfilter,
      year = {2009},
      month = jul,
      issn = {0929-5593},
      journal = {Autonomous Robots},
      volume = {27},
      number = {1},
      doi = {10.1007/s10514-009-9119-x},
      title = { {GP-B}ayes{F}ilters: {B}ayesian filtering using {G}aussian process prediction and observation models},
      publisher = {Springer US},
      keywords = {Gaussian process; Bayesian filtering; Dynamic modelling; Machine learning; Regression},
      author = {Ko, Jonathan and Fox, Dieter},
      pages = {75-90},
      language = {English}
    }
    
    
  • Jonathan Ko and Dieter Fox. Learning GP-BayesFilters via Gaussian process latent variable models. Autonomous Robots, 30(1):3–23, January 2011. [ bib | DOI ]
    
    @article{ko2011gpbf-learn,
      year = {2011},
      month = jan,
      issn = {0929-5593},
      journal = {Autonomous Robots},
      volume = {30},
      number = {1},
      doi = {10.1007/s10514-010-9213-0},
      title = {Learning {GP-B}ayes{F}ilters via {G}aussian process latent variable models},
      publisher = {Springer US},
      keywords = {Gaussian process; System identification; Bayesian filtering; Time alignment; System control; Machine learning},
      author = {Ko, Jonathan and Fox, Dieter},
      pages = {3-23},
      language = {English}
    }
    
    
  • Jeroen D. Hol. Sensor fusion and calibration of inertial sensors, vision, ultrawideband and GPS. PhD thesis, Linköping University, Institute of Technology, 2011. [ bib | .pdf ]
    
    @phdthesis{hol2011sensor,
      title = {Sensor fusion and calibration of inertial sensors, vision, ultrawideband and GPS},
      author = {Hol, Jeroen D.},
      year = {2011},
      school = {Link{\"o}ping University, Institute of Technology},
      url = {http://user.it.uu.se/~thosc112/team/hol2011.pdf}
    }
    
    
  • Jonathan Kelly and Gaurav S Sukhatme. Visual-inertial sensor fusion: Localization, mapping and sensor-to-sensor self-calibration. The International Journal of Robotics Research, 30(1):56–79, 2011. [ bib | DOI ]
    
    @article{kelly2011visualinertial,
      author = {Kelly, Jonathan and Sukhatme, Gaurav S},
      title = {Visual-Inertial Sensor Fusion: Localization, Mapping and Sensor-to-Sensor Self-calibration},
      volume = {30},
      number = {1},
      pages = {56-79},
      year = {2011},
      doi = {10.1177/0278364910382802},
      abstract = {
        Visual and inertial sensors, in combination, are able to provide accurate motion estimates and are well suited for use in many robot navigation tasks. However, correct data fusion, and hence overall performance, depends on careful calibration of the rigid body transform between the sensors. Obtaining this calibration information is typically difficult and time-consuming, and normally requires additional equipment. In this paper we describe an algorithm, based on the unscented Kalman filter, for self-calibration of the transform between a camera and an inertial measurement unit (IMU). Our formulation rests on a differential geometric analysis of the observability of the camera—IMU system; this analysis shows that the sensor-to-sensor transform, the IMU gyroscope and accelerometer biases, the local gravity vector, and the metric scene structure can be recovered from camera and IMU measurements alone. While calibrating the transform we simultaneously localize the IMU and build a map of the surroundings, all without additional hardware or prior knowledge about the environment in which a robot is operating. We present results from simulation studies and from experiments with a monocular camera and a low-cost IMU, which demonstrate accurate estimation of both the calibration parameters and the local scene structure.
      },
      url = {http://ijr.sagepub.com/content/30/1/56},
      eprint = {http://ijr.sagepub.com/content/30/1/56.full.pdf+html},
      journal = {The International Journal of Robotics Research}
    }
    
    
  • Paul Furgale, Joern Rehder, and Roland Siegwart. Unified temporal and spatial calibration for multi-sensor systems. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1280–1286, November 2013. [ bib | DOI ]
    
    @inproceedings{furgale2013unified,
      title = {Unified Temporal and Spatial Calibration for Multi-Sensor Systems},
      author = {Furgale, Paul and Rehder, Joern and Siegwart, Roland},
      booktitle = {Proceedings of the {IEEE/RSJ} International Conference on Intelligent Robots and Systems ({IROS})},
      year = 2013,
      location = {Tokyo, Japan},
      month = nov,
      pages = {1280-1286},
      keywords = {maximum likelihood estimation;robots;sensor fusion;state estimation;IMU calibration;continuous-time batch estimation;inertial measurement unit;maximum likelihood estimation;multisensor systems;robotics;sensor fusion;spatial calibration;spatial displacement;spatial transformation;state estimation;time offsets;unified temporal calibration;Calibration;Cameras;Estimation;Measurement uncertainty;Sensors;Splines (mathematics);Time measurement},
      doi = {10.1109/IROS.2013.6696514},
      issn = {2153-0858}
    }