Fusion of GPS and Visual SLAM
to improve localization of autonomous vehicles
in urban environments.
by Adam Kalisz
Date | Artifact |
---|---|
18.12.2017 | Basic Qt Application with OpenCV lib DSO Paper fully read |
25.12.2017 | Basic Qt Application with NetworkDiscovery GPS + VSLAM Fusion Papers read |
01.01.2018 | Basic Qt Application with 3D Viewer Own Paper introduction section written |
08.01.2018 | Qt Application with DSO support First tests with own application |
15.01.2018 | Qt Application with libMV support LibMV tests with own application |
22.01.2018 | Qt Application with Import GPS and Video file import with own application |
29.01.2018 | Qt Application with object detection Object detection via OpenCV |
Master of Applied Science (124 pages)
Download: Get Handout (pdf)
Credit: http://eprofits.com/article/japanese-scientists-created-a-facial-recognition-software-with-99-per-cent-accuracy
Credit: http://www.hapari.com/blog/wp-content/uploads/2016/04/snapchat-filters.jpg
Credit: http://karantza.org/projects/featurepoints.png
https://www.youtube.com/watch?v=wQN9pUORxxE
Credit: http://blogs-images.forbes.com/insertcoin/files/2016/07/pokemon-go-new5-1200x711.jpg
Credit: http://vision.princeton.edu/courses/SFMedu/teaser.jpg
Credit: https://www.youtube.com/watch?time_continue=370&v=OMENy0ptoyM
Unreal Engine Real-Time Cinematography (Game: Hellblade: Senua’s Sacrifice)
Credit: http://il5.picdn.net/shutterstock/videos/15264310/thumb/1.jpg
Credit: S. Seitz 2005, http://graphics.cs.cmu.edu/courses/15-463/2005_fall/www/Lectures/convolution.pdf
Credit: S. Seitz 2005, http://graphics.cs.cmu.edu/courses/15-463/2005_fall/www/Lectures/convolution.pdf
Feature Comparison via Feature Descriptors.
An example for a SIFT Feature Descriptor:
Use thresholding and min distance to get features:
There is no scale-invariance here!
Scale-Space:
Credit: David G. Lowe, 2004 ( [6] on Handout )
Local extrema detection over scale and space
Credit: David G. Lowe, 2004 ( [6] on Handout )
Rotation assignment
Credit: David G. Lowe, 2004 ( [6] on Handout )
Credit: Jan E. Solem, 2012 ( [3] on Handout )
Harris vs. SIFT:
Credit: http://www.josefbsharah.net/wp-content/uploads/2013/11/shutter-speed-example1.jpg
Credit: http://3njm962ijlia1o1mlt12pbfb.wpengine.netdna-cdn.com/wp-content/uploads/2015/08/Dark-Tunnel-Exit-Needs-LED-Lighting-1024x683.jpg
Credit: http://2.bp.blogspot.com/-rKDfJNUkTfM/T0ulpZFsOII/AAAAAAAAADY/WwvWvGogWmY/s1600/aim.jpg
Direct Sparse Odometry (DSO) is a
visual odometry method
based on a novel, highly accurate
sparse and direct structure and motion
formulation.
Direct vs. Indirect:
Basis: Probabilistic model, take noisy measurements Y as input and compute estimator X for unknown, hidden model parameters (3D world model and camera motion).
Typically Maximum Likelihood approach: $\textbf{X*} := argmax_X P( \textbf{Y} | \textbf{X} )$.
Goal: Find model parameters that maximize the probability of obtaining the actual measurements.
Indirect:
Direct:
Skip pre-computation and directly use actual sensor values – light measured over time – as measurements Y in probabilistic model.
Passive vision:
Indirect methods optimize a geometric error
Direct approach thus optimizes a photometric error
Note: Direct formulations for depth cameras or laser scanners may also optimize a geometric error (as they measure geometry).
Sparse vs. Dense:
Difference: Quantity of reconstructed points / pixels in 2D image domain.
More fundamental difference: Usage of a geometry prior (log-likelihood energy term).
Sparse:
Reconstruct selected set of independent points (traditionally corners)
No notion of neighborhood, and geometry parameters (keypoint positions) are conditionally independent given the camera poses & intrinsics
Dense:
Reconstruct all pixels in 2D image domain.
Exploit connectedness of the used image region to formulate a geometry prior, typically favouring smoothness
(In fact necessarily required to make a dense world model observable from passive vision alone).
Motivation:
Direct: Keypoint robustness (auto exposure, gamma / white balancing, rolling shutter, vignetting) is a plus, but discards potentially valuable information in these variations. Benefits from a more precise sensor model. Allows more finely grained geometry representation (pixelwise inverse depth). More complete model and lending more robustness in sparsely textured environments.
Motivation:
Sparse: geometry prior introduces correlations between geometry parameters => renders statistically consistent, joint optimization in realtime infeasible. Although denser 3D reconstruction locally more accurate and more visually appealing, priors can introduce a bias, and thereby reduce rather than increase long-term, large-scale accuracy.
Contribution:
Only fully direct method that jointly optimizes the full likelihood for all involved model parameters, including camera poses, camera intrinsics, and geometry parameters (inverse depth values, Gaussian for EKF).
Old camera poses and points leaving FOV are marginalized.
Method takes full advantage of photometric camera calibration to increase accuracy and robustness.
Contribution:
CPU-based implementation runs in real time on a laptop computer. Outperforms other state-of-the-art approaches (direct and indirect), 5x realtime still outperforming state-of-the-art indirect methods.