Master Thesis

Fusion of GPS and Visual SLAM
to improve localization of autonomous vehicles
in urban environments.

by Adam Kalisz



  • Last time (Recap)
  • This time
  • Comparison of various VSLAM Algorithms
  • Papers on VSLAM and GPS Fusion
  • Interesting 3D Reconstruction ideas
  • Conclusion / Feeling

Last time

Demo: DSO (Direct Sparse Odometry)

This time

Comparison time!

Blender 3D Motion Tracker:
Libmv multiview reconstruction
and tracking library.

Pros and Cons:

  • Compact, field-tested and robust solution
  • MIT license
  • Research benefits a huge community
  • Maintained by Blender Devs (updates!)
  • Personal contact to Keir Mierle (Dev from BCon17)
  • Feature-based (i.e. SURF)
  • Sparse reconstruction
  • No ROS-integration

Clustering Views for Multi-view Stereo (CMVS)
Patch-based Multi-view Stereo Software (PMVS)

Pros and Cons:

  • Dense reconstruction
  • GPL license
  • Used by ILM, Weta and Google
  • Feature-based (Based on SfM output via SIFT)
  • No ROS-integration
  • Crashed during my tests on Windows
  • Last update: 7 years ago

PTAM (Parallel Tracking and Mapping)
re-released under GPLv3.

Pros and Cons:

  • GPL license
  • Feature-based (FAST-10, machine-learned D-Tree)
  • Sparse reconstruction
  • Not robust in any environment
  • No ROS-integration
  • Last update: 3 years ago

ORB-SLAM2: Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

Pros and Cons:

  • Feature-rich solution (Mono, Stereo, RGB-D)
  • GPLv3 license
  • last PR-commit: 1 month ago
  • ROS-integration available (optional)
  • Feature-based (ORB, combined FAST+BRIEF)
  • Sparse reconstruction
  • 292 issues on GitHub!

LSD Slam:
Large-Scale Direct Monocular SLAM

Pros and Cons:

  • Mature and feature-rich solution
  • GPLv3 license
  • ROS-integration available (optional, rosmake)
  • Direct (no features!)
  • Semi-dense reconstruction in real-time
  • 176 issues on GitHub!
  • last PR-commit: 3 years ago

Direct Sparse Odometry

Pros and Cons:

  • Feature-rich solution
  • GPLv3 license
  • last PR-commit: 1 month ago
  • ROS-integration available (optional)
  • Direct (no features!)
  • A lot of great documentation (Youtube!)
  • Semi-dense reconstruction
  • Stereo-DSO (dense) not yet published

Papers on VSLAM and GPS Fusion

Interesting 3D Reconstruction ideas

Variational methods for dense 3D reconstruction

Do not tell the algorithm how to do reconstruction, but rather what result it should deliver.

Class of optimization methods using mathematical models instead of a series of processing steps to produce a result.

Mathematical analysis of cost function. Example: Perturbation of a sphere to get a complex object (Blender demo)

Variational methods for 3D reconstruction

$ E(u) = \int\limits_{\Omega}^{} $$(u - f)^2$ + $\lambda | \nabla u | dx $
$ E(u) = \int\limits_{\Omega}^{} $$I_0(x) - I_i($$\pi$($g_\xi$($u$ $\cdot$ $x$$)) )$ + $\lambda | \nabla u | dx $

$\pi$ = projection; $g_\xi$ = translation, rotation;
u = distance; x = 2D image point (homogenous).

Data term: local assignment costs (photometric error)

Regularization term: length of interface

Basic idea: We take the sum over the whole image given some constraints and try to minimize it

Imposing Silhouette Consistency
[Cremers, Kolev, PAMI 2011]

Constrained optimization problem

Raycasting through pixel of image into voxelgrid, to find intersection with geometry

Integral of voxels along ray >= 1 if surface, 0 else

Implicit or explicit representation of mesh surface

Problem: Silhouette! Difficult to determine against busy background


  • Try out direct method with own calibration and Langwasser dataset to get rather dense result?
  • Do tests with point cloud for 3D Segmentation in order to detect street signs?
  • Begin reading suitable literature thoroughly?


Thank you!