In the last days I have done literature research on VO / SLAM algorithms, and it seems to be that ORB-SLAM(2) and DSO seem to be the two currently best (openly available) algorithms.
But as the authors write in their paper on DSO: "Note that DSO is designed as a pure visual odometry while ORB-SLAM constitutes a full SLAM system, including loop-closure detection & correction and re-localization [...]."
I'm still a bit struggling to fully understand the direct method used in DSO and its consequences for SLAM tasks like large-scale loop closing, relocalization and map reuse. So I am asking myself how difficult it would be to develop DSO into a SLAM algorithm by invoking the previously mentioned tasks:
1. Large-scale loop closing and relocalization both rely on place recognition by bag-of-words (at least in ORB-SLAM). Is there a way one could do fast place recognition without using features?
2. Large scale loop closing also needs bundle adjustment of the poses and map points of that loop. Would this be possible in DSO? Or is necessary information lost after frames leave the optimization window?
3. By map reuse I mean that once a map of a place is built, the algorithm can locate the camera without forcibly finding new map points. Is there a way this is possible without using features?
Thank you a lot already for your answers.