Depth Estimation and Dense Reconstruction with the Monocular Camera (Tutor: Tianjun Zhang, wechat: z619850002)
3D reconstruction based on the monocular vision is a classic task in the field of computer vision. In this project, you need to mainly focus on three tasks, including pose recovery, depth estimation and dense map construction.
Fig. 1: Depth Estimation
Fig. 2: Dense Scene Reconstruction
Fig. 3: Scene structure reconstruction incrementally using a single agent
1) Collecting at least one video sequence of images and recovering corresponding camera poses of these images based on any SFM or SLAM methods. After that, depth maps of key frames should be recovered by any proper algorithms. Finally, the dense map of the scene can be constructed incrementally.
2) When the video stream and corresponding poses are input, your system should be able to construct a dense map in real-time or quasi real-time, rather than offline. GPU is allowed.
A SFM pipeline, Colmap: link: https://github.com/colmap/colmap
A SLAM pipeline, ORB-SLAM2: link: https://github.com/raulmur/ORB_SLAM2
Traditional depth estimation method, SGM: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4359315
Deep learning based method, DPT: https://www.sciencedirect.com/science/article/abs/pii/S0950705122007821
Demo pipeline of SGM: https://github.com/z619850002/DepthEstimation-SGM
Official implementation of DPT: https://github.com/isl-org/DPT
Created on: Nov. 09, 2023