Self-supervised Correspondence Estimation via Multiview Registration


Mohamed El Banani
Ignacio Rocco
David Novotny
Andrea Vedaldi
Natalia Neverova
Justin Johnson
Benjamin Graham

University of Michigan
Meta AI
WACV 2023

[arxiv]
[code]
[bibtex]



Video provides us with the spatio-temporal consistency needed for visual learning. Recent approaches have utilized this signal to learn correspondence estimation from close- by frame pairs. However, by only relying on close-by frame pairs, those approaches miss out on the richer long-range consistency between distant overlapping frames. To address this, we propose a self-supervised approach for correspon- dence estimation that learns from multiview consistency in short RGB-D video sequences. Our approach combines pairwise correspondence estimation and registration with a novel SE(3) transformation synchronization algorithm. Our key insight is that self-supervised multiview registration al- lows us to obtain correspondences over longer time frames; increasing both the diversity and difficulty of sampled pairs. We evaluate our approach on indoor scenes for correspon- dence estimation and RGB-D pointcloud registration and find that we perform on-par with supervised approaches.


Overview Video




Paper

El Banani, M., Rocco, I., Novotny, D., Vedaldi, A., Neverova, N., Johnson, J., Graham, B.

Self-supervised Correspondence Estimation via Multiview Registration

WACV 2023

[arxiv]
[bibtex]


Acknowledgements

We thank Karan Desai, Mahmoud Azab, David Fouhey, Richard Higgins, Daniel Geng, and Menna El Banani for feedback and edits to early drafts of this work. This webpage template was borrowed from some colorful folks.