OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

1UT Austin     2Google Research
CVPR 2024
arXiv Code

Abstract

The image matching field has been witnessing a continuous emergence of novel learnable feature matching techniques, with ever-improving performance on conventional benchmarks. However, our investigation shows that despite these gains, their potential for real-world applications is restricted by their limited generalization capabilities to novel image domains. In this paper, we introduce OmniGlue, the first learnable image matcher that is designed with generalization as a core principle. OmniGlue leverages broad knowledge from a vision foundation model to guide the feature matching process, boosting generalization to domains not seen at training time. Additionally, we propose a novel keypoint position-guided attention mechanism which disentangles spatial and appearance information, leading to enhanced matching descriptors. We perform comprehensive experiments on a suite of 6 datasets with varied image domains, including scene-level, object-centric and aerial images. OmniGlue's novel components lead to relative gains on unseen domains of 20.9% with respect to a directly comparable reference model SuperGlue, while also outperforming the recent LightGlue method by 9.5% relatively.

OmniGlue Framework

OmniGlue is the first learnable image matcher that is de-signed with generalization as a core principle. OmniGlue benefits from two designs: foundation model guidance and keypoint-position attention guidance. The visual foundation model, which is trained on large-scale data, provides coarse but generalizable correspondence cues. It huides the inter-image feature propagation process. The keypoint-position attention guidance disentangles the positional informatation from the keypoint features, which avoids the model specializing too strongly in the training dis-tribution of keypoints and relative pose transformations.

OmniGlue

In-domain Data Results

Out-of-domain Data Results

BibTeX

@inproceedings{jiang2024Omniglue,
   title={OmniGlue: Generalizable Feature Matching with Foundation Model Guidance},
   author={Jiang, Hanwen and Karpur, Arjun and Cao, Bingyi and Huang, Qixing and Araujo, Andre},
   booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
   year={2024},
}