Abstract
We present the deep self-correlation (DSC) descriptor for establishing dense correspondences between images taken under different imaging modalities, such as different spectral ranges or lighting conditions. We encode local self-similar structure in a pyramidal manner that yields both more precise localization ability and greater robustness to non-rigid image deformations. Specifically, DSC first computes multiple self-correlation surfaces with randomly sampled patches over a local support window, and then builds pyramidal self-correlation surfaces through average pooling on the surfaces. The feature responses on the self-correlation surfaces are then encoded through spatial pyramid pooling in a log-polar configuration. To better handle geometric variations such as scale and rotation, we additionally propose the geometry-invariant DSC (GI-DSC) that leverages multi-scale self-correlation computation and canonical orientation estimation. In contrast to descriptors based on deep convolutional neural networks (CNNs), DSC and GI-DSC are training-free (i.e., handcrafted descriptors), are robust to cross-modality, and generalize well to various modality variations. Extensive experiments demonstrate the state-of-The-Art performance of DSC and GI-DSC on challenging cases of cross-modal image pairs having photometric and/or geometric variations.
Original language | English |
---|---|
Article number | 8955799 |
Pages (from-to) | 2345-2359 |
Number of pages | 15 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 43 |
Issue number | 7 |
DOIs | |
State | Published - 1 Jul 2021 |
Bibliographical note
Publisher Copyright:© 1979-2012 IEEE.
Keywords
- Cross-modal correspondence
- local self-similarity
- non-rigid deformation
- pyramidal structure
- self-correlation