TY - JOUR
T1 - Correlation Verification for Image Retrieval and Its Memory Footprint Optimization
AU - Lee, Seongwon
AU - Seong, Hongje
AU - Lee, Suhyeon
AU - Kim, Euntai
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In this paper, we propose a novel image retrieval network named Correlation Verification Network (CVNet) to replace the conventional geometric re-ranking with a 4D convolutional neural network that learns diverse geometric matching possibilities. To enable efficient cross-scale matching, we construct feature pyramids and establish cross-scale feature correlations in a single inference, thereby replacing the costly multi-scale inference. Additionally, we employ curriculum learning with the Hide-and-Seek strategy to handle challenging samples. Our proposed CVNet demonstrates state-of-the-art performance on several image retrieval benchmarks by a large margin. From an implementation perspective, however, CVNet has one drawback: it requires high memory usage because it needs to store dense features of all database images. This high memory requirement can be a significant limitation in practical applications. To address this issue, we introduce an extension of CVNet called Dense-to-Sparse CVNet (CVNetDS), which can significantly reduce memory usage by sparsifying the features of the database images. The sparsification module in CVNetDS learns to select the relevant parts of image features end-to-end using a Gumbel estimator. Since the sparsification is performed offline, CVNetDS does not increase online extraction and matching times. CVNetDS dramatically reduces the memory footprint while preserving performance levels nearly identical to CVNet.
AB - In this paper, we propose a novel image retrieval network named Correlation Verification Network (CVNet) to replace the conventional geometric re-ranking with a 4D convolutional neural network that learns diverse geometric matching possibilities. To enable efficient cross-scale matching, we construct feature pyramids and establish cross-scale feature correlations in a single inference, thereby replacing the costly multi-scale inference. Additionally, we employ curriculum learning with the Hide-and-Seek strategy to handle challenging samples. Our proposed CVNet demonstrates state-of-the-art performance on several image retrieval benchmarks by a large margin. From an implementation perspective, however, CVNet has one drawback: it requires high memory usage because it needs to store dense features of all database images. This high memory requirement can be a significant limitation in practical applications. To address this issue, we introduce an extension of CVNet called Dense-to-Sparse CVNet (CVNetDS), which can significantly reduce memory usage by sparsifying the features of the database images. The sparsification module in CVNetDS learns to select the relevant parts of image features end-to-end using a Gumbel estimator. Since the sparsification is performed offline, CVNetDS does not increase online extraction and matching times. CVNetDS dramatically reduces the memory footprint while preserving performance levels nearly identical to CVNet.
KW - correlation verification
KW - dense-to-sparse
KW - feature sparsification
KW - Image retrieval
KW - re-ranking
UR - http://www.scopus.com/inward/record.url?scp=85210302408&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2024.3504274
DO - 10.1109/TPAMI.2024.3504274
M3 - Article
AN - SCOPUS:85210302408
SN - 0162-8828
VL - 47
SP - 1514
EP - 1529
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 3
ER -