ELViS: Efficient Visual Similarity from Local Descriptors that Generalizes Across Domains
Overview
Overall Novelty Assessment
The paper introduces ELViS, a similarity-based re-ranking model for cross-domain image retrieval that operates in similarity space rather than representation space. It sits within the 'Cross-Domain Correspondence via Deep Features' leaf of the taxonomy, which contains four papers total including ELViS. This leaf is part of the broader 'Correspondence Establishment and Matching' branch, indicating a moderately populated research direction focused on leveraging deep neural network features to establish correspondences across visually distinct domains. The taxonomy shows this is an active but not overcrowded area, with sibling papers exploring related correspondence mechanisms using deep features.
The taxonomy reveals neighboring research directions that contextualize ELViS's position. Adjacent leaves include 'Semantic-Guided Correspondence' (2 papers) and 'Geometric and Appearance-Based Matching' (1 paper), both addressing correspondence establishment through different mechanisms. The broader 'Domain Adaptation and Transfer Learning' branch (7 papers across three leaves) tackles domain shift through feature alignment rather than correspondence reasoning. ELViS diverges from these by emphasizing similarity-space operations and optimal transport refinement, connecting conceptually to correspondence-based methods while introducing a distinct architectural approach that prioritizes interpretability and efficiency over pure feature alignment.
Among 30 candidates examined, the contribution-level analysis reveals mixed novelty signals. The core ELViS re-ranking model (Contribution 1) examined 10 candidates with zero refutations, suggesting relative novelty in its similarity-space formulation. However, the optimal transport refinement with descriptor-dependent gains (Contribution 2) found 2 refutable candidates among 10 examined, indicating some overlap with prior work on correspondence refinement techniques. The cross-domain generalization benchmark (Contribution 3) showed no refutations across 10 candidates, suggesting this evaluation framework addresses a gap in existing benchmarks. The limited search scope means these findings reflect top-30 semantic matches rather than exhaustive coverage.
Based on the limited literature search, ELViS appears to offer meaningful contributions in similarity-space modeling and benchmark construction, while its optimal transport component shows more substantial prior work. The taxonomy context suggests the paper occupies a moderately explored niche within correspondence-based retrieval, with room for differentiation through its specific design choices. The analysis covers top-30 semantic matches and does not claim exhaustive field coverage, leaving open the possibility of additional related work in less semantically similar papers or specialized venues.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors propose ELViS, a novel image-to-image similarity model that operates on local descriptor correspondences rather than raw descriptors. The model refines similarity matrices using optimal transport with descriptor-dependent dustbin gains and aggregates strong correspondences through a learnable voting mechanism, achieving better generalization to unseen domains than prior descriptor-based methods.
The authors introduce a variant of optimal transport that uses descriptor-dependent dustbin gains (computed via a learned function h) to suppress uninformative or background descriptors. This refinement step produces a double-stochastic similarity matrix that emphasizes mutually consistent strong correspondences while discarding distracting descriptors.
The authors compile a benchmarking protocol unifying eight existing datasets across diverse domains (landmarks, household items, retail products, artworks, and multi-domain sets) and introduce an evaluation framework that distinguishes in-domain and out-of-domain test sets. This is presented as the first extensive evaluation of single-source domain generalization in instance-level retrieval.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[13] Neural best-buddies: Sparse cross-domain correspondence PDF
[16] Cross-domain image matching with deep feature maps PDF
[22] Grownbb: GromovâWasserstein learning of neural best buddies for cross-domain correspondence PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
ELViS: similarity-based re-ranking model for cross-domain image retrieval
The authors propose ELViS, a novel image-to-image similarity model that operates on local descriptor correspondences rather than raw descriptors. The model refines similarity matrices using optimal transport with descriptor-dependent dustbin gains and aggregates strong correspondences through a learnable voting mechanism, achieving better generalization to unseen domains than prior descriptor-based methods.
[9] Cross-domain image retrieval: methods and applications PDF
[21] Multi-level domain adaptive learning for cross-domain detection PDF
[46] Omniglue: Generalizable feature matching with foundation model guidance PDF
[47] Bridging the domain gap for ground-to-aerial image matching PDF
[48] Cross-domain image retrieval with a dual attribute-aware ranking network PDF
[49] Pixel matching network for cross-domain few-shot segmentation PDF
[50] A Cross-View Image Matching Method with Feature Enhancement PDF
[51] Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization PDF
[52] Sketch-based shape retrieval via best view selection and a cross-domain similarity measure PDF
[53] Siamese transformer network-based similarity metric learning for cross-source remote sensing image retrieval PDF
Optimal transport refinement with descriptor-dependent dustbin gains
The authors introduce a variant of optimal transport that uses descriptor-dependent dustbin gains (computed via a learned function h) to suppress uninformative or background descriptors. This refinement step produces a double-stochastic similarity matrix that emphasizes mutually consistent strong correspondences while discarding distracting descriptors.
[54] Optimal transport aggregation for visual place recognition PDF
[60] Superglue: Learning feature matching with graph neural networks PDF
[55] Optimal transport for transfer learning across spaces PDF
[56] Semantic correspondence as an optimal transport problem PDF
[57] AOT: Aggregation Optimal Transport for Few-Shot SAR Automatic Target Recognition PDF
[58] Feature Robust Optimal Transport for High-dimensional Data PDF
[59] The Self-Optimal-Transport Feature Transform PDF
[61] Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition PDF
[62] Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport PDF
[63] Deep Shells: Unsupervised Shape Correspondence with Optimal Transport PDF
Cross-domain generalization benchmark for instance-level retrieval
The authors compile a benchmarking protocol unifying eight existing datasets across diverse domains (landmarks, household items, retail products, artworks, and multi-domain sets) and introduce an evaluation framework that distinguishes in-domain and out-of-domain test sets. This is presented as the first extensive evaluation of single-source domain generalization in instance-level retrieval.