Uncover Underlying Correspondence for Robust Multi-view Clustering

ICLR 2026 Conference SubmissionAnonymous Authors
Multi-view clustering; Noisy Correspondence
Abstract:

Multi-view clustering (MVC) aims to group unlabeled data into semantically meaningful clusters by leveraging cross-view consistency. However, real-world datasets collected from the web often suffer from noisy correspondence (NC), which breaks the consistency prior and results in unreliable alignments. In this paper, we identify two critical forms of NC that particularly harm clustering: i) category-level mismatch, where semantically consistent samples from the same class are mistakenly treated as negatives; and ii) sample-level mismatch, where collected cross-view pairs are misaligned and some samples may even lack any valid counterpart. To address these challenges, we propose \textbf{CorreGen}, a generative framework that formulates noisy correspondence learning in MVC as maximum likelihood estimation over underlying cross-view correspondences. The objective is elegantly solved via an Expectation–Maximization algorithm: in the E-step, soft correspondence distributions are inferred across views, capturing class-level relations while adaptively down-weighting noisy or unalignable samples through GMM-guided marginals; in the M-step, the embedding network is updated to maximize the expected log-likelihood. Extensive experiments on both synthetic and real-world noisy datasets demonstrate that our method significantly improves clustering robustness. The code will be released upon acceptance.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes CorreGen, a generative framework that formulates noisy correspondence learning in multi-view clustering as maximum likelihood estimation over latent cross-view correspondences. It sits within the 'General Noisy Correspondence Robustness' leaf of the taxonomy, which contains only two papers including this one. This leaf focuses on preventing overfitting to noisy correspondences through regularization or probabilistic inference, distinguishing it from contrastive-specific or pseudo-label-specific methods. The sparse population of this leaf suggests the paper addresses a relatively focused research direction within the broader noisy correspondence modeling branch.

The taxonomy reveals that the paper's immediate parent branch, 'Noisy Correspondence Modeling and Mitigation', contains five distinct subcategories addressing different aspects of correspondence noise. Neighboring leaves include 'Dual Noisy Correspondence in Contrastive Learning' (handling false positives and negatives in contrastive frameworks) and 'Pseudo-Label Noise and Correspondence Correction' (refining noisy pseudo-labels). The paper's generative probabilistic approach diverges from these contrastive and pseudo-label-centric methods, instead emphasizing latent correspondence inference through EM optimization. This positions it closer to probabilistic frameworks in adjacent branches like 'Probabilistic Multi-View Clustering', though the taxonomy places it firmly within the noisy correspondence domain rather than the broader probabilistic methods category.

Among the twelve candidates examined through limited semantic search, none were found to clearly refute any of the three identified contributions. The formalization of two critical forms of noisy correspondence (category-level and sample-level mismatch) was examined against one candidate with no refutation. The core CorreGen framework was compared against ten candidates, none providing overlapping prior work within this search scope. The EM-based optimization with GMM-guided marginals was examined against one candidate without refutation. These statistics reflect a constrained literature search rather than exhaustive coverage, suggesting that within the top-K semantic matches examined, the specific combination of generative modeling, dual noise formalization, and EM-based inference appears distinctive.

Based on the limited search scope of twelve candidates, the work appears to occupy a relatively sparse position within its immediate taxonomy leaf. The absence of refutable prior work among examined candidates, combined with the leaf's small population, suggests the specific technical approach may be novel within the boundaries of this search. However, the analysis does not cover the full breadth of multi-view clustering literature, particularly methods in adjacent branches that might employ related probabilistic or generative techniques under different problem formulations.

Taxonomy

Core-task Taxonomy Papers
34
3
Claimed Contributions
12
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: multi-view clustering under noisy correspondence. The field addresses scenarios where multiple views of the same data are available but the assumed one-to-one correspondence between samples across views is corrupted by noise, misalignment, or missing entries. The taxonomy reveals a rich landscape organized around several complementary themes. One major branch focuses explicitly on noisy correspondence modeling and mitigation, developing techniques to identify and correct erroneous pairings, as seen in works like Robust Noisy Correspondence[1] and Dual Noisy Correspondence[4]. Another prominent direction tackles incomplete multi-view clustering, where some views lack certain samples entirely, exemplified by Semantic Invariant Incomplete[9] and Error-Resilient Incomplete[20]. A third branch explores unpaired and partially aligned settings, relaxing the strict correspondence assumption and allowing distribution-level or group-level alignment, as in Cross-View Partial Alignment[6] and Distribution-Level Unaligned[7]. Additional branches address multi-source noise in features, subspace and low-rank methods, probabilistic and tensor-based frameworks, and large-scale anchor-based strategies, reflecting the diversity of technical approaches and problem formulations. Within this landscape, particularly active lines of work contrast different noise assumptions and recovery strategies. Some methods explicitly model correspondence noise and attempt to prune or rectify mismatched pairs, while others adopt soft alignment or causal reasoning to sidestep the need for perfect correspondence. The original paper, Uncover Underlying Correspondence[0], sits squarely within the general noisy correspondence robustness cluster, sharing its branch with Robust Noisy Correspondence[1]. Both emphasize uncovering or recovering the true latent alignment when observed pairings are unreliable. Compared to nearby efforts like Subspace Alignment Constraint[3] or ROLL[5], which may impose geometric or regularization-based constraints, Uncover Underlying Correspondence[0] appears to focus more directly on inferring the hidden correspondence structure itself. This positioning highlights an ongoing tension in the field: whether to treat noisy correspondence as a nuisance to be filtered out, a latent variable to be inferred, or a soft constraint to be relaxed through probabilistic or causal modeling.

Claimed Contributions

Formalization of two critical forms of noisy correspondence in multi-view clustering

The authors formally define and distinguish two types of noisy correspondence problems in multi-view clustering: category-level mismatch (semantically consistent samples from the same class treated as negatives) and sample-level mismatch (cross-view pairs that are misaligned or lack valid counterparts). These definitions provide a structured framework for understanding correspondence noise in multi-view data.

1 retrieved paper
CorreGen: a generative framework formulating noisy correspondence learning as maximum likelihood estimation

The authors introduce CorreGen, a novel generative framework that reformulates the noisy correspondence problem in multi-view clustering as a maximum likelihood estimation task over latent cross-view correspondences. This approach shifts from discriminative contrastive objectives to a probabilistic generative formulation that does not rely heavily on pre-defined pairs.

10 retrieved papers
EM-based optimization algorithm with GMM-guided marginals and virtual sample mechanism

The authors develop an Expectation-Maximization algorithm to optimize their generative objective. The E-step infers soft correspondence distributions using GMM-guided marginals to capture category-level relationships and a virtual sample mechanism to handle unalignable samples, while the M-step updates the embedding network to maximize expected log-likelihood.

1 retrieved paper

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Formalization of two critical forms of noisy correspondence in multi-view clustering

The authors formally define and distinguish two types of noisy correspondence problems in multi-view clustering: category-level mismatch (semantically consistent samples from the same class treated as negatives) and sample-level mismatch (cross-view pairs that are misaligned or lack valid counterparts). These definitions provide a structured framework for understanding correspondence noise in multi-view data.

Contribution

CorreGen: a generative framework formulating noisy correspondence learning as maximum likelihood estimation

The authors introduce CorreGen, a novel generative framework that reformulates the noisy correspondence problem in multi-view clustering as a maximum likelihood estimation task over latent cross-view correspondences. This approach shifts from discriminative contrastive objectives to a probabilistic generative formulation that does not rely heavily on pre-defined pairs.

Contribution

EM-based optimization algorithm with GMM-guided marginals and virtual sample mechanism

The authors develop an Expectation-Maximization algorithm to optimize their generative objective. The E-step infers soft correspondence distributions using GMM-guided marginals to capture category-level relationships and a virtual sample mechanism to handle unalignable samples, while the M-step updates the embedding network to maximize expected log-likelihood.