Identifiability and recoverability in self-supervised models

ICLR 2026 Conference SubmissionAnonymous Authors
identifiabilityself-supervised learningdisentanglement
Abstract:

Self-supervised models exhibit a surprising stability in their internal representations. Whereas most prior work treats this stability as a single property, we formalize it as two distinct concepts: statistical identifiability (consistency of representations across runs) and structural identifiability (alignment of representations with some unobserved ground truth). Recognizing that perfect pointwise identifiability is generally unrealistic for modern representation learning models, we propose new model-agnostic definitions of statistical and structural near-identifiability of representations up to some error tolerance ϵ\epsilon. Leveraging these definitions, we prove a statistical ϵ\epsilon-near-identifiability result for the representations of models with nonlinear decoders, generalizing existing identifiability theory beyond last-layer representations in e.g. generative pre-trained transformers (GPTs) to near-identifiability of the intermediate representations of a broad class of models including (masked) autoencoders (MAEs) and supervised learners. Although these weaker assumptions confer weaker identifiability, we show that independent components analysis (ICA) can resolve much of the remaining linear ambiguity for this class of models, and validate and measure our near-identifiability claims empirically. With additional assumptions on the data-generating process, statistical identifiability extends to structural identifiability, yielding a simple and practical recipe for disentanglement: ICA post-processing of latent representations. On synthetic benchmarks, this approach achieves state-of-the-art disentanglement using a vanilla autoencoder. With a foundation model-scale MAE for cell microscopy, it disentangles biological variation from technical batch effects, substantially improving downstream generalization.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces model-agnostic definitions of statistical and structural near-identifiability, extending identifiability theory beyond last-layer representations to intermediate layers in models with nonlinear decoders. It resides in the 'Identifiability Theory for Self-Supervised Learning' leaf, which contains five papers total, including the original work. This leaf sits within the broader 'Theoretical Foundations of Identifiability' branch, indicating a moderately populated research direction focused on formal guarantees rather than empirical validation or application-specific methods.

The taxonomy reveals neighboring leaves addressing 'Contrastive Learning Identifiability' (three papers) and 'Causal and Supervised Identifiability' (four papers), suggesting the field partitions identifiability research by learning paradigm and supervision type. The sibling papers in the same leaf—such as Contrastive ICA Identifiability and Multiview Correlation Identifiability—establish identifiability conditions for specific self-supervised objectives, while this work aims for broader model-agnostic applicability. The scope note explicitly excludes empirical validation studies, positioning this leaf as a hub for theoretical contributions rather than practical demonstrations.

Among thirty candidates examined, the contribution-level analysis shows varied novelty profiles. The model-agnostic definitions (Contribution A) and the statistical near-identifiability theorem for intermediate representations (Contribution B) each examined ten candidates with zero refutable overlaps, suggesting these theoretical formulations occupy relatively unexplored territory within the limited search scope. The ICA post-processing recipe (Contribution C) examined ten candidates and found one refutable match, indicating some prior work on disentanglement via ICA exists, though the specific integration with near-identifiability may still offer incremental value.

Based on the top-thirty semantic matches and citation expansion, the work appears to advance identifiability theory into less-explored territory—intermediate representations with nonlinear decoders—while the ICA application shows modest overlap with existing disentanglement literature. The analysis does not cover exhaustive field-wide searches, so additional related work may exist beyond the examined candidates.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: representation identifiability in self-supervised learning models. The field is organized around five main branches that together capture the theoretical underpinnings, structural properties, learning dynamics, domain-specific adaptations, and practical applications of identifiable representations. Theoretical Foundations of Identifiability establishes the mathematical conditions under which learned representations can be uniquely recovered or related to ground-truth factors, often drawing on tools from independent component analysis and causal inference, as seen in works like Contrastive ICA Identifiability[44] and Multiview Correlation Identifiability[49]. Disentanglement and Structured Representations focuses on methods that separate underlying factors of variation, with approaches ranging from group-theoretic frameworks such as Disentangled Group Representation[4] to causal grouping strategies like Causal Grouping Identifiability[21]. Representation Learning Dynamics and Analysis examines how representations evolve during training, including emergent properties captured by SSL Dynamics[5] and Emergent Linear Representations[27]. Domain-Specific Self-Supervised Learning tailors identifiability principles to particular modalities—speech, vision, robotics—while Applications and Downstream Tasks explore how identifiable features improve generalization and interpretability in real-world settings. Recent work has intensified the dialogue between empirical validation and formal guarantees, with studies like Empirical Identifiability Theory[2] and Empirical Identifiability Position[6] probing the gap between theoretical assumptions and practical learning scenarios. Within the Theoretical Foundations branch, Identifiability Recoverability SSL[0] contributes to this conversation by addressing conditions under which self-supervised objectives provably recover latent structures, positioning itself alongside Content Style Isolation[1] and Dieting Identifiable Features[17], which explore complementary notions of feature separability and minimal sufficient statistics. A central tension across these lines of work concerns the trade-off between strong identifiability guarantees—often requiring restrictive assumptions—and the flexibility needed for diverse data distributions and architectures. Identifiability Recoverability SSL[0] engages with this challenge by refining recoverability criteria, offering a perspective that bridges rigorous theory with the nuanced realities of modern self-supervised learning pipelines.

Claimed Contributions

Model-agnostic definitions of statistical and structural near-identifiability

The authors introduce formal definitions that distinguish between statistical identifiability (consistency of representations across runs) and structural identifiability (alignment with ground truth), relaxing perfect identifiability to near-identifiability with error tolerance ε. These definitions are the first general-purpose formulations applicable when representations are nearly identifiable.

10 retrieved papers
Statistical near-identifiability theorem for intermediate representations with nonlinear decoders

The authors prove that intermediate-layer representations of models with nonlinear decoders (including masked autoencoders, GPTs, and supervised learners) are statistically near-identifiable up to rigid transformations, where nearness depends on the local bi-Lipschitz constant of the decoder. This extends prior results that only covered last-layer representations with linear mappings to the loss.

10 retrieved papers
Practical recipe for disentanglement via ICA post-processing

The authors show that under bi-Lipschitz data-generating process assumptions, applying independent components analysis to latent representations of encoder-decoder models achieves structural identifiability and disentanglement. They demonstrate this approach achieves state-of-the-art disentanglement on synthetic benchmarks using vanilla autoencoders and improves out-of-distribution generalization in a foundation model for cell microscopy.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Model-agnostic definitions of statistical and structural near-identifiability

The authors introduce formal definitions that distinguish between statistical identifiability (consistency of representations across runs) and structural identifiability (alignment with ground truth), relaxing perfect identifiability to near-identifiability with error tolerance ε. These definitions are the first general-purpose formulations applicable when representations are nearly identifiable.

Contribution

Statistical near-identifiability theorem for intermediate representations with nonlinear decoders

The authors prove that intermediate-layer representations of models with nonlinear decoders (including masked autoencoders, GPTs, and supervised learners) are statistically near-identifiable up to rigid transformations, where nearness depends on the local bi-Lipschitz constant of the decoder. This extends prior results that only covered last-layer representations with linear mappings to the loss.

Contribution

Practical recipe for disentanglement via ICA post-processing

The authors show that under bi-Lipschitz data-generating process assumptions, applying independent components analysis to latent representations of encoder-decoder models achieves structural identifiability and disentanglement. They demonstrate this approach achieves state-of-the-art disentanglement on synthetic benchmarks using vanilla autoencoders and improves out-of-distribution generalization in a foundation model for cell microscopy.