Unsupervised discovery of the shared and private geometry in multi-view data

ICLR 2026 Conference SubmissionAnonymous Authors
manifold learningmulti-viewneuroscienceneural networkmulti-region
Abstract:

Studying complex real-world phenomena often involves data from multiple views (e.g. sensor modalities or brain regions), each capturing different aspects of the underlying system. Within neuroscience, there is growing interest in large-scale simultaneous recordings across multiple brain regions. Understanding the relationship between views (e.g., the neural activity in each region recorded) can reveal fundamental insights into each view and the system as a whole. However, existing methods to characterize such relationships lack the expressivity required to capture nonlinear relationships, describe only shared sources of variance, or discard geometric information that is crucial to drawing insights from data. Here, we present SPLICE: a neural network-based method that infers disentangled, interpretable representations of private and shared latent variables from paired samples of high-dimensional views. Compared to competing methods, we demonstrate that SPLICE 1) disentangles shared and private representations more effectively, 2) yields more interpretable representations by preserving geometry, and 3) is more robust to incorrect a priori estimates of latent dimensionality. We propose our approach as a general-purpose method for finding succinct and interpretable descriptions of paired data sets in terms of disentangled shared and private latent variables.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces SPLICE, a neural network method for disentangling shared and private latent variables from paired multi-view data, with an emphasis on neuroscience applications. It resides in the Multimodal VAE Frameworks leaf, which contains five papers including the original work. This leaf sits within the broader Variational Autoencoder-Based Disentanglement branch, one of seven major branches in the taxonomy. The relatively small cluster suggests a moderately active but not overcrowded research direction focused on VAE architectures for explicit shared-private decomposition.

The taxonomy reveals neighboring approaches across multiple branches. Within the same VAE-based parent category, sibling leaves explore information bottleneck principles and domain-specific adaptations. Adjacent branches include Deep Learning Representation Disentanglement (adversarial and contrastive methods), Matrix and Tensor Factorization (classical decomposition techniques), and Probabilistic Bayesian Models (hierarchical priors). The paper's emphasis on geometry preservation and neuroscience data positions it at the intersection of general VAE frameworks and specialized application domains, distinguishing it from purely generative or domain-agnostic methods.

Among twenty candidates examined across three contributions, no clearly refuting prior work was identified. The SPLICE architecture contribution examined ten candidates with zero refutations, as did the geometry preservation framework. The predictability minimization approach examined zero candidates, indicating limited overlap detection in that specific area. This limited search scope—twenty papers from semantic search and citation expansion—suggests the analysis captures closely related work but cannot claim exhaustive coverage. The absence of refutations among examined candidates indicates potential novelty within the sampled literature, though broader searches might reveal additional overlaps.

Based on the limited search scope, SPLICE appears to occupy a distinct position emphasizing geometric interpretability within the multimodal VAE landscape. The analysis covers top-ranked semantic matches and immediate citations but does not encompass the full breadth of multi-view learning or neuroscience-specific methods. The combination of VAE-based disentanglement, geometry preservation, and neuroscience focus differentiates it from examined neighbors, though the small sample size warrants cautious interpretation of novelty claims.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
20
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Disentangling shared and private latent variables in multi-view data. The field addresses how to decompose observations from multiple sources or modalities into components that are common across views and components that are unique to each view. The taxonomy reveals several major branches: Variational Autoencoder-Based Disentanglement leverages deep generative models to learn shared and private latent codes, often through structured inference networks; Deep Learning Representation Disentanglement explores neural architectures and contrastive objectives for separating factors; Matrix and Tensor Factorization Methods apply classical decomposition techniques to multi-view matrices or tensors; Probabilistic and Bayesian Latent Variable Models employ hierarchical priors and graphical models for interpretable factorization; Multi-View Clustering and Embedding focuses on joint clustering or embedding that respects both consensus and view-specific structure; Specialized Application Domains tailor disentanglement to fields like medical imaging or autonomous driving; and Partially Shared and Flexible Disentanglement relaxes strict shared-private assumptions to handle more nuanced overlap patterns. Representative works such as Multi-VAE[2] and Disentangling Multimodal VAE[1] illustrate the VAE-based approach, while methods like Integrative Decomposition[23] and Covariance Structure Inference[24] exemplify matrix factorization strategies. A particularly active line of work within the VAE-based branch explores how to enforce or encourage disentanglement through architectural choices, regularization terms, and information-theoretic constraints, as seen in Multi-View Information Bottleneck[5] and Disentangled Variational Bottleneck[38]. Trade-offs between reconstruction fidelity, interpretability, and the degree of overlap permitted between shared and private codes remain central questions. Shared Private Geometry[0] sits within the Multimodal VAE Frameworks cluster, closely related to Disentangling Multimodal VAE[1] and M2VAE[36], which similarly employ variational inference to separate common and view-specific latent factors. Compared to these neighbors, Shared Private Geometry[0] emphasizes geometric structure in the latent space, whereas Disentangling Multimodal VAE[1] focuses on modality-specific encoders and M2VAE[36] on handling missing views. This positioning highlights ongoing efforts to refine how VAE architectures balance flexibility, interpretability, and robustness across diverse multi-view scenarios.

Claimed Contributions

SPLICE neural network architecture for disentangling shared and private latent variables

The authors introduce SPLICE, a two-step neural network method that uses predictability minimization in a crossed autoencoder framework to disentangle shared and private latent variables from multi-view data, then applies geometry-preserving loss terms to maintain interpretable submanifold structure in the learned representations.

10 retrieved papers
Predictability minimization approach for robust disentangling

The method employs adversarial measurement networks that attempt to predict one view from the other's private latents, training encoders to minimize this predictability. This approach prevents shared information from leaking into private latents regardless of whether it appears in inferred shared latents, making the model more robust to mis-specified dimensionality.

0 retrieved papers
Geometry preservation framework for interpretable latent representations

After disentangling, SPLICE projects data onto shared and private submanifolds, estimates geodesic distances using manifold learning techniques, then fine-tunes the network with geometry-preserving loss terms that match latent space Euclidean distances to estimated submanifold geodesic distances, yielding interpretable geometric structure.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

SPLICE neural network architecture for disentangling shared and private latent variables

The authors introduce SPLICE, a two-step neural network method that uses predictability minimization in a crossed autoencoder framework to disentangle shared and private latent variables from multi-view data, then applies geometry-preserving loss terms to maintain interpretable submanifold structure in the learned representations.

Contribution

Predictability minimization approach for robust disentangling

The method employs adversarial measurement networks that attempt to predict one view from the other's private latents, training encoders to minimize this predictability. This approach prevents shared information from leaking into private latents regardless of whether it appears in inferred shared latents, making the model more robust to mis-specified dimensionality.

Contribution

Geometry preservation framework for interpretable latent representations

After disentangling, SPLICE projects data onto shared and private submanifolds, estimates geodesic distances using manifold learning techniques, then fine-tunes the network with geometry-preserving loss terms that match latent space Euclidean distances to estimated submanifold geodesic distances, yielding interpretable geometric structure.