FullPart: Generating each 3D Part at Full Resolution

ICLR 2026 Conference SubmissionAnonymous Authors
3D GenerationDiffusion ModelPart Generation
Abstract:

Part-based 3D generation holds great potential for various applications. Previous part generators that represent parts using implicit vector-set tokens often suffer from insufficient geometric details. Another line of work adopts an explicit voxel representation but shares a global voxel grid among all parts; this often causes small parts to occupy too few voxels, leading to degraded quality. In this paper, we propose FullPart, a novel framework that combines both implicit and explicit paradigms. It first derives the bounding box layout through an implicit box vector-set diffusion process, a task that implicit diffusion handles effectively since box tokens contain little geometric detail. Then, it generates detailed parts, each within its own fixed full-resolution voxel grid. Instead of sharing a global low-resolution space, each part in our method—even small ones—is generated at full resolution, enabling the synthesis of intricate details. We further introduce a center-point encoding strategy to address the misalignment issue when exchanging information between parts of different actual sizes, thereby maintaining global coherence. Moreover, to tackle the scarcity of reliable part data, we present PartVerse-XL, the largest human-annotated 3D part dataset to date. Extensive experiments demonstrate that FullPart achieves state-of-the-art results in 3D part generation. We will release all code, data, and model to benefit future research in 3D part generation.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Taxonomy

Core-task Taxonomy Papers
33
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: part-based 3D object generation with full-resolution representation. The field organizes around several complementary directions. Part-Aware Representation and Decomposition Strategies focus on how to segment and model objects as assemblies of meaningful components, often leveraging latent diffusion or hierarchical encodings to capture part-level semantics. Full-Resolution and High-Fidelity Volumetric Approaches emphasize maintaining geometric detail at scale, exploring techniques such as dual volume packing (Dual Volume Packing[5]) or octree-based representations (OctGPT[10]) to balance memory and fidelity. Compositional and Multi-View Synthesis Methods address the challenge of generating coherent 3D content from multiple viewpoints or by composing learned part priors, with works like Category Aware Composition[1] and Sparc3D[2] illustrating how category-specific knowledge can guide assembly. Domain-Specific Part-Based Applications tailor these ideas to specialized contexts—ranging from human body modeling (GHUM[20]) and talking avatars (TalkingGaussian[13], PoseTalker[29]) to medical imaging (3D MRI Synthesis[27])—while Representation Learning and Encoding Foundations provide the underlying machinery, including variational autoencoders (Shape VAE[14]) and discrete tokenization schemes (Discrete Representation Learning[25]). A particularly active line of work explores part-level latent diffusion, where generative models operate in a structured latent space that respects object decomposition. Within this cluster, Contextual Part Latents[3] conditions diffusion on part relationships to ensure coherent assembly, while FullPart[0] extends this idea by maintaining full-resolution detail throughout the generation process, avoiding the loss of fine geometric features that can occur with coarser representations. Nearby efforts such as Diverse Part Synthesis[11] and Assembler[7] similarly emphasize compositional generation but may trade off resolution for broader part diversity or faster sampling. The central tension across these branches is between expressive part-level control and the computational cost of high-fidelity volumetric outputs. FullPart[0] sits at the intersection of contextual part modeling and full-resolution synthesis, aiming to preserve both semantic decomposition and geometric detail—a balance that distinguishes it from methods prioritizing either coarse part assembly or resolution alone.

Claimed Contributions

FullPart framework combining implicit and explicit paradigms

The authors introduce FullPart, a framework that first generates bounding box layouts using implicit vecset diffusion, then generates each part at full resolution within its own dedicated voxel grid using explicit representation. This design addresses limitations of prior methods by enabling fine geometric details while maintaining global coherence.

10 retrieved papers
Center-corner encoding strategy for part coherence

The authors propose a center-corner encoding mechanism that embeds absolute spatial context for each voxel by encoding the positions of its center and eight corners in a unified super-high-resolution global coordinate system. This addresses the scale misalignment problem when parts of different sizes exchange information through attention mechanisms.

10 retrieved papers
PartVerse-XL dataset

The authors introduce PartVerse-XL, the largest human-annotated 3D part dataset to date, containing 40K objects and 320K parts with associated part-aware texture descriptions. The dataset was created through mesh pre-segmentation followed by human refinement to ensure high-quality, semantically consistent annotations.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

FullPart framework combining implicit and explicit paradigms

The authors introduce FullPart, a framework that first generates bounding box layouts using implicit vecset diffusion, then generates each part at full resolution within its own dedicated voxel grid using explicit representation. This design addresses limitations of prior methods by enabling fine geometric details while maintaining global coherence.

Contribution

Center-corner encoding strategy for part coherence

The authors propose a center-corner encoding mechanism that embeds absolute spatial context for each voxel by encoding the positions of its center and eight corners in a unified super-high-resolution global coordinate system. This addresses the scale misalignment problem when parts of different sizes exchange information through attention mechanisms.

Contribution

PartVerse-XL dataset

The authors introduce PartVerse-XL, the largest human-annotated 3D part dataset to date, containing 40K objects and 320K parts with associated part-aware texture descriptions. The dataset was created through mesh pre-segmentation followed by human refinement to ensure high-quality, semantically consistent annotations.