DexNDM: Closing the Reality Gap for Dexterous In-Hand Rotation via Joint-Wise Neural Dynamics Model

ICLR 2026 Conference SubmissionAnonymous Authors
In-Hand Object Rotation; Sim-to-Real; Neural Dynamics Model
Abstract:

Achieving generalized in-hand object rotation remains a significant challenge in robotics, largely due to the difficulty of transferring policies from simulation to the real world. The complex, contact-rich dynamics of dexterous manipulation create a "reality gap" that has limited prior work to constrained scenarios involving simple geometries, limited object sizes and aspect ratios, constrained wrist poses, or customized hands. We address this sim-to-real challenge with a novel framework that enables a single policy, trained in simulation, to generalize to a wide variety of objects and conditions in the real world. The core of our method is a joint-wise dynamics model that learns to bridge the reality gap by effectively fitting limited amount of real-world collected data and then adapting the sim policy’s actions accordingly. The model is highly data‑efficient and generalizable across different whole‑hand interaction distributions by factorizing dynamics across joints, compressing system-wide influences into low‑dimensional variables, and learning each joint’s evolution from its own dynamic profile, implicitly capturing these net effects. We pair this with a fully autonomous data collection strategy that gathers diverse, real-world interaction data with minimal human intervention. Our complete pipeline demonstrates unprecedented generality: a single policy successfully rotates challenging objects with complex shapes (e.g., animals), high aspect ratios (up to 5.33), and small sizes, all while handling diverse wrist orientations and rotation axes. Comprehensive real-world evaluations and a teleoperation application for complex tasks validate the effectiveness and robustness of our approach. Website: DexNDM.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper contributes a joint-wise neural dynamics model for sim-to-real transfer in dexterous in-hand object rotation, paired with an autonomous data collection strategy. It resides in the 'Adaptive and Fine-Tuning Transfer' leaf, which contains five papers total including the original work. This leaf sits within the broader 'Sim-to-Real Transfer Methods and Frameworks' branch, indicating a moderately populated research direction. The sibling papers explore online adaptation (Rapid Motor Adaptation), controller refinements (DexCtrl), and privileged information distillation (DROP), suggesting this adaptive transfer area is active but not overcrowded compared to domain randomization approaches.

The taxonomy reveals neighboring directions that contextualize this work's positioning. The adjacent 'Domain Randomization and Direct Transfer' leaf contains six papers emphasizing zero-shot deployment without fine-tuning, while 'Simulation-Guided and Digital Twin Approaches' (three papers) explores world models for transfer. The paper's focus on data-efficient adaptation distinguishes it from purely randomization-based methods, yet shares goals with simulation-guided approaches. The 'Sensory Modalities and Perception' branch (thirteen papers across vision, tactile, and multimodal categories) addresses complementary challenges in state estimation, while this work assumes sensory inputs are available and focuses on dynamics modeling.

Among twenty-three candidates examined across three contributions, no clearly refutable prior work emerged. The joint-wise dynamics model examined three candidates with zero refutations, suggesting this factorized approach to dynamics learning may represent a less-explored angle. The autonomous data collection strategy and the overall sim-to-real framework each examined ten candidates without refutation, indicating these contributions appear novel within the limited search scope. However, the search examined only top-K semantic matches plus citations, not an exhaustive survey, so related work in adjacent subfields (e.g., system identification, residual learning) may exist outside this candidate pool.

The analysis suggests the paper occupies a moderately novel position within adaptive sim-to-real transfer, with no strong prior work overlap detected among examined candidates. The joint-wise factorization and autonomous data collection appear distinctive given the search scope, though the limited candidate pool (twenty-three papers) means adjacent literature in dynamics modeling or automated data collection may not have been fully captured. The taxonomy structure indicates this adaptive transfer direction remains active, with ongoing exploration of data efficiency versus generalization trade-offs.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
23
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: dexterous in-hand object rotation with sim-to-real transfer. The field organizes around several complementary branches that address different facets of transferring learned manipulation skills from simulation to physical robots. Sim-to-Real Transfer Methods and Frameworks form the backbone, encompassing domain randomization, system identification, and adaptive fine-tuning strategies that bridge the reality gap. Sensory Modalities and Perception explores how vision, tactile feedback, and multimodal fusion enable robust state estimation during contact-rich tasks. Planning and Control Strategies investigates model-based approaches, reinforcement learning policies, and hybrid methods for generating dexterous motions. Human Demonstration and Imitation Learning leverages teleoperation data and motion retargeting to bootstrap policies, while Task-Specific Applications and Benchmarks provide standardized testbeds for rotation, reorientation, and tool use. System Design and Generative Frameworks address hardware considerations and procedural generation of training scenarios, and Surveys and Modeling Studies offer theoretical perspectives on contact dynamics and skill acquisition. Within the adaptive and fine-tuning transfer cluster, several works illustrate contrasting strategies for closing the sim-to-real gap. Rapid Motor Adaptation[2] and Simulation Guided Finetuning[21] emphasize online adaptation mechanisms that adjust policies during deployment, while DexCtrl[29] explores controller-level refinements. DexNDM[0] situates itself in this adaptive branch by proposing a neural dynamics model that combines simulation-trained priors with real-world fine-tuning, balancing sample efficiency against the need for physical interaction. Compared to DROP[3], which focuses on privileged information distillation, and ManipTrans[11], which targets cross-task transfer, DexNDM[0] emphasizes iterative refinement of dynamics predictions to handle object variability. This adaptive fine-tuning direction remains active, as researchers navigate trade-offs between purely sim-trained policies like Visual Dexterity[5] and methods requiring substantial real-world data collection.

Claimed Contributions

Joint-wise neural dynamics model for sim-to-real transfer

The authors introduce a novel neural dynamics model that factorizes system dynamics across individual joints rather than modeling the whole hand-object system. Each joint's evolution is predicted from its own proprioceptive history, compressing system-wide influences into low-dimensional variables. This design achieves high data efficiency and generalizability across different interaction distributions.

3 retrieved papers
Fully autonomous data collection strategy

The authors develop an autonomous data collection method called Chaos Box that gathers real-world interaction data by placing the robotic hand in a container with soft balls and replaying actions from the simulated policy. This approach eliminates catastrophic failures and human resets while providing diverse, object-loaded interaction data at scale.

10 retrieved papers
Sim-to-real framework for general-purpose in-hand rotation

The authors present a complete pipeline combining specialist-to-generalist policy training with their joint-wise dynamics model and autonomous data collection. This framework demonstrates unprecedented generality in rotating challenging objects with complex shapes, high aspect ratios, small sizes, and diverse wrist orientations in real-world settings.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Joint-wise neural dynamics model for sim-to-real transfer

The authors introduce a novel neural dynamics model that factorizes system dynamics across individual joints rather than modeling the whole hand-object system. Each joint's evolution is predicted from its own proprioceptive history, compressing system-wide influences into low-dimensional variables. This design achieves high data efficiency and generalizability across different interaction distributions.

Contribution

Fully autonomous data collection strategy

The authors develop an autonomous data collection method called Chaos Box that gathers real-world interaction data by placing the robotic hand in a container with soft balls and replaying actions from the simulated policy. This approach eliminates catastrophic failures and human resets while providing diverse, object-loaded interaction data at scale.

Contribution

Sim-to-real framework for general-purpose in-hand rotation

The authors present a complete pipeline combining specialist-to-generalist policy training with their joint-wise dynamics model and autonomous data collection. This framework demonstrates unprecedented generality in rotating challenging objects with complex shapes, high aspect ratios, small sizes, and diverse wrist orientations in real-world settings.

DexNDM: Closing the Reality Gap for Dexterous In-Hand Rotation via Joint-Wise Neural Dynamics Model | Novelty Validation