Geometry-aware Policy Imitation

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

imitation learning; diffusion policy

We propose a Geometry-Aware Policy Imitation (GPI) approach that rethinks imitation learning by treating demonstrations as geometric curves rather than collections of state–action samples. From these curves, GPI derives distance fields that give rise to two complementary control primitives: a progression flow that advances along expert trajectories and an attraction flow that corrects deviations. Their combination defines a controllable, non-parametric vector field that directly guides robot behavior. This formulation decouples metric learning from policy synthesis, enabling modular adaptation across low-dimensional robot states and high-dimensional perceptual inputs. GPI naturally supports multimodality by preserving distinct demonstrations as separate models and allows efficient composition of new demonstrations through simple additions to the distance field. We evaluate GPI in simulation and on real robots across diverse tasks. Experiments show that GPI achieves higher success rates than diffusion-based policies while running 20× faster, requiring less memory, and remaining robust to perturbations. These results establish GPI as an efficient, interpretable, and scalable alternative to generative approaches for robotic imitation learning.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes treating demonstrations as geometric curves and deriving distance fields to generate progression and attraction flows for robot control. It resides in the 'Distance Field-Based Policy Synthesis' leaf, which contains only three papers total, including this work and two siblings (DFields and Diff-lfd). This represents a relatively sparse research direction within the broader taxonomy of 31 papers across the field, suggesting the geometric distance field approach to policy synthesis remains an emerging area rather than a crowded subfield.

The taxonomy reveals that neighboring research directions include 'Geometric Constraint Inference' (extracting kinematic constraints from demonstrations) and 'Movement Primitive and Probabilistic Trajectory Methods' (encoding trajectories via dynamic movement primitives). The paper's approach diverges from these by directly synthesizing control policies from distance fields rather than extracting constraints or encoding trajectories probabilistically. It also differs from the 'Optimal Transport and Divergence Minimization' branch, which treats imitation as distribution matching rather than geometric curve following, and from 'Visual and Perceptual Imitation Learning', which focuses on high-dimensional sensory inputs rather than geometric structure.

The analysis examined zero candidate papers for all three contributions, meaning no literature search was conducted to identify potentially overlapping prior work. Without examining any candidates, the contribution-level statistics provide no evidence about whether the geometric distance field formulation, the decoupling of metric learning from policy synthesis, or the efficiency claims have substantial precedent. The absence of a literature search leaves the novelty assessment entirely dependent on the taxonomy structure and the two sibling papers in the same leaf, which address related but distinct aspects of geometric policy learning.

Given the limited search scope (zero candidates examined), this assessment reflects only the paper's position within a sparse taxonomy leaf and its relationship to two sibling works. The geometric distance field approach appears to occupy a relatively unexplored niche, but a comprehensive novelty evaluation would require examining a broader set of candidates from related leaves and potentially from outside the provided taxonomy structure.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: learning from demonstrations via geometric distance fields. The field organizes around several complementary perspectives on how to extract and reproduce demonstrated behaviors. Geometric Representation and Distance Field Methods emphasize spatial structure and collision-free motion synthesis, often leveraging signed distance functions or manifold-based encodings to capture task constraints. Trajectory Representation and Encoding focuses on compressing and generalizing demonstrated paths through probabilistic models, dynamic movement primitives, or neural embeddings. Optimal Transport and Divergence Minimization treats imitation as a distribution-matching problem, using Wasserstein distances or Sinkhorn divergences to align learned policies with expert data. Reinforcement Learning-Based Imitation blends demonstration data with trial-and-error optimization, while Visual and Perceptual Imitation Learning tackles the challenge of learning from raw sensory inputs. Domain-Specific Imitation Applications explore targeted use cases such as robotic manipulation, autonomous driving, or multi-agent coordination, each imposing unique geometric or temporal constraints. Within the geometric branch, a small cluster of works explores how distance fields can directly guide policy synthesis. DFields[14] constructs neural distance representations to encode spatial constraints from demonstrations, while Diff-lfd[4] uses diffusion models to generate trajectories that respect learned geometric structure. Geometry-aware Policy Imitation[0] sits naturally alongside these methods, emphasizing the integration of geometric priors into the imitation pipeline to ensure physically plausible and collision-aware behavior. Compared to Trajectory Optimisation Incremental[3], which refines paths through iterative optimization, Geometry-aware Policy Imitation[0] more directly encodes spatial relationships into the policy architecture. This line of work addresses a central trade-off: balancing the expressiveness of learned representations with the interpretability and safety guarantees that explicit geometric reasoning provides, a question that remains active as methods scale to more complex environments and richer sensory modalities.

Claimed Contributions

Geometry-aware Policy Imitation (GPI) approach

0 retrieved papers

GPI represents expert demonstrations as geometric curves that induce distance fields in state space. These fields give rise to two complementary control primitives: a progression flow advancing along trajectories and an attraction flow correcting deviations. Their combination defines a controllable, non-parametric vector field that directly guides robot behavior.

0 retrieved papers

Modular formulation decoupling metric learning from policy synthesis

0 retrieved papers

The approach separates metric learning (defining how states are represented and compared) from behavior synthesis (constructing policies from distance and flow fields). This decoupling enables flexible adaptation across low-dimensional robot states and high-dimensional perceptual inputs, with policy synthesis remaining non-parametric and lightweight.

0 retrieved papers

Extensive validation demonstrating efficiency and performance

0 retrieved papers

The authors validate GPI across diverse simulation benchmarks and real robot platforms, demonstrating that it achieves higher success rates than diffusion-based policies while running substantially faster (20× or more), requiring less memory, and maintaining robustness to perturbations.

0 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution