NeMo-map: Neural Implicit Flow Fields for Spatio-Temporal Motion Mapping

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.5 Download Report PDF

Neural Implicit RepresentationHuman Motion RepresentationMaps of Dynamics

Safe and efficient robot operation in complex human environments can benefit from good models of site-specific motion patterns. Maps of Dynamics (MoDs) provide such models by encoding statistical motion patterns in a map, but existing representations use discrete spatial sampling and typically require costly offline construction. We propose a continuous spatio-temporal MoD representation based on implicit neural functions that directly map coordinates to the parameters of a Semi-Wrapped Gaussian Mixture Model. This removes the need for discretization and imputation for unevenly sampled regions, enabling smooth generalization across both space and time. Evaluated on two public datasets with real-world people tracking data, our method achieves better accuracy of motion representation and smoother velocity distributions in sparse regions while still being computationally efficient, compared to available baselines. The proposed approach demonstrates a powerful and efficient way of modeling complex human motion patterns and high performance in the trajectory prediction downstream task.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a continuous spatio-temporal representation for Maps of Dynamics using implicit neural functions that map coordinates to Semi-Wrapped Gaussian Mixture Model parameters. It resides in the 'Neural Implicit Flow Fields for Motion Mapping' leaf, which is a newly created category containing only this work as a sibling. This positioning reflects a sparse research direction within the broader taxonomy of 50 papers across 36 topics, suggesting the approach occupies a relatively unexplored niche in the field of spatio-temporal human motion modeling.

The taxonomy reveals that most related work falls into discrete or grid-based representations under 'Urban and Geographic Mobility Patterns' or trajectory-focused methods in 'Pedestrian and Agent Trajectory Prediction'. The paper's continuous implicit representation diverges from these established directions, which typically employ LSTMs, graph networks, or discrete spatial sampling. Neighboring branches like 'Trajectory Representation and Reconstruction' focus on learning from sparse data rather than continuous field modeling, while 'Crowd and Aggregate Movement Modeling' addresses collective patterns using hidden Markov models or simulation frameworks rather than neural implicit functions.

Among the 30 candidates examined through semantic search, none clearly refute any of the three core contributions. Contribution A (continuous spatio-temporal MoD) examined 10 candidates with 0 refutable matches, as did Contribution B (neural function mapping to SWGMM parameters) and Contribution C (feature-conditioned architecture with SIREN encoding). This suggests that within the limited search scope, the specific combination of implicit neural representations, SWGMM parameterization, and continuous spatio-temporal mapping for motion patterns appears relatively novel, though the analysis does not cover exhaustive prior work beyond top-30 semantic matches.

Based on the limited literature search, the work appears to introduce a distinct methodological approach by applying neural implicit functions to motion pattern encoding, a technique more common in 3D scene representation than human mobility modeling. The absence of sibling papers in its taxonomy leaf and the lack of refuting candidates among 30 examined suggest novelty, though this assessment is constrained by the search scope and does not preclude relevant work outside the examined set.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Modeling spatio-temporal human motion patterns in environments. The field encompasses a broad spectrum of approaches, from fine-grained skeletal pose prediction and pedestrian trajectory forecasting to large-scale urban mobility analytics and crowd dynamics. Major branches include methods focused on individual body motion (e.g., skeletal pose sequences and stylized motion synthesis), agent-level trajectory prediction that captures social interactions and scene context, aggregate crowd modeling for evacuation or public space design, and geographic mobility patterns derived from mobile phone data or transportation networks. Sensor-based activity recognition and video analysis form complementary branches that emphasize real-time detection and classification, while specialized applications address domains such as sports analytics, virtual reality motion detection, and even paleolithic activity reconstruction. Methodological frameworks span classical probabilistic models, deep learning architectures (LSTMs, transformers, diffusion models), and emerging neural implicit representations that encode motion as continuous fields. Works like Spatial-Temporal LLM[2] and Masked Diffusion Mobility[6] illustrate the integration of modern generative models, whereas Crowded Tracking[7] and Pedestrian Tracking[12] represent foundational vision-based approaches. Recent lines of work reveal contrasting emphases: some studies prioritize interpretability and causal reasoning in trajectory forecasting (e.g., Interpretable Motion Forecasting[42]), while others leverage large-scale data and neural architectures for generalization across diverse environments (e.g., Trajectory Dependencies[1], Robust Trajectories[36]). Urban and geographic mobility research (e.g., Urban Mobility Patterns[30], Mobility Science Directions[14]) often grapples with privacy, scalability, and the integration of heterogeneous data sources, whereas sensor-based activity recognition (e.g., AttnSense[47], DTR-HAR[49]) focuses on real-time inference and wearable deployment. NeMo-map[0] sits within the Neural Implicit and Continuous Representations branch, emphasizing the use of implicit flow fields to map motion patterns continuously in space and time. This approach contrasts with discrete trajectory models like LSTM Trajectory[45] or graph-based methods such as STGAT[50], offering a more flexible representation that can capture complex, non-linear motion dynamics without explicit discretization. The work aligns with broader trends toward continuous, differentiable representations in motion modeling, addressing challenges of generalization and scene-aware prediction.

Claimed Contributions

Continuous spatio-temporal map of dynamics using neural implicit representation

10 retrieved papers

The authors introduce NeMo-map, a novel continuous representation of maps of dynamics that uses implicit neural functions to map spatio-temporal coordinates to Semi-Wrapped Gaussian Mixture Model parameters. This eliminates the need for spatial discretization and enables smooth generalization across both space and time while maintaining multimodality in motion patterns.

10 retrieved papers

Neural function mapping spatio-temporal coordinates to SWGMM parameters

10 retrieved papers

The method learns a neural function parameterized by an MLP that takes spatial and temporal coordinates as input and outputs the full set of parameters for a Semi-Wrapped Gaussian Mixture Model. This formulation enables querying motion distributions at arbitrary locations and times without requiring discrete grid cells.

10 retrieved papers

Feature-conditioned architecture with spatial grid and SIREN temporal encoding

10 retrieved papers

The architecture combines spatial features from a learnable grid queried via bilinear interpolation with temporal encoding using SIREN networks. This design captures local spatial variations while modeling continuous temporal dynamics through periodic activation functions, enabling the model to represent time-varying motion patterns.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Continuous spatio-temporal map of dynamics using neural implicit representation

[51] An implicit neural deformable ray model for limited and sparse viewâbased spatiotemporal reconstruction PDF

Cannot Refute

[52] Implicit neural differentiable model for spatiotemporal dynamics PDF

Cannot Refute

[53] Space-time neural irradiance fields for free-viewpoint video PDF

Cannot Refute

[54] Implicit Neural Differential Model for Spatiotemporal Dynamics PDF

Cannot Refute

[55] MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution PDF

Cannot Refute

[56] VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution PDF

Cannot Refute

[57] Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans PDF

Cannot Refute

[58] Generalized Implicit Neural Representations for Dynamic Molecular Surface Modeling PDF

Cannot Refute

[59] Implicit Neural Representations of Intramyocardial Motion and Strain PDF

Cannot Refute

[60] Generalizable implicit motion modeling for video frame interpolation PDF

Cannot Refute

Contribution

Neural function mapping spatio-temporal coordinates to SWGMM parameters

[71] Symbiotic graph neural networks for 3d skeleton-based human action recognition and motion prediction PDF

Cannot Refute

[72] Bias for Action: Video Implicit Neural Representations with Bias Modulation PDF

Cannot Refute

[73] NeRM: Learning neural representations for high-framerate human motion synthesis PDF

Cannot Refute

[74] Synergy-space recurrent neural network for transferable forearm motion prediction from residual limb motion PDF

Cannot Refute

[75] Polar Coordinate-Based 2D Pose Prior with Neural Distance Field PDF

Cannot Refute

[76] Vehicle Trajectory Prediction Based on Dynamic Graph Neural Network PDF

Cannot Refute

[77] Adaptive Wavelet-Positional Encoding for High-Frequency Information Learning in Implicit Neural Representation PDF

Cannot Refute

[78] TSGN: Temporal Scene Graph Neural Networks with Projected Vectorized Representation for Multi-Agent Motion Prediction PDF

Cannot Refute

[79] Spatiotemporal Co-Attention Recurrent Neural Networks for Human-Skeleton Motion Prediction PDF

Cannot Refute

[80] NeMF: Neural Motion Fields for Kinematic Animation PDF

Cannot Refute

Contribution

Feature-conditioned architecture with spatial grid and SIREN temporal encoding

[61] Coordnet: Data generation and visualization generation for time-varying volumes via a coordinate-based neural network PDF

Cannot Refute

[62] SINR: Spline-enhanced implicit neural representation for multi-modal registration PDF

Cannot Refute

[63] Signet: Efficient neural representation for light fields PDF

Cannot Refute

[64] Leveraging sinusoidal representation networks to predict fMRI signals from EEG PDF

Cannot Refute

[65] E-nerv: Expedite neural video representation with disentangled spatial-temporal context PDF

Cannot Refute

[66] Signal compression via neural implicit representations PDF

Cannot Refute

[67] Implicit neural representation for radiation therapy dose distribution PDF

Cannot Refute

[68] Geographic location encoding with spherical harmonics and sinusoidal representation networks PDF

Cannot Refute

[69] MADS: Modulated Auto-Decoding SIREN for time series imputation PDF

Cannot Refute

[70] Frequency-based motion representation for video generative adversarial networks PDF

Cannot Refute

NeMo-map: Neural Implicit Flow Fields for Spatio-Temporal Motion Mapping

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

Continuous spatio-temporal map of dynamics using neural implicit representation

[51] An implicit neural deformable ray model for limited and sparse viewâbased spatiotemporal reconstruction PDF

[52] Implicit neural differentiable model for spatiotemporal dynamics PDF

[53] Space-time neural irradiance fields for free-viewpoint video PDF

[54] Implicit Neural Differential Model for Spatiotemporal Dynamics PDF

[55] MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution PDF

[56] VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution PDF

[57] Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans PDF

[58] Generalized Implicit Neural Representations for Dynamic Molecular Surface Modeling PDF

[59] Implicit Neural Representations of Intramyocardial Motion and Strain PDF

[60] Generalizable implicit motion modeling for video frame interpolation PDF

Neural function mapping spatio-temporal coordinates to SWGMM parameters

[71] Symbiotic graph neural networks for 3d skeleton-based human action recognition and motion prediction PDF

[72] Bias for Action: Video Implicit Neural Representations with Bias Modulation PDF

[73] NeRM: Learning neural representations for high-framerate human motion synthesis PDF

[74] Synergy-space recurrent neural network for transferable forearm motion prediction from residual limb motion PDF

[75] Polar Coordinate-Based 2D Pose Prior with Neural Distance Field PDF

[76] Vehicle Trajectory Prediction Based on Dynamic Graph Neural Network PDF

[77] Adaptive Wavelet-Positional Encoding for High-Frequency Information Learning in Implicit Neural Representation PDF

[78] TSGN: Temporal Scene Graph Neural Networks with Projected Vectorized Representation for Multi-Agent Motion Prediction PDF

[79] Spatiotemporal Co-Attention Recurrent Neural Networks for Human-Skeleton Motion Prediction PDF

[80] NeMF: Neural Motion Fields for Kinematic Animation PDF

Feature-conditioned architecture with spatial grid and SIREN temporal encoding

[61] Coordnet: Data generation and visualization generation for time-varying volumes via a coordinate-based neural network PDF

[62] SINR: Spline-enhanced implicit neural representation for multi-modal registration PDF

[63] Signet: Efficient neural representation for light fields PDF

[64] Leveraging sinusoidal representation networks to predict fMRI signals from EEG PDF

[65] E-nerv: Expedite neural video representation with disentangled spatial-temporal context PDF

[66] Signal compression via neural implicit representations PDF

[67] Implicit neural representation for radiation therapy dose distribution PDF

[68] Geographic location encoding with spherical harmonics and sinusoidal representation networks PDF

[69] MADS: Modulated Auto-Decoding SIREN for time series imputation PDF

[70] Frequency-based motion representation for video generative adversarial networks PDF

Table of Contents

[51] An implicit neural deformable ray model for limited and sparse viewâbased spatiotemporal reconstruction PDF