P3D: Highly Scalable 3D Neural Surrogates for Physics Simulations with Global Context

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

neural surrogatesphysics simulationstransformers3D

We present a scalable framework for learning deterministic and probabilistic neural surrogates for high-resolution 3D physics simulations. We introduce P3D, a hybrid CNN-Transformer backbone architecture targeted for 3D physics simulations, which significantly outperforms existing architectures in terms of speed and accuracy. Our proposed network can be pretrained on small patches of the simulation domain, which can be fused to obtain a global solution, optionally guided via a scalable sequence-to-sequence model to include long-range dependencies. This setup allows for training large-scale models with reduced memory and compute requirements for high-resolution datasets. We evaluate our backbone architecture against a large set of baseline methods with the objective to simultaneously learn 14 different types of PDE dynamics in 3D. We demonstrate how to scale our model to high-resolution isotropic turbulence with spatial resolutions of up to $512^3$ . Finally, we show the versatility of our architecture by training it as a diffusion model to produce probabilistic samples of highly turbulent 3D channel flows across varying Reynolds numbers, accurately capturing the underlying flow statistics.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces P3D, a hybrid CNN-Transformer architecture for learning neural surrogates of high-resolution 3D physics simulations, with a focus on scalability and accuracy. It resides in the 'Hybrid and Multi-Scale Architectures' leaf, which contains four papers total, including the original work. This leaf sits within the broader 'Neural Architecture Design for 3D Physics' branch, indicating a moderately populated research direction. The taxonomy reveals that hybrid architectures combining multiple network types are an active but not overcrowded area, with sibling categories addressing graph-based, transformer-only, and domain-specific designs.

The taxonomy tree shows that neighboring leaves include 'Graph and Geometric Neural Networks' (three papers), 'Transformer-Based Architectures' (two papers), and 'Domain-Specific Network Designs' (three papers). The paper's hybrid approach bridges convolutional and transformer paradigms, distinguishing it from pure transformer methods in the sibling leaf. The 'Computational Efficiency and Scalability' branch (two papers) addresses related concerns about high-resolution simulation, while 'Physics-Informed Learning Frameworks' (seventeen papers across four leaves) represents a more densely populated alternative strategy. The taxonomy's scope and exclude notes clarify that this work focuses on architecture rather than physics integration or training methodologies.

Among twenty candidates examined, three refutable pairs were identified, all associated with the third contribution on flexible finetuning setups with memory-efficient gradient control. The first contribution (P3D hybrid architecture) examined seven candidates with zero refutations, suggesting relative novelty in this specific architectural combination. The second contribution (crop-based pretraining with global context) examined three candidates, also with zero refutations. The third contribution's three refutable candidates indicate that memory-efficient training strategies have more substantial prior work within the limited search scope. The analysis explicitly covers top-K semantic matches plus citation expansion, not an exhaustive literature review.

Based on the limited search scope of twenty candidates, the architectural contributions appear more distinctive than the training methodology. The taxonomy context reveals a moderately active research area with clear boundaries separating hybrid architectures from graph-based, transformer-only, and physics-informed approaches. The contribution-level statistics suggest that while the core P3D design shows novelty signals, the memory-efficient training aspects overlap with existing work. This assessment reflects the examined candidate set and does not claim comprehensive coverage of all relevant prior art.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Learning neural surrogates for high-resolution 3D physics simulations. The field has evolved into several major branches that reflect different strategic emphases. Neural Architecture Design for 3D Physics explores specialized network structures—ranging from graph-based methods like Boundary Graph Networks[9] to hybrid and multi-scale designs that blend convolutional, transformer, and implicit representations. Physics-Informed Learning Frameworks, exemplified by works such as Modified Loss PINN[2] and PINNs Fluid Review[7], embed governing equations directly into training objectives. Surrogate Modeling and Operator Learning focuses on data-driven approximations of solution operators, often leveraging Graph Neural Operators[15] or frequency-domain techniques like Frequency Domain Wind[5]. Meanwhile, Computational Efficiency and Scalability addresses the practical challenge of deploying these surrogates at scale, as seen in P3D Scalable Surrogates[1]. Additional branches cover Benchmarking and Evaluation Frameworks, Specialized Physics Applications (from turbulence to astrophysics), Inverse Problems and Reconstruction, and Scene Generation and Dynamics Modeling, collectively spanning a wide spectrum of simulation tasks and domain requirements. Within the Neural Architecture Design branch, a particularly active line of work investigates hybrid and multi-scale architectures that combine local feature extraction with global context modeling. P3D Global Context[0] sits squarely in this cluster, emphasizing mechanisms that capture long-range dependencies in high-resolution 3D fields—a recurring challenge when simulating complex fluid or structural dynamics. Nearby efforts such as LRQ Solver[30] and Multi Resolution Hash[33] similarly explore multi-resolution representations, though they differ in how they balance computational cost against fidelity. Flow3DNet[3] offers another perspective by integrating flow-specific inductive biases into the architecture. The central trade-off across these works revolves around expressiveness versus efficiency: richer global context can improve accuracy on intricate phenomena, but often at the expense of memory and compute. P3D Global Context[0] addresses this by proposing tailored attention or hierarchical encoding strategies, positioning itself as a step toward scalable yet expressive surrogates for demanding 3D physics scenarios.

Claimed Contributions

P3D hybrid CNN-Transformer architecture for 3D physics simulations

7 retrieved papers

The authors propose P3D, a novel backbone architecture that combines convolutional neural networks for efficient local feature extraction with windowed transformer blocks for learning generalizable token representations. This hybrid design is specifically optimized for scaling to very high-resolution 3D physics simulations.

7 retrieved papers

Crop-based pretraining with global context model for scalability

3 retrieved papers

The authors introduce a scalable training framework where P3D can be pretrained on small spatial patches and then scaled to full domains. A sequence-to-sequence context model processes global dependencies by linking bottleneck representations, and region tokens inject global information back into decoder layers via adaptive normalization.

3 retrieved papers

Flexible finetuning setups with memory-efficient gradient control

Can Refute

10 retrieved papers

The authors develop multiple training and inference configurations that allow selective gradient backpropagation through network components. These setups enable memory-efficient finetuning by freezing encoders or randomly disabling gradient flow, reducing computational costs while maintaining model performance.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] P3D: Scalable Neural Surrogates for High-Resolution 3D Physics Simulations with Global Context PDF

Kohl, Georg, Benjamin J. Holzschuh, Georg Kohl, Thuerey, Nils, Florian Redinger, Nils Thuerey (2025)

[30] Lrq-solver: A transformer-based neural operator for fast and accurate solving of large-scale 3d pdes PDF

Wang Guan, Peijian Zeng, Guan Wang, Hu Xiaoguang, Haohao Gu, Xiaoguang Hu, Wang Zhuowei, Tiezhu Gao, Yang Ai-min, Zhuowei Wang, Song Xiao-yu, Aimin Yang, Xiaoyu Song (2025)

[33] Neural Physical Simulation with Multi-Resolution Hash Grid Encoding PDF

Dai Qionghai, Qiao Hui, Wang Hao-xiang, Yang Tianwei, Yu Tao (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

P3D hybrid CNN-Transformer architecture for 3D physics simulations

[1] P3D: Scalable Neural Surrogates for High-Resolution 3D Physics Simulations with Global Context PDF

Cannot Refute

[64] SwinFlood: A hybrid CNN-Swin Transformer model for rapid spatiotemporal flood simulation PDF

Cannot Refute

[65] A physics-embedded Transformer-CNN architecture for data-driven turbulence prediction and surrogate modeling of high-fidelity fluid dynamics PDF

Cannot Refute

[66] PTCT: Patches with 3D-Temporal Convolutional Transformer Network for Precipitation Nowcasting PDF

Cannot Refute

[67] Towards physics-inspired data-driven weather forecasting: Integrating data assimilation with a deep spatial-transformer-based U-NET in a case study with â¦ PDF

Cannot Refute

[68] Reconstruction of temperature field in nanofluid-filled annular receiver with fins using deep hybrid transformer-convolutional neural network PDF

Cannot Refute

[69] Real-Time Gas Dispersion Model Prediction on Offshore Platforms based on CNN_Transformer Model PDF

Cannot Refute

Contribution

Crop-based pretraining with global context model for scalability

[61] JAX-MPM: A Learning-Augmented Differentiable Meshfree Framework for GPU-Accelerated Lagrangian Simulation and Geophysical Inverse Modeling PDF

Cannot Refute

[62] Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics PDF

Cannot Refute

[63] Solving Fokker-Planck-Kolmogorov Equation by Distribution Self-adaptation Normalized Physics-informed Neural Networks PDF

Cannot Refute

Contribution

Flexible finetuning setups with memory-efficient gradient control

[53] Memory-efficient fine-tuning of transformers via token selection PDF

Can Refute

[55] End-to-end long document summarization using gradient caching PDF

Can Refute

[56] Memory-efficient selective fine-tuning PDF

Can Refute

[51] GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection PDF

Cannot Refute

[52] MobiEdit: Resource-efficient Knowledge Editing for Personalized On-device LLMs PDF

Cannot Refute

[54] Optimizing memory usage when training deep neural networks PDF

Cannot Refute

[57] Progressive gradient flow for robust n: M sparsity training in transformers PDF

Cannot Refute

[58] G3R: Gradient Guided Generalizable Reconstruction PDF

Cannot Refute

[59] Explicit and data-Efficient Encoding via Gradient Flow PDF

Cannot Refute

[60] UNetDeblur: Optimized Lightweight and Efficient Motion Deblurring for Mobile Platforms in Real-Time Scenarios PDF

Cannot Refute

P3D: Highly Scalable 3D Neural Surrogates for Physics Simulations with Global Context

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] P3D: Scalable Neural Surrogates for High-Resolution 3D Physics Simulations with Global Context PDF

[30] Lrq-solver: A transformer-based neural operator for fast and accurate solving of large-scale 3d pdes PDF

[33] Neural Physical Simulation with Multi-Resolution Hash Grid Encoding PDF

Contribution Analysis

P3D hybrid CNN-Transformer architecture for 3D physics simulations

[1] P3D: Scalable Neural Surrogates for High-Resolution 3D Physics Simulations with Global Context PDF

[64] SwinFlood: A hybrid CNN-Swin Transformer model for rapid spatiotemporal flood simulation PDF

[65] A physics-embedded Transformer-CNN architecture for data-driven turbulence prediction and surrogate modeling of high-fidelity fluid dynamics PDF

[66] PTCT: Patches with 3D-Temporal Convolutional Transformer Network for Precipitation Nowcasting PDF

[67] Towards physics-inspired data-driven weather forecasting: Integrating data assimilation with a deep spatial-transformer-based U-NET in a case study with â¦ PDF

[68] Reconstruction of temperature field in nanofluid-filled annular receiver with fins using deep hybrid transformer-convolutional neural network PDF

[69] Real-Time Gas Dispersion Model Prediction on Offshore Platforms based on CNN_Transformer Model PDF

Crop-based pretraining with global context model for scalability

[61] JAX-MPM: A Learning-Augmented Differentiable Meshfree Framework for GPU-Accelerated Lagrangian Simulation and Geophysical Inverse Modeling PDF

[62] Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics PDF

[63] Solving Fokker-Planck-Kolmogorov Equation by Distribution Self-adaptation Normalized Physics-informed Neural Networks PDF

Flexible finetuning setups with memory-efficient gradient control

[53] Memory-efficient fine-tuning of transformers via token selection PDF

[55] End-to-end long document summarization using gradient caching PDF

[56] Memory-efficient selective fine-tuning PDF

[51] GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection PDF

[52] MobiEdit: Resource-efficient Knowledge Editing for Personalized On-device LLMs PDF

[54] Optimizing memory usage when training deep neural networks PDF

[57] Progressive gradient flow for robust n: M sparsity training in transformers PDF

[58] G3R: Gradient Guided Generalizable Reconstruction PDF

[59] Explicit and data-Efficient Encoding via Gradient Flow PDF

[60] UNetDeblur: Optimized Lightweight and Efficient Motion Deblurring for Mobile Platforms in Real-Time Scenarios PDF

Table of Contents

[67] Towards physics-inspired data-driven weather forecasting: Integrating data assimilation with a deep spatial-transformer-based U-NET in a case study with â¦ PDF