Differentiable Model Predictive Control on the GPU

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.3 Download Report PDF

differentiable optimizationmodel predictive controloptimal controlgpu-accelerated optimizationreinforcement learningimitation learningrobotics

Differentiable model predictive control (MPC) offers a powerful framework for combining learning and control. However, its adoption has been limited by the inherently sequential nature of traditional optimization algorithms, which are challenging to parallelize on modern computing hardware like GPUs. In this work, we tackle this bottleneck by introducing a GPU-accelerated differentiable optimization tool for MPC. This solver leverages sequential quadratic programming and a custom preconditioned conjugate gradient (PCG) routine with tridiagonal preconditioning to exploit the problem's structure and enable efficient parallelization. We demonstrate substantial speedups over CPU- and GPU-based baselines, significantly improving upon state-of-the-art training times on benchmark reinforcement learning and imitation learning tasks. Finally, we showcase the method on the challenging task of reinforcement learning for driving at the limits of handling, where it enables robust drifting of a Toyota Supra through water puddles.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces a GPU-accelerated differentiable MPC solver combining sequential quadratic programming with a custom preconditioned conjugate gradient routine featuring tridiagonal preconditioning. It resides in the 'Preconditioned Conjugate Gradient Approaches' leaf, which contains only two papers total (including this one and one sibling). This is a notably sparse research direction within the broader taxonomy of 36 papers across 28 leaf nodes, suggesting the specific combination of GPU acceleration, differentiability, and PCG-based iterative solvers remains relatively underexplored compared to sampling-based or direct shooting methods.

The taxonomy reveals several neighboring directions: the sibling 'Condensed-Space Interior-Point Methods' leaf focuses on eliminating state variables rather than iterative PCG solvers, while the parallel 'Parallel Shooting and Direct Methods' branch employs primal-dual KKT solvers or multilevel SQP without emphasizing PCG preconditioning. Nearby 'Differentiable MPC for End-to-End Learning' nodes integrate MPC with actor-critic frameworks but do not necessarily prioritize iterative linear system solvers. The paper's position bridges gradient-based optimization infrastructure with learning-integrated applications, sitting at the intersection of solver design and end-to-end training pipelines.

Among 16 candidates examined across three contributions, none clearly refute the core claims. The GPU-accelerated differentiable optimization tool examined 7 candidates with 0 refutations; the tridiagonal PCG routine examined 1 candidate with 0 refutations; and the robust drifting application examined 8 candidates with 0 refutations. This limited search scope—focused on top-K semantic matches—suggests that within the examined literature, the specific combination of SQP, custom PCG preconditioning, and differentiability for learning tasks appears distinct. However, the small candidate pool means the analysis does not capture exhaustive prior work in iterative MPC solvers or GPU optimization.

Given the sparse taxonomy leaf and absence of refutations among 16 examined candidates, the work appears to occupy a relatively novel niche within GPU-accelerated MPC. The emphasis on tridiagonal preconditioning for differentiable SQP distinguishes it from both sampling-heavy methods and direct KKT approaches. Nonetheless, the limited search scope and small sibling set mean this assessment reflects positioning within a focused literature subset rather than comprehensive field coverage.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: GPU-accelerated differentiable model predictive control. The field organizes around several complementary directions. Gradient-Based Trajectory Optimization Methods focus on iterative solvers and direct differentiation through dynamics, often leveraging automatic differentiation frameworks to compute sensitivities efficiently. Sampling-Based and Hybrid Methods explore stochastic rollouts and path-integral formulations that naturally parallelize on GPUs, trading off gradient precision for robustness in high-dimensional or uncertain settings. Learning-Integrated MPC Frameworks blend neural network components with classical control loops, using differentiable MPC layers to enable end-to-end training. Domain-Specific MPC Applications tailor these techniques to robotics, autonomous driving, and soft-body systems, while Specialized Optimization and Solver Techniques develop custom algorithms for constrained quadratic programs and nonlinear solvers. Differentiable Simulation and Surrogate Modeling emphasizes physics engines and reduced-order models that support backpropagation, and Hardware Design contributions address VLSI and embedded implementations. A particularly active line of work targets real-time control for legged robots and manipulators, where GPU parallelism enables rapid trajectory recomputation at high frequencies—examples include Whole Body MPC GPU[4] and GPU Stochastic Predictive Legged[10]. Another contrasting theme is the integration of learned components, as seen in Physics Neural MPC Quadcopter[3] and Adaptive DiffTune MPC[30], which adapt model parameters online. The original paper, Differentiable MPC GPU[0], sits squarely within the gradient-based optimization branch, specifically employing preconditioned conjugate gradient solvers to handle large-scale linear systems arising from sequential quadratic programming. This places it close to Mpcgpu[2], which similarly exploits GPU-accelerated iterative methods, but Differentiable MPC GPU[0] emphasizes end-to-end differentiability to support learning pipelines. Compared to sampling-heavy approaches like Feedback-MPPI[14], it trades stochastic exploration for deterministic gradient descent, reflecting a broader tension between computational efficiency and algorithmic flexibility across the taxonomy.

Claimed Contributions

GPU-accelerated differentiable optimization tool for MPC

7 retrieved papers

The authors introduce DiffMPC, a differentiable solver for model predictive control that runs efficiently on GPUs. It uses sequential quadratic programming combined with a custom preconditioned conjugate gradient routine that exploits the sparse-in-time structure of optimal control problems to enable parallelization over time steps.

7 retrieved papers

Preconditioned conjugate gradient routine with tridiagonal preconditioning

1 retrieved paper

The authors adapt and implement a PCG routine with tridiagonal preconditioning that solves the KKT linear systems by exploiting the block-tridiagonal structure of the Schur complement. This design enables parallelization over time steps and supports warm-starting, making it suitable for GPU execution.

1 retrieved paper

Application to robust drifting via domain randomization and RL

8 retrieved papers

The authors demonstrate DiffMPC on a reinforcement learning task for autonomous drifting under model mismatch. They use domain randomization over nonlinear dynamics to learn MPC cost and vehicle parameters, achieving robust drifting through water puddles on a Toyota Supra.

8 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[2] Mpcgpu: Real-time nonlinear model predictive control through preconditioned conjugate gradient on the gpu PDF

Emre Adabag, Miloni Atal, William Gerard, Brian Plancher (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

GPU-accelerated differentiable optimization tool for MPC

[3] Physics-guided neural network and GPU-accelerated nonlinear model predictive control for quadcopter PDF

Cannot Refute

[19] Multilevel parallel GPU implementation of SQP solvers for Nonlinear MPC PDF

Cannot Refute

[23] ReLU-QP: A GPU-Accelerated Quadratic Programming Solver for Model-Predictive Control PDF

Cannot Refute

[46] Parallel Shooting Sequential Quadratic Programming for Nonlinear MPC Problems PDF

Cannot Refute

[47] Structure-exploiting sequential quadratic programming for model-predictive control PDF

Cannot Refute

[48] Towards safe and tractable Gaussian process-based MPC: Efficient sampling within a sequential quadratic programming framework PDF

Cannot Refute

[49] DEEP FLEXQP: ACCELERATED NONLINEAR PRO PDF

Cannot Refute

Contribution

Preconditioned conjugate gradient routine with tridiagonal preconditioning

[37] Implementation of a distributed parallel in time scheme using PETSc for a parabolic optimal control problem PDF

Cannot Refute

Contribution

Application to robust drifting via domain randomization and RL

[38] Reference-free formula drift with reinforcement learning: From driving data to tire energy-inspired, real-world policies PDF

Cannot Refute

[39] Enhance Generality by Model-based Reinforcement Learning and Domain Randomization PDF

Cannot Refute

[40] High-Speed Cornering Control and Real-Vehicle Deployment for Autonomous Electric Vehicles PDF

Cannot Refute

[41] Towards safe reinforcement learning in the real world PDF

Cannot Refute

[42] Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits PDF

Cannot Refute

[43] Drift Cornering Control and Real-Vehicle Deployment for Electric Vehicles PDF

Cannot Refute

[44] Adversarial Reinforcement Learning for Circular Autonomous Drifting Under Drivetrain Uncertainty PDF

Cannot Refute

[45] MPC-Inspired Reinforcement Learning for Verifiable Model-Free Control PDF

Cannot Refute

Differentiable Model Predictive Control on the GPU

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[2] Mpcgpu: Real-time nonlinear model predictive control through preconditioned conjugate gradient on the gpu PDF

Contribution Analysis

GPU-accelerated differentiable optimization tool for MPC

[3] Physics-guided neural network and GPU-accelerated nonlinear model predictive control for quadcopter PDF

[19] Multilevel parallel GPU implementation of SQP solvers for Nonlinear MPC PDF

[23] ReLU-QP: A GPU-Accelerated Quadratic Programming Solver for Model-Predictive Control PDF

[46] Parallel Shooting Sequential Quadratic Programming for Nonlinear MPC Problems PDF

[47] Structure-exploiting sequential quadratic programming for model-predictive control PDF

[48] Towards safe and tractable Gaussian process-based MPC: Efficient sampling within a sequential quadratic programming framework PDF

[49] DEEP FLEXQP: ACCELERATED NONLINEAR PRO PDF

Preconditioned conjugate gradient routine with tridiagonal preconditioning

[37] Implementation of a distributed parallel in time scheme using PETSc for a parabolic optimal control problem PDF

Application to robust drifting via domain randomization and RL

[38] Reference-free formula drift with reinforcement learning: From driving data to tire energy-inspired, real-world policies PDF

[39] Enhance Generality by Model-based Reinforcement Learning and Domain Randomization PDF

[40] High-Speed Cornering Control and Real-Vehicle Deployment for Autonomous Electric Vehicles PDF

[41] Towards safe reinforcement learning in the real world PDF

[42] Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits PDF

[43] Drift Cornering Control and Real-Vehicle Deployment for Electric Vehicles PDF

[44] Adversarial Reinforcement Learning for Circular Autonomous Drifting Under Drivetrain Uncertainty PDF

[45] MPC-Inspired Reinforcement Learning for Verifiable Model-Free Control PDF

Table of Contents