Random sampling-based gradient descent method for optimal control problems with variance reduction

Jeongho Kim, Dongnam Ko, Chohong Min, Byungjoon Lee

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we propose and analyze two random sampling-based gradient descent methods for optimal control problems in large-scale multi-agent dynamics with a variance reduction technique, which is inspired by the Random Batch Method (RBM)20 and the stochastic variance reduced gradient (SVRG)21. The proposed algorithms are based on the gradient descent method with adjoint states from Pontryagin’s maximum principle, which requires the computation of the controlled trajectory (forward dynamics) and its adjoint system. To reduce the computational costs of dynamics, we apply random sampling to the forward dynamics, splitting them into simpler randomized ones. From the initial guess of the control, the update of the control function follows the gradient of the randomized cost function as in the stochastic gradient system. On top of that, the variance reduction technique is applied to handle the random error from approximation by random sampling. We show that this variance-reduced optimization process converges to the optimal control of the original system for simple cases, i.e. linear-quadratic optimal control problems. Numerical simulations are presented to validate the computational efficiency of the stochastic gradient method and the stability of the variance-reduced method.

Original languageEnglish
Pages (from-to)2797-2829
Number of pages33
JournalMathematical Models and Methods in Applied Sciences
Volume35
Issue number13
DOIs
StatePublished - 15 Dec 2025

Bibliographical note

Publisher Copyright:
© 2025 World Scientific Publishing Company.

Keywords

  • Optimal control problem
  • random batch method
  • stochastic variance reduced gradient

Fingerprint

Dive into the research topics of 'Random sampling-based gradient descent method for optimal control problems with variance reduction'. Together they form a unique fingerprint.

Cite this