Skip to main content

Symbolic Optimal Control Problem (OCP) Solver

Project description

Optimal Control

In this repository, MATLAB and Python functions are provided for solving a general nonlinear optimal control problem using the gradient descent approach.

You can define your optimal control problem symbolically using MATLAB symbolic or Python sympy and the functions will solve the problem. Several demos are provided alongside the main functions.

Theoretical Background

In optimal control theory a standard optimal control is defined as

$$ \begin{align} \text{minimize} &\phi(x(t_f) + \int_0^{t_f} g(x,u) dt \ s.t. &\dot{x} = f(x,u) \end{align} $$

The solution to the problem above comes from calculus of variations. A Hamiltonian function is defined as

$$ \mathcal{H} = g(x,u) + p^T f(x,u) $$

And the optimal control input, in absence of input limits, can be calculated using the relationships below:

$$ \begin{align} \dot{x} = \frac{\partial \mathcal{H}}{\partial p} = f(x,u) \ \dot{p} = - \frac{\partial \mathcal{H}}{\partial x} \ \frac{\partial \mathcal{H}}{\partial u} = 0 \end{align} $$

Here p denotes the co-states. This set of equations is generally hard to solve, because they are two point boundary nonlinear equations. The initial values of x and the final values of p are known. One way to solve this set of equations is using the gradient descent algorithm. An initial guess of the control input u is selected and the equations are solved for x and p, given the boundary values. Then u is corrected using the gradient of the Hamiltonian.

Free Final Time Problems

When the final time $t_f$ is not specified and can be freely chosen to minimize the cost, an additional optimality condition must be satisfied. This condition is known as the transversality condition for free final time.

For a problem with free $t_f$, the total cost is:

$$ J = \phi(x(t_f), t_f) + \int_0^{t_f} g(x,u) , dt $$

At the optimal solution, not only must the control satisfy $\frac{\partial \mathcal{H}}{\partial u} = 0$ (in unconstrained input case), but the final time must also satisfy:

$$ \frac{dJ}{dt_f} = \mathcal{H}(x(t_f), u(t_f), p(t_f)) + \frac{\partial \phi}{\partial t}\bigg|_{t_f} = 0 $$

This condition arises because when $t_f$ is perturbed by a small amount $\delta t_f$, the cost changes due to two effects:

  • the terminal cost changes as $\frac{\partial \phi}{\partial t} \delta t_f$
  • the integral cost changes by adding a time slice of width $\delta t_f$ with integrand $g + p^T f = \mathcal{H}$.

At optimality, these combined effects must sum to zero.

The solver implements this by performing alternating optimization: first updating the control u via gradient descent on $\frac{\partial \mathcal{H}}{\partial u}$, then updating the final time $t_f$ via gradient descent on $\frac{dJ}{dt_f}$. Both steps use Armijo backtracking line search to ensure cost reduction. For more details on the transversality condition, see the Wikipedia article on Pontryagin's Maximum Principle.

Documentation

The exact logic explained in the theoretical background has been implemented in the function optimalControlSolver (and the same function for Python). Here, we go over the variables, inputs and outputs of the function.

I've only explained the MATLAB script, but the Python function is similar as well. I've even tried to use the exact same names in both functions.

Problem:

$$ \begin{aligned} \text{minimize}& J = Phi(x(tf)) + ∫_0^{tf} g(x(t), u(t)) dt \ \text{subject to}& ẋ = f(x,u),\quad x(0) = x0 \end{aligned} $$

Usage:

 [sol, info] = optimalControlSolver(symF, symG, symPhi, xSym, uSym, tGrid, x0, U0, opts)

Inputs:

  • symF : symbolic vector field f(x,u) of size [n x 1]
  • symG : symbolic scalar running cost g(x,u)
  • symPhi : symbolic scalar terminal cost Phi(x)
  • xSym : symbolic state vector [x1; x2; ...; xn]
  • uSym : symbolic control vector [u1; u2; ...; um]
  • tGrid : time grid (column or row) of size [N x 1] or [1 x N], increasing, with tGrid(1) = 0
  • x0 : initial state (numeric) [n x 1]
  • U0 : initial control trajectory over tGrid [N x m]
  • opts : options struct (all optional fields):
    • maxIters (default 50)
    • alpha (default 1.0) initial step size for gradient descent
    • beta (default 0.5) backtracking reduction factor (0<beta<1)
    • c1 (default 1e-4) Armijo condition constant
    • tol (default 1e-6) stopping tolerance on ||grad_u||_F
    • odeOptions (default []) options set by odeset
    • interp (default 'linear') 'linear' or 'zoh' for u/x interpolation
    • uLower (default []) lower bounds on u (1x m) or scalar
    • uUpper (default []) upper bounds on u (1x m) or scalar
    • maxLineSearch (default 10)
    • verbose (default true)

Outputs:

  • sol.t : time grid [N x 1]
  • sol.X : state trajectory along tGrid [N x n]
  • sol.U : control trajectory along tGrid [N x m]
  • sol.P : co-state trajectory along tGrid [N x n]
  • sol.J : final cost value at solution
  • sol.J_hist : cost history per iteration
  • sol.grad_norm_hist : gradient-norm history per iteration
  • info.iters : number of iterations performed

Requirements:

  • MATLAB Symbolic Math Toolbox for MATLAB
  • sympy for Python

Notes:

  • Instead of a simple gradient descent with a constant step size, the Armijo condition is checked every time and backtracking is used to find an appropriate step size
  • Three sample scripts have been provided:
    1. demo.m (demo.py for Python) contains a linear system with two states and two inputs.
    2. CSTR.m (CSTR.py for Python) solves the optimal control problem for a CSTR system (example 6.2-2 from Kirk's book).
    3. Free time CSTR (CSTR_freeTf.py for Python) solves the same optimal control problem as CSTR.m, but with free final time

Maintainer

This repo is maintained by me. Contributions are welcome as well. Feel free to create a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symocp-1.0.0.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

symocp-1.0.0-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file symocp-1.0.0.tar.gz.

File metadata

  • Download URL: symocp-1.0.0.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for symocp-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f722b201d933b272a2746fcb54fe4ac59f44c6b18e58c142f65ad8da747b9772
MD5 b63c2ae5bef03c1c8778a9dd3ff1dc82
BLAKE2b-256 519620665053fbb599c4c7a42dabc991b13776354c211c179a701f11a78ea6c2

See more details on using hashes here.

File details

Details for the file symocp-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: symocp-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for symocp-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f185d0fa85dfb545ed63ad38b5034fd2588bee40a7a8f56f1b1b6e2c40b489ba
MD5 ad8b17fe8cd5f6acb1708ec7d63965cf
BLAKE2b-256 1e7dcb19ef7857e1450d4311cb9e7138f782911ca9e52c8a528510bde88d24a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page