5. Continuous-Time Trajectory Optimization#
5.1. State-Space Models#
We extend our focus from the discrete-time setting to trajectory optimization in continuous time. Such models are omnipresent in various branches of science and engineering, where the dynamics of physical, biological, or economic systems are often described in terms of continuous-time differential equations. Here, we consider models given by ordinary differential equations (ODEs). However, continuous-time optimal control methods also exist beyond ODEs; for example, using stochastic differential equations (SDEs) or partial differential equations (PDEs).
An example of such state-space representation is the following:
The function \(\mathbf{f}\) describes the dynamics of the system, giving the rate of change of the state vector as a function of the current state \(\mathbf{x}(t)\) and control input \(\mathbf{u}(t)\). The function \(\mathbf{h}\) is the output mapping, which determines the measured output \(\mathbf{y}(t)\) based on the current state \(\mathbf{x}(t)\) and control input \(\mathbf{u}(t)\). This state-space representation is reminiscent of recurrent neural networks (RNNs), albeit in discrete time, where we maintain a hidden state and then map it to an observable space.
A state space model is called a linear state space model (or simply a linear system) if the functions \(\mathbf{f}\) and \(\mathbf{h}\) are linear in \(\mathbf{x}(t)\) and \(\mathbf{u}(t)\). In this case, the model can be written as:
where \(\mathbf{A}\), \(\mathbf{B}\), \(\mathbf{C}\), and \(\mathbf{D}\) are constant matrices. The matrix \(\mathbf{A}\) is called the dynamics matrix, \(\mathbf{B}\) is the control matrix, \(\mathbf{C}\) is the sensor matrix, and \(\mathbf{D}\) is the direct term matrix. If the model does not have a direct term, it means that the control input \(\mathbf{u}(t)\) does not directly influence the output \(\mathbf{y}(t)\).
It is worth noting that linear models like the one presented above are becoming increasingly popular thanks to the development of structured state space models (S4 and such) [Gu et al., 2022]. These models leverage the inherent structure and properties of linear systems to design more efficient and interpretable neural networks for processing sequential data.
5.2. Numerical Methods for Solving ODEs#
An ODE is an implicit representation of a state-space trajectory: it tells us how the state changes in time but not precisely what the state is at any given time. To find out this information, we need to either solve the ODE analytically (for some special structure) or, as we’re going to do, solve them numerically. This numerical procedure is meant to solve what is called an IVP (initial value problem) of the form:
5.2.1. Euler’s Method#
The algorithm to solve this problem is, in its simplest form, a for loop which closely resembles the updates encountered in gradient descent (in fact, gradient descent can be derived from the gradient flow ODE, but that’s another discussion). The so-called explicit Euler’s method can be implemented as follow:
(Euler’s method)
Input: \(f(x, t)\), \(x_0\), \(t_0\), \(t_{end}\), \(h\)
Output: Approximate solution \(x(t)\) at discrete time points
Initialize \(t = t_0\), \(x = x_0\)
While \(t < t_{end}\):
Compute \(x_{new} = x + h f(x, t)\)
Update \(t = t + h\)
Update \(x = x_{new}\)
Store or output the pair \((t, x)\)
End While
Consider the following simple dynamical system of a ballistic motion model, neglecting air resistance. The state of the system is described by two variables: \(y(t)\): vertical position at time \(t\) and \(v(t)\), the vertical velocity at time \(t\). The corresponding ODE is:
where \(g \approx 9.81 \text{ m/s}^2\) is the acceleration due to gravity. In our code, we use the initial conditions \(y(0) = 0 \text{ m}\) and \(v(0) = v_0 \text{ m/s}\) where \(v_0\) is the initial velocity (in this case, \(v_0 = 20 \text{ m/s}\)). The analytical solution to this system is:
This system models the vertical motion of an object launched upward, reaching a maximum height before falling back down due to gravity.
Euler’s method can be obtained by taking the first-order Taylor expansion of \(x(t)\) at \(t\):
Each step of the algorithm therefore involves approximating the function with a linear function of slope \(f\) over the given interval \(h\).
Show code cell source
import numpy as np
import matplotlib.pyplot as plt
def f(y, t):
"""
Derivative function for vertical motion under gravity.
y[0] is position, y[1] is velocity.
"""
g = 9.81 # acceleration due to gravity (m/s^2)
return np.array([y[1], -g])
def euler_method(f, y0, t0, t_end, h):
"""
Implement Euler's method for the entire time range.
"""
t = np.arange(t0, t_end + h, h)
y = np.zeros((len(t), 2))
y[0] = y0
for i in range(1, len(t)):
y[i] = y[i-1] + h * f(y[i-1], t[i-1])
return t, y
def true_solution(t):
"""
Analytical solution for the ballistic trajectory.
"""
y0, v0 = 0, 20 # initial height and velocity
g = 9.81
return y0 + v0*t - 0.5*g*t**2, v0 - g*t
# Set up the problem
t0, t_end = 0, 4
y0 = np.array([0, 20]) # initial height = 0, initial velocity = 20 m/s
# Different step sizes
step_sizes = [1.0, 0.5, 0.1]
colors = ['r', 'g', 'b']
markers = ['o', 's', '^']
# True solution
t_fine = np.linspace(t0, t_end, 1000)
y_true, v_true = true_solution(t_fine)
# Plotting
plt.figure(figsize=(12, 8))
# Plot Euler approximations
for h, color, marker in zip(step_sizes, colors, markers):
t, y = euler_method(f, y0, t0, t_end, h)
plt.plot(t, y[:, 0], color=color, marker=marker, linestyle='--',
label=f'Euler h = {h}', markersize=6, markerfacecolor='none')
# Plot true solution last so it's on top
plt.plot(t_fine, y_true, 'k-', label='True trajectory', linewidth=2, zorder=10)
plt.xlabel('Time (s)', fontsize=12)
plt.ylabel('Height (m)', fontsize=12)
plt.title("Euler's Method: Effect of Step Size on Ballistic Trajectory Approximation", fontsize=14)
plt.legend(fontsize=10)
plt.grid(True, linestyle=':', alpha=0.7)
# Add text to explain the effect of step size
plt.text(2.5, 15, "Smaller step sizes\nyield better approximations",
bbox=dict(facecolor='white', edgecolor='black', alpha=0.7),
fontsize=10, ha='center', va='center')
plt.tight_layout()
plt.show()
Another way to understand Euler’s method is through the fundamental theorem of calculus:
We then approximate the integral term with a box of width \(h\) and height \(f\), and therefore of area \(h f\).
Show code cell source
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
def v(t):
"""
Velocity function for the ballistic trajectory.
"""
v0 = 20 # initial velocity (m/s)
g = 9.81 # acceleration due to gravity (m/s^2)
return v0 - g * t
def position(t):
"""
Position function (integral of velocity).
"""
v0 = 20
g = 9.81
return v0*t - 0.5*g*t**2
# Set up the problem
t0, t_end = 0, 2
num_points = 1000
t = np.linspace(t0, t_end, num_points)
# Calculate true velocity and position
v_true = v(t)
x_true = position(t)
# Euler's method with a large step size for visualization
h = 0.5
t_euler = np.arange(t0, t_end + h, h)
x_euler = np.zeros_like(t_euler)
for i in range(1, len(t_euler)):
x_euler[i] = x_euler[i-1] + h * v(t_euler[i-1])
# Plotting
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 12), sharex=True)
# Plot velocity function and its approximation
ax1.plot(t, v_true, 'b-', label='True velocity')
ax1.fill_between(t, 0, v_true, alpha=0.3, label='True area (displacement)')
# Add rectangles with hashed pattern, ruler-like annotations, and area values
for i in range(len(t_euler) - 1):
t_i = t_euler[i]
v_i = v(t_i)
rect = Rectangle((t_i, 0), h, v_i,
fill=True, facecolor='red', edgecolor='r',
alpha=0.15, hatch='///')
ax1.add_patch(rect)
# Add ruler-like annotations
# Vertical ruler (height)
ax1.annotate('', xy=(t_i, 0), xytext=(t_i, v_i),
arrowprops=dict(arrowstyle='<->', color='red'))
ax1.text(t_i - 0.05, v_i/2, f'v(t{i}) = {v_i:.2f}', rotation=90,
va='center', ha='right', color='red', fontweight='bold')
# Horizontal ruler (width)
ax1.annotate('', xy=(t_i, -1), xytext=(t_i + h, -1),
arrowprops=dict(arrowstyle='<->', color='red'))
ax1.text(t_i + h/2, -2, f'h = {h}', ha='center', va='top',
color='red', fontweight='bold')
# Add area value in the middle of each rectangle
area = h * v_i
ax1.text(t_i + h/2, v_i/2, f'Area = {area:.2f}', ha='center', va='center',
color='white', fontweight='bold', bbox=dict(facecolor='red', edgecolor='none', alpha=0.7))
# Plot only the points for Euler's method
ax1.plot(t_euler, v(t_euler), 'ro', markersize=6, label="Euler's points")
ax1.set_ylabel('Velocity (m/s)', fontsize=12)
ax1.set_title("Velocity Function and Euler's Approximation", fontsize=14)
ax1.legend(fontsize=10)
ax1.grid(True, linestyle=':', alpha=0.7)
ax1.set_ylim(bottom=-3) # Extend y-axis to show horizontal rulers
# Plot position function and its approximation
ax2.plot(t, x_true, 'b-', label='True position')
ax2.plot(t_euler, x_euler, 'ro--', label="Euler's approximation", markersize=6, linewidth=2)
# Add vertical arrows and horizontal lines to show displacement and time step
for i in range(1, len(t_euler)):
t_i = t_euler[i]
x_prev = x_euler[i-1]
x_curr = x_euler[i]
# Vertical line for displacement
ax2.plot([t_i, t_i], [x_prev, x_curr], 'g:', linewidth=2)
# Horizontal line for time step
ax2.plot([t_i - h, t_i], [x_prev, x_prev], 'g:', linewidth=2)
# Add text to show the displacement value
displacement = x_curr - x_prev
ax2.text(t_i + 0.05, (x_prev + x_curr)/2, f'+{displacement:.2f}',
color='green', fontweight='bold', va='center')
# Add text to show the time step
ax2.text(t_i - h/2, x_prev - 0.5, f'h = {h}',
color='green', fontweight='bold', ha='center', va='top')
ax2.set_xlabel('Time (s)', fontsize=12)
ax2.set_ylabel('Position (m)', fontsize=12)
ax2.set_title("Position: True vs Euler's Approximation", fontsize=14)
ax2.legend(fontsize=10)
ax2.grid(True, linestyle=':', alpha=0.7)
# Add explanatory text
ax1.text(1.845, 15, "Red hashed areas show\nEuler's approximation\nof the area under the curve",
bbox=dict(facecolor='white', edgecolor='black', alpha=0.7),
fontsize=10, ha='center', va='center')
plt.tight_layout()
plt.show()
5.2.2. Implicit Euler’s Method#
An alternative approach is the Implicit Euler method, also known as the Backward Euler method. Instead of using the derivative at the current point to step forward, it uses the derivative at the end of the interval. This leads to the following update rule:
Note that \(x_{new}\) appears on both sides of the equation, making this an implicit method. The algorithm for the Implicit Euler method can be described as follows:
(Implicit Euler’s Method)
Input: \(f(x, t)\), \(x_0\), \(t_0\), \(t_{end}\), \(h\)
Output: Approximate solution \(x(t)\) at discrete time points
Initialize \(t = t_0\), \(x = x_0\)
While \(t < t_{end}\):
Set \(t_{new} = t + h\)
Solve for \(x_{new}\) in the equation: \(x_{new} = x + h f(x_{new}, t_{new})\)
Update \(t = t_{new}\)
Update \(x = x_{new}\)
Store or output the pair \((t, x)\)
End While
The key difference in the Implicit Euler method is step 4, where we need to solve a (potentially nonlinear) equation to find \(x_{new}\). This is typically done using iterative methods such as fixed-point iteration or Newton’s method.
5.2.2.1. Stiff ODEs#
While the Implicit Euler method requires more computation per step, it often allows for larger step sizes and can provide better stability for certain types of problems, especially stiff ODEs.
Stiff ODEs are differential equations for which certain numerical methods for solving the equation are numerically unstable, unless the step size is taken to be extremely small. These ODEs typically involve multiple processes occurring at widely different rates. In a stiff problem, the fastest-changing component of the solution can make the numerical method unstable unless the step size is extremely small. However, such a small step size may lead to an impractical amount of computation to traverse the entire interval of interest.
For example, consider a chemical reaction where some reactions occur very quickly while others occur much more slowly. The fast reactions quickly approach their equilibrium, but small perturbations in the slower reactions can cause rapid changes in the fast reactions.
A classic example of a stiff ODE is the Van der Pol oscillator with a large parameter. The Van der Pol equation is:
where \(\mu\) is a scalar parameter. This second-order ODE can be transformed into a system of first-order ODEs by introducing a new variable \(y = \frac{dx}{dt}\):
When \(\mu\) is large (e.g., \(\mu = 1000\)), this system becomes stiff. The large \(\mu\) causes rapid changes in \(y\) when \(x\) is near ±1, but slower changes elsewhere. This leads to a solution with sharp transitions followed by periods of gradual change.
5.2.3. Trapezoid Method#
The trapezoid method, also known as the trapezoidal rule, offers improved accuracy and stability compared to the simple Euler method. The name “trapezoid method” comes from the idea of using a trapezoid to approximate the integral term in the fundamental theorem of calculus. This leads to the following update rule:
where \( t_{new} = t + h \). Note that this formula involves \( x_{new} \) on both sides of the equation, making it an implicit method, similar to the implicit Euler method discussed earlier.
Show code cell source
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle, Polygon
def v(t):
"""Velocity function for the ballistic trajectory."""
v0 = 20 # initial velocity (m/s)
g = 9.81 # acceleration due to gravity (m/s^2)
return v0 - g * t
def position(t):
"""Position function (integral of velocity)."""
v0 = 20
g = 9.81
return v0*t - 0.5*g*t**2
# Set up the problem
t0, t_end = 0, 2
num_points = 1000
t = np.linspace(t0, t_end, num_points)
# Calculate true velocity and position
v_true = v(t)
x_true = position(t)
# Euler's method and Trapezoid method with a large step size for visualization
h = 0.5
t_numeric = np.arange(t0, t_end + h, h)
x_euler = np.zeros_like(t_numeric)
x_trapezoid = np.zeros_like(t_numeric)
for i in range(1, len(t_numeric)):
# Euler's method
x_euler[i] = x_euler[i-1] + h * v(t_numeric[i-1])
# Trapezoid method (implicit, so we use a simple fixed-point iteration)
x_trapezoid[i] = x_trapezoid[i-1]
for _ in range(5): # 5 iterations should be enough for this simple problem
x_trapezoid[i] = x_trapezoid[i-1] + h/2 * (v(t_numeric[i-1]) + v(t_numeric[i]))
# Plotting
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 16), sharex=True)
# Plot velocity function and its approximations
ax1.plot(t, v_true, 'b-', label='True velocity')
ax1.fill_between(t, 0, v_true, alpha=0.3, label='True area (displacement)')
# Add trapezoids and rectangles
for i in range(len(t_numeric) - 1):
t_i, t_next = t_numeric[i], t_numeric[i+1]
v_i, v_next = v(t_i), v(t_next)
# Euler's rectangle (hashed pattern)
rect = Rectangle((t_i, 0), h, v_i, fill=True, facecolor='red', edgecolor='r', alpha=0.15, hatch='///')
ax1.add_patch(rect)
# Trapezoid (dot pattern)
trapezoid = Polygon([(t_i, 0), (t_i, v_i), (t_next, v_next), (t_next, 0)],
fill=True, facecolor='green', edgecolor='g', alpha=0.15, hatch='....')
ax1.add_patch(trapezoid)
# Add area values
euler_area = h * v_i
trapezoid_area = h * (v_i + v_next) / 2
ax1.text(t_i + h/2, v_i/2, f'Euler: {euler_area:.2f}', ha='center', va='bottom', color='red', fontweight='bold')
ax1.text(t_i + h/2, (v_i + v_next)/4, f'Trapezoid: {trapezoid_area:.2f}', ha='center', va='top', color='green', fontweight='bold')
# Plot points for Euler's and Trapezoid methods
ax1.plot(t_numeric, v(t_numeric), 'ro', markersize=6, label="Euler's points")
ax1.plot(t_numeric, v(t_numeric), 'go', markersize=6, label="Trapezoid points")
ax1.set_ylabel('Velocity (m/s)', fontsize=12)
ax1.set_title("Velocity Function: True vs Numerical Approximations", fontsize=14)
ax1.legend(fontsize=10)
ax1.grid(True, linestyle=':', alpha=0.7)
# Plot position function and its approximations
ax2.plot(t, x_true, 'b-', label='True position')
ax2.plot(t_numeric, x_euler, 'ro--', label="Euler's approximation", markersize=6, linewidth=2)
ax2.plot(t_numeric, x_trapezoid, 'go--', label="Trapezoid approximation", markersize=6, linewidth=2)
ax2.set_xlabel('Time (s)', fontsize=12)
ax2.set_ylabel('Position (m)', fontsize=12)
ax2.set_title("Position: True vs Numerical Approximations", fontsize=14)
ax2.legend(fontsize=10)
ax2.grid(True, linestyle=':', alpha=0.7)
# Add explanatory text
ax1.text(1.76, 17, "Red hashed areas: Euler's approximation\nGreen dotted areas: Trapezoid approximation",
bbox=dict(facecolor='white', edgecolor='black', alpha=0.7),
fontsize=10, ha='center', va='center')
plt.tight_layout()
plt.show()
Algorithmically, the trapezoid method can be described as follows:
(Trapezoid Method)
Input: \( f(x, t) \), \( x_0 \), \( t_0 \), \( t_{end} \), \( h \)
Output: Approximate solution \( x(t) \) at discrete time points
Initialize \( t = t_0 \), \( x = x_0 \)
While \( t < t_{end} \):
Set \( t_{new} = t + h \)
Solve for \( x_{new} \)in the equation: \( x_{new} = x + \frac{h}{2}[f(x, t) + f(x_{new}, t_{new})] \)
Update \( t = t_{new} \)
Update \( x = x_{new} \)
Store or output the pair \( (t, x) \)
The trapezoid method can also be derived by averaging the forward Euler and backward Euler methods. Recall that:
Forward Euler method:
\[ x_{n+1} = x_n + h f(x_n, t_n) \]Backward Euler method:
\[ x_{n+1} = x_n + h f(x_{n+1}, t_{n+1}) \]
Taking the average of these two methods yields:
This is precisely the update rule for the trapezoid method. Recall that the forward Euler method approximates the solution by extrapolating linearly using the slope at the beginning of the interval \([t_n, t_{n+1}] \). In contrast, the backward Euler method extrapolates linearly using the slope at the end of the interval. The trapezoid method, on the other hand, averages these two slopes. This averaging provides better approximation properties than either of the methods alone, offering both stability and accuracy. Note finally that unlike the forward or backward Euler methods, the trapezoid method is also symmetric in time. This means that if you were to reverse time and apply the method backward, you would get the same results (up to numerical precision).
5.2.4. Trapezoidal Predictor-Corrector#
The trapezoid method can also be implemented under the so-called predictor-corrector framework. This interpretation reformulates the implicit trapezoid rule into an explicit two-step process:
Predictor Step:
We make an initial guess for \( x_{n+1} \) using the forward Euler method:\[ x_{n+1}^* = x_n + h f(x_n, t_n) \]This is our “predictor” step, where \( x_{n+1}^* \) is the predicted value of \( x_{n+1} \).
Corrector Step:
We then use this predicted value to estimate \( f(x_{n+1}^*, t_{n+1}) \) and apply the trapezoid formula:\[ x_{n+1} = x_n + \frac{h}{2} \left[ f(x_n, t_n) + f(x_{n+1}^*, t_{n+1}) \right] \]This is our “corrector” step, where the initial guess \( x_{n+1}^* \) is corrected by taking into account the slope at \( (x_{n+1}^*, t_{n+1}) \).
This two-step process is similar to performing one iteration of Newton’s method to solve the implicit trapezoid equation, starting from the Euler prediction. However, to fully solve the implicit equation, multiple iterations would be necessary until convergence is achieved.
Show code cell source
import numpy as np
import matplotlib.pyplot as plt
def f(y, t):
"""
Derivative function for vertical motion under gravity.
y[0] is position, y[1] is velocity.
"""
g = 9.81 # acceleration due to gravity (m/s^2)
return np.array([y[1], -g])
def true_solution(t):
"""
Analytical solution for the ballistic trajectory.
"""
y0, v0 = 0, 20 # initial height and velocity
g = 9.81
return y0 + v0*t - 0.5*g*t**2, v0 - g*t
def trapezoid_method_visual(f, y0, t0, t_end, h):
"""
Implement the trapezoid method for the entire time range.
Returns predictor and corrector steps for visualization.
"""
t = np.arange(t0, t_end + h, h)
y = np.zeros((len(t), 2))
y_predictor = np.zeros((len(t), 2))
y[0] = y_predictor[0] = y0
for i in range(1, len(t)):
# Predictor step (Euler forward)
slope_start = f(y[i-1], t[i-1])
y_predictor[i] = y[i-1] + h * slope_start
# Corrector step
slope_end = f(y_predictor[i], t[i])
y[i] = y[i-1] + h * 0.5 * (slope_start + slope_end)
return t, y, y_predictor
# Set up the problem
t0, t_end = 0, 2
y0 = np.array([0, 20]) # initial height = 0, initial velocity = 20 m/s
h = 0.5 # Step size
# Compute trapezoid method steps
t, y_corrector, y_predictor = trapezoid_method_visual(f, y0, t0, t_end, h)
# Plotting
plt.figure(figsize=(12, 8))
# Plot the true solution for comparison
t_fine = np.linspace(t0, t_end, 1000)
y_true, v_true = true_solution(t_fine)
plt.plot(t_fine, y_true, 'k-', label='True trajectory', linewidth=1.5)
# Plot the predictor and corrector steps
for i in range(len(t)-1):
# Points for the predictor step
p0 = [t[i], y_corrector[i, 0]]
p1_predictor = [t[i+1], y_predictor[i+1, 0]]
# Points for the corrector step
p1_corrector = [t[i+1], y_corrector[i+1, 0]]
# Plot predictor step
plt.plot([p0[0], p1_predictor[0]], [p0[1], p1_predictor[1]], 'r--', linewidth=2)
plt.plot(p1_predictor[0], p1_predictor[1], 'ro', markersize=8)
# Plot corrector step
plt.plot([p0[0], p1_corrector[0]], [p0[1], p1_corrector[1]], 'g--', linewidth=2)
plt.plot(p1_corrector[0], p1_corrector[1], 'go', markersize=8)
# Add arrows to show the predictor and corrector adjustments
plt.arrow(p0[0], p0[1], h, y_predictor[i+1, 0] - p0[1], color='r', width=0.005,
head_width=0.02, head_length=0.02, length_includes_head=True, zorder=5)
plt.arrow(p1_predictor[0], p1_predictor[1], 0, y_corrector[i+1, 0] - y_predictor[i+1, 0],
color='g', width=0.005, head_width=0.02, head_length=0.02, length_includes_head=True, zorder=5)
# Add legend entries for predictor and corrector steps
plt.plot([], [], 'r--', label='Predictor step (Forward Euler)')
plt.plot([], [], 'g-', label='Corrector step (Trapezoid)')
# Labels and title
plt.xlabel('Time (s)', fontsize=12)
plt.ylabel('Height (m)', fontsize=12)
plt.title("Trapezoid Method: Predictor-Corrector Structure", fontsize=14)
plt.legend(fontsize=10)
plt.grid(True, linestyle=':', alpha=0.7)
plt.tight_layout()
plt.show()
5.3. Collocation Methods#
The numerical integration methods we’ve discussed before are inherently sequential: given an initial state, we make a guess as to where the system might go over a small time interval. The accuracy of that guess depends on the numerical integration procedure being used and the information available locally.
Collocation methods offer an alternative paradigm. Instead of solving the problem step by step, these methods aim to solve for the entire trajectory simultaneously by reformulating the differential equation into a system of algebraic equations. This approach involves a process of iterative refinements for the values of the discretized system, taking into account both local and global properties of the function.
Sequential methods like Euler’s or Runge-Kutta focus on evolving the system forward in time, starting from known initial conditions. They are simple to use for initial value problems but can accumulate errors over long time scales. Collocation methods, on the other hand, consider the whole time domain at once and tend to distribute errors more evenly across the domain.
This global view has some interesting implications. For one, collocation methods can handle both initial value and boundary value problems more naturally, which is particularly useful when you have path constraints throughout the entire integration interval. This property is especially valuable when solving continuous-time optimal control problems with such constraints: something which would otherwise be difficult to achieve with single-shooting methods.
However, solving for all time points simultaneously can be computationally intensive. It involves a tradeoff between overall accuracy, especially for long-time integrations or sensitive systems, but at the cost of increased computational complexity. Despite this, the ability to handle complex constraints and achieve more robust solutions often makes collocation methods the preferred choice for solving sophisticated COCPs in fields such as aerospace, robotics, and process control.
5.3.1. Quick Primer on Polynomials#
Collocation methods are based on polynomial approximation theory. Therefore, the first step in developing collocation-based optimal control techniques is to review the fundamentals of polynomial functions.
Polynomials are typically introduced through their standard form:
In this expression, the \(a_i\) are coefficients which linearly combine the powers of \(t\) to represent a function. The set of functions \(\{ 1, t, t^2, t^3, \ldots, t^n \}\) used in the standard polynomial representation is called the monomial basis.
In linear algebra, a basis is a set of vectors in a vector space such that any vector in the space can be uniquely represented as a linear combination of these basis vectors. In the same way, a polynomial basis is such that any function \( f(x) \) (within the function space) to be expressed as:
where the coefficients \( c_k \) are generally determined by solving a system of equation.
Just as vectors can be represented in different coordinate systems (bases), functions can also be expressed using various polynomial bases. However, the ability to apply a change of basis does not imply that all types of polynomials are equivalent from a practical standpoint. In practice, our choice of polynomial basis is dictated by considerations of efficiency, accuracy, and stability when approximating a function.
For instance, despite the monomial basis being easy to understand and implement, it often performs poorly in practice due to numerical instability. This instability arises as its coefficients take on large values: an ill-conditioning problem. The following kinds of polynomial often remedy this issues.
5.3.1.1. Orthogonal Polynomials#
An orthogonal polynomial basis is a set of polynomials that are not only orthogonal to each other but also form a complete basis for a certain space of functions. This means that any function within that space can be represented as a linear combination of these polynomials.
More precisely, let \( \{ p_0(x), p_1(x), p_2(x), \dots \} \) be a sequence of polynomials where each \( p_n(x) \) is a polynomial of degree \( n \). We say that this set forms an orthogonal polynomial basis if any polynomial \( q(x) \) of degree \( n \) or less can be uniquely expressed as a linear combination of \( \{ p_0(x), p_1(x), \dots, p_n(x) \} \). Furthermore, the orthogonality property means that for any \( i \neq j \):
for some weight function \( w(x) \) over a given interval of orthogonality \( [a, b] \).
The orthogonality property allows to simplify the computation of the coefficients involved in the polynomial representation of a function. At a high level, what happens is that when taking the inner product of \( f(x) \) with each basis polynomial, \( p_k(x) \) isolates the corresponding coefficient \( c_k \), which can be found to be:
Here are some examples of the most common orthogonal polynomials used in practice.
5.3.1.1.1. Legendre Polynomials#
Legendre polynomials \( \{ P_n(x) \} \) are defined on the interval \([-1, 1]\) and satisfy the orthogonality condition:
They can be generated using the recurrence relation:
with initial conditions:
The first four Legendre polynomials resulting from this recurrence are the following:
Show code cell source
import numpy as np
from IPython.display import display, Math
def legendre_polynomial(n, x):
if n == 0:
return np.poly1d([1])
elif n == 1:
return x
else:
p0 = np.poly1d([1])
p1 = x
for k in range(2, n + 1):
p2 = ((2 * k - 1) * x * p1 - (k - 1) * p0) / k
p0, p1 = p1, p2
return p1
def legendre_coefficients(n):
x = np.poly1d([1, 0]) # Define a poly1d object to represent x
poly = legendre_polynomial(n, x)
return poly
def poly_to_latex(poly):
coeffs = poly.coefficients
variable = poly.variable
terms = []
for i, coeff in enumerate(coeffs):
power = len(coeffs) - i - 1
if coeff == 0:
continue
coeff_str = f"{coeff:.2g}" if coeff not in {1, -1} or power == 0 else ("-" if coeff == -1 else "")
if power == 0:
term = f"{coeff_str}"
elif power == 1:
term = f"{coeff_str}{variable}"
else:
term = f"{coeff_str}{variable}^{power}"
terms.append(term)
latex_poly = " + ".join(terms).replace(" + -", " - ")
return latex_poly
for n in range(4):
poly = legendre_coefficients(n)
display(Math(f"P_{n}(x) = {poly_to_latex(poly)}"))
5.3.1.1.2. Chebyshev Polynomials#
There are two types of Chebyshev polynomials: Chebyshev polynomials of the first kind, \( \{ T_n(x) \} \), and Chebyshev polynomials of the second kind, \( \{ U_n(x) \} \). We typically focus on the first kind. They are defined on the interval \([-1, 1]\) and satisfy the orthogonality condition:
The Chebyshev polynomials of the first kind can be generated using the recurrence relation:
with initial conditions:
Remarkably, this recurrence relation also admits an explicit formula:
Let’s now implement it in Python:
Show code cell source
def chebyshev_polynomial(n, x):
if n == 0:
return np.poly1d([1])
elif n == 1:
return x
else:
t0 = np.poly1d([1])
t1 = x
for _ in range(2, n + 1):
t2 = 2 * x * t1 - t0
t0, t1 = t1, t2
return t1
def chebyshev_coefficients(n):
x = np.poly1d([1, 0]) # Define a poly1d object to represent x
poly = chebyshev_polynomial(n, x)
return poly
for n in range(4):
poly = chebyshev_coefficients(n)
display(Math(f"T_{n}(x) = {poly_to_latex(poly)}"))
5.3.1.1.3. Hermite Polynomials#
Hermite polynomials \( \{ H_n(x) \} \) are defined on the entire real line and are orthogonal with respect to the weight function \( w(x) = e^{-x^2} \). They satisfy the orthogonality condition:
Hermite polynomials can be generated using the recurrence relation:
with initial conditions:
The following code computes the coefficients of the first four Hermite polynomials:
Show code cell source
def hermite_polynomial(n, x):
if n == 0:
return np.poly1d([1])
elif n == 1:
return 2 * x
else:
h0 = np.poly1d([1])
h1 = 2 * x
for k in range(2, n + 1):
h2 = 2 * x * h1 - 2 * (k - 1) * h0
h0, h1 = h1, h2
return h1
def hermite_coefficients(n):
x = np.poly1d([1, 0]) # Define a poly1d object to represent x
poly = hermite_polynomial(n, x)
return poly
for n in range(4):
poly = hermite_coefficients(n)
display(Math(f"H_{n}(x) = {poly_to_latex(poly)}"))
5.3.2. Collocation Conditions#
Consider a general ODE of the form:
where \( y(t) \in \mathbb{R}^n \) is the state vector, and \( f: \mathbb{R}^n \times \mathbb{R} \rightarrow \mathbb{R}^n \) is a known function. The goal is to approximate the solution \( y(t) \) over a given interval \([t_0, t_f]\). Collocation methods achieve this by:
Choosing a basis to approximate \( y(t) \) using a finite sum of basis functions \( \phi_i(t) \):
\[ y(t) \approx \sum_{i=0}^{N} c_i \phi_i(t), \]where \( \{c_i\} \) are the coefficients to be determined.
Selecting collocation points \( t_1, t_2, \ldots, t_N \) within the interval \([t_0, t_f]\). These are typically chosen to be the roots of certain orthogonal polynomials, like Legendre or Chebyshev polynomials, or can be spread equally across the interval.
Enforcing the ODE at the collocation points for each \( t_j \):
\[ \dot{y}(t_j) = f(y(t_j), t_j). \]To implement this, we differentiate the approximate solution \( y(t) \) with respect to time:
\[ \dot{y}(t) \approx \sum_{i=0}^{N} c_i \dot{\phi}_i(t). \]Substituting this into the ODE at the collocation points gives:
\[ \sum_{i=0}^{N} c_i \dot{\phi}_i(t_j) = f\left(\sum_{i=0}^{N} c_i \phi_i(t_j), t_j\right), \quad j = 1, \ldots, N. \]
The collocation equations are formed by enforcing the ODE at all collocation points, leading to a system of nonlinear equations:
Furthermore when solving an initial value problem (IVP), we also need to incorporate the initial condition \( y(t_0) = y_0 \) as an additional constraint:
The collocation conditions and IVP condition are combined together to form a root-finding problem, which we can generically solve numerically using Newton’s method.
5.3.3. Common Numerical Integration Techniques as Collocation Methods#
Many common numerical integration techniques can be viewed as special cases of collocation methods. While the general collocation method we discussed earlier applies to the entire interval \([t_0, t_f]\), many numerical integration techniques can be viewed as collocation methods applied locally, step by step.
In practical numerical integration, we often divide the full interval \([t_0, t_f]\) into smaller subintervals or steps. In general, this allows us to user simpler basis functions thereby reducing computational complexity, and gives us more flexibility in dynamically ajusting the step size using local error estimates. When we apply collocation locally, we’re essentially using the collocation method to “step” from \(t_n\) to \(t_{n+1}\). As we did, earlier we still apply the following three steps:
We choose a basis function to approximate \(y(t)\) over \([t_n, t_{n+1}]\).
We select collocation points within this interval.
We enforce the ODE at these points to determine the coefficients of our basis function.
We can make this idea clearer by re-deriving some of the numerical integration methods seen before using this perspective.
5.3.3.1. Explicit Euler Method#
For the Explicit Euler method, we use a linear basis function for each step:
Note that we use \((t - t_n)\) rather than just \(t\) because we’re approximating the solution locally, relative to the start of each step. We then choose one collocation point at \(t_{n+1}\) where we have:
Our local approximation is:
At \(t = t_{n+1}\), this gives:
where \(h = t_{n+1} - t_n\). This is the classic Euler update formula.
5.3.3.2. Implicit Euler Method#
The Implicit Euler method uses the same linear basis function:
Again, we choose one collocation point at \(t_{n+1}\). The main difference is that we enforce the ODE using \(y_{n+1}\):
Our approximation remains:
At \(t = t_{n+1}\), this leads to the implicit equation:
5.3.3.3. Trapezoidal Method#
The Trapezoidal method uses a quadratic basis function:
We use two collocation points: \(t_n\) and \(t_{n+1}\). Enforcing the ODE at these points gives:
At \(t_n\):
At \(t_{n+1}\):
Our approximation is:
At \(t = t_{n+1}\), this gives:
Solving the system of equations leads to the trapezoidal update:
5.3.3.4. Runge-Kutta Methods#
Higher-order Runge-Kutta methods can also be interpreted as collocation methods. The RK4 method corresponds to a collocation method using a cubic polynomial basis:
Here, we’re using a cubic polynomial to approximate the solution over each step, rather than the linear or quadratic approximations of the other methods above. For RK4, we use four collocation points:
\(t_n\) (the start of the step)
\(t_n + h/2\)
\(t_n + h/2\)
\(t_n + h\) (the end of the step)
These points are called the “Gauss-Lobatto” points, scaled to our interval \([t_n, t_n + h]\). The RK4 method enforces the ODE at these collocation points, leading to four stages:
The final update formula for RK4 can be derived by solving the system of equations resulting from enforcing the ODE at our collocation points:
5.3.4. Example: Solving a Simple ODE by Collocation#
Consider a simple ODE:
The analytical solution is \(y(t) = e^{-t}\). We apply the collocation method with a monomial basis of order \(N\):
We select \(N\) equally spaced points \(\{t_1, \ldots, t_N\}\) in \([0, 2]\) as collocation points.
Show code cell source
import numpy as np
from scipy.optimize import root
import matplotlib.pyplot as plt
def ode_function(y, t):
"""Define the ODE: dy/dt = -y"""
return -y
def solve_ode_collocation(ode_func, t_span, y0, order):
t0, tf = t_span
n_points = order + 1 # number of collocation points
t_points = np.linspace(t0, tf, n_points)
def collocation_residuals(coeffs):
residuals = []
# Initial condition residual
y_init = sum(c * t_points[0]**i for i, c in enumerate(coeffs))
residuals.append(y_init - y0)
# Collocation point residuals
for t in t_points[1:]: # Skip the first point as it's used for initial condition
y = sum(c * t**i for i, c in enumerate(coeffs))
dy_dt = sum(c * i * t**(i-1) for i, c in enumerate(coeffs) if i > 0)
residuals.append(dy_dt - ode_func(y, t))
return residuals
# Initial guess for coefficients
initial_coeffs = [y0] + [0] * order
# Solve the system of equations
solution = root(collocation_residuals, initial_coeffs)
if not solution.success:
raise ValueError("Failed to converge to a solution.")
coeffs = solution.x
# Generate solution
t_fine = np.linspace(t0, tf, 100)
y_solution = sum(c * t_fine**i for i, c in enumerate(coeffs))
return t_fine, y_solution, t_points, coeffs
# Example usage
t_span = (0, 2)
y0 = 1
orders = [1, 2, 3, 4, 5] # Different polynomial orders to try
plt.figure(figsize=(12, 8))
for order in orders:
t, y, t_collocation, coeffs = solve_ode_collocation(ode_function, t_span, y0, order)
# Calculate y values at collocation points
y_collocation = sum(c * t_collocation**i for i, c in enumerate(coeffs))
# Plot the results
plt.plot(t, y, label=f'Order {order}')
plt.scatter(t_collocation, y_collocation, s=50, zorder=5)
# Plot the analytical solution
t_analytical = np.linspace(t_span[0], t_span[1], 100)
y_analytical = y0 * np.exp(-t_analytical)
plt.plot(t_analytical, y_analytical, 'k--', label='Analytical')
plt.xlabel('t')
plt.ylabel('y')
plt.title('ODE Solutions: dy/dt = -y, y(0) = 1')
plt.legend()
plt.grid(True)
plt.show()
# Print error for each order
print("Maximum absolute errors:")
for order in orders:
t, y, _, _ = solve_ode_collocation(ode_function, t_span, y0, order)
y_true = y0 * np.exp(-t)
max_error = np.max(np.abs(y - y_true))
print(f"Order {order}: {max_error:.6f}")
Maximum absolute errors:
Order 1: 0.300453
Order 2: 0.074113
Order 3: 0.015265
Order 4: 0.002518
Order 5: 0.000339
6. Trajectory Optimization in Continuous Time#
As studied earlier in the discrete-time setting, we consider three variants of the continuous-time optimal control problem (COCP) with path constraints and bounds:
(Mayer Problem)
(Lagrange Problem)
(Bolza Problem)
In these formulations, the additional constraints are:
Path constraints: \(\mathbf{g}(\mathbf{x}(t), \mathbf{u}(t)) \leq \mathbf{0}\), which represent constraints that must be satisfied at all times along the trajectory.
State bounds: \(\mathbf{x}_{\text{min}} \leq \mathbf{x}(t) \leq \mathbf{x}_{\text{max}}\), which specify the lower and upper bounds on the state variables.
Control bounds: \(\mathbf{u}_{\text{min}} \leq \mathbf{u}(t) \leq \mathbf{u}_{\text{max}}\), which specify the lower and upper bounds on the control inputs.
Furthermore, we may also encounter variations of the above problems under the assumption that horizon is infinite. For example:
(Infinite-Horizon Trajectory Optimization)
In this formulation, the term \(e^{-\rho t}\) is a discount factor that exponentially decreases the importance of future costs relative to the present. The parameter \( \rho > 0\) is the discount rate. A larger value of \( \rho \) places more emphasis on the immediate cost and diminishes the impact of future costs. In infinite-horizon problems, the integral of the cost function \( \int_{t_0}^{\infty} c(\mathbf{x}(t), \mathbf{u}(t)) \, dt \) could potentially diverge because the cost accumulates over an infinite time period. Introducing the exponential term \( e^{-\rho t} \) guarantees that the integral converges as long as \( c(\mathbf{x}(t), \mathbf{u}(t)) \) grows at a slower rate than \( e^{\rho t} \).
6.1. Transforming Lagrange and Bolza Problems to Mayer#
Recall that, according to the fundamental theorem of calculus, solving an initial value problem (IVP) or using a quadrature method to evaluate an integral are essentially two approaches to the same problem. Similarly, just as we can convert discrete-time Lagrange or Bolza optimal control problems to the Mayer form by keeping a running sum of the cost, we can extend this idea to continuous time. This is done by introducing an augmented state space that includes the accumulated cost up to a given time as part of the integral term. If our original system is represented by \(\mathbf{f}\), we construct an augmented system \(\tilde{\mathbf{f}}\) as follows:
Here, the newly introduced state variable \(z(t)\) containts the accumulated integral cost over time. The augmented system dynamics, \(\tilde{\mathbf{f}}\), are defined as:
The integral cost is now obtained via \(z(t_f)\), which is added with the terminal cost \(c(\mathbf{x}(t_f))\) to recover the value of the original objective in the Bolza problem.
6.2. Example Problems#
6.2.1. Inverted Pendulum#
The inverted pendulum is a classic problem in control theory and robotics that demonstrates the challenge of stabilizing a dynamic system that is inherently unstable. The objective is to keep a pendulum balanced in the upright position by applying a control force, typically at its base. This setup is analogous to balancing a broomstick on your finger: any deviation from the vertical position will cause the system to tip over unless you actively counteract it with appropriate control actions.
We typically assume that the pendulum is mounted on a cart or movable base, which can move horizontally. The system’s state is then characterized by four variables:
Cart position: \( x(t) \) — the horizontal position of the base.
Cart velocity: \( \dot{x}(t) \) — the speed of the cart.
Pendulum angle: \( \theta(t) \) — the angle between the pendulum and the vertical upright position.
Angular velocity: \( \dot{\theta}(t) \) — the rate at which the pendulum’s angle is changing.
This setup is more complex because the controller must deal with interactions between two different types of motion: linear (the cart) and rotational (the pendulum). This system is said to be “underactuated” because the number of control inputs (one) is less than the number of state variables (four). This makes the problem more challenging and interesting from a control perspective.
We can simplify the problem by assuming that the base of the pendulum is fixed. This is akin to having the bottom of the stick attached to a fixed pivot on a table. You can’t move the base anymore; you can only apply small nudges at the pivot point to keep the stick balanced upright. In this case, you’re only focusing on adjusting the stick’s tilt without worrying about moving the base. This reduces the problem to stabilizing the pendulum’s upright orientation using only the rotational dynamics. The system’s state can now be described by just two variables:
Pendulum angle: \( \theta(t) \) — the angle of the pendulum from the upright vertical position.
Angular velocity: \( \dot{\theta}(t) \) — the rate at which the pendulum’s angle is changing.
The evolution of these two varibles is governed by the following ordinary differential equation:
where:
\(m\) is the mass of the pendulum
\(g\) is the acceleration due to gravity
\(l\) is the length of the pendulum
\(\gamma\) is the coefficient of rotational friction
\(J_t = J + ml^2\) is the total moment of inertia, with \(J\) being the pendulum’s moment of inertia about its center of mass
\(u(t)\) is the control force applied at the base
\(y(t) = \theta(t)\) is the measured output (the pendulum’s angle)
We expect that when no control is applied to the system, the rod should be falling down when started from the upright position.
Show code cell source
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from IPython.display import HTML, display
from matplotlib.animation import FuncAnimation
# System parameters
m = 1.0 # mass of the pendulum (kg)
g = 9.81 # acceleration due to gravity (m/s^2)
l = 1.0 # length of the pendulum (m)
gamma = 0.1 # coefficient of rotational friction
J = 1/3 * m * l**2 # moment of inertia of a rod about its center of mass
J_t = J + m * l**2 # total moment of inertia
# Define the ODE for the inverted pendulum
def pendulum_ode(state, t):
theta, omega = state
dtheta = omega
domega = (m*g*l/J_t) * np.sin(theta) - (gamma/J_t) * omega
return [dtheta, domega]
# Initial conditions: slightly off vertical position
theta0 = 0.1 # initial angle (radians)
omega0 = 0 # initial angular velocity (rad/s)
y0 = [theta0, omega0]
# Time array for integration
t = np.linspace(0, 10, 500) # Reduced number of points
# Solve ODE
solution = odeint(pendulum_ode, y0, t)
# Extract theta and omega from the solution
theta = solution[:, 0]
omega = solution[:, 1]
# Create two separate plots
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
# Plot for angle
ax1.plot(t, theta)
ax1.set_xlabel('Time (s)')
ax1.set_ylabel('Angle (rad)')
ax1.set_title('Pendulum Angle over Time')
ax1.grid(True)
# Plot for angular velocity
ax2.plot(t, omega)
ax2.set_xlabel('Time (s)')
ax2.set_ylabel('Angular velocity (rad/s)')
ax2.set_title('Pendulum Angular Velocity over Time')
ax2.grid(True)
plt.tight_layout()
plt.show()
# Function to create animation frames
def get_pendulum_position(theta):
x = l * np.sin(theta)
y = l * np.cos(theta)
return x, y
# Create animation
fig, ax = plt.subplots(figsize=(8, 8))
line, = ax.plot([], [], 'o-', lw=2)
time_text = ax.text(0.02, 0.95, '', transform=ax.transAxes)
def init():
ax.set_xlim(-1.2*l, 1.2*l)
ax.set_ylim(-1.2*l, 1.2*l) # Adjusted to show full range of motion
ax.set_aspect('equal', adjustable='box')
return line, time_text
def animate(i):
x, y = get_pendulum_position(theta[i])
line.set_data([0, x], [0, y])
time_text.set_text(f'Time: {t[i]:.2f} s')
return line, time_text
anim = FuncAnimation(fig, animate, init_func=init, frames=len(t), interval=40, blit=True)
plt.title('Inverted Pendulum Animation')
ax.grid(True)
# Convert animation to JavaScript
js_anim = anim.to_jshtml()
# Close the figure to prevent it from being displayed
plt.close(fig)
# Display only the JavaScript animation
display(HTML(js_anim))