Homework 1#

Due: April 24th 11:59PM (Submit via Canvas)

In addition to answering the written questions below, please complete the following Jupyter notebook exercises found here

  • 01a_dynamics.ipynb

  • 01b_dynamics+JAX.ipynb

  • 02_unconstrained_optimization.ipynb

  • 03a_constrained_optimization_logbarrier.ipynb

  • 03b_constraint_optimization_cvxpy.ipynb

  • 04_control_barrier_function.ipynb

Dynamics (Discrete time, linearization)#

1. Derive and implement discrete-time dynamics given continuous-time dynamics.#

Consider the following continuous-time dynamically-extended simple unicycle model. This is a common model for mobile robots with differential drive.

\[\begin{split} \dot{\mathbf{x}} = \begin{bmatrix} \dot{x} \\ \dot{y} \\ \dot{\theta} \\ \dot{v} \end{bmatrix} = \begin{bmatrix} v\cos\theta \\ v\sin\theta \\ \omega \\ a \end{bmatrix}, \quad u=(\omega, a) \end{split}\]

(a)#

Write down the discrete-time dynamics \(\mathbf{x}_{t+1} = f(\mathbf{x}_t, \mathbf{u}_t)\) using Euler integration with step size \(\Delta t\).

(b)#

Assuming zero-order hold over each time step of size \(\Delta t\), write down the corresponding discrete-time dynamics \(\mathbf{x}_{k+1} = f_\text{d}(\mathbf{x}_k, \mathbf{u}_k, t_k)\) by performing the integration analytically, where \(f_\text{d}(\mathbf{x}_k, \mathbf{u}_k) = \mathbf{x}_k + \int_{t_k}^{t_k+\Delta t}f_\text{c}(\mathbf{x}(\tau), u(\tau), \tau)d\tau\) where \(\mathbf{x}_k = \mathbf{x}(t_k)\), and \(\mathbf{x}_{k+1} = \mathbf{x}(t_{k+1}) = \mathbf{x}(t_k + \Delta t)\). You may verify your results with your favorite software (e.g., Mathematica or Wolfram Alpha), but should outline the intermediate steps.

Note: In your analytic discrete-time dynamics, you will encounter \(\omega\) in the denominator of some expressions. In your code, you must handle cases when \(\omega\) is equal to or very near zero to avoid numerical errors or division by zero. (Tip: Instead of using an if statement, you can use jax.numpy.where for better support of batched inputs.) - Hint: The theoretical analytic solution is well-behaved as \(\omega \to 0\), but your code must take care with this limit.

(c)#

Using your analytic expression from (a), analytically differentiate your expression to obtain the state and control jacobians of the discrete-time dynamics.

(d)#

Generate 100 random states and verify that your analytic expression for the state and control jacobian matches the values you get from using jax.jacobian.

2. Study the classical cart-pole system and understand the limits of linearization.#

(This problem is adapted from Underactuated Robotics notes by Russ Tedrake at MIT.) The cart-pole system is a classic example of a nonlinear system \(\dot{\mathbf{x}} = \mathbf{f}(\mathbf{x}, \mathbf{u})\). Using the same convention as the textbook, we describe the state of the cart-pole as the vector \(\mathbf{x}=[x,\theta, \dot{x}, \dot{\theta}]^T\), the dimension is \(n=4\), and let the force on the cart be the control input \(\mathbf{u}=[f_x]\), where the dimension is \(m=1\). For sake of simplicity, let \(m_\mathrm{c}=m_\mathrm{p}=\ell=1\) and \(g=9.81\).

(a)#

Let \(\bar{\mathbf{x}} = \mathbf{x} - \mathbf{x}^*\), \(\bar{\mathbf{u}}=\mathbf{u} - \mathbf{u}^*\) denote the error in state and control relative to the equilibrium \((\mathbf{x}^*, \mathbf{u}^*)\). Show that the linearized error dynamics is of the form

\[ \dot{\bar{\mathbf{x}}} \approx \mathbf{f}_\mathrm{lin}(\bar{\mathbf{x}},\bar{\mathbf{u}}) = A_\mathrm{lin}\bar{\mathbf{x}} + B_\mathrm{lin}\bar{\mathbf{u}}\]

where

\[ A_\mathrm{lin}= \left[\nabla_\mathbf{x}f_1(\mathbf{x}^*,\mathbf{u}^*), \nabla_\mathbf{x}f_2(\mathbf{x}^*,\mathbf{u}^*),...,\nabla_\mathbf{x}f_n(\mathbf{x}^*,\mathbf{u}^*)\right]^T, \]
\[ B_\mathrm{lin} = \left[\nabla_\mathbf{u}f_1(\mathbf{x}^*,\mathbf{u}^*), \nabla_\mathbf{u}f_2(\mathbf{x}^*,\mathbf{u}^*),...,\nabla_\mathbf{u}f_n(\mathbf{x}^*,\mathbf{u}^*)\right]^T. \]

Hint: Since \((\mathbf{x}^*, \mathbf{u}^*)\) is an equilibrium point, \(\mathbf{f}(\mathbf{x}^*, \mathbf{u}^*)=0\) and \(\dot{\mathbf{x}}^*=0\).

(b)#

The linearization error is given by \(e(\mathbf{x}, \mathbf{u},\mathbf{x}^*, \mathbf{u}^*) = \| \mathbf{f}(\mathbf{x}, \mathbf{u}) - \mathbf{f}_\mathrm{lin}(\bar{\mathbf{x}}, \bar{\mathbf{u}})\|_2\). Compute the linearization error for the following states and control pairs. Include the numbers in your write-up.

  • \(\mathbf{x}=[0, 0.99\pi, 0, 0]^T\) and \(\mathbf{u}=[0]\)

  • \(\mathbf{x}=[0, 0.9\pi, 0, 0]^T\) and \(\mathbf{u}=[-10]\)

  • \(\mathbf{x}=[0, 0.85\pi, 0, 0]^T\) and \(\mathbf{u}=[0]\)

  • \(\mathbf{x}=[0, 0.5\pi, 0, 0]^T\) and \(\mathbf{u}=[0]\)

  • \(\mathbf{x}=[0, 0, 0, 0]^T\) and \(\mathbf{u}=[0]\)

  • \(\mathbf{x}=[1, \pi, 0, 0]^T\) and \(\mathbf{u}=[10]\)

(c)#

Provide a brief answer to the following questions.

  • Is our linear approximation valid for all the points we tested?

  • Do we expect a linear controller to do a decent job when \(\theta=\frac{\pi}{2}\)?

  • When \(\theta\) is different from \(\pi\), does the linearization error depend on \(\mathbf{u}\)?

  • Why is the error from the second case bigger than the one from the third, even if the second \(\theta\) is closer to \(\pi\)?

  • What about the position \(x\) of the cart? Should it affect the linearization error? If no, why not?

Optimization#

1. Cross entropy method.#

Study the cross-entropy method (CEM) as presented in the Algorithms for Optimization textbook, focusing especially on page 133. Your objective is to implement the core CEM algorithm for use in global optimization, specifically targeting the Branin function described below.

Your implementation should include a function named cross_entropy_method(...). At a minimum, this function must:

  • Accept the objective function to optimize,

  • Accept any required arguments for initializing the CEM algorithm,

  • Accept algorithm-specific hyperparameters,

  • Return the best solution found by the algorithm.

Write your code with clear, comprehensive comments that demonstrate a solid understanding of each algorithmic step.

To demonstrate both the correctness and generality of your implementation, apply your cross_entropy_method to at least one additional objective function beyond the Branin function.

For deeper analysis, visualize CEM’s progress and sample distributions over iterations using plots or animations.

Finally, create several GIFs that show how samples evolve and converge over time under differing hyperparameter settings and objective functions. Comment on the effectiveness of CEM for locating local versus global minima, and discuss which factors (such as population size, number of elite samples, initialization, etc.) should be considered when selecting hyperparameters for different optimization scenarios.

Branin function#

The Branin function is a popular benchmark function for optimization algorithms. It is defined as:

\[ f(x, y) = a(y - bx^2 + cx - r)^2 + s(1 - t)\cos(x) + s \]

where the typical parameter values are:

  • \( a = 1 \)

  • \( b = \frac{5.1}{4\pi^2} \)

  • \( c = \frac{5}{\pi} \)

  • \( r = 6 \)

  • \( s = 10 \)

  • \( t = \frac{1}{8\pi} \)

The function is usually evaluated on the domain \( x \in [-5, 10] \) and \( y \in [0, 15] \).

The Branin function has three global minima, making it useful for testing the performance of global optimization algorithms. Its landscape is characterized by multiple valleys and ridges, which can challenge optimization methods.

2. Gradient descent.#

Given the different functions studied in notebook 02, discuss how the choice of initial conditions and step size affects the convergence and the final solution when using (vanilla) gradient descent. Provide some plots to support your discussion.

3. (Optional) Gradient descent algorithms.#

Read Chapters 5 and 6 of the Algorithms for Optimization textbook. In your own words, summarize how the various first-order and second-order variants of gradient descent differ from basic gradient descent, and explain in what ways they may offer improved performance. Consider aspects such as convergence speed, ability to escape local minima, sensitivity to step size, and robustness to the shape of the objective landscape. You don’t need to write an essay here, but rather, demonstrate an understanding that there are more sophistocated gradient descent methods, the intuition behind them, and their strengths and weaknesses.

Control Certificates#

1. Class-K function#

Consider a class-\(\mathcal{K}\) function of the form \(\alpha(z) = a z\). Explain how adjusting the parameter \(a\) impacts both the system’s response and the level of conservatism inherent in a Control Barrier Function (CBF) safety filter. Specifically, discuss the trade-offs associated with choosing higher versus lower values of \(a\) in terms of safety margin, responsiveness, and potential limitations for practical implementation.

2. Class-K function and control actuation limits.#

When implementing a CBF safety filter using a class-\(\mathcal{K}\) function \(\alpha(z) = a z\), how should actuator constraints—such as control saturation or rate limits—influence your selection of the parameter \(a\)? Address how such constraints can limit the allowable range of \(a\), and describe the potential consequences of choosing an inappropriate value in the presence of these practical limitations.

3. High-order CBF.#

The concept of relative degree is important when analyzing functions with respect to system dynamics. Here, we provide a more precise definition, specifying it for each control input component.

Consider a (sufficiently) differentiable function \(b : \mathbb{R}^n \rightarrow \mathbb{R}\) and a control-affine dynamical system given by \(\dot{z} = f(z) + g(z)u\), where \(z \in \mathbb{R}^n\) and \(u \in \mathbb{R}^m\). The relative degree \(m_\mathrm{r}\) of \(b\) with respect to the \(j\)-th control input is defined as the smallest integer such that

\[ \left[\mathcal{L}_{g}\mathcal{L}_{f}^{m_\mathrm{r}-1}b(\cdot)\right]_j \neq 0, \]

but

\[ \left[\mathcal{L}_{g}\mathcal{L}_{f}^{m_\mathrm{r}-2}b(\cdot)\right]_j = 0. \]

Here, \(\mathcal{L}_{v}h(z) = \nabla h(z)^\top v(z)\), \(\mathcal{L}_v^k b(\cdot)\) denotes the \(k\)-th Lie derivative (with \(\mathcal{L}_v^0 b(\cdot) = b(\cdot)\)), and \(\left[x\right]_j\) refers to the \(j\)-th component of the vector \(x\).

Now, consider the continuous-time, dynamically-extended unicycle model, which is a control-affine system, along with a candidate control barrier function (CBF) \(b(z) = x^2 + y^2 - r^2\), where \(r\) is the radius of an obstacle:

\[\begin{split} \dot{z} = \begin{bmatrix} \dot{x} \\ \dot{y} \\ \dot{\theta} \\ \dot{v} \end{bmatrix} = \begin{bmatrix} v\cos\theta \\ v\sin\theta \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 \\ 0 & 0 \\ 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \omega \\ a \end{bmatrix} , \qquad b(z) = x^2 + y^2 - r^2 \end{split}\]

a. Demonstrate that the relative degree of \(b\) with respect to the dynamically-extended unicycle dynamics is \(2\) for both control inputs.

(b)#

When the relative degree of a barrier function exceeds \(1\), we must use high order control barrier functions (HOCBFs)—analogous to high order control Lyapunov functions—to ensure safety. For a system with relative degree \(2\) and control-affine dynamics \(\dot{z} = f(z) + g(z) u\), the HOCBF constraint takes the following form:

\[ \text{HOCBF:} \qquad \mathcal{L}_{f}^{2}b(z) + \mathcal{L}_{g}\mathcal{L}_{f} b(z)\, u + \mathcal{L}_{f}(\alpha_1 \circ b)(z) + \alpha_2\big( \mathcal{L}_{f} b(z) + \alpha_1(b(z)) \big) \geq 0 \]

where \(\mathcal{L}_{f}^{2} h(z) = \mathcal{L}_{f}(\mathcal{L}_{f} h)(z)\), \((\alpha_1 \circ h)(z) = \alpha_1(h(z))\), and \(\alpha_1\), \(\alpha_2\) are class \(\mathcal{K}_\infty\) functions.

Given that the CBF for this system has relative degree \(2\), can we directly implement it in a standard CBF-QP (Quadratic Program) safety filter? Discuss why or why not.

4. Closer look at CBF-CLF-QP Filter demo.#

In this problem, you will take a closer look at the cbf-clf-filter demo. Make sure you first familiarize yourself with the code because you will need to change some of the parameter values to complete this problem.

(a)#

Set the initial state to be [-7, 0.0, 0.0, 2.0] and the desired control to be [0.0, 0.0]. Run the simulation. You will find that the CBF-CLF-QP filter is unsuccessful in keeping the system away from the obstacle. Explain why this behavior occurs.

(b)#

Set cbf_alpha=0.5 and dt=0.5. Run the simulation. You will find that the CBF-CLF-QP filter is unsuccessful in keeping the system away from the obstacle. Explain why this behavior occurs.