Introduction to optimization#

In this chapter, we present a concise overview of mathematical optimization and introduce several strategies for approaching and solving optimization problems.

At its core, mathematical optimization involves finding the best solution to a problem by minimizing (or maximizing) an objective function, subject to a set of constraints. We frequently solve optimization problems in our daily lives, often without realizing it. For instance, when trying to get to class on time, you might seek the shortest path while avoiding traffic and stairways. In aerospace engineering, designers aim to minimize the weight and cost of an aircraft—adjusting factors like wing shape, fuselage size, and engine placement—while ensuring stability, aerodynamic efficiency, and structural integrity. Similarly, in control systems, we strive to select the sequence of control inputs that minimizes fuel consumption, reaches a desired goal, avoids obstacles, and respects the dynamics of the system.

Although optimization problems are often easy to describe in words, translating them into precise mathematical formulations—especially in a way that enables efficient computation—can be challenging.

Note: This course will not explore the full depth of optimization theory or solver algorithms. Our primary focus will be on expressing control synthesis problems as mathematical optimization problems, and in particular, reformulating them in ways (e.g., to make them convex) that permit efficient solution with modern, off-the-shelf solvers and packages such as cvxpy.

Unconstrained Optimization#

Mathematical Formulation#

Let us begin by considering an unconstrained optimization problem. Such a problem involves two principal components:

  • The decision variable \(x \in \mathbb{R}^n\), representing the \(n\) unknowns whose values we seek, and

  • The objective function \(f: \mathbb{R}^n \rightarrow \mathbb{R}\), which assigns a numerical value (often representing cost) to each possible \(x\).

If \(f(x)\) represents the cost associated with a particular \(x\), our aim is to select \(x\) so as to minimize the cost. The problem is typically written as:

\[ \min_x \; f(x), \qquad x^\star = \arg\min_x f(x) \]

The first expression seeks the minimum value of \(f\) over all \(x\), while the second denotes the minimizer: the particular value \(x^\star\) attaining that minimum. Conventions often use the superscript \(\star\) to indicate the optimal variable.

Approaches to Solving Unconstrained Optimization Problems#

The difficulty of solving an unconstrained optimization problem depends heavily on the properties of the objective function \(f\).

  • If \(f\) is smooth (differentiable) and has a structure amenable to analysis, techniques like gradient descent can efficiently find stationary points.

  • However, if \(f\) is non-differentiable, highly non-convex, or expensive to evaluate (e.g., each evaluation requires a costly simulation), then specialized derivative-free methods may be required, and obtaining an optimal solution can become substantially more challenging.

Gradient Descent#

One of the most popular and straightforward methods for solving unconstrained optimization problems is gradient descent. The key idea behind gradient descent is to iteratively improve your estimate of the optimal solution by moving in the direction of steepest descent—that is, the direction in which the function decreases most rapidly.

Suppose \(f\) is a smooth (differentiable) function and it is feasible—ideally inexpensive—to compute its gradient, \(\nabla f(x)\). Then, at each iteration, we can update our current guess \(x_k\) by taking a small step \(\alpha\) (the step size, sometimes called the learning rate) in the negative gradient direction:

\[ x_{k+1} = x_k - \alpha \nabla f(x_k) \]

Here, \(x_{k+1}\) becomes the new guess, and the process is repeated until convergence, or a maximum number of steps is reached.

Pause and think

What are some practical considerations when performing gradient descent?

Derivative-free#

What if you can’t compute gradients, or doing so would be too expensive or difficult? In such cases, there are many alternative methods that do not require gradient information.

A common strategy is to use sampling-based or population-based methods, where you explore the search space by evaluating the objective function at a set of sampled points. Based on these samples, you can iteratively refine your search—focusing more on promising regions of the space.

Popular derivative-free optimization methods include the cross-entropy method, simulated annealing, and genetic algorithms.

Constrained optimization#

Now, let’s consider situations where certain values of \(x\) are inadmissible. For instance, negative values might be physically impossible or undesirable (such as negative weights), or some variables may need to be fixed at specific values. More generally, we can represent such requirements using two constraint functions: \(g:\mathbb{R}^n \rightarrow \mathbb{R}^{m_\leq}\) for inequality constraints and \(h:\mathbb{R}^n \rightarrow \mathbb{R}^{m_=}\) for equality constraints, where we require that \(g(x) \leq 0\) and \(h(x) = 0\). Here, \(m_\leq\) and \(m_=\) denote the number of inequality and equality constraints, respectively.

With these constraints in place, we arrive at the general constrained optimization problem:

\[\begin{split} \min_x \quad f(x) \\\\ \text{subject to} \quad g(x) \leq 0 \\\\ \phantom{\text{subject to}} \quad h(x) = 0 \end{split}\]

Solving constrained optimization problems is generally much more challenging than the unconstrained case. Standard gradient descent is typically insufficient, since the algorithm may step into infeasible regions that violate the constraints.

Pause and think

What are some effective strategies for handling constraints in optimization problems?

As a final note, throughout this course we will emphasize formulating control problems as optimization problems, ensuring they are structured so that off-the-shelf optimization solvers can be applied directly and efficiently.

Optimization solvers#

There is a wide variety of optimization solvers available, each with its own strengths and interfaces. Some packages are modeling languages—user-friendly frameworks that allow you to define optimization problems in mathematical terms, freeing you from the rigid standard forms required by many solvers and enabling you to easily switch between different solver backends. Other packages are dedicated solvers with their own specific interfaces, while some environments offer optimization functionality through toolbox functions.

In this course, we will primarily use cvxpy because of its intuitive, math-like syntax and ease of use. However, for your project work, you are welcome to use any solver, modeling language, or library that best fits your needs or interests.

Below is a curated list of commonly used optimization solvers and modeling languages, with links and brief descriptions for each. Please note: while this list draws on various sources (including AI-assisted compilation), the teaching team has not comprehensively tested every tool listed.

🔹 Optimization solvers#

Free & open-source solvers#

  • CVXOPT – Convex optimization library in Python, supports linear and quadratic programming.

  • SciPy Optimize – General-purpose optimization module in SciPy (Python).

  • OSQP – Quadratic programming solver designed for fast embedded optimization.

  • GLPK (GNU Linear Programming Kit) – Solves large-scale linear programming (LP) and mixed-integer programming (MILP) problems.

  • Ipopt – Interior-point solver for nonlinear programming (NLP).

  • CBC (Coin-or Branch and Cut) – Open-source MILP solver.

  • Clp (Coin-or Linear Programming) – Open-source LP solver.

  • CasADi – Symbolic framework for nonlinear optimization and optimal control.

  • GEKKO – Python package for solving LP, NLP, and MILP problems.


Commercial solvers (often more Powerful, but some have free academic licenses)#

  • Gurobi – Industry-leading solver for LP, QP, and MILP. Free academic licenses available.

  • CPLEX (IBM) – Powerful solver for LP, MILP, and NLP. Academic licenses available.

  • MOSEK – Optimized for large-scale conic and linear problems.

  • Knitro – High-performance nonlinear optimization solver.

  • MATLAB Optimization Toolbox – Various solvers for LP, QP, and NLP problems.


🔹 Optimization modeling languages#

  • CVXPY – Python-based modeling framework for convex optimization.

  • PuLP – Python library for linear programming modeling.

  • Pyomo – Python-based modeling for LP, NLP, and MILP.

  • JuMP – High-performance optimization modeling in Julia.

  • AMPL – A powerful modeling language for mathematical programming.

  • GAMS – General Algebraic Modeling System for LP, NLP, and MILP.

  • YALMIP – Optimization modeling toolbox for MATLAB.

  • CVX (MATLAB) – Convex optimization modeling in MATLAB.

🔹 Optimization functions within toolboxes#

Here’s a list of optimization functions within toolboxes for popular programming environments like MATLAB, SciPy, and Julia.


🔹 Optimization functions in MATLAB (from Optimization Toolbox and other toolboxes)#

  • fmincon – Solves constrained nonlinear optimization problems.

  • linprog – Linear programming solver for minimizing linear objectives subject to linear constraints.

  • quadprog – Solves quadratic programming (QP) problems.

  • fminunc – Unconstrained nonlinear optimization.

  • lsqnonlin – Solves nonlinear least-squares problems.

  • ga – Genetic algorithm solver for global optimization.

  • particleswarm – Particle swarm optimization (PSO) for global optimization.

  • simulannealbnd – Simulated annealing for constrained or unconstrained problems.


🔹 Optimization functions in SciPy (Python) (from scipy.optimize module)#


🔹 Optimization functions in Julia (from JuMP and Optim.jl packages)#