Quadratic Optimization and Matrices

In this post, I want to talk about a way to connect critical points of two variable polynomials of degree 2 to the null space of a matrix. A critical point of a function is a place where all partial derivatives of that function are equal to zero. All polynomials like this are of the form f(x,y)=ax^2+bxy+cy^2+d, where a,b,c,d \in \mathbb{R}. As a reminder, the null space of a matrix A is the set \{x \vert Ax = \vec{0}\}.

Let’s look at a few examples.

Example 1: Let’s find the critical points of the polynomial f(x,y)=3x^2+2xy+4y^2. To do this, we will find all partial derivatives of f(x,y). First, we see that \frac{\partial f}{\partial x}=6x+2y and \frac{\partial f}{\partial y}=2x+8y. So, to find the critical points of f(x,y), we need to solve 6x+2y = 0 and 2x+8y = 0.

This is a system of two equations with two unknowns! Let’s rewrite it using matrices. \begin{pmatrix}6&2\\2&8\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}0\\0\end{pmatrix}.

This matrix is invertible, since det\begin{pmatrix}6&2\\2&8\end{pmatrix}=6\cdot 8-2\cdot 2=44\neq 0, so the Invertible Matrix Theorem tell us that the only vector that solves this equation is \begin{pmatrix}0\\0\end{pmatrix}.

This means that the only critical point of f(x,y)=3x^2+2xy+4y^2 occurs at the point (0,0).

Now let’s look at another example with a different polynomial.

Example 2: Let’s find the critical points of the polynomial f(x,y)=4x^2+8xy+4y^2. The partial derivatives of f(x,y) are \frac{\partial f}{\partial x}=8x+8y and \frac{\partial f}{\partial y}=8x+8y. This means we want to solve 8x+8y=0 and 8x+8y=0.

That’s just the same equation twice! So, the corresponding matrix equation to examine is \begin{pmatrix}8&8\\8&8\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}0\\0\end{pmatrix}.

This matrix is not invertible since det\begin{pmatrix}8&8\\8&8\end{pmatrix}=8\cdot 8-8\cdot 8=0, which, by the Invertible Matrix Theorem, means that this matrix equation has infinitely many solutions.

That also tells us that the corresponding polynomial has infinitely many critical points! If you graph this polynomial it makes a lot of sense. There’s a flat line along y=-x, which all make f(x,y) attain the same value.

What’s cool about this is that there is a pretty simple way of finding critical points of a two variable quadratic function using linear algebra. And solutions to matrix equations like the ones we made will always either have no solution, one solution, or infinitely many solutions (that’s just a fact about systems of linear equations). Since we’re only every setting the left hand side of the matrix equation equal to a vector of zeros, we will never be in the “no solution” case (i.e. \begin{pmatrix}0\\0\end{pmatrix} will be a valid solution).

This means that polynomials of this form will always either have a single critical point OR infinitely many critical points. (If you had a polynomial with 2 critical points, for example, you would know that it is not a quadratic polynomial.) This is fun to think about because critical points are either local minima, maxima, or saddle points. By just finding the matrix associated with a polynomial like this, you can understand the shape of that polynomial without having to try and visualize it!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s