Multivariable Optimization

Optimization refers to the process of finding the maximum or minimum value of a function. It's super important in many fields, including those not directly related to the natural sciences:

In economics, you might formulate a function to represent the profit of a company based on things like the price, quantities, and costs. You would want to find the maximum values for those parameters to maximize the profit.
In engineering, you might have a cylindrical tank with a fixed volume and want to minimize the cost of the material used to build it. You would want to find the minimum surface area of the tank to minimize the cost.
In computer science, you might have a function that represents the time complexity of an algorithm. You would want to find the minimum value to optimize the algorithm's performance.
In physics, you might have a function that represents the potential energy of a system. You would want to find the minimum value to find the equilibrium position of the system.

Previously, we learned how to find minima and maxima of functions in single-variable calculus with techniques like the first and second derivative tests. However, in many real-world problems, we often deal with functions of multiple variables. In this section, we will extend these concepts to multivariable functions and learn how to find maxima, minima, and saddle points of these functions.

Table of Contents
Refresher on Single-Variable Optimization
- Second Derivative Test
- Local and Global Extrema
What Extrema Look Like
Critical Points in Multivariable Functions
- Tangent Planes
Inflection Points in Multivariable Functions
Saddle Points
Formal Definition of Extrema
Summary and Next Steps

Refresher on Single-Variable Optimization

When optimizing a function, we typically look for points where the function "flattens" out. In a parabola, that would be the vertex.

Such points are called critical points, stable points, or stationary points. When optimizing, the first step is to find these critical points. How do we do that?

Consider an arbitrary function . We would want to find the points where the derivative of the function is zero. These points are the critical points of the function:

However, these points are not always maxima or minima, and additionally, some maxima and minima might not have a derivative of zero.

At an inflection point, the derivative is zero, but it's not a maximum or minimum:
Checking the derivative requires the function to be differentiable. However, maxima and minima can also occur at points where the function is not differentiable, where we can't use the derivative test. Take this example of the an absolute value function:

Hence, instead of just checking the derivative to be zero, we add on points where the function is not differentiable; the derivative is not defined.

Second Derivative Test

To sum up the previous section:

Critical points are points where the derivative is zero or undefined.
They can be maxima, minima, or neither.

To determine whether a critical point is a maximum or minimum, we use the second derivative test. Consider this:

Imagine the maximum point of a function. Consider its slope - on the left, it's negative (and goes up), and on the right, it's positive (and goes down). In other words, the slope changes from negative to positive.
Now, consider the minimum point of a function. The slope changes from positive to negative.

Using the second derivative, we can check its sign to determine whether the slope is increasing or decreasing:

: The function is concave up, and the point is a minimum.
: The function is concave down, and the point is a maximum.

Local and Global Extrema

In single-variable calculus, we classify maxima and minima as follows:

If the point is a peak, it is a local maximum. We use "local" because it's only the maximum in a small neighborhood.
If the point is the highest point the function reaches, it is an absolute maximum or global maximum. It's the maximum over the entire domain of the function.
Similarly, we have local minimum and absolute minimum.

What Extrema Look Like

Before discussing the computation of maxima and minima, let's talk about what they look like.

Maxima in a function are points where the function is at its highest value. In a graph, they look like peaks or "hills", where every point within a small neighborhood is lower than the peak.

This also applies to multivariable functions. Consider the following function:

Here's a 3D plot of the function:

Using this surface as a reference, we can see 5 peaks and 4 valleys. These correspond to the local maxima and minima of the function.

The peak in red is the global maximum, meaning it's the highest point that the function can ever reach over its entire domain.

Critical Points in Multivariable Functions

Let's begin by extending the concept of critical points to multivariable functions.

In single-variable calculus, we found critical points by setting the derivative of the function to zero. The extension to multivariable functions is similar. You could imagine that if you slice the function through any plane that intersects the critical point, you would get a single-variable function, and the slope of that function would be zero at the critical points. In other words, all of its directional derivatives are zero.

Using the critical point at the origin, here's a visualization of the function sliced at a certain angle:

Given that all directional derivatives are zero at the critical point, the gradient of the function at the critical point is zero:

Note that the gradient is a vector, and the zero vector is a vector where all components are zero.

Tangent Planes

If we draw a tangent plane at the critical point, it would be flat. This is similar to the tangent line in single-variable calculus, where the tangent line (who has a slope equivalent to the derivative) is flat (i.e. slope is ) at the critical point.

Note that in higher dimensions, the tangent isn't really a plane, but something known as a hyperplane. For a function whose graph sits in , the tangent is an dimensional hyperplane.

For instance, in (single-variable calculus), the tangent is a line (1D). In , the tangent is a plane (2D). In , the tangent is a 3D hyperplane, and so on.

Inflection Points in Multivariable Functions

Recall that critical points can occur at inflection points, where the function is flat but not a maximum or minimum. There is a similar concept in multivariable functions.

For example, consider the following function:

This function has infinitely many critical points along the line .

Why?

To show this, we need to find the gradient of the function and set it to zero.

The gradient of the function is:

Setting this to zero, we get a system of equations:

Consider any of the equations. We have multiplied by something squared equals zero. This implies that the something squared is zero, which means that the something is zero; .

Hence, for the gradient to be zero, we must have .

However, once we look at the graph, it becomes clear that these points are not maxima or minima, but inflection points:

In the graph, the function is flat along the line , but it's not a maximum or minimum.

Saddle Points

In single-variable calculus, we learned that a function can have a maximum, minimum, or neither. In multivariable calculus, we have another type of critical point: the saddle point. These do not occur in single-variable functions, and are the result of different behavior in different directions.

Consider the function:

Its gradient at is :

So there is a critical point at . However, consider slicing the function along the -axis and -axis:

Along the -axis, the function looks like . This is a parabola that opens upwards, so is a minimum.
Along the -axis, the function looks like . This is a parabola that opens downwards, so is a maximum.

When a function's behavior conflicts in different directions, it results in a saddle point. Saddle points are not maxima or minima, and they are not inflection points.

In the graph, the function is a saddle point at .

Saddle points get their name from the shape of a saddle, where the function curves upwards in one direction and downwards in another.

Formal Definition of Extrema

While we have an intuitive understanding of maxima and minima, we need a formal definition to determine these points in multivariable functions.

Recall one interpretation we had for maxima and minima: the function is at its highest or lowest point in a small neighborhood.

Let the point be a local maximum of a function . We can denote this "neighborhood" as a ball centered at with radius . In this ball, the function is at its highest point at .

Hence, for any within of , . In other words:

For a function , the point is a local maximum if:

Similarly, we can define a local minimum:

For a function , the point is a local minimum if:

Summary and Next Steps

In this section, we introduced the concept of optimization in multivariable functions.

Here are the key things to remember:

Critical points are points where the gradient of the function is zero or undefined.
In single-variable calculus, critical points can be maxima, minima, inflection points, or none of these.
In multivariable calculus, we have the new concept of saddle points, which are neither maxima nor minima.
Maxima look like peaks, minima look like valleys, and saddle points look like... saddles.
We can define local extrema formally as points where the function is at its highest or lowest in a small neighborhood around the point.

In the next section, we will continue our discussion on optimization and learn about the second derivative test for multivariable functions.

Table of Contents​

Refresher on Single-Variable Optimization​

Second Derivative Test​

Local and Global Extrema​

What Extrema Look Like​

Critical Points in Multivariable Functions​

Options

Tangent Planes​

Inflection Points in Multivariable Functions​

Saddle Points​

Formal Definition of Extrema​

Summary and Next Steps​

Table of Contents