Multivariable Chain Rule
The chain rule is a fundamental concept in calculus, and it can be extended to multivariable functions.
Table of Contents
Definition
Consider a function
Then, consider applying
This can be thought of as a series of transformations:
- Start with a number line for
. - Transform to a plane with
. - Transform to a number line with
.
Notice how, although a plane is involved, it still starts with a single number and ends with a single number. Therefore, it is still a single-variable function.
Next, consider taking the derivative of
We can start with an example:
Then, consider the derivative of
This is fine, but there's a more general way to think about this.
Consider the partial derivatives of
Next, consider the derivative of each function
Notice that the derivative of
This is known as the multivariable chain rule.
Intuition for the Multivariable Chain Rule
We've just used one example and noticed a (possibly coincidental) pattern, but this should also make intuitive sense.
First, recall the intuition for the regular chain rule.
Consider a function
- Start with a number line for
. - Transform to a number line with
. - Transform to a number line with
.
Now, consider a change in
- The change in
is . - The change in
is , since the fraction essentially cancels out. - The change in
is .
Then:
And dividing by
Let's extend this intuition to the multivariable case.
- You start with a number line for
. The change in is . - The change in
causes a change in and . The change in is and the change in is , due to the cancelling of differentials. - Both of these changes result in a change in
. You could think of this as the sum of a change in due to a change in and a change in due to a change in .- The change in
due to a change in is . - The change in
due to a change in is .
- The change in
Then:
And dividing by
Vector Form of the Multivariable Chain Rule
The multivariable chain rule can be written in vector form.
We've used
The vector
Then, the derivative of
Recall the multivariable chain rule:
Notice that this is basically a dot product:
This should also make intuitive sense, as it is very similar to the regular chain rule:
corresponds to ; the gradient is sort of an extension of the full derivative. corresponds to .
Duality of the Multivariable Chain Rule and the Directional Derivative
One thing to notice is that the multivariable chain rule looks very similar to the directional derivative.
Recall the directional derivative:
And the vector-form multivariable chain rule:
The vector-form rule is essentially the directional derivative of
Consider why this is the case. Recall that the composition of functions can be thought of as a series of transformations:
- Start with a number line for
. - Transform to a plane with
. - Transform to a number line with
.
When you increment a value by a vector in the plane, and measure the change in
The vector in question is caused by the change in
Formalizing the Multivariable Chain Rule
We have shown various ways to intuitively think about the multivariable chain rule, but let's treat it more formally now.
Recall that we used the cancellation of differentials to derive the chain rule. This is not rigorous, but it is still helpful because it very closely resembles the formal treatment.
Recall the vector form of the multivariable chain rule:
And the limit definition of the derivative (since this is a single-variable function):
Recall our intuition for the chain rule: the change in
Now we're going to do something that might be unfamiliar, but we're going to rewrite the limit as a sum of two terms:
The
Multiply both sides by
Rewrite
Then, rewrite
This is based on the definition of the derivative as a slope. We apply the slope to
Substitute this back into the definition of the derivative of
Since
Finally, recall the definition of the directional derivative:
Looks familiar? The multivariable chain rule is essentially the directional derivative of
This also illustrates the power of using vectors, as well as an interplay between intuition and formalization - our entire manipulation was to evaluate the different nudges in a formal way.