Multivariable Chain Rule
The chain rule is a fundamental concept in calculus, and it can be extended to multivariable functions.
Table of Contents
Definition
Consider a function 
Then, consider applying 
This can be thought of as a series of transformations:
- Start with a number line for 
 . - Transform to a plane with 
 . - Transform to a number line with 
 . 
Notice how, although a plane is involved, it still starts with a single number and ends with a single number. Therefore, it is still a single-variable function.
Next, consider taking the derivative of 
We can start with an example:
Then, consider the derivative of 
This is fine, but there's a more general way to think about this.
Consider the partial derivatives of 
Next, consider the derivative of each function 
Notice that the derivative of 
This is known as the multivariable chain rule.
Intuition for the Multivariable Chain Rule
We've just used one example and noticed a (possibly coincidental) pattern, but this should also make intuitive sense.
First, recall the intuition for the regular chain rule.
Consider a function 
- Start with a number line for 
 . - Transform to a number line with 
 . - Transform to a number line with 
 . 
Now, consider a change in 
- The change in 
 is . - The change in 
 is , since the fraction essentially cancels out. - The change in 
 is . 
Then:
And dividing by 
Let's extend this intuition to the multivariable case.
- You start with a number line for 
 . The change in is . - The change in 
 causes a change in and . The change in is and the change in is , due to the cancelling of differentials. - Both of these changes result in a change in 
 . You could think of this as the sum of a change in due to a change in and a change in due to a change in .- The change in 
 due to a change in is . - The change in 
 due to a change in is . 
 - The change in 
 
Then:
And dividing by 
Vector Form of the Multivariable Chain Rule
The multivariable chain rule can be written in vector form.
We've used 
The vector 
Then, the derivative of 
Recall the multivariable chain rule:
Notice that this is basically a dot product:
This should also make intuitive sense, as it is very similar to the regular chain rule:
 corresponds to ; the gradient is sort of an extension of the full derivative. corresponds to .
Duality of the Multivariable Chain Rule and the Directional Derivative
One thing to notice is that the multivariable chain rule looks very similar to the directional derivative.
Recall the directional derivative:
And the vector-form multivariable chain rule:
The vector-form rule is essentially the directional derivative of 
Consider why this is the case. Recall that the composition of functions can be thought of as a series of transformations:
- Start with a number line for 
 . - Transform to a plane with 
 . - Transform to a number line with 
 . 
When you increment a value by a vector in the plane, and measure the change in 
The vector in question is caused by the change in 
Formalizing the Multivariable Chain Rule
We have shown various ways to intuitively think about the multivariable chain rule, but let's treat it more formally now.
Recall that we used the cancellation of differentials to derive the chain rule. This is not rigorous, but it is still helpful because it very closely resembles the formal treatment.
Recall the vector form of the multivariable chain rule:
And the limit definition of the derivative (since this is a single-variable function):
Recall our intuition for the chain rule: the change in 
Now we're going to do something that might be unfamiliar, but we're going to rewrite the limit as a sum of two terms:
The 
Multiply both sides by 
Rewrite 
Then, rewrite 
This is based on the definition of the derivative as a slope. We apply the slope to 
Substitute this back into the definition of the derivative of 
Since 
Finally, recall the definition of the directional derivative:
Looks familiar? The multivariable chain rule is essentially the directional derivative of 
This also illustrates the power of using vectors, as well as an interplay between intuition and formalization - our entire manipulation was to evaluate the different nudges in a formal way.