In this section, we introduce the derivative as a measure of the instantaneous rate of change of a function.
The central idea of differential calculus is a measure of change, and specifically, how fast a function is changing at a particular point.
Imagine you throw a ball up into the air.
The ball's height above the ground, , can be modelled by a function of time, .
The graph above shows the height of the ball as a function of time.
Note that the left-right axis does not represent left-right motion; it represents time - rightwards is the future, and leftwards is the past.
Now imagine you want to know how fast the ball is moving.
One way to do this is to calculate the average rate of change of the height function over a small interval of time.
This means we take the difference in height between two points on the graph, , and divide it by the difference in time, .
To calculate these differences, denote the starting time as and the ending time as .
Then, we can calculate the difference in height as , and the difference in time as simply .
The average rate of change is then given by:
The expression on the right is sometimes called the difference quotient.
We can model this as the slope of a line connecting the two points on the graph.
This line is called the secant line:
There are two important points to note about the average rate of change:
The rate of change isn't just the slope; the slope is a representation of the rate of change.
When we study higher-level mathematics, such as multivariable calculus, we will see that the rate of change can be more complex than just measuring slopes.
The way we can think of it is that the answer to "how fast is the ball moving between and " is "it's equal to the slope of the secant line between and ".
It's just a way to calculate the rate of change.
Sometimes we hear the word "gradient" used instead of "slope".
In calculus, the gradient is actually a different term reserved for a different concept (multivariable calculus).
Hence, it's better to stick with "slope" when talking about the rate of change of a function to avoid confusion.
The average rate of change gives us a good idea of how fast something is changing over a period of time.
However, what if we want to know how fast something is changing at a single instant in time?
Before even considering how to calculate this, let's first think about what it means.
The question "how fast is something changing at a single instant" is a bit of a paradox.
After all, how can something change at a single instant?
It's like asking how fast a car is moving at a single point in time - it doesn't make sense.
Speedometers in cars don't measure speed at a single instant; they measure speed over a very short period of time.
Think of it like this: the speedometer measures the position now, as well as the position seconds ago.
Then, it calculates the average speed over that time period.
Let's take the same approach with the height of the ball - make very small.
Go back to the graph of the ball's height above the ground, and drag the second point closer to the first point.
As the two points get closer together, the secant line becomes more and more like a tangent line, which touches the curve at a single point.
In order to calculate the instantaneous rate of change, we need to make as small as possible.
Of course, we can't make it exactly zero, because then we'd be dividing by zero.
This should immediately ring a bell - limits.
Although we cannot set to , we can take the limit as approaches :
This limit of the difference quotient gives us the instantaneous rate of change of the function at a particular point.
It's also called the derivative of the function at that point.
The key point is that instantaneous really just means very, very close to 0.
We denote the derivative of this function as , which reads as "the derivative of with respect to ".
Using instead of indicates that the limit is "built in" into the expression.
This notation is called Leibniz notation, named after the mathematician Gottfried Wilhelm Leibniz.
Another common notation for the derivative is , which reads the same as "the derivative of with respect to ".
This is called prime notation, and is a shorthand for the derivative.
It's a bit more concise than Leibniz notation, but it doesn't explicitly show the variable with respect to which the derivative is taken, which can be a disadvantage especially in the future when we deal with functions of multiple variables.
This gives us the formal definition of the derivative, sometimes called the first principle of differentiation:
The derivative of a function at a point is given by:
It's important to note that the derivative of a function is another function.
It gives the rate of change of the original function at every point.
For example, gives the rate of change of the height of the ball at time .
If a function is explicitly stated, we can substitute the in the notation with the expression of the function.
For example, if , then the derivative can be written as or , where can be though of as an operator that transforms a function into its derivative.
We will try to answer the following question algebraically:
Calculate the derivative of the following functions:
, where is a positive integer
To calculate the derivative of , we need to use the definition of the derivative.
Substitute into the definition:
Next, calculate the derivative of with the same procedure:
Finally, we need to calculate the derivative of .
This will let us generalize the results for and to any positive integer .
Substitute into the definition of the derivative:
This expression is a bit more difficult than the previous ones, but we can still calculate it.
We can expand using the binomial theorem.
(The appendix contains a short review of the binomial theorem.)
Using the binomial theorem, we can expand as a sum of terms:
In the second line, I simply grouped the terms with and higher.
Then, we can substitute this expansion into the derivative expression:
Once again, the terms cancel out, leaving us with:
This result shows that the derivative of is .
This is known as the power rule:
Power Rule: The derivative of a function, , is given by:
(Note that our approach using the binomial theorem only works for positive integer values of .
However, the power rule actually holds for any real number , not just integers.
The proof for non-integer values of requires more advanced techniques including the chain rule and exponential functions:
)
We used the limit definition to derive the derivative of and .
We can also understand these geometrically - it would be a good way to emphasize that the derivative is not a slope, but a rate of change.
We can visualize as the area of a square with side length .
Then, the derivative of is the rate at which the area of the square changes as we change the side length.
So all we have to do is increase the side length by a small amount and see how much the area changes:
Looking at the diagram, we can see that the change in the area can be divided into the green areas and the red area.
Each green area has a width of and a height of , so the total 2 green areas is .
The red area is a square with side length , so its area is .
Thus, .
Dividing by :
And, of course, as approaches , the term disappears, leaving us with .
Similarly, we can visualize as the volume of a cube with side length .
Then, the derivative of is the rate at which the volume of the cube changes as we change the side length.
In the diagram above, if we increase the side length, the volume of the cube changes.
This change can be divided into 3 parts: the green parts, the yellow parts, and the red part:
The green parts are 3 rectangles with dimensions , so the total volume of the green parts is .
The yellow parts are 3 rectangles with dimensions , so the total volume of the yellow parts is .
The red part is a cube with side length , so its volume is .
Thus:
Dividing by :
Once again, as approaches , the terms disappear, leaving us with .
This is where things get difficult.
In order to visualize , we need to think of it as the volume of a "hypercube" in dimensions.
This is difficult to visualize, but we can still think of it as the volume of a cube in 3D space, a "hypercube" in 4D space, and so on.
In the case, we had different parts of the volume change as we increased the side length.
of them scaled linearly with , of them scaled quadratically with , and of them scaled cubically with .
This corresponds with the binomial theorem expansion of .
Hence, the binomial expansion of higher powers can give us a geometric sense of how hypercubes work in higher dimensions.
In this section, we introduced the concept of the derivative as a measure of the instantaneous rate of change of a function.
We explored how the derivative can be calculated using the definition of the derivative, and derived the power rule for derivatives of functions of the form .
Here are the key points to remember:
The average rate of change of a function over an interval is given by the difference in the function values divided by the difference in the input values.
The derivative of a function at a point gives the instantaneous rate of change of the function at that point.
It is not a slope, but a tangent line witih the slope as the derivative can be used to represent it.
The derivative of a function is another function that gives the rate of change of the original function at every point.
Denoted as or , it is formally defined as:
The power rule states that the derivative of is .
In the next section, we will explore certain properties of derivatives that make them easy to calculate and manipulate.
The binomial theorem is a formula that allows us to expand expressions of the form .
We can first expand and to see the pattern:
Notice that during the expansion of something like , we are essentially choosing all possible combinations of and from each term:
Chosen terms
Result
Total
Therefore, if, for example, we want to know the term with (notice how the powers always add up to ) in the expansion of , we can just count the number of ways we can choose two s and one from the terms.
That is just the combination .
Generalizing, in the expansion of , if we want to know the term with , we can count the number of ways we can choose s and s from the terms, which is .
Then, the term is .
Finally, we can add up all the terms from to to get the full expansion of :
We can also write this in a more compact form using the summation notation: