Tensors
Tensors
A tensor, roughly speaking, is a mathematical object that is invariant under a change of basis. However, they have components that transform in a specific way under a change of basis. In this page, we will introduce the concept of tensors and build up our understanding of them. We will first explore some examples of tensors, like dual vectors and linear maps. Then, we will formalize our understanding of tensors with tensor products.
Table of Contents
Change of Basis
The most fundamental concept in tensors is the idea of a change of basis. A basis is a set of vectors that can be used to represent any vector in a vector space. Recall that vectors can be represented as a linear combination of basis vectors:
where
The coefficients
Written in matrix form, this is:
The different theories of relativity essentially boil down to different ways of changing the basis of a vector space. Galilean relativity uses the Galilean transformation while special relativity uses the Lorentz transformation. In both cases, the transformation matrix is a linear transformation that transforms the old basis vectors into the new basis vectors.
Vectors Under Change of Basis
When we change the basis, the components of a vector transform in a specific way.
Suppose we have a vector
In textbooks, one might often see it just written as:
This is just a shorthand notation for the above equation with an implied basis.
If we replace the basis with another one (by introducing a transformation matrix
As such, vector components transform with the inverse of
Dual Vectors
A dual vector or covector is a linear map that takes a vector and returns a scalar.
They have many names—dual vectors, covectors, one-forms, linear functionals, etc.—but they all refer to the same concept.
If you have read the quantum mechanics section, you will know that the bra vector
Let
Suppose now we apply
Notice that this is the same as if we applied
Additionally, suppose we apply
Notice that this is the same as if we applied
From these examples, we can see that dual vectors are linear maps, meaning that they satisfy the following properties:
Hence, more formally, we define dual vectors symbolically as:
where
Index Notation for dual vectors
When we apply a dual vector to a vector, we take each component of the dual vector and multiply it by the corresponding component of the vector.
For a dual vector
Visualizing dual vectors
Suppose we have dual vectors acting on two-dimensional vectors.
We can imagine a dual vector as a surface, where the height of the surface at a point
However, these can be difficult to visualize and draw, especially in higher dimensions.
We instead prefer to use a series of lines to represent dual vectors, akin to a contour plot.
Each line represents a constant value of the dual vector.
Then, the value of
Alternatively, instead of writing the number on each line, we simply draw an arrow pointing in the direction of increasing values of the dual vector.
dual vector Arithmetic
Dual vectors can be added and multiplied by scalars in the same way as vectors.
This is because the dual space
When a dual vector is scaled, say
Dual Vector Basis and Components
Just like vectors, dual vectors have a basis.
In order to define a basis, they should be linearly independent and span the entire dual space
where
Notice that unlike vector bases, dual vector bases have a superscript index—dual vector bases are contravariant.
To see this more concretely, let
We know that we can insert the identity matrix in between the two matrices without changing the result.
As such, we can insert
We already know that vector components are contravariant and transform with the inverse matrix. As such,
Now, let's expand
Comparing the two expressions, we find that
Linear Maps
See Linear Transformations for a more detailed introduction to linear maps.
See Matrix Representation of Operators and Change of Basis for an introduction in the context of quantum mechanics.
A more general concept than dual vectors is that of a linear map.
It is defined as a map of
Linear maps can be represented by matrices:
where each column represents the image of the basis vector
We can see that the components of the image of
Linear Maps under Change of Basis
When we change the basis of the vector space, the components of vectors and linear maps change:
Intuitively, to transform a vector
To prove this more formally, we will introduce the Einstein summation convention.
This states that when an index appears twice in a term, it is implicitly summed over.
For example,
Let
with an implicit sum over
On the left-hand side, we have a sum over
Thus:
Indices vs Matrices
Suppose we have three matrices,
However, we can also write this as
They "cancel out" and we are left with the indices that are not repeated—
This is how we know that the entire matrix
Metric Tensor and Bilinear Forms
We will now introduce the concept of a metric tensor.
Suppose we want to find out the dot product of two vectors
Because the dot product is linear, we can expand this out:
We are left with a sum of terms of the form
The dot product of the basis vectors
While
Because the dot product is linear, the metric tensor is also linear.
Furthermore, it takes two inputs, so it is more formally written as
- Bilinear: Linear in both arguments.
- Form: A function that takes vectors and outputs a scalar. For example, dual vectors are 1-forms.
Tensor Products
If you have read the quantum mechanics section, you will know that an operator can be written as a sum of projection operators. The fundamental idea is that the outer product of a vector and dual vector gives a linear map:
To see why this is the case, matrix multiplication can be carried out as follows:
- Move the second matrix upwards and draw an empty matrix where it used to be.
- Each element in the empty matrix is the product of the corresponding elements in the two matrices.
Here's a concrete example:
To multiply them, we can shift
Then, the elements in the product matrix are given by the sum of the products of the corresponding elements. For example, the element in the first row and first column is:
For an outer product, it looks like this:
Since the outer product involves scaling one vector by two different scalars (in the dual vector), each column is a scaled version of the vector. As such, the matrix has a determinant of zero; its columns are linearly dependent. These matrices are called pure matrices.
In order to write any linear map in a similar way, we could combine pure matrices together. In a sense, the pure matrices act as a "basis" for all linear maps. We can make the following four pure matrices the bases:
Then, any matrix can be written as a sum of these four matrices:
And hence, we can write any linear map as:
Formally, each basis pure matrix is the tensor product of the basis vectors and dual vectors.
They are more formally written as
In fact, a tensor can be defined as a set of vectors and dual vectors that are combined using tensor products.
Defining a tensor as such makes it trivial to derive the transformation rules for tensors under a change of basis.
For example, for a linear map
Thus it is clear that
To take another example of a tensor product, consider two qubits (or two spin-1/2 particles).
The state of a single qubit is given by a vector in a two-dimensional complex vector space,
Now suppose we have a composite system of these qubits. There are now four basis states, each corresponding to a different combination of the two qubits:
: both qubits are in the state . : the first qubit is in the state and the second qubit is in the state . : the first qubit is in the state and the second qubit is in the state . : both qubits are in the state .
These are actually the tensor products of the two qubit states:
And the state of the composite system is given by a linear combination of these basis states:
Kronecker Product
Tensor products and Kronecker products are similar but not the same. Tensor products act on tensors, while Kronecker products act on arrays. However, they are essentially identical in our context, so we will use the terms interchangeably.
To take an example, a bilinear form is a tensor product of two dual vectors:
To apply the tensor product, we simply give a copy of the left object for each element in the right object: