Skip to main content

Tensors

Tensors

A tensor, roughly speaking, is a mathematical object that is invariant under a change of basis. However, they have components that transform in a specific way under a change of basis. In this page, we will introduce the concept of tensors and build up our understanding of them. We will first explore some examples of tensors, like dual vectors and linear maps. Then, we will formalize our understanding of tensors with tensor products.

Table of Contents

Change of Basis

The most fundamental concept in tensors is the idea of a change of basis. A basis is a set of vectors that can be used to represent any vector in a vector space. Recall that vectors can be represented as a linear combination of basis vectors:

where are the basis vectors and are the components of the vector . A change of basis is a transformation that changes the basis vectors from one set to another. Suppose we change to a new basis . We can express the new basis vectors in terms of the old basis vectors:

The coefficients form a transformation matrix that transforms the old basis vectors into the new basis vectors:

Written in matrix form, this is:

The different theories of relativity essentially boil down to different ways of changing the basis of a vector space. Galilean relativity uses the Galilean transformation while special relativity uses the Lorentz transformation. In both cases, the transformation matrix is a linear transformation that transforms the old basis vectors into the new basis vectors.

Vectors Under Change of Basis

When we change the basis, the components of a vector transform in a specific way. Suppose we have a vector in the old basis, represented by its components . One reason we use the superscript (as opposed to a subscript) and the subscript for basis vectors is because vector components are typically placed in a column vector, while basis vectors are typically placed in a row vector;

In textbooks, one might often see it just written as:

This is just a shorthand notation for the above equation with an implied basis. If we replace the basis with another one (by introducing a transformation matrix ), we must also use the inverse of the transformation matrix to transform the components of the vector to keep the vector the same:

As such, vector components transform with the inverse of . We say that vector components are contravariant. On the other hand, because vector bases transform with , we say that vector bases are covariant.

Dual Vectors

A dual vector or covector is a linear map that takes a vector and returns a scalar. They have many names—dual vectors, covectors, one-forms, linear functionals, etc.—but they all refer to the same concept. If you have read the quantum mechanics section, you will know that the bra vector is a dual vector. The Riesz representation theorem states that every dual vector can be represented as an inner product with a vector. We will re-introduce the concept of dual vectors, this time in the context of tensors.

Let be a dual vector. When applied to a vector , it returns a scalar, denoted by . They are represented by a row vector. For a simple example, consider the dual vector . When applied to the vector , we get:

Suppose now we apply to a sum of vectors , where . This is given by:

Notice that this is the same as if we applied to and separately and then added the results:

Additionally, suppose we apply to . This is given by:

Notice that this is the same as if we applied to :

From these examples, we can see that dual vectors are linear maps, meaning that they satisfy the following properties:

Hence, more formally, we define dual vectors symbolically as:

where is the vector space and is the field that the vector space is defined over. The set of all dual vectors on a vector space is called the dual space of , denoted by .

Index Notation for dual vectors

When we apply a dual vector to a vector, we take each component of the dual vector and multiply it by the corresponding component of the vector. For a dual vector , we say that , , and . As such, when we apply to a vector, we get:

Visualizing dual vectors

Suppose we have dual vectors acting on two-dimensional vectors. We can imagine a dual vector as a surface, where the height of the surface at a point is given by the value of the dual vector at that point. For example, consider the dual vector . This is visualized as a plane with a slope of in the -direction and in the -direction:

However, these can be difficult to visualize and draw, especially in higher dimensions. We instead prefer to use a series of lines to represent dual vectors, akin to a contour plot. Each line represents a constant value of the dual vector. Then, the value of on a vector is given by how many lines intersects.

Alternatively, instead of writing the number on each line, we simply draw an arrow pointing in the direction of increasing values of the dual vector.

dual vector Arithmetic

Dual vectors can be added and multiplied by scalars in the same way as vectors. This is because the dual space is itself a vector space.

When a dual vector is scaled, say , the contour lines look more densely packed. When a dual vector is added to another, say , the contour lines are the sum of the two dual vectors. These operations follow linearity rules:

Dual Vector Basis and Components

Just like vectors, dual vectors have a basis. In order to define a basis, they should be linearly independent and span the entire dual space . We define them using the following: if has a basis , then has a basis such that:

where is the Kronecker delta. Thus a dual vector can be written as:

Notice that unlike vector bases, dual vector bases have a superscript index—dual vector bases are contravariant. To see this more concretely, let be the transformation matrix from an old basis to a new basis. Suppose we apply a dual vector to a vector . In the old basis, we have:

We know that we can insert the identity matrix in between the two matrices without changing the result. As such, we can insert :

We already know that vector components are contravariant and transform with the inverse matrix. As such, :

Now, let's expand in the new basis:

Comparing the two expressions, we find that . Hence, dual vector components are covariant and transform with the transformation matrix. And by the fact that dual vectors are invariant, we have that the dual vector basis transforms with the inverse matrix.

Linear Maps

See Linear Transformations for a more detailed introduction to linear maps.

See Matrix Representation of Operators and Change of Basis for an introduction in the context of quantum mechanics.

A more general concept than dual vectors is that of a linear map. It is defined as a map of , where and are vector spaces. Geometrically, when we apply a linear map to every point on a grid, gridlines remain parallel and evenly spaced, and the origin remains fixed. Algebraically, a linear map satisfies the following linearity properties:

Linear maps can be represented by matrices:

where each column represents the image of the basis vector under . Applying to a vector yields:

We can see that the components of the image of are given by the matrix-vector product. In other words, if , then .

Linear Maps under Change of Basis

When we change the basis of the vector space, the components of vectors and linear maps change:

Intuitively, to transform a vector by under a new basis , we first transform back to the old basis, apply , and then transform the result back to the new basis:

To prove this more formally, we will introduce the Einstein summation convention. This states that when an index appears twice in a term, it is implicitly summed over. For example, is equivalent to . This is useful for simplifying expressions. See Change of Basis for a different proof.

Let be the transformation matrix from the old basis to the new basis. The components of a new basis vector under a linear map are given by:

with an implicit sum over . We also know that, by definition:

On the left-hand side, we have a sum over , and on the right-hand side, we have a sum over . We can rename the indices on the left-hand side :

Thus:

Indices vs Matrices

Suppose we have three matrices, , , and , and we want to multiply them. Typically, we simply write .

However, we can also write this as . Notice that the indices and are repeated, and hence we sum over them. The question is, is there a way to quickly derive these expressions from the matrix multiplication? It turns out, there is a heuristic way to do this. Looking back at the indices, we see that the repeated indices connect diagonally:

They "cancel out" and we are left with the indices that are not repeated— and . This also helps us make sure that we are multiplying the matrices in the correct order. For instance, in the proof for the transformation of linear maps, we see that the indices are connected diagonally:

This is how we know that the entire matrix is given by and not any other order.

Metric Tensor and Bilinear Forms

We will now introduce the concept of a metric tensor. Suppose we want to find out the dot product of two vectors and . We can write this as:

Because the dot product is linear, we can expand this out:

We are left with a sum of terms of the form . In Einstein summation notation this is simply:

The dot product of the basis vectors is defined by the metric tensor . Notice that we can immediately see that the metric tensor must have two covariant indices because the dot product must be invariant under a change of basis. We say that the metric tensor is twice-covariant.

While is the -component of the metric tensor, the tensor itself is denoted by and is defined as:

Because the dot product is linear, the metric tensor is also linear. Furthermore, it takes two inputs, so it is more formally written as . This is known as a bilinear form;

  • Bilinear: Linear in both arguments.
  • Form: A function that takes vectors and outputs a scalar. For example, dual vectors are 1-forms.

Tensor Products

If you have read the quantum mechanics section, you will know that an operator can be written as a sum of projection operators. The fundamental idea is that the outer product of a vector and dual vector gives a linear map:

To see why this is the case, matrix multiplication can be carried out as follows:

  • Move the second matrix upwards and draw an empty matrix where it used to be.
  • Each element in the empty matrix is the product of the corresponding elements in the two matrices.

Here's a concrete example:

To multiply them, we can shift up and to the right:

Then, the elements in the product matrix are given by the sum of the products of the corresponding elements. For example, the element in the first row and first column is:

For an outer product, it looks like this:

Since the outer product involves scaling one vector by two different scalars (in the dual vector), each column is a scaled version of the vector. As such, the matrix has a determinant of zero; its columns are linearly dependent. These matrices are called pure matrices.

In order to write any linear map in a similar way, we could combine pure matrices together. In a sense, the pure matrices act as a "basis" for all linear maps. We can make the following four pure matrices the bases:

Then, any matrix can be written as a sum of these four matrices:

And hence, we can write any linear map as:

Formally, each basis pure matrix is the tensor product of the basis vectors and dual vectors. They are more formally written as . As such:

In fact, a tensor can be defined as a set of vectors and dual vectors that are combined using tensor products. Defining a tensor as such makes it trivial to derive the transformation rules for tensors under a change of basis. For example, for a linear map as above, the components transform as:

Thus it is clear that transforms as .

To take another example of a tensor product, consider two qubits (or two spin-1/2 particles). The state of a single qubit is given by a vector in a two-dimensional complex vector space, . The vector is written as a superposition of the two basis states and :

Now suppose we have a composite system of these qubits. There are now four basis states, each corresponding to a different combination of the two qubits:

  • : both qubits are in the state .
  • : the first qubit is in the state and the second qubit is in the state .
  • : the first qubit is in the state and the second qubit is in the state .
  • : both qubits are in the state .

These are actually the tensor products of the two qubit states:

And the state of the composite system is given by a linear combination of these basis states:

Kronecker Product

Tensor products and Kronecker products are similar but not the same. Tensor products act on tensors, while Kronecker products act on arrays. However, they are essentially identical in our context, so we will use the terms interchangeably.

To take an example, a bilinear form is a tensor product of two dual vectors:

To apply the tensor product, we simply give a copy of the left object for each element in the right object: