A lot of linear algebra is concerned with operations on vectors and matrices, and there are many different types of matrices.
There are a few types of matrices that you may encounter again and again when getting started in linear algebra, particularity the parts of linear algebra relevant to machine learning.
In this tutorial, you will discover a suite of different types of matrices from the field of linear algebra that you may encounter in machine learning.
After completing this tutorial, you will know:
- Square, symmetric, triangular, and diagonal matrices that are much as their names suggest.
- Identity matrices that are all zero values except along the main diagonal where the values are 1.
- Orthogonal matrices that generalize the idea of perpendicular vectors and have useful computational properties.
Tutorial Overview
This tutorial is divided into 6 parts to cover the main types of matrices; they are:
- Square Matrix
- Symmetric Matrix
- Triangular Matrix
- Diagonal Matrix
- Identity Matrix
- Orthogonal Matrix
Need help with Linear Algebra for Machine Learning?
Take my free 7-day email crash course now (with sample code).
Click to sign-up and also get a free PDF Ebook version of the course.
Square Matrix
A square matrix is a matrix where the number of rows (n) equals the number of columns (m).
The square matrix is contrasted with the rectangular matrix where the number of rows and columns are not equal.
Given that the number of rows and columns match, the dimensions are usually denoted as n, e.g. n x n. The size of the matrix is called the order, so an order 4 square matrix is 4 x 4.
The vector of values along the diagonal of the matrix from the top left to the bottom right is called the main diagonal.
Below is an example of an order 3 square matrix.
Square matrices are readily added and multiplied together and are the basis of many simple linear transformations, such as rotations (as in the rotations of images).
Symmetric Matrix
A symmetric matrix is a type of square matrix where the top-right triangle is the same as the bottom-left triangle.
It is no exaggeration to say that symmetric matrices S are the most important matrices the world will ever see – in the theory of linear algebra and also in the applications.
— Page 338, Introduction to Linear Algebra, Fifth Edition, 2016.
To be symmetric, the axis of symmetry is always the main diagonal of the matrix, from the top left to the bottom right.
Below is an example of a 5×5 symmetric matrix.
A symmetric matrix is always square and equal to its own transpose.
Triangular Matrix
A triangular matrix is a type of square matrix that has all values in the upper-right or lower-left of the matrix with the remaining elements filled with zero values.
A triangular matrix with values only above the main diagonal is called an upper triangular matrix. Whereas, a triangular matrix with values only below the main diagonal is called a lower triangular matrix.
Below is an example of a 3×3 upper triangular matrix.
Below is an example of a 3×3 lower triangular matrix.
The LU decomposition resolves a given matrix into upper and lower triangular matrices.
NumPy provides functions to calculate a triangular matrix from an existing square matrix. The tril() function to calculate the lower triangular matrix from a given matrix and the triu() to calculate the upper triangular matrix from a given matrix
The example below defines a 3×3 square matrix and calculates the lower and upper triangular matrix from it.
Running the example prints the defined matrix followed by the lower and upper triangular matrices.
Diagonal Matrix
A diagonal matrix is one where values outside of the main diagonal have a zero value, where the main diagonal is taken from the top left of the matrix to the bottom right.
A diagonal matrix is often denoted with the variable D and may be represented as a full matrix or as a vector of values on the main diagonal.
Diagonal matrices consist mostly of zeros and have non-zero entries only along the main diagonal.
— Page 40, Deep Learning, 2016.
Below is an example of a 3×3 square diagonal matrix.
As a vector, it would be represented as:
Or, with the specified scalar values:
A diagonal matrix does not have to be square. In the case of a rectangular matrix, the diagonal would cover the shortest dimension; for example:
NumPy provides the function diag() that can create a diagonal matrix from an existing matrix, or transform a vector into a diagonal matrix.
The example below defines a 3×3 square matrix, extracts the main diagonal as a vector, and then creates a diagonal matrix from the extracted vector.
Running the example first prints the defined matrix, followed by the vector of the main diagonal and the diagonal matrix constructed from the vector.
Identity Matrix
An identity matrix is a square matrix that does not change a vector when multiplied.
The values of an identity matrix are known. All of the scalar values along the main diagonal (top-left to bottom-right) have the value one, while all other values are zero.
An identity matrix is a matrix that does not change any vector when we multiply that vector by that matrix.
— Page 36, Deep Learning, 2016.
An identity matrix is often represented using the notation “I” or with the dimensionality “In”, where n is a subscript that indicates the dimensionality of the square identity matrix. In some notations, the identity may be referred to as the unit matrix, or “U”, to honor the one value it contains (this is different from a Unitary matrix).
For example, an identity matrix with the size 3 or I3 would be as follows:
In NumPy, an identity matrix can be created with a specific size using the identity() function.
The example below creates an I3 identity matrix.
Running the example prints the created identity matrix.
Alone, the identity matrix is not that interesting, although it is a component in other import matrix operations, such as matrix inversion.
Orthogonal Matrix
Two vectors are orthogonal when their dot product equals zero, called orthonormal.
or
This is intuitive when we consider that one line is orthogonal with another if it is perpendicular to it.
An Orthogonal matrix is a type of square matrix whose columns and rows are orthonormal unit vectors, e.g. perpendicular and have a length or magnitude of 1.
An orthogonal matrix is a square matrix whose rows are mutually orthonormal and whose columns are mutually orthonormal
— Page 41, Deep Learning, 2016.
An Orthogonal matrix is often denoted as uppercase “Q”.
Multiplication by an orthogonal matrix preserves lengths.
— Page 277, No Bullshit Guide To Linear Algebra, 2017
The Orthogonal matrix is defined formally as follows:
Where Q is the orthogonal matrix, Q^T indicates the transpose of Q, and I is the identity matrix.
A matrix is orthogonal if its transpose is equal to its inverse.
Another equivalence for an orthogonal matrix is if the dot product of the matrix and itself equals the identity matrix.
Orthogonal matrices are used a lot for linear transformations, such as reflections and permutations.
A simple 2×2 orthogonal matrix is listed below, which is an example of a reflection matrix or coordinate reflection.
The example below creates this orthogonal matrix and checks the above equivalences.
Running the example first prints the orthogonal matrix, the inverse of the orthogonal matrix, and the transpose of the orthogonal matrix are then printed and are shown to be equivalent. Finally, the identity matrix is printed which is calculated from the dot product of the orthogonal matrix with its transpose.
Orthogonal matrices are useful tools as they are computationally cheap and stable to calculate their inverse as simply their transpose.
Extensions
This section lists some ideas for extending the tutorial that you may wish to explore.
- Modify each example using your own small contrived data.
- Write your own functions to implement each operation.
- Research one example where each operation was used in machine learning.
If you explore any of these extensions, I’d love to know.
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
Books
- Section 6.2 Special types of matrices. No Bullshit Guide To Linear Algebra, 2017.
- Introduction to Linear Algebra, 2016.
- Section 2.3 Identity and Inverse Matrices, Deep Learning, 2016.
- Section 2.6 Special Kinds of Matrices and Vectors, Deep Learning, 2016.
API
Articles
- Square matrix on Wikipedia
- Main diagonal on Wikipedia
- Symmetric matrix on Wikipedia
- Triangular Matrix on Wikipedia
- Diagonal matrix on Wikipedia
- Identity matrix on Wikipedia
- Orthogonal matrix on Wikipedia
Summary
In this tutorial, you discovered a suite of different types of matrices from the field of linear algebra that you may encounter in machine learning.
Specifically, you learned:
- Square, symmetric, triangular, and diagonal matrices that are much as their name suggests.
- Identity matrices that are all zero values except along the main diagonal where the values are 1.
- Orthogonal matrices that generalize the idea of perpendicular vectors and have useful computational properties.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
No comments:
Post a Comment