Matrix theory

Algebra ->  Algebra  -> College  -> Linear Algebra -> Matrix theory      Log On


   

Matrix (mathematics)

  (Redirected from Matrix theory)
Jump to: navigation, search
Specific entries of a matrix are often referenced by using pairs of subscripts.

In mathematics, a matrix (plural matrices, or less commonly matrixes) is a rectangular array of numbers, such as


\begin{bmatrix}
1 & 2 & 3 \\
6 & 5 & 4
\end{bmatrix}.

Entries of a matrix are often denoted by a variable with two subscripts, as shown on the right. Matrices of the same size can be added and subtracted entrywise and matrices of compatible size can be multiplied. These operations have many of the properties of ordinary arithmetic, except that matrix multiplication is not commutative, that is, AB and BA are not equal in general. Matrices consisting of only one column or row are called vectors, while higher-dimensional, e.g. three-dimensional, arrays of numbers are called tensors. Matrices with entries in other fields or rings are also studied.

Matrices are a key tool in linear algebra. One use of matrices is to represent linear transformations, which are higher-dimensional analogs of linear functions of the form f(x) = cx, where c is a constant; matrix multiplication corresponds to composition of linear transformations. Matrices can also keep track of the coefficients in a system of linear equations. For a square matrix, the determinant and inverse matrix (when it exists) govern the behavior of solutions to the corresponding system of linear equations, and eigenvalues and eigenvectors provide insight into the geometry of the associated linear transformation.

Matrices find many applications. Physics makes use of matrices in various domains, for example in geometrical optics and matrix mechanics; the latter led to studying in more detail matrices with an infinite number of rows and columns. Graph theory uses matrices to keep track of distances between pairs of vertices in a graph. Computer graphics uses matrices to project 3-dimensional space onto a 2-dimensional screen. Matrix calculus generalizes classical analytical notions such as derivatives of functions or exponentials to matrices. The latter is a recurring need in solving ordinary differential equations. Serialism and dodecaphonism are musical movements of the 20th century that utilize a square mathematical matrix to determine the pattern of music intervals.

A major branch of numerical analysis is devoted to the development of efficient algorithms for matrix computations, a subject that is centuries old but still an active area of research. Matrix decomposition methods simplify computations, both theoretically and practically. For sparse matrices, specifically tailored algorithms can provide speedups; such matrices arise in the finite element method, for example.

Contents

[ Definition

A matrix is a rectangular arrangement of numbers.[1] For example,

\mathbf{A} = \begin{bmatrix}

9 & 8 & 6 \\
1 & 2 & 7 \\
4 & 9 & 2 \\
6 & 0 & 5 \end{bmatrix}.

An alternative notation uses large parentheses instead of box brackets:

\mathbf{A} = \begin{pmatrix}
9 & 8 & 6 \\
1 & 2 & 7 \\
4 & 9 & 2 \\
6 & 0 & 5 \end{pmatrix}.

The horizontal and vertical lines in a matrix are called rows and columns, respectively. The numbers in the matrix are called its entries or its elements. To specify a matrix's size, a matrix with m rows and n columns is called an m-by-n matrix or m × n matrix, while m and n are called its dimensions. The above is a 4-by-3 matrix.

A matrix where one of the dimensions equals one is also called a vector, and may be interpreted as an element of real coordinate space. An m × 1 matrix (one column and m rows) is called a column vector and a 1 × n matrix (one row and n columns) is called a row vector. For example, the second row vector of the above matrix A is

\begin{bmatrix}
1 & 2 & 7 \\ \end{bmatrix}.

Most of this article focuses on real and complex matrices, i.e., matrices whose entries are real or complex numbers. More general types of entries are discussed below.

[ Notation

The entry that lies in the i-th row and the j-th column of a matrix is typically referred to as the i,j, (i,j), or (i,j)th entry of the matrix. For example, (2,3) entry of the above matrix A is 7. Matrices are usually denoted using upper-case letters, while the corresponding lower-case letters, with two subscript indices, represent the entries. For example, the (i, j)th entry of a matrix A is most commonly written as ai,j. Alternative notations for that entry are A[i,j] or Ai,j. In addition to using upper-case letters to symbolize matrices, many authors use a special typographical style, commonly boldface upright (non-italic), to further distinguish matrices from other variables. An asterisk is commonly used to refer to all of the rows or columns in a matrix. For example, ai,∗ refers to the ith row of A, and a∗,j refers to the jth column of A. The set of all m-by-n matrices is denoted M(m, n).

A common shorthand is

A = [ai,j]i=1,...,m; j=1,...,n or more briefly A = [ai,j]m×n

to define an m × n matrix A. Usually the entries ai,j are defined separately for all integers 1 ≤ im and 1 ≤ jn. They can however sometimes be given by one formula; for example the 3-by-4 matrix

\mathbf A = \begin{bmatrix}
0 & -1 & -2 & -3\\
1 & 0 & -1 & -2\\
2 & 1 & 0 & -1\\
\end{bmatrix}

can alternatively be specified by A = [ij]i=1,2,3; j=1,...,4.

Some programming languages start the numbering of rows and columns at zero, in which case the entries of an m-by-n matrix are indexed by 0 ≤ im − 1 and 0 ≤ jn − 1.[2] This article follows the more common convention in mathematical writing where enumeration starts from 1.

[ Basic operations

There are a number of operations that can be applied to modify matrices called matrix addition, scalar multiplication and transposition.[3] These form the basic techniques to deal with matrices.

Operation Definition Example
Addition The sum A+B of two m-by-n matrices A and B is calculated entrywise:
(A + B)i,j = Ai,j + Bi,j, where 1 ≤ im and 1 ≤ jn.



\begin{bmatrix}
1 & 3 & 1 \\
1 & 0 & 0
\end{bmatrix}
+
\begin{bmatrix}
0 & 0 & 5  \\
7 & 5 & 0
\end{bmatrix}
=
\begin{bmatrix}
1+0 & 3+0 & 1+5 \\
1+7 & 0+5 & 0+0
\end{bmatrix}
=
\begin{bmatrix}
1 & 3 & 6 \\
8 & 5 & 0
\end{bmatrix}

Scalar multiplication The scalar multiplication cA of a matrix A and a number c (also called a scalar in the parlance of abstract algebra) is given by multiplying every entry of A by c:
(cA)i,j = c · Ai,j.
2 \cdot

\begin{bmatrix}
1 & 8 & -3 \\
4 & -2 & 5
\end{bmatrix}
=
\begin{bmatrix}
2 \cdot 1 & 2\cdot 8 & 2\cdot -3 \\
2\cdot 4 & 2\cdot -2 & 2\cdot 5
\end{bmatrix}
=
\begin{bmatrix}
2 & 16 & -6 \\
8 & -4 & 10
\end{bmatrix}
Transpose The transpose of an m-by-n matrix A is the n-by-m matrix AT (also denoted Atr or tA) formed by turning rows into columns and vice versa:
(AT)i,j = Aj,i.


\begin{bmatrix}
1 & 2 & 3 \\
0 & -6 & 0
\end{bmatrix}^T =

\begin{bmatrix}
1 & 0 \\
2 & -6 \\
3 & 0
\end{bmatrix}

Familiar properties of numbers extend to these operations of matrices: for example, addition is commutative, i.e. the matrix sum does not depend on the order of the summands: A + B = B + A.[4] The transpose is compatible with addition and scalar multiplication, as expressed by (cA)T = c(AT) and (A + B)T = AT + BT. Finally, (AT)T = A.

Row operations are ways to change matrices. There are three types of row operations: row switching, that is interchanging two rows of a matrix, row multiplication, multiplying all entries of a row by a non-zero constant and finally row addition which means adding a multiple of a row to another row. These row operations are used in a number of ways including solving linear equations and finding inverses.

[ Matrix multiplication, linear equations and linear transformations

Schematic depiction of the matrix product AB of two matrices A and B.

Multiplication of two matrices is defined only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their matrix product AB is the m-by-p matrix whose entries are given by

 [\mathbf{AB}]_{i,j} = A_{i,1}B_{1,j} + A_{i,2}B_{2,j} + ... + A_{i,n}B_{n,j} = \sum_{r=1}^n A_{i,r}B_{r,j},

where 1 ≤ im and 1 ≤ jp.[5] For example (the underlined entry 1 in the product is calculated as the product 1 · 1 + 0 · 1 + 2 · 0 = 1):


\begin{align}
\begin{bmatrix}
\underline{1} & \underline 0 & \underline 2 \\
-1 & 3 & 1 \\
\end{bmatrix}
\times
\begin{bmatrix}
3 & \underline 1 \\
2 & \underline 1 \\
1 & \underline 0 \\
\end{bmatrix}
&=
\begin{bmatrix}
5 & \underline 1 \\
4 & 2 \\
\end{bmatrix}.
\end{align}

Matrix multiplication satisfies the rules (AB)C = A(BC) (associativity), and (A+B)C = AC+BC as well as C(A+B) = CA+CB (left and right distributivity), whenever the size of the matrices is such that the various products are defined.[6] The product AB may be defined without BA being defined, namely if A and B are m-by-n and n-by-k matrices, respectively, and mk. Even if both products are defined, they need not be equal, i.e. generally one has

ABBA,

i.e., matrix multiplication is not commutative, in marked contrast to (rational, real, or complex) numbers whose product is independent of the order of the factors. An example of two matrices not commuting with each other is:

\begin{bmatrix}
1 & 2\\
3 & 4\\
\end{bmatrix}
\times
\begin{bmatrix}
0 & 1\\
0 & 0\\
\end{bmatrix}=
\begin{bmatrix}
0 & 1\\
0 & 3\\
\end{bmatrix},

whereas

\begin{bmatrix}
0 & 1\\
0 & 0\\
\end{bmatrix}
\times
\begin{bmatrix}
1 & 2\\
3 & 4\\
\end{bmatrix}=
\begin{bmatrix}
3 & 4\\
0 & 0\\
\end{bmatrix}
.

The identity matrix In of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, e.g.



\mathbf{I}_3 =
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{bmatrix}.

It is called identity matrix because multiplication with it leaves a matrix unchanged: MIn = ImM = M for any m-by-n matrix M.

Besides the ordinary matrix multiplication just described, there exist other less frequently used operations on matrices that can be considered forms of multiplication, such as the Hadamard product and the Kronecker product.[7] They arise in solving matrix equations such as the Sylvester equation.

[ Linear equations

A particular case of matrix multiplication is tightly linked to linear equations: if x designates a column vector (i.e. n×1-matrix) of n variables x1, x2, ..., xn, and A is an m-by-n matrix, then the matrix equation

Ax = b,

where b is some m×1-column vector, is equivalent to the system of linear equations

A1,1x1 + A1,2x2 + ... + A1,nxn = b1
...
Am,1x1 + Am,2x2 + ... + Am,nxn = bm .[8]

This way, matrices can be used to compactly write and deal with multiple linear equations, i.e. systems of linear equations.

[ Linear transformations

Matrices and matrix multiplication reveal their essential features when related to linear transformations, also known as linear maps. A real m-by-n matrix A gives rise to a linear transformation RnRm mapping each vector x in Rn to the (matrix) product Ax, which is a vector in Rm. Conversely, each linear transformation f: RnRm arises from a unique m-by-n matrix A: explicitly, the (i, j)-entry of A is the ith coordinate of f(ej), where ej = (0,...,0,1,0,...,0) is the unit vector with 1 in the jth position and 0 elsewhere. The matrix A is said to represent the linear map f, and A is called the transformation matrix of f.

The following table shows a number of 2-by-2 matrices with the associated linear maps of R2. The blue original is mapped to the green grid and shapes, the origin (0,0) is marked with a black point.

Vertical shear with m=1.25. Horizontal flip Squeeze mapping with r=3/2 Scaling by a factor of 3/2 Rotation by π/6 = 30°
\begin{bmatrix}
1 & 1.25  \\
0 & 1 \end{bmatrix} \begin{bmatrix}
-1 & 0  \\
0 & 1 \end{bmatrix} \begin{bmatrix}
3/2 & 0  \\
0 & 2/3 \end{bmatrix} \begin{bmatrix}
3/2 & 0  \\
0 & 3/2 \end{bmatrix} \begin{bmatrix}\cos(\pi / 6) & -\sin(\pi / 6)\\ \sin(\pi / 6) & \cos(\pi / 6)\end{bmatrix}
VerticalShear m=1.25.svg Flip map.svg Squeeze r=1.5.svg Scaling by 1.5.svg Rotation by pi over 6.svg

Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps[9]: if a k-by-m matrix B represents another linear map g : RmRk, then the composition gf is represented by BA since

(gf)(x) = g(f(x)) = g(Ax) = B(Ax) = (BA)x.

The last equality follows from the above-mentioned associativity of matrix multiplication.

The rank of a matrix A is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors.[10] Equivalently it is the dimension of the image of the linear map represented by A.[11] The rank-nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix.[12]

[ Square matrices

A square matrix is a matrix which has the same number of rows and columns. An n-by-n matrix is known as a square matrix of order n. Any two square matrices of the same order can be added and multiplied. A square matrix A is called invertible or non-singular if there exists a matrix B such that

AB = In.[13]

This is equivalent to BA = In.[14] Moreover, if B exists, it is unique and is called the inverse matrix of A, denoted A−1.

The entries Ai,i form the main diagonal of a matrix. The trace, tr(A) of a square matrix A is the sum of its diagonal entries. While, as mentioned above, matrix multiplication is not commutative, the trace of the product of two matrices is independent of the order of the factors: tr(AB) = tr(BA).[15]

If all entries outside the main diagonal are zero, A is called a diagonal matrix. If only all entries above (below) the main diagonal are zero, A is called a lower triangular matrix (upper triangular matrix, respectively). For example, if n = 3, they look like


      \begin{bmatrix}
           d_{11} & 0 & 0 \\
           0 & d_{22} & 0 \\
           0 & 0 & d_{33} \\
        \end{bmatrix}
(diagonal), 
      \begin{bmatrix}
           l_{11} & 0 & 0 \\
           l_{21} & l_{22} & 0 \\
           l_{31} & l_{32} & l_{33} \\
        \end{bmatrix}
(lower) and 
        \begin{bmatrix}
           u_{11} & u_{12} & u_{13} \\
           0 & u_{22} & u_{23} \\
           0 & 0 & u_{33} \\
        \end{bmatrix} (upper triangular matrix).

[ Determinant

A linear transformation on R2 given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses the orientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.

The determinant det(A) or |A| of a square matrix A is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in R2) or volume (in R3) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.

The determinant of 2-by-2 matrices is given by

\det \begin{pmatrix}a&b\\c&d\end{pmatrix} = ad-bc,

the determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions.[16]

The determinant of a product of square matrices equals the product of their determinants: det(AB) = det(A) · det(B).[17] Adding a multiple of any row to another row, or a multiple of any column to another column, does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1.[18] Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, i.e., determinants of smaller matrices.[19] This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.[20]

[ Eigenvalues and eigenvectors

A number λ and a non-zero vector v satisfying

Av = λv

are called an eigenvalue and an eigenvector of A, respectively.[nb 1][21] The number λ is an eigenvalue of an n×n-matrix A if and only if A−λIn is not invertible, which is equivalent to

det(A−λI) = 0.[22]

The function pA(t) = det(AtI) is called the characteristic polynomial of A, its degree is n. Therefore pA(t) has at most n different roots, i.e., eigenvalues of the matrix.[23] They may be complex even if the entries of A are real. According to the Cayley-Hamilton theorem, pA(A) = 0, that is to say, the characteristic polynomial applied to the matrix itself yields the zero matrix.

[ Symmetry

A square matrix A that is equal to its transpose, i.e. A = AT, is a symmetric matrix; if it is equal to the negative of its transpose, i.e. A = −AT, then it is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy A = A, where the star denotes the conjugate transpose of the matrix, i.e. the transpose of the complex conjugate of A.

By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; i.e., every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real.[24] This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns, see below.

[ Definiteness

Matrix A; definiteness; associated quadratic form QA(x,y);
set of vectors (x,y) such that QA(x,y)=1
\begin{bmatrix}
1/4 & 0\\
0 & 1\end{bmatrix} \begin{bmatrix}
1/4 & 0\\
0 & -1/4\end{bmatrix}
positive definite indefinite
1/4 x2 + y2 1/4 x2 − 1/4 y2
Ellipse in coordinate system with semi-axes labelled.svg
Ellipse
Hyperbola2.png
Hyperbola

A symmetric n×n-matrix is called positive definite (negative definite, indefinite, resp.), if for all nonzero vectors xRn the associated quadratic form given by

Q(x) = xTAx

takes only positive values (negative, both negative and positive values, respectively).[25] Allowing as input two different vectors instead yields the bilinear form associated to A:

BA (x, y) = xTAy.[26]

A symmetric matrix is positive definite if and only if all its eigenvalues are positive.[27] The table at the right shows two possibilities for 2-by-2 matrices.

[ Computational aspects

In addition to theoretical knowledge of properties of matrices and their relation to other fields, it is important for practical purposes to perform matrix calculations effectively and precisely. The domain studying these matters is called numerical linear algebra.[28] As with other numerical situations, two main aspects are the complexity of algorithms and their numerical stability. Many problems can be solved by both direct algorithms or iterative approaches. For example, finding eigenvectors can be done by finding a sequence of vectors xn converging to an eigenvector when n tends to infinity.[29]

Determining the complexity of an algorithm means finding upper bounds or estimates of how many elementary operations such as additions and multiplications of scalars are necessary to perform some algorithm, e.g. multiplication of matrices. For example, calculating the matrix product of two n-by-n matrix using the definition given above needs n3 multiplications, since for any of the n2 entries of the product, n multiplications are necessary. The Strassen algorithm outperforms this "naive" algorithm; it needs only n2.807 multiplications.[30] A refined approach also incorporates specific features of the computing devices.

In many practical situations additional information about the matrices involved is known. An important case are sparse matrices, i.e. matrices most of whose entries are zero. There are specifically adapted algorithms for, say, solving linear systems Ax = b for sparse matrices A, such as the conjugate gradient method.[31]

An algorithm is, roughly speaking, numerical stable, if little deviations (such as rounding errors) do not lead to big deviations in the result. For example, calculating the inverse of a matrix via Laplace's formula (Adj (A) denotes the adjugate matrix of A)

A−1 = Adj(A) / det(A)

may lead to significant rounding errors if the determinant of the matrix is very small. The norm of a matrix can be used to capture the conditioning of linear algebraic problems, such as computing a matrix' inverse.[32]

Although most computer languages are not designed with commands or libraries for matrices, as early as the 1970s, some engineering desktop computers such as the HP 9830 had ROM cartridges to add BASIC commands for matrices. Some computer languages such as APL were designed to manipulate matrices, and various mathematical programs can be used to aid computing with matrices.[33]

[ Matrix decomposition methods

There are several methods to render matrices into a more easily accessible form. They are generally referred to as matrix transformation or matrix decomposition techniques. The interest of all these decomposition techniques is that they preserve certain properties of the matrices in question, such as determinant, rank or inverse, so that these quantities can be calculated after applying the transformation, or that certain matrix operations are algorithmically easier to carry out for some types of matrices.

The LU decomposition factors matrices as a product of lower (L) and an upper triangular matrices (U).[34] Once this decomposition is calculated, linear systems can be solved more efficiently, by a simple technique called forward and back substitution. Likewise, inverses of triangular matrices are algorithmically easier to calculate. The Gaussian elimination is a similar algorithm; it transforms any matrix to row echelon form.[35] Both methods proceed by multiplying the matrix by suitable elementary matrices, which correspond to permuting rows or columns and adding multiples of one row to another row. Singular value decomposition expresses any matrix A as a product UDV, where U and V are unitary matrices and D is a diagonal matrix.

A matrix in Jordan normal form. The grey blocks are called Jordan blocks.

The eigendecomposition or diagonalization expresses A as a product VDV−1, where D is a diagonal matrix and V is a suitable invertible matrix.[36] If A can be written in this form, it is called diagonalizable. More generally, and applicable to all matrices, the Jordan decomposition transforms a matrix into Jordan normal form, that is to say matrices whose only nonzero entries are the eigenvalues λ1 to λn of A, placed on the main diagonal and possibly entries equal to one directly above the main diagonal, as shown at the right.[37] Given the eigendecomposition, the nth power of A (i.e. n-fold iterated matrix multiplication) can be calculated via

An = (VDV−1)n = VDV−1VDV−1...VDV−1 = VDnV−1

and the power of a diagonal matrix can be calculated by taking the corresponding powers of the diagonal entries, which is much easier than doing the exponentiation for A instead. This can be used to compute the matrix exponential eA, a need frequently arising in solving linear differential equations, matrix logarithms and square roots of matrices.[38] To avoid numerically ill-conditioned situations, further algorithms such as the Schur decomposition can be employed.[39]

[ Abstract algebraic aspects and generalizations

Matrices can be generalized in different ways. Abstract algebra uses matrices with entries in more general fields or even rings, while linear algebra codifies properties of matrices in thee notion of linear maps. It is possible to consider matrices with infinitely many columns and rows. Another extension are tensors, which can be seen as higher-dimensional arrays of numbers, as opposed to vectors, which can often be realised as sequences of numbers, while matrices are rectangular or two-dimensional array of numbers.[40] Matrices, subject to certain requirements tend to form groups known as matrix groups.

[ Matrices with more general entries

This article focuses on matrices whose entries are real or complex numbers. However, matrices can be considered with much more general types of entries than real or complex numbers. As a first step of generalization, any CC-BY-SA.


Tutors Answer Your Questions about Linear Algebra (FREE)


Older solutions: 1..45, 46..90, 91..135, 136..180, 181..225, 226..270, 271..315, 316..360, 361..405, 406..450, 451..495, 496..540, 541..585, 586..630, 631..675, 676..720, 721..765, 766..810, 811..855, 856..900, 901..945, 946..990, 991..1035, 1036..1080, 1081..1125, 1126..1170, 1171..1215, 1216..1260, 1261..1305, 1306..1350, 1351..1395, 1396..1440, 1441..1485, 1486..1530, 1531..1575, 1576..1620, 1621..1665, 1666..1710, 1711..1755, 1756..1800, 1801..1845, 1846..1890