Book Image

Mastering SciPy

By : Francisco Javier Blanco-Silva, Francisco Javier B Silva
Book Image

Mastering SciPy

By: Francisco Javier Blanco-Silva, Francisco Javier B Silva

Overview of this book

Table of Contents (16 chapters)
Mastering SciPy
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Matrix factorizations related to solving matrix equations


The concept of matrix decompositions is what makes Numerical Linear Algebra an efficient tool in Scientific Computing. If the matrix representing a problem is simple enough, any basic generic algorithm can find the solutions optimally (that is, fast, with minimal storage of data, and without a significant roundoff error). But, in real life, this situation seldom occurs. What we do in the general case is finding a suitable matrix factorization and tailoring an algorithm that is optimal on each factor, thus gaining on each step an obvious advantage. In this section, we explore the different factorizations included in the modules scipy.linalg and scipy.sparse.linalg that help us achieve a robust solution to matrix equations.

Relevant factorizations

We have the following factorizations in this category:

Pivoted LU decomposition

It is always possible to perform a factorization of a square matrix A as a product A = P ● L ● U of a permutation matrix P (which performs a permutation of the rows of A), a lower triangular matrix L, and an upper triangular matrix U:

Constructor

Description

scipy.linalg.lu(A)

Pivoted LU decomposition

scipy.linalg.lu_factor(A)

Pivoted LU decomposition

scipy.sparse.linalg.splu(A)

Pivoted LU decomposition

scipy.sparse.linalg.spilu(A)

Incomplete pivoted LU decomposition

Cholesky decomposition

For a square, symmetric, and positive definite matrix A, we can realize the matrix as the product A = UT ● U of an upper triangular matrix U with its transpose, or as the product A = LT ● L of a lower triangular matrix L with its transpose. All the diagonal entries of U or L are strictly positive numbers:

Constructor

Description

scipy.linalg.cholesky(A)

Cholesky decomposition

scipy.linalg.cholesky_banded(AB)

Cholesky decomposition for Hermitian positive-definite banded matrices

QR decomposition

We can realize any matrix of size m × n as the product A=Q ● R of a square orthogonal matrix Q of size m × m, with an upper triangular matrix R of the same size as A.

Constructor

Description

scipy.linalg.qr(A)

QR decomposition of a matrix

Singular value decomposition

We can realize any matrix A as the product A = U ● D ● VH of a unitary matrix U with a diagonal matrix D (where all entries in the diagonal are positive numbers), and the Hermitian transpose of another unitary matrix V. The values on the diagonal of D are called the singular values of A.

Constructor

Description

scipy.linalg.svd(A)

Singular value decomposition

scipy.linalg.svdvals(A)

Singular values

scipy.linalg.diagsvd(s, m, n)

Diagonal matrix of an SVD, from singular values `s` and prescribed size

scipy.sparse.linalg.svds(A)

Largest k singular values/vectors of a sparse matrix

Matrix equations

In SciPy, we have robust algorithms to solve any matrix equation based on the following cases:

  • Given a square matrix A, and a right-hand side b (which can be a one-dimensional vector or another matrix with the same number of rows as A), the basic systems are as follows:

    • A ● x = b

    • AT ● x = b

    • AH ● x = b

  • Given any matrix A (not necessarily square) and a right-hand side vector/matrix b of an appropriate size, the least squares solution to the equation A ● x = b. This is, finding a vector x that minimizes the Frobenius norm of the expression A ● x - b.

  • For the same case as before, and an extra damping coefficient d, the regularized least squares solution to the equation A ● x = b that minimizes the functional norm(A * x - b, 'f')**2 + d**2 * norm(x, 'f')**2.

  • Given square matrices A and B, and a right-hand side matrix Q with appropriate sizes, the Sylvester system is A ● X + X ● B = Q.

  • For a square matrix A and matrix Q of an appropriate size, the continuous Lyapunov equation is A ● X + X ● AH = Q.

  • For matrices A and Q, as in the previous case, the discrete Lyapunov equation is X - A ● X ● AH = Q.

  • Given square matrices A, Q, and R, and another matrix B with an appropriate size, the continuous algebraic Riccati equation is AT ● X + X ● A - X ● B ● R-1 ● BT ● X + Q = 0.

  • For matrices as in the previous case, the Discrete Algebraic Riccati equation is X = AT ● X ● A - (AT ● X ● B) ● (R+BT ● X ● B)-1 ● (BT ● X ● A) + Q.

In any case, mastering matrix equations with SciPy basically means identifying the matrices involved and choosing the most adequate algorithm in the libraries to perform the requested operations. Besides being able to compute a solution with the least possible amount of roundoff error, we need to do so in the fastest possible way, and by using as few memory resources as possible.

Back and forward substitution

Let us start with the easiest possible case: The basic system of linear equations A ● x = b (or the other two variants), where A is a generic lower or upper triangular square matrix. In theory, these systems are easily solved by forward substitution (for lower triangular matrices) or back substitution (for upper triangular matrices). In SciPy, we accomplish this task with the function solve_triangular in the module scipy.linalg.

For this initial example, we will construct A as a lower triangular Pascal matrix of size 1024 × 1024, where the non-zero values have been filtered: odd values are turned into ones, while even values are turned into zeros. The right-hand side b is a vector with 1024 ones.

In [1]: import numpy as np, \
   ...: scipy.linalg as spla, scipy.sparse as spsp, \
   ...: scipy.sparse.linalg as spspla
In [2]: A = (spla.pascal(1024, kind='lower')%2 != 0)
In [3]: %timeit spla.solve_triangular(A, np.ones(1024))
10 loops, best of 3: 6.64 ms per loop

To solve the other related systems that involve the matrix A, we employ the optional parameter trans (by default set to 0 or N, giving the basic system A ● x = b). If trans is set to T or 1, we solve the system AT ● x = b instead. If trans is set to C or 2, we solve AH ● x = b instead.

Note

The function solve_triangular is a wrapper for the LAPACK function trtrs.

Basic systems: banded matrices

The next cases in terms of algorithm simplicity are those of basic systems A ● x = b, where A is a square banded matrix. We use the routines solve_banded (for a generic banded matrix) or solveh_banded (for a generic real symmetric of complex Hermitian banded matrix). Both of them belong to the module scipy.linalg.

Note

The functions solve_banded and solveh_banded are wrappers for the LAPACK functions GBSV, and PBSV, respectively.

Neither function accepts a matrix in the usual format. For example, since solveh_banded expects a symmetric banded matrix, the function requires as input only the elements of the diagonals on and under/over the main diagonal, stored sequentially from the top to the bottom.

This input method is best explained through a concrete example. Take the following symmetric banded matrix:

 2 -1  0  0  0  0
-1  2 -1  0  0  0
 0 -1  2 -1  0  0
 0  0 -1  2 -1  0
 0  0  0 -1  2 -1
 0  0  0  0 -1  2 

The size of the matrix is 6 × 6, and there are only three non-zero diagonals, two of which are identical due to symmetry. We collect the two relevant non-zero diagonals in ndarray of size 2 × 6 in one of two ways, as follows:

  • If we decide to input the entries from the upper triangular matrix, we collect first the diagonals from the top to the bottom (ending in the main diagonal), right justified:

    * -1 -1 -1 -1 -1
    2  2  2  2  2  2
    
  • If we decide to input the entries from the lower triangular matrix, we collect the diagonals from the top to the bottom (starting from the main diagonal), left justified:

     2  2  2  2  2  2
     -1 -1 -1 -1 -1  *
    In [4]: B_banded = np.zeros((2,6)); \
       ...: B_banded[0,1:] = -1; \
       ...: B_banded[1,:] = 2
    In [5]: spla.solveh_banded(B_banded, np.ones(6))
    Out[5]: array([ 3.,  5.,  6.,  6.,  5.,  3.])
    

For a non-symmetric banded square matrix, we use solve_banded instead, and the input matrix also needs to be stored in this special way:

  • Count the number of non-zero diagonals under the main diagonal (set that to l). Count the number of non-zero diagonals over the main diagonal (set that to u). Set r = l + u + 1.

  • If the matrix has size n × n, create ndarray with n columns and r rows. We refer to this storage as a matrix in the AB form, or an AB matrix, for short.

  • Store in the AB matrix only the relevant non-zero diagonals, from the top to the bottom, in order. Diagonals over the main diagonal are right justified; diagonals under the main diagonal are left justified.

Let us illustrate this process with another example. We input the following matrix:

 2 -1  0  0  0  0
-1  2 -1  0  0  0
 3 -1  2 -1  0  0
 0  3 -1  2 -1  0
 0  0  3 -1  2 -1
 0  0  0  3 -1  2
In [6]: C_banded = np.zeros((4,6)); \
   ...: C_banded[0,1:] = -1; \
   ...: C_banded[1,:] = 2; \
   ...: C_banded[2,:-1] = -1; \
   ...: C_banded[3,:-2] = 3; \
   ...: print C_banded
[[ 0. -1. -1. -1. -1. -1.]
 [ 2.  2.  2.  2.  2.  2.]
 [-1. -1. -1. -1. -1.  0.]
 [ 3.  3.  3.  3.  0.  0.]]

To call the solver, we need to input manually the number of diagonals over and under the diagonal, together with the AB matrix and the right-hand side of the system:

In [7]: spla.solve_banded((2,1), C_banded, np.ones(6))
Out[7]:
array([ 0.86842105,  0.73684211, -0.39473684,  0.07894737, 
        1.76315789,  1.26315789])

Let us examine the optional parameters that we can include in the call of these two functions:

Parameter

Default values

Description

l_and_u

(int, int)

Number of non-zero lower/upper diagonals

ab

Matrix in AB format

A banded square matrix

b

ndarray

Right-hand side

overwrite_ab

Boolean

Discard data in ab

overwrite_b

Boolean

Discard data in b

check_finite

Boolean

Whether to check that input matrices contain finite numbers

Tip

All the functions in the scipy.linalg module that require matrices as input and output either a solution to a system of equations, or a factorization, have two optional parameters with which we need to familiarize: overwrite_x (for each matrix/vector in the input) and check_finite. They are both Boolean.

The overwrite options are set to False by default. If we do not care about retaining the values of the input matrices, we may use the same object in the memory to perform operations, rather than creating another object with the same size in the memory. We gain speed and use fewer resources in such a case.

The check_finite option is set to True by default. In the algorithms where it is present, there are optional checks for the integrity of the data. If at any given moment, any of the values is (+/-)numpy.inf or NaN, the process is halted, and an exception is raised. We may turn this option off, thus resulting in much faster solutions, but the code might crash if the data is corrupted at any point in the computations.

The function solveh_banded has an extra optional Boolean parameter, lower, which is initially set to False. If set to True, we must input the lower triangular matrix of the target AB matrix instead of the upper one (with the same input convention as before).

Basic systems: generic square matrices

For solutions of basic systems where A is a generic square matrix, it is a good idea to factorize A so that some (or all) of the factors are triangular and then apply back and forward substitution, where appropriate. This is the idea behind pivoted LU and Cholesky decompositions.

If matrix A is real symmetric (or complex Hermitian) and positive definite, the optimal strategy goes through applying any of the two possible Cholesky decompositions A = UH ● U or A = L ● LH with the U and L upper/lower triangular matrices.

For example, if we use the form with the upper triangular matrices, the solution of the basic system of equations A ● x = b turns into UH ● U ● x = b. Set y = U ● x and solve the system UH ● y = b for y by forward substitution. We have now a new triangular system U ● x = y that we solve for x, by back substitution.

To perform the solution of such a system with this technique, we first compute the factorization by using either the functions cholesky, cho_factor or cholesky_banded. The output is then used in the solver cho_solve.

For Cholesky decompositions, the three relevant functions called cholesky, cho_factor, and cholesky_banded have a set of options similar to those of solveh_banded. They admit an extra Boolean option lower (set by default to False) that decides whether to output a lower or an upper triangular factorization. The function cholesky_banded requires a matrix in the AB format as input.

Let us now test the Cholesky decomposition of matrix B with all three methods:

In [8]: B = spsp.diags([[-1]*5, [2]*6, [-1]*5], [-1,0,1]).todense()
   ...: print B
[[ 2. -1.  0.  0.  0.  0.]
 [-1.  2. -1.  0.  0.  0.]
 [ 0. -1.  2. -1.  0.  0.]
 [ 0.  0. -1.  2. -1.  0.]
 [ 0.  0.  0. -1.  2. -1.]
 [ 0.  0.  0.  0. -1.  2.]]
In [9]: np.set_printoptions(suppress=True, precision=3)
In [10]: print spla.cholesky(B)
[[ 1.414 -0.707  0.     0.     0.     0.   ]
 [ 0.     1.225 -0.816  0.     0.     0.   ]
 [ 0.     0.     1.155 -0.866  0.     0.   ]
 [ 0.     0.     0.     1.118 -0.894  0.   ]
 [ 0.     0.     0.     0.     1.095 -0.913]
 [ 0.     0.     0.     0.     0.     1.08 ]]
In [11]: print spla.cho_factor(B)[0]
[[ 1.414 -0.707  0.     0.     0.     0.   ]
 [-1.     1.225 -0.816  0.     0.     0.   ]
 [ 0.    -1.     1.155 -0.866  0.     0.   ]
 [ 0.     0.    -1.     1.118 -0.894  0.   ]
 [ 0.     0.     0.    -1.     1.095 -0.913]
 [ 0.     0.     0.     0.    -1.     1.08 ]]
In [12]: print spla.cholesky_banded(B_banded)
[[ 0.    -0.707 -0.816 -0.866 -0.894 -0.913]
 [ 1.414  1.225  1.155  1.118  1.095  1.08 ]]

The output of cho_factor is a tuple: the second element is the Boolean lower. The first element is ndarray representing a square matrix. If lower is set to True, the lower triangular sub-matrix of this ndarray is L in the Cholesky factorization of A. If lower is set to False, the upper triangular sub-matrix is U in the factorization of A. The remaining elements in the matrix are random, instead of zeros, since they are not used by cho_solve. In a similar way, we can call cho_solve_banded with the output of cho_banded to solve the appropriate system.

Note

Both cholesky and cho_factor are wrappers to the same LAPACK function called potrf, with different output options. cholesky_banded calls pbtrf. The cho_solve function is a wrapper for potrs, and cho_solve_banded calls pbtrs.

We are then ready to solve the system, with either of the two options:

In [13]: spla.cho_solve((spla.cholesky(B), False), np.ones(6))
Out[13]: array([ 3.,  5.,  6.,  6.,  5.,  3.])
In [13]: spla.cho_solve(spla.cho_factor(B), np.ones(6))
Out[13]: array([ 3.,  5.,  6.,  6.,  5.,  3.])

For any other kind of generic square matrix A, the next best method to solve the basic system A ● x = b is pivoted LU factorization. This is equivalent to finding a permutation matrix P, and triangular matrices U (upper) and L (lower) so that P ● A = L ● U. In such a case, a permutation of the rows in the system according to P gives the equivalent equation (P ● A) ● x = P ● b. Set c = P ● b and y = U ● x, and solve for y in the system L ● y = c using forward substitution. Then, solve for x in the system U ● x = y with back substitution.

The relevant functions to perform this operation are lu, lu_factor (for factorization), and lu_solve (for solution) in the module scipy.linalg. For sparse matrices we have splu, and spilu, in the module scipy.sparse.linalg.

Let us start experimenting with factorizations first. We use a large circulant matrix (non-symmetric) for this example:

In [14]: D = spla.circulant(np.arange(4096))
In [15]: %timeit spla.lu(D)
1 loops, best of 3: 7.04 s per loop
In [16]: %timeit spla.lu_factor(D)
1 loops, best of 3: 5.48 s per loop

Note

The lu_factor function is a wrapper to all *getrf routines from LAPACK. The lu_solve function is a wrapper for getrs.

The function lu has an extra Boolean option: permute_l (set to False by default). If set to True, the function outputs only two matrices PL = P ● L (the properly permuted lower triangular matrix), and U. Otherwise, the output is the triple P, L, U, in that order.

In [17]: P, L, U = spla.lu(D)
In [17]: PL, U = spla.lu(D, permute_l=True)

The outputs of the function lu_factor are resource-efficient. We obtain a matrix LU, with upper triangle U and lower triangle L. We also obtain a one-dimensional ndarray class of integer dtype, piv, indicating the pivot indices representing the permutation matrix P.

In [18]: LU, piv = spla.lu_factor(D)

The solver lu_solve takes the two outputs from lu_factor, a right-hand side matrix b, and the optional indicator trans to the kind of basic system to solve:

In [19]: spla.lu_solve(spla.lu_factor(D), np.ones(4096))
Out[19]: array([ 0.,  0.,  0., ...,  0.,  0.,  0.])

Tip

At this point, we must comment on the general function solve in the module scipy.linalg. It is a wrapper to both LAPACK functions POSV and GESV. It allows us to input matrix A and right-hand side matrix b, and indicate whether A is symmetric and positive definite. In any case, the routine internally decides which of the two factorizations to use (Cholesky or pivoted LU), and computes a solution accordingly.

For large sparse matrices, provided they are stored in the CSC format, the pivoted LU decomposition is more efficiently performed with either functions splu or spilu from the module scipy.sparse.linalg. Both functions use the SuperLU library directly. Their output is not a set of matrices, but a Python object called scipy.sparse.linalg.dsolve._superlu.SciPyLUType. This object has four attributes and one instance method:

  • shape: 2-tuple containing the shape of matrix A

  • nnz: The number of non-zero entries in matrix A

  • perm_c, perm_r: The permutations applied to the columns and rows (respectively) to the matrix A to obtain the computed LU decomposition

  • solve: instance method that converts the object into a function object.solve(b,trans) accepting ndarray b, and the optional description string trans.

The big idea is that, dealing with large amounts of data, the actual matrices in the LU decomposition are not as important as the main application behind the factorization: the solution of the system. All the relevant information to perform this operation is optimally stored in the object's method solve.

The main difference between splu and spilu is that the latter computes an incomplete decomposition. With it, we can obtain really good approximations to the inverse of matrix A, and use matrix multiplication to compute the solution of large systems in a fraction of the time that it would take to calculate the actual solution.

Note

The usage of these two functions is rather complex. The purpose is to compute a factorization of the form Pr*Dr*A*Dc*Pc = L*U with diagonal matrices Dr and Dc and permutation matrices Pr and Pc. The idea is to equilibrate matrix A manually so that the product B = Dr*A*Dc is better conditioned than A. In case of the possibility of solving this problem in a parallel architecture, we are allowed to help by rearranging the rows and columns optimally. The permutation matrices Pr and Pc are then manually input to pre-order the rows and columns of B. All of these options can be fed to either splu or spilu.

The algorithm exploits the idea of relaxing supernodes to reduce inefficient indirect addressing and symbolic time (besides permitting the use of higher-level BLAS operations). We are given the option to determine the degree of these objects, to tailor the algorithm to the matrix at hand.

For a complete explanation of the algorithms and all the different options, the best reference is SuperLU User Guide, which can be found online at crd-legacy.lbl.gov/~xiaoye/SuperLU/superlu_ug.pdf.

Let us illustrate this with a simple example, where the permutation of rows or columns is not needed. In a large lower triangular Pascal matrix, turn into zero all the even-valued entries and into ones all the odd-valued entries. Use this as matrix A. For the right-hand side, use a vector of ones:

In [20]: A_csc = spsp.csc_matrix(A, dtype=np.float64)
In [21]: invA = spspla.splu(A_csc)
In [22]: %time invA.solve(np.ones(1024))
CPU times: user: 4.32 ms, sys: 105 µs, total: 4.42 ms
Wall time: 4.44 ms
Out[22]: array([ 1., -0.,  0., ..., -0.,  0.,  0.])
In [23]: invA = spspla.spilu(A_csc)
In [24]: %time invA.solve(np.ones(1024))
CPU times: user 656 µs, sys: 22 µs, total: 678 µs
Wall time: 678 µs
Out[24]: array([ 1.,  0.,  0., ...,  0.,  0.,  0.]) 

Note

Compare the time of execution of the procedures on sparse matrices, with the initial solve_triangular procedure on the corresponding matrix A at the beginning of the section. Which process is faster?

However, in general, if a basic system must be solved and matrix A is large and sparse, we prefer to use iterative methods with fast convergence to the actual solutions. When they converge, they are consistently less sensitive to rounding-off errors and thus more suitable when the number of computations is extremely high.

In the module scipy.sparse.linalg, we have eight different iterative methods, all of which accept the following as parameters:

  • Matrix A in any format (matrix, ndarray, sparse matrix, or even a linear operator!), and right-hand side vector/matrix b as ndarray.

  • Initial guess x0, as ndarray.

  • Tolerance to l, a floating point number. If the difference of successive iterations is less than this value, the code stops and the last computed values are output as the solution.

  • Maximum number of iterations allowed, maxiter, an integer.

  • A Preconditioner sparse matrix M that should approximate the inverse of A.

  • A callback function of the current solution vector xk, called after each iteration.

Constructor

Description

bicg

Biconjugate Gradient Iteration

bicgstab

Biconjugate Gradient Stabilized Iteration

cg

Conjugate Gradient Iteration

cgs

Conjugate Gradient Squared Iteration

gmres

Generalized Minimal Residual Iteration

lgmres

LGMRES Iteration

minres

Minimum Residual Iteration

qmr

Quasi-minimal Residual Iteration

Choosing the right iterative method, a good initial guess, and especially a successful Preconditioner is an art in itself. It involves learning about topics such as operators in Functional Analysis, or Krylov subspace methods, which are far beyond the scope of this book. At this point, we are content with showing a few simple examples for the sake of comparison:

In [25]: spspla.cg(A_csc, np.ones(1024), x0=np.zeros(1024))
Out[25]: (array([ nan,  nan,  nan, ...,  nan,  nan,  nan]), 1)
In [26]: %time spspla.gmres(A_csc, np.ones(1024), x0=np.zeros(1024))
CPU times: user 4.26 ms, sys: 712 µs, total: 4.97 ms
Wall time: 4.45 ms
Out[26]: (array([ 1.,  0.,  0., ..., -0., -0.,  0.]), 0)
In [27]: Nsteps = 1
   ....: def callbackF(xk):
   ....:     global Nsteps
   ....:     print'{0:4d}  {1:3.6f}  {2:3.6f}'.format(Nsteps, \
   ....:     xk[0],xk[1])
   ....:     Nsteps += 1
   ....:
In [28]: print '{0:4s}  {1:9s}  {1:9s}'.format('Iter', \
   ....: 'X[0]','X[1]'); \
   ....: spspla.bicg(A_csc, np.ones(1024), x0=np.zeros(1024),
   ....:             callback=callbackF)
   ....:
Iter  X[0]       X[1]
   1  0.017342  0.017342
   2  0.094680  0.090065
   3  0.258063  0.217858
   4  0.482973  0.328061
   5  0.705223  0.337023
   6  0.867614  0.242590
   7  0.955244  0.121250
   8  0.989338  0.040278
   9  0.998409  0.008022
  10  0.999888  0.000727
  11  1.000000  -0.000000
  12  1.000000  -0.000000
  13  1.000000  -0.000000
  14  1.000000  -0.000000
  15  1.000000  -0.000000
  16  1.000000  0.000000
  17  1.000000  0.000000
Out[28]: (array([ 1.,  0.,  0., ...,  0.,  0., -0.]), 0)

Least squares

Given a generic matrix A (not necessarily square) and a right-hand side vector/matrix b, we look for a vector/matrix x such that the Frobenius norm of the expression A ● x - b is minimized.

The main three methods to solve this problem numerically are contemplated in scipy:

  • Normal equations

  • QR factorization

  • Singular value decomposition

Normal equations

Normal equations reduce the least square problem to solving a basic system of linear equations, with a symmetric (not-necessarily positive-definite) matrix. It is very fast but can be inaccurate due to presence of roundoff errors. Basically, it amounts to solving the system (AH ● A) ● x = AH ● b. This is equivalent to solving x = (AH ● A)-1 ● AH ● b = pinv(A) ● b.

Let us show by example:

In [29]: E = D[:512,:256]; b = np.ones(512)
In [30]: sol1 = np.dot(spla.pinv2(E), b)
In [31]: sol2 = spla.solve(np.dot(F.T, F), np.dot(F.T, b))
QR factorization

The QR factorization turns any matrix into the product A = Q ● R of an orthogonal/unitary matrix Q with a square upper triangular matrix R. This allows us to solve the system without the need to invert any matrix (since QH = Q-1), and thus, A ● x = b turns into R ● x = QH ● b, which is easily solvable by back substitution. Note that the two methods below are equivalent, since the mode economic reports the sub-matrices of maximum rank:

In [32]: Q, R = spla.qr(E); \
   ....: RR = R[:256, :256]; BB = np.dot(Q.T, b)[:256]; \
   ....: sol3 = spla.solve_triangular(RR, BB)
In [32]: Q, R = spla.qr(E, mode='economic'); \
   ....: sol3 = spla.solve_triangular(R, np.dot(Q.T, b))
Singular value decomposition

Both methods of normal equations and QR factorization work fast and are reliable only when the rank of A is full. If this is not the case, we must use singular value decomposition A = U ● D ● VH with unitary matrices U and V and a diagonal matrix D, where all the entries in the diagonal are positive values. This allows for a fast solution x = V ● D-1 ● UH ● b.

Note that the two methods discussed below are equivalent, since the option full_matrices set to False reports the sub-matrices of the minimum possible size:

In [33]: U, s, Vh = spla.svd(E); \
   ....: Uh = U.T; \
   ....: Si = spla.diagsvd(1./s, 256, 256); \
   ....: V = Vh.T; \
   ....: sol4 = np.dot(V, Si).dot(np.dot(Uh, b)[:256])  
In [33]: U, s, Vh = spla.svd(E, full_matrices=False); \
   ....: Uh = U.T; \
   ....: Si = spla.diagsvd(1./s, 256, 256); \
   ....: V = Vh.T; \
   ....: sol4 = np.dot(V, Si).dot(np.dot(Uh, b))

The module scipy.linalg has one function that actually performs least squares with the SVD method: lstsq. There is no need to manually transpose, invert, and multiply all the required matrices. It is a wrapper to the LAPACK function GELSS. It outputs the desired solution, together with the residues of computation, the effective rank, and the singular values of the input matrix A.

In [34]: sol5, residue, rank, s = spla.lstsq(E, b)

Note how all the computations that we have carried out offer solutions that are very close to each other (if not equal!):

In [35]: map(lambda x: np.allclose(sol5,x), [sol1, sol2, sol3, sol4])
Out[35]: [True, True, True, True]

Regularized least squares

The module scipy.sparse.linalg has two iterative methods for least squares in the context of large sparse matrices, lsqr and lsmr, which allow for a more generalized version with a damping factor d for regularization. We seek to minimize the functional norm(A * x - b, 'f')**2 + d**2 * norm(x, 'f')**2. The usage and parameters are very similar to the iterative functions we studied before.

Other matrix equation solvers

The rest of the matrix equation solvers are summarized in the following table. None of these routines enjoy any parameters to play around with performance or memory management, or check for the integrity of data:

Constructor

Description

solve_sylvester(A, B, Q)

Sylvester equation

solve_continuous_are(A, B, Q, R)

continuous algebraic Riccati equation

solve_discrete_are(A, B, Q, R)

discrete algebraic Riccati equation

solve_lyapunov(A, Q)

continuous Lyapunov equation

solve_discrete_lyapunov(A, Q)

discrete Lyapunov equation