#### Overview of this book

Mastering SciPy
Credits
www.PacktPub.com
Preface
Free Chapter
Numerical Linear Algebra
Interpolation and Approximation
Differentiation and Integration
Nonlinear Equations and Optimization
Initial Value Problems for Ordinary Differential Equations
Computational Geometry
Descriptive Statistics
Inference and Data Analysis
Mathematical Imaging
Index

## Matrix factorizations related to solving matrix equations

The concept of matrix decompositions is what makes Numerical Linear Algebra an efficient tool in Scientific Computing. If the matrix representing a problem is simple enough, any basic generic algorithm can find the solutions optimally (that is, fast, with minimal storage of data, and without a significant roundoff error). But, in real life, this situation seldom occurs. What we do in the general case is finding a suitable matrix factorization and tailoring an algorithm that is optimal on each factor, thus gaining on each step an obvious advantage. In this section, we explore the different factorizations included in the modules `scipy.linalg` and `scipy.sparse.linalg` that help us achieve a robust solution to matrix equations.

### Relevant factorizations

We have the following factorizations in this category:

#### Pivoted LU decomposition

It is always possible to perform a factorization of a square matrix `A` as a product A = P ● L ● U of a permutation matrix `P` (which performs a permutation of the rows of `A`), a lower triangular matrix `L`, and an upper triangular matrix `U`:

Constructor

Description

`scipy.linalg.lu(A)`

Pivoted LU decomposition

`scipy.linalg.lu_factor(A)`

Pivoted LU decomposition

`scipy.sparse.linalg.splu(A)`

Pivoted LU decomposition

`scipy.sparse.linalg.spilu(A)`

Incomplete pivoted LU decomposition

#### Cholesky decomposition

For a square, symmetric, and positive definite matrix `A`, we can realize the matrix as the product A = UT ● U of an upper triangular matrix `U` with its transpose, or as the product A = LT ● L of a lower triangular matrix `L` with its transpose. All the diagonal entries of `U` or `L` are strictly positive numbers:

Constructor

Description

`scipy.linalg.cholesky(A)`

Cholesky decomposition

`scipy.linalg.cholesky_banded(AB)`

Cholesky decomposition for Hermitian positive-definite banded matrices

#### QR decomposition

We can realize any matrix of size m × n as the product A=Q ● R of a square orthogonal matrix `Q` of size m × m, with an upper triangular matrix `R` of the same size as `A`.

Constructor

Description

`scipy.linalg.qr(A)`

QR decomposition of a matrix

#### Singular value decomposition

We can realize any matrix `A` as the product A = U ● D ● VH of a unitary matrix `U` with a diagonal matrix `D` (where all entries in the diagonal are positive numbers), and the Hermitian transpose of another unitary matrix `V`. The values on the diagonal of `D` are called the singular values of `A`.

Constructor

Description

`scipy.linalg.svd(A)`

Singular value decomposition

`scipy.linalg.svdvals(A)`

Singular values

`scipy.linalg.diagsvd(s, m, n)`

Diagonal matrix of an SVD, from singular values `s` and prescribed size

`scipy.sparse.linalg.svds(A)`

Largest `k` singular values/vectors of a sparse matrix

### Matrix equations

In SciPy, we have robust algorithms to solve any matrix equation based on the following cases:

• Given a square matrix `A`, and a right-hand side `b` (which can be a one-dimensional vector or another matrix with the same number of rows as `A`), the basic systems are as follows:

• A ● x = b

• AT ● x = b

• AH ● x = b

• Given any matrix `A` (not necessarily square) and a right-hand side vector/matrix `b` of an appropriate size, the least squares solution to the equation A ● x = b. This is, finding a vector `x` that minimizes the Frobenius norm of the expression A ● x - b.

• For the same case as before, and an extra damping coefficient `d`, the regularized least squares solution to the equation A ● x = b that minimizes the functional `norm(A * x - b, 'f')**2 + d**2 * norm(x, 'f')**2`.

• Given square matrices `A` and `B`, and a right-hand side matrix `Q` with appropriate sizes, the Sylvester system is A ● X + X ● B = Q.

• For a square matrix `A` and matrix `Q` of an appropriate size, the continuous Lyapunov equation is A ● X + X ● AH = Q.

• For matrices `A` and `Q`, as in the previous case, the discrete Lyapunov equation is X - A ● X ● AH = Q.

• Given square matrices `A`, `Q`, and `R`, and another matrix `B` with an appropriate size, the continuous algebraic Riccati equation is AT ● X + X ● A - X ● B ● R-1 ● BT ● X + Q = 0.

• For matrices as in the previous case, the Discrete Algebraic Riccati equation is X = AT ● X ● A - (AT ● X ● B) ● (R+BT ● X ● B)-1 ● (BT ● X ● A) + Q.

In any case, mastering matrix equations with SciPy basically means identifying the matrices involved and choosing the most adequate algorithm in the libraries to perform the requested operations. Besides being able to compute a solution with the least possible amount of roundoff error, we need to do so in the fastest possible way, and by using as few memory resources as possible.

#### Back and forward substitution

Let us start with the easiest possible case: The basic system of linear equations A ● x = b (or the other two variants), where `A` is a generic lower or upper triangular square matrix. In theory, these systems are easily solved by forward substitution (for lower triangular matrices) or back substitution (for upper triangular matrices). In SciPy, we accomplish this task with the function `solve_triangular` in the module `scipy.linalg`.

For this initial example, we will construct `A` as a lower triangular Pascal matrix of size 1024 × 1024, where the non-zero values have been filtered: odd values are turned into ones, while even values are turned into zeros. The right-hand side `b` is a vector with `1024` ones.

```In [1]: import numpy as np, \
...: scipy.linalg as spla, scipy.sparse as spsp, \
...: scipy.sparse.linalg as spspla
In [2]: A = (spla.pascal(1024, kind='lower')%2 != 0)
In [3]: %timeit spla.solve_triangular(A, np.ones(1024))
10 loops, best of 3: 6.64 ms per loop
```

To solve the other related systems that involve the matrix `A`, we employ the optional parameter `trans` (by default set to `0` or `N`, giving the basic system A ● x = b). If `trans` is set to `T` or `1`, we solve the system AT ● x = b instead. If `trans` is set to `C` or `2`, we solve AH ● x = b instead.

### Note

The function `solve_triangular` is a wrapper for the `LAPACK` function `trtrs`.

#### Basic systems: banded matrices

The next cases in terms of algorithm simplicity are those of basic systems A ● x = b, where `A` is a square banded matrix. We use the routines `solve_banded` (for a generic banded matrix) or `solveh_banded` (for a generic real symmetric of complex Hermitian banded matrix). Both of them belong to the module `scipy.linalg`.

### Note

The functions `solve_banded` and `solveh_banded` are wrappers for the `LAPACK` functions `GBSV`, and `PBSV`, respectively.

Neither function accepts a matrix in the usual format. For example, since `solveh_banded` expects a symmetric banded matrix, the function requires as input only the elements of the diagonals on and under/over the main diagonal, stored sequentially from the top to the bottom.

This input method is best explained through a concrete example. Take the following symmetric banded matrix:

``` 2 -1  0  0  0  0
-1  2 -1  0  0  0
0 -1  2 -1  0  0
0  0 -1  2 -1  0
0  0  0 -1  2 -1
0  0  0  0 -1  2
```

The size of the matrix is 6 × 6, and there are only three non-zero diagonals, two of which are identical due to symmetry. We collect the two relevant non-zero diagonals in `ndarray` of size 2 × 6 in one of two ways, as follows:

• If we decide to input the entries from the upper triangular matrix, we collect first the diagonals from the top to the bottom (ending in the main diagonal), right justified:

```* -1 -1 -1 -1 -1
2  2  2  2  2  2
```
• If we decide to input the entries from the lower triangular matrix, we collect the diagonals from the top to the bottom (starting from the main diagonal), left justified:

``` 2  2  2  2  2  2
-1 -1 -1 -1 -1  *
In [4]: B_banded = np.zeros((2,6)); \
...: B_banded[0,1:] = -1; \
...: B_banded[1,:] = 2
In [5]: spla.solveh_banded(B_banded, np.ones(6))
Out[5]: array([ 3.,  5.,  6.,  6.,  5.,  3.])
```

For a non-symmetric banded square matrix, we use `solve_banded` instead, and the input matrix also needs to be stored in this special way:

• Count the number of non-zero diagonals under the main diagonal (set that to `l`). Count the number of non-zero diagonals over the main diagonal (set that to `u`). Set `r = l + u + 1`.

• If the matrix has size n × n, create `ndarray` with n columns and r rows. We refer to this storage as a matrix in the `AB` form, or an `AB` matrix, for short.

• Store in the AB matrix only the relevant non-zero diagonals, from the top to the bottom, in order. Diagonals over the main diagonal are right justified; diagonals under the main diagonal are left justified.

Let us illustrate this process with another example. We input the following matrix:

``` 2 -1  0  0  0  0
-1  2 -1  0  0  0
3 -1  2 -1  0  0
0  3 -1  2 -1  0
0  0  3 -1  2 -1
0  0  0  3 -1  2
In [6]: C_banded = np.zeros((4,6)); \
...: C_banded[0,1:] = -1; \
...: C_banded[1,:] = 2; \
...: C_banded[2,:-1] = -1; \
...: C_banded[3,:-2] = 3; \
...: print C_banded
[[ 0. -1. -1. -1. -1. -1.]
[ 2.  2.  2.  2.  2.  2.]
[-1. -1. -1. -1. -1.  0.]
[ 3.  3.  3.  3.  0.  0.]]
```

To call the solver, we need to input manually the number of diagonals over and under the diagonal, together with the `AB` matrix and the right-hand side of the system:

```In [7]: spla.solve_banded((2,1), C_banded, np.ones(6))
Out[7]:
array([ 0.86842105,  0.73684211, -0.39473684,  0.07894737,
1.76315789,  1.26315789])
```

Let us examine the optional parameters that we can include in the call of these two functions:

Parameter

Default values

Description

`l_and_u`

`(int, int)`

Number of non-zero lower/upper diagonals

`ab`

Matrix in `AB` format

A banded square matrix

`b`

`ndarray`

Right-hand side

`overwrite_ab`

Boolean

Discard data in `ab`

`overwrite_b`

Boolean

Discard data in `b`

`check_finite`

Boolean

Whether to check that input matrices contain finite numbers

### Tip

All the functions in the `scipy.linalg` module that require matrices as input and output either a solution to a system of equations, or a factorization, have two optional parameters with which we need to familiarize: `overwrite_x` (for each matrix/vector in the input) and `check_finite`. They are both Boolean.

The `overwrite` options are set to `False` by default. If we do not care about retaining the values of the input matrices, we may use the same object in the memory to perform operations, rather than creating another object with the same size in the memory. We gain speed and use fewer resources in such a case.

The `check_finite` option is set to `True` by default. In the algorithms where it is present, there are optional checks for the integrity of the data. If at any given moment, any of the values is `(+/-)numpy.inf` or `NaN`, the process is halted, and an exception is raised. We may turn this option off, thus resulting in much faster solutions, but the code might crash if the data is corrupted at any point in the computations.

The function `solveh_banded` has an extra optional Boolean parameter, `lower`, which is initially set to `False`. If set to `True`, we must input the lower triangular matrix of the target AB matrix instead of the upper one (with the same input convention as before).

#### Basic systems: generic square matrices

For solutions of basic systems where `A` is a generic square matrix, it is a good idea to factorize `A` so that some (or all) of the factors are triangular and then apply back and forward substitution, where appropriate. This is the idea behind pivoted `LU` and Cholesky decompositions.

If matrix `A` is real symmetric (or complex Hermitian) and positive definite, the optimal strategy goes through applying any of the two possible Cholesky decompositions A = UH ● U or A = L ● LH with the `U` and `L` upper/lower triangular matrices.

For example, if we use the form with the upper triangular matrices, the solution of the basic system of equations A ● x = b turns into UH ● U ● x = b. Set y = U ● x and solve the system UH ● y = b for `y` by forward substitution. We have now a new triangular system U ● x = y that we solve for `x`, by back substitution.

To perform the solution of such a system with this technique, we first compute the factorization by using either the functions `cholesky`, `cho_factor` or `cholesky_banded`. The output is then used in the solver `cho_solve`.

For Cholesky decompositions, the three relevant functions called `cholesky`, `cho_factor`, and `cholesky_banded` have a set of options similar to those of `solveh_banded`. They admit an extra Boolean option lower (set by default to `False`) that decides whether to output a lower or an upper triangular factorization. The function `cholesky_banded` requires a matrix in the `AB` format as input.

Let us now test the Cholesky decomposition of matrix `B` with all three methods:

```In [8]: B = spsp.diags([[-1]*5, [2]*6, [-1]*5], [-1,0,1]).todense()
...: print B
[[ 2. -1.  0.  0.  0.  0.]
[-1.  2. -1.  0.  0.  0.]
[ 0. -1.  2. -1.  0.  0.]
[ 0.  0. -1.  2. -1.  0.]
[ 0.  0.  0. -1.  2. -1.]
[ 0.  0.  0.  0. -1.  2.]]
In [9]: np.set_printoptions(suppress=True, precision=3)
In [10]: print spla.cholesky(B)
[[ 1.414 -0.707  0.     0.     0.     0.   ]
[ 0.     1.225 -0.816  0.     0.     0.   ]
[ 0.     0.     1.155 -0.866  0.     0.   ]
[ 0.     0.     0.     1.118 -0.894  0.   ]
[ 0.     0.     0.     0.     1.095 -0.913]
[ 0.     0.     0.     0.     0.     1.08 ]]
In [11]: print spla.cho_factor(B)[0]
[[ 1.414 -0.707  0.     0.     0.     0.   ]
[-1.     1.225 -0.816  0.     0.     0.   ]
[ 0.    -1.     1.155 -0.866  0.     0.   ]
[ 0.     0.    -1.     1.118 -0.894  0.   ]
[ 0.     0.     0.    -1.     1.095 -0.913]
[ 0.     0.     0.     0.    -1.     1.08 ]]
In [12]: print spla.cholesky_banded(B_banded)
[[ 0.    -0.707 -0.816 -0.866 -0.894 -0.913]
[ 1.414  1.225  1.155  1.118  1.095  1.08 ]]
```

The output of `cho_factor` is a tuple: the second element is the Boolean lower. The first element is `ndarray` representing a square matrix. If `lower` is set to `True`, the lower triangular sub-matrix of this `ndarray` is `L` in the Cholesky factorization of `A`. If `lower` is set to `False`, the upper triangular sub-matrix is `U` in the factorization of `A`. The remaining elements in the matrix are random, instead of zeros, since they are not used by `cho_solve`. In a similar way, we can call `cho_solve_banded` with the output of `cho_banded` to solve the appropriate system.

### Note

Both `cholesky` and `cho_factor` are wrappers to the same LAPACK function called `potrf`, with different output options. `cholesky_banded` calls `pbtrf`. The `cho_solve` function is a wrapper for `potrs`, and `cho_solve_banded` calls `pbtrs`.

We are then ready to solve the system, with either of the two options:

```In [13]: spla.cho_solve((spla.cholesky(B), False), np.ones(6))
Out[13]: array([ 3.,  5.,  6.,  6.,  5.,  3.])
In [13]: spla.cho_solve(spla.cho_factor(B), np.ones(6))
Out[13]: array([ 3.,  5.,  6.,  6.,  5.,  3.])
```

For any other kind of generic square matrix `A`, the next best method to solve the basic system A ● x = b is pivoted `LU` factorization. This is equivalent to finding a permutation matrix `P`, and triangular matrices `U` (upper) and `L` (lower) so that P ● A = L ● U. In such a case, a permutation of the rows in the system according to `P` gives the equivalent equation (P ● A) ● x = P ● b. Set `c = P ● b` and `y = U ● x`, and solve for `y` in the system L ● y = c using forward substitution. Then, solve for `x` in the system U ● x = y with back substitution.

The relevant functions to perform this operation are `lu`, `lu_factor` (for factorization), and `lu_solve` (for solution) in the module `scipy.linalg`. For sparse matrices we have `splu`, and `spilu`, in the module `scipy.sparse.linalg`.

Let us start experimenting with factorizations first. We use a large circulant matrix (non-symmetric) for this example:

```In [14]: D = spla.circulant(np.arange(4096))
In [15]: %timeit spla.lu(D)
1 loops, best of 3: 7.04 s per loop
In [16]: %timeit spla.lu_factor(D)
1 loops, best of 3: 5.48 s per loop
```

### Note

The `lu_factor` function is a wrapper to all `*getrf` routines from LAPACK. The `lu_solve` function is a wrapper for `getrs`.

The function `lu` has an extra Boolean option: `permute_l` (set to `False` by default). If set to `True`, the function outputs only two matrices PL = P ● L (the properly permuted lower triangular matrix), and `U`. Otherwise, the output is the triple `P`, `L`, `U`, in that order.

```In [17]: P, L, U = spla.lu(D)
In [17]: PL, U = spla.lu(D, permute_l=True)
```

The outputs of the function `lu_factor` are resource-efficient. We obtain a matrix `LU`, with upper triangle `U` and lower triangle `L`. We also obtain a one-dimensional `ndarray` class of integer `dtype`, `piv`, indicating the pivot indices representing the permutation matrix `P`.

```In [18]: LU, piv = spla.lu_factor(D)
```

The solver `lu_solve` takes the two outputs from `lu_factor`, a right-hand side matrix `b`, and the optional indicator `trans` to the kind of basic system to solve:

```In [19]: spla.lu_solve(spla.lu_factor(D), np.ones(4096))
Out[19]: array([ 0.,  0.,  0., ...,  0.,  0.,  0.])
```

### Tip

At this point, we must comment on the general function `solve` in the module `scipy.linalg`. It is a wrapper to both `LAPACK` functions `POSV` and `GESV`. It allows us to input matrix `A` and right-hand side matrix `b`, and indicate whether `A` is symmetric and positive definite. In any case, the routine internally decides which of the two factorizations to use (Cholesky or pivoted LU), and computes a solution accordingly.

For large sparse matrices, provided they are stored in the CSC format, the pivoted `LU` decomposition is more efficiently performed with either functions `splu` or `spilu` from the module `scipy.sparse.linalg`. Both functions use the `SuperLU` library directly. Their output is not a set of matrices, but a Python object called `scipy.sparse.linalg.dsolve._superlu.SciPyLUType`. This object has four attributes and one instance method:

• `shape`: 2-tuple containing the shape of matrix `A`

• `nnz`: The number of non-zero entries in matrix `A`

• `perm_c, perm_r`: The permutations applied to the columns and rows (respectively) to the matrix `A` to obtain the computed `LU` decomposition

• `solve`: instance method that converts the object into a function `object.solve(b,trans)` accepting `ndarray b`, and the optional description string `trans`.

The big idea is that, dealing with large amounts of data, the actual matrices in the `LU` decomposition are not as important as the main application behind the factorization: the solution of the system. All the relevant information to perform this operation is optimally stored in the object's method `solve`.

The main difference between `splu` and `spilu` is that the latter computes an incomplete decomposition. With it, we can obtain really good approximations to the inverse of matrix `A`, and use matrix multiplication to compute the solution of large systems in a fraction of the time that it would take to calculate the actual solution.

### Note

The usage of these two functions is rather complex. The purpose is to compute a factorization of the form Pr*Dr*A*Dc*Pc = L*U with diagonal matrices `Dr` and `Dc` and permutation matrices `Pr` and `Pc`. The idea is to equilibrate matrix `A` manually so that the product B = Dr*A*Dc is better conditioned than `A`. In case of the possibility of solving this problem in a parallel architecture, we are allowed to help by rearranging the rows and columns optimally. The permutation matrices `Pr` and `Pc` are then manually input to pre-order the rows and columns of `B`. All of these options can be fed to either `splu` or `spilu`.

The algorithm exploits the idea of relaxing supernodes to reduce inefficient indirect addressing and symbolic time (besides permitting the use of higher-level BLAS operations). We are given the option to determine the degree of these objects, to tailor the algorithm to the matrix at hand.

For a complete explanation of the algorithms and all the different options, the best reference is SuperLU User Guide, which can be found online at crd-legacy.lbl.gov/~xiaoye/SuperLU/superlu_ug.pdf.

Let us illustrate this with a simple example, where the permutation of rows or columns is not needed. In a large lower triangular Pascal matrix, turn into zero all the even-valued entries and into ones all the odd-valued entries. Use this as matrix `A`. For the right-hand side, use a vector of ones:

```In [20]: A_csc = spsp.csc_matrix(A, dtype=np.float64)
In [21]: invA = spspla.splu(A_csc)
In [22]: %time invA.solve(np.ones(1024))
CPU times: user: 4.32 ms, sys: 105 µs, total: 4.42 ms
Wall time: 4.44 ms
Out[22]: array([ 1., -0.,  0., ..., -0.,  0.,  0.])
In [23]: invA = spspla.spilu(A_csc)
In [24]: %time invA.solve(np.ones(1024))
CPU times: user 656 µs, sys: 22 µs, total: 678 µs
Wall time: 678 µs
Out[24]: array([ 1.,  0.,  0., ...,  0.,  0.,  0.])
```

### Note

Compare the time of execution of the procedures on sparse matrices, with the initial `solve_triangular` procedure on the corresponding matrix `A` at the beginning of the section. Which process is faster?

However, in general, if a basic system must be solved and matrix `A` is large and sparse, we prefer to use iterative methods with fast convergence to the actual solutions. When they converge, they are consistently less sensitive to rounding-off errors and thus more suitable when the number of computations is extremely high.

In the module `scipy.sparse.linalg`, we have eight different iterative methods, all of which accept the following as parameters:

• Matrix `A` in any format (matrix, ndarray, sparse matrix, or even a linear operator!), and right-hand side vector/matrix `b` as `ndarray`.

• Initial guess `x0`, as `ndarray`.

• Tolerance to `l`, a floating point number. If the difference of successive iterations is less than this value, the code stops and the last computed values are output as the solution.

• Maximum number of iterations allowed, maxiter, an integer.

• A Preconditioner sparse matrix `M` that should approximate the inverse of `A`.

• A `callback` function of the current solution vector `xk`, called after each iteration.

Constructor

Description

`bicg`

`bicgstab`

`cg`

`cgs`

`gmres`

Generalized Minimal Residual Iteration

`lgmres`

LGMRES Iteration

`minres`

Minimum Residual Iteration

`qmr`

Quasi-minimal Residual Iteration

Choosing the right iterative method, a good initial guess, and especially a successful Preconditioner is an art in itself. It involves learning about topics such as operators in Functional Analysis, or Krylov subspace methods, which are far beyond the scope of this book. At this point, we are content with showing a few simple examples for the sake of comparison:

```In [25]: spspla.cg(A_csc, np.ones(1024), x0=np.zeros(1024))
Out[25]: (array([ nan,  nan,  nan, ...,  nan,  nan,  nan]), 1)
In [26]: %time spspla.gmres(A_csc, np.ones(1024), x0=np.zeros(1024))
CPU times: user 4.26 ms, sys: 712 µs, total: 4.97 ms
Wall time: 4.45 ms
Out[26]: (array([ 1.,  0.,  0., ..., -0., -0.,  0.]), 0)
In [27]: Nsteps = 1
....: def callbackF(xk):
....:     global Nsteps
....:     print'{0:4d}  {1:3.6f}  {2:3.6f}'.format(Nsteps, \
....:     xk[0],xk[1])
....:     Nsteps += 1
....:
In [28]: print '{0:4s}  {1:9s}  {1:9s}'.format('Iter', \
....: 'X[0]','X[1]'); \
....: spspla.bicg(A_csc, np.ones(1024), x0=np.zeros(1024),
....:             callback=callbackF)
....:
Iter  X[0]       X[1]
1  0.017342  0.017342
2  0.094680  0.090065
3  0.258063  0.217858
4  0.482973  0.328061
5  0.705223  0.337023
6  0.867614  0.242590
7  0.955244  0.121250
8  0.989338  0.040278
9  0.998409  0.008022
10  0.999888  0.000727
11  1.000000  -0.000000
12  1.000000  -0.000000
13  1.000000  -0.000000
14  1.000000  -0.000000
15  1.000000  -0.000000
16  1.000000  0.000000
17  1.000000  0.000000
Out[28]: (array([ 1.,  0.,  0., ...,  0.,  0., -0.]), 0)
```

#### Least squares

Given a generic matrix `A` (not necessarily square) and a right-hand side vector/matrix `b`, we look for a vector/matrix `x` such that the Frobenius norm of the expression A ● x - b is minimized.

The main three methods to solve this problem numerically are contemplated in `scipy`:

• Normal equations

• QR factorization

• Singular value decomposition

##### Normal equations

Normal equations reduce the least square problem to solving a basic system of linear equations, with a symmetric (not-necessarily positive-definite) matrix. It is very fast but can be inaccurate due to presence of roundoff errors. Basically, it amounts to solving the system (AH ● A) ● x = AH ● b. This is equivalent to solving x = (AH ● A)-1 ● AH ● b = pinv(A) ● b.

Let us show by example:

```In [29]: E = D[:512,:256]; b = np.ones(512)
In [30]: sol1 = np.dot(spla.pinv2(E), b)
In [31]: sol2 = spla.solve(np.dot(F.T, F), np.dot(F.T, b))
```
##### QR factorization

The QR factorization turns any matrix into the product A = Q ● R of an orthogonal/unitary matrix `Q` with a square upper triangular matrix `R`. This allows us to solve the system without the need to invert any matrix (since QH = Q-1), and thus, A ● x = b turns into R ● x = QH ● b, which is easily solvable by back substitution. Note that the two methods below are equivalent, since the mode `economic` reports the sub-matrices of maximum rank:

```In [32]: Q, R = spla.qr(E); \
....: RR = R[:256, :256]; BB = np.dot(Q.T, b)[:256]; \
....: sol3 = spla.solve_triangular(RR, BB)
In [32]: Q, R = spla.qr(E, mode='economic'); \
....: sol3 = spla.solve_triangular(R, np.dot(Q.T, b))
```
##### Singular value decomposition

Both methods of normal equations and QR factorization work fast and are reliable only when the rank of `A` is full. If this is not the case, we must use singular value decomposition A = U ● D ● VH with unitary matrices `U` and `V` and a diagonal matrix `D`, where all the entries in the diagonal are positive values. This allows for a fast solution x = V ● D-1 ● UH ● b.

Note that the two methods discussed below are equivalent, since the option `full_matrices` set to `False` reports the sub-matrices of the minimum possible size:

```In [33]: U, s, Vh = spla.svd(E); \
....: Uh = U.T; \
....: Si = spla.diagsvd(1./s, 256, 256); \
....: V = Vh.T; \
....: sol4 = np.dot(V, Si).dot(np.dot(Uh, b)[:256])
In [33]: U, s, Vh = spla.svd(E, full_matrices=False); \
....: Uh = U.T; \
....: Si = spla.diagsvd(1./s, 256, 256); \
....: V = Vh.T; \
....: sol4 = np.dot(V, Si).dot(np.dot(Uh, b))
```

The module `scipy.linalg` has one function that actually performs least squares with the SVD method: `lstsq`. There is no need to manually transpose, invert, and multiply all the required matrices. It is a wrapper to the `LAPACK` function `GELSS`. It outputs the desired solution, together with the residues of computation, the effective rank, and the singular values of the input matrix `A`.

```In [34]: sol5, residue, rank, s = spla.lstsq(E, b)
```

Note how all the computations that we have carried out offer solutions that are very close to each other (if not equal!):

```In [35]: map(lambda x: np.allclose(sol5,x), [sol1, sol2, sol3, sol4])
Out[35]: [True, True, True, True]
```

#### Regularized least squares

The module `scipy.sparse.linalg` has two iterative methods for least squares in the context of large sparse matrices, `lsqr` and `lsmr`, which allow for a more generalized version with a damping factor `d` for regularization. We seek to minimize the functional `norm(A * x - b, 'f')**2 + d**2 * norm(x, 'f')**2`. The usage and parameters are very similar to the iterative functions we studied before.

#### Other matrix equation solvers

The rest of the matrix equation solvers are summarized in the following table. None of these routines enjoy any parameters to play around with performance or memory management, or check for the integrity of data:

Constructor

Description

`solve_sylvester(A, B, Q)`

Sylvester equation

`solve_continuous_are(A, B, Q, R)`

continuous algebraic Riccati equation

`solve_discrete_are(A, B, Q, R)`

discrete algebraic Riccati equation

`solve_lyapunov(A, Q)`

continuous Lyapunov equation

`solve_discrete_lyapunov(A, Q)`

discrete Lyapunov equation