Gaussian Elimination and Backsubstitution Tutorial¶

Many statistical procedures such as least squares involve solving large systems of equations. We saw a numerical algorithm to accomplish this task by splitting the computation into two subroutines. As we discussed, we form an augmented matrix $[A\ b]$. We then:

i) reduce the system into upper triangular form

ii) back substitute the values of the variables to find a solution.

Let's define a matrix A to represent the left side of the system of equations:

\begin{align} x_1 + 2x_2 + x_3 &= 5\\ 3x_2 + 2x_2 + 4x_3 &= 17 \\ 4x_1 + 4x_2 + 3x_3 &= 26 \end{align}

import numpy as np

A = np.array([[1,2,1],[3,2,4],[4,4,3]])

b = np.array([5,17,26])

Can you combine the system into an augmented matrix using a numpy function? It should look like this:

system = np.array([[1,2,1,5],[3,2,4,17],[4,4,3,26]])
print system

[[ 1  2  1  5]
 [ 3  2  4 17]
 [ 4  4  3 26]]

check the dimensions:

system.shape

(3, 4)

# colon in the first position means include all rows
# we're only selecting the last one
b = system[:,system.shape[1]-1]
b

array([ 5, 17, 26])

# include all rows, all columns except for the last one
a = system[:,:system.shape[1]-1]
a

array([[1, 2, 1],
       [3, 2, 4],
       [4, 4, 3]])

Here was the code I gave you based on the algorithm we derived in class. Did you find the bug?

def solver_w_typo(system):
    """
    Write a solver that takes as input a system of equations via an augmented matrix
    and returns the system in upper triangular form.
    """
    # define a and b from the augmented matrix
    b = system[:,system.shape[1]-1]
    a = system[:,:system.shape[1]-1]
    
    
    m = np.zeros([a.shape[0],a.shape[1]])
    for j in range(a.shape[1]-1):
        print "j", j
        for i in range(j+1,a.shape[0]):
            print "i", i
            # a[i,j] is the component of the variable we want to eliminate
            # a[j,j] is the component we are multiplying
            # the multiplier is defined by the ratio of the two
            print "a[i,j]", a[i,j], "\na[j,j]", a[j,j]
            m[i,j] = a[i,j]/a[j,j]
            print "m[i,j]", m[i,j]
            for k in range(j, a.shape[0]):
                print "k", k
                print "a[i,k]", a[i,k]
                a[i,k] = a[i,k] - m[i,j]*a[j,k]
                print "new a[i,k]", a[i,k]
            b[i] = b[i] - m[i,j]*b[j]
    
    U = np.zeros([system.shape[0],system.shape[1]])
    U[:,:system.shape[1]-1] = a
    U[:,system.shape[1]-1] = b
    return U

The index in the third for loop should start at j not at j+1. Here it is correct:

def solver(system):
    """
    Write a solver that takes as input a system of equations via an augmented matrix 
    and returns the system in upper triangular form.
    -----
    Input 
        * system : Augmented matrix of the form [A b]
    Output 
        * U      : A matrix in upper triangular form
        * a      : RHS of system upper triangular matrix
        * b      : LHS of system
    """
    # define a and b from the augmented matrix
    b = system[:,system.shape[1]-1]
    a = system[:,:system.shape[1]-1]
    
    m = np.zeros([a.shape[0],a.shape[1]])
    # iterate over columns
    for j in range(a.shape[1]-1):
        # iterate over rows
        for i in range(j+1,a.shape[0]):
            # a[i,j] is the component of the variable we want to eliminate
            # a[j,j] is the component we are multiplying
            # the multiplier is defined by the ratio of the two
            m[i,j] = a[i,j]/a[j,j]
            # iterate over components of the equation
            for k in range(j, a.shape[0]):
                a[i,k] = a[i,k] - m[i,j]*a[j,k]
            # update the right side of the augmented system
            b[i] = b[i] - m[i,j]*b[j]
    
    # store the results in a new upper triangular matrix U
    U = np.zeros([system.shape[0],system.shape[1]])
    U[:,:system.shape[1]-1] = a
    U[:,system.shape[1]-1] = b
    return U,a,b

Let's test it out:

U,a,b = solver(system)

print "the system in upper triangular form:\n", U

the system in upper triangular form:
[[ 1.  2.  1.  5.]
 [ 0. -4.  1.  2.]
 [ 0.  0. -2.  4.]]

print "the RHS of the system:\n", b

the RHS of the system:
[5 2 4]

We now need to plug in our values of x iteratively, starting with $x_n$. This algorithm is known as backsubstitution:

def backsubstitute(U,y):
    x = np.zeros(U.shape[0])
    for ind in range(U.shape[0]):
        i = U.shape[0] - ind - 1
        x[i] = y[i]
        for j in range(i+1,U.shape[0]):
            x[i] = x[i] - U[i,j]*x[j]
        x[i] = x[i]/U[i,i]
    return x

Let's try it out on a larger example:

aug_mat1 = np.array([[1,2,1,-1,5],[3,2,4,4,16],[4,4,3,4,22],[2,0,1,5,15]])

print aug_mat1

[[ 1  2  1 -1  5]
 [ 3  2  4  4 16]
 [ 4  4  3  4 22]
 [ 2  0  1  5 15]]

U,a,b = solver(aug_mat1)

print "the system in upper triangular form:\n", U

the system in upper triangular form:
[[ 1.  2.  1. -1.  5.]
 [ 0. -4.  1.  7.  1.]
 [ 0.  0. -2.  1.  1.]
 [ 0.  0.  0. -1.  3.]]

Calling backsubstitute on our upper triangular matrix and RHS of the system returns the solution:

backsubstitute(U,b)

array([ 16.,  -6.,  -2.,  -3.])

Try your own system of equations by entering large augmented matrices. Time your code and plot the running time as a function of input size for various size matrices [A b].

How do we know whether or not a system will have a solution? We also saw a function called the determinant:

\begin{equation} det(A) = \sum\limits_{\sigma \in S_n}^{}(-1)^{\# inv}\prod\limits_{i=1}^{n}a_{\sigma(i),i} \end{equation}

The summand on the left side refers to all permutations of rows (or equivalently columns) in the matrix. What is the order of the number of permutations? That is, how many permutations are there of rows (or columns) for an n x n matrix?

Can you write a function that takes as input a list (or a numpy array) and returns all permutations?

Recall that we defined the number of inversions as the number of pairs out of order in the list. That is, $ \forall i < j$, count all pairs such that $a[i] > a[j]$.

The final term is a product over all components of the permutation. That is, we multiply n components of A where the components are selected by first taking the row via the permutation and the column via the original index.

Your Assignment¶

Try our code on a few of your own systems of equations. Plot running time as a function of input size.
What is the order of the number of permutations? That is, how many permutations are there for n rows or columns?
Write a function that takes as input a list (or numpy array) and returns all permutations.
Write a function that takes as input a list and counts the number of inversions (or returns its parity).
What is the order of your algorithm for counting inversions? Can you beat $\mathcal{O}(n^2)$?
Write a function that computes the determinant using the pieces above. Why is Leibniz formula a poor way of solving a system of equations?