CRC Algorithm  Notes 12-23-03 Rev. 4/24/04  S. H.Unger

P is the CRC polynomial.  Assume degree n, e.g., D^3 +D^1 +I or
I+D+D^3, n=3
All addition here is modulo 2.

M is a sequence of k message bits.  Normally, k >n.  We can represent
M as a polynomial.  e.g., 1001101 can be written as I+D^3+D^4+D^6 In
practice, the polynomials used are usually of high degree, with 16,
and 32 being common choices.  The message blocks often consist of
thousands of kilobytes.

We can divide one polynomial by another to obtain a quotient
polynomial and a remainder polynomial.  M/P=Q+R/P

Given a message M, the transmitter generates a check sequence (CS) by
first extending M by appending n 0's at the right end, dividing the
result (call it X) by P (which is very carefully chosen).  The
remainder R, is used as the CS for M.  It is n bits long and is
appended to the right end of M to form the message to be transmitted.

At the receiving end, the received message Y, k+n bits long, is
divided by the same P.  If the remainder is not all 0's, then there
must have been an error.

This scheme will detect a very wide range of errors, all single errors
and a great many multiple errors and error bursts.  The likelihood of
a message corrupted by having almost ALL its bits changed being
accepted as error free is about (1/2)^n.

The polynomial divisions are done by linear feedback shift registers
either in actual hardware or by software simulations.

Example 1:
P = I+D+D^2, n=2
M=10011 = I+D^3+D^4, k=5
First generate the checking sequence CS

         I+D+  D^4
        ____________________ 
 I+D+D^2|I+D^3+D^4+0D^5+0D^6
         I+D+D^2
         --------
           D+D^2+D^3
           D+D^2+D^3
           -----------------
                D^4+0D^5+0D^6
                D^4+ D^5+ D^6
                -------------
                     D^5+ D^6  This is the remainder, 11

We append this to the end of M to obtain Y=1001111 and transmit this.
The receiver divides Y by P as below (where it is assumed that the
message is uncorrupted).


         I+D+  D^4
        ____________________ 
 I+D+D^2|I+D^3+D^4+D^5+D^6
         I+D+D^2
         --------
           D+D^2+D^3
           D+D^2+D^3
           -----------------
                D^4+D^5+D^6
                D^4+D^5+D^6
                -------------
                          0   No remainder.

The remainder is 0, so it is assumed that no errors occurred.

The division operations can be performed by a linear feed back
shift-register (LFSR) as below.  We drive it with:

Z=D^nX/P=D^2X/(1+D+D^2) 
Z+DZ+D^2Z=D^2X
Z=D(Z + D(Z + X))
                   y
Z--|---(D|---(+)----(D|-----(+)----X
   |__________|______________|

The generation of CS is as below

         X:   1001100-
 DX+DZ=  y:   01010101
 Dy+DZ=  Z:   00110011  The CS is the sequence Zy FOLLOWING the last
 input.

So it is generated correctly here as 11.  Note that the quotient bits
(Z), delayed by n=2, are correct.  (We don't use them.)  At the
receiver, the incoming message is checked as below:

         X:   1001111-
 DX+DZ=  y:   01010110
 Dy+DZ=  Z:   00110010   Again, the CS is found as yZ after the last
 input, or 00.

Why does this work?  Note that the LFSR is described by Z=D^nX/P, so,
basically its output is X/P, with Z corresponding to the quotient
bits, and, as will be shown below, the remainder, R, being in the
delays at the end.
1. If we set the LFSR input X=P, i.e., to a sequence corresponding to
the polynomial P (in this example 1+D+D^2), then the output will be an
"impulse" at time n (see discussion 1* below) and the storage elements
will be clear at n+1.  They have to be, or else the output would
continue with non-zero values.  The key point here is that applying P
leaves the delay outputs all at 0 at t=n+1.
2. If we apply  X= D^iP, we also end up with the delay outputs clear
at t=n+i+2.
3.  So if we apply the product of polynomials QP, where Q is any
polynomial of degree j to our circuit (based on P), the result at
t=n+j+2 is that all delay outputs are clear.
4.  If we apply an input corresponding to the product of D^(j+2) and a
polynomial R of degree LESS than n+1, the contents of the delays
(reading from left to right) will be R at time t=n+j+2, since
(assuming as always that we start with cleared delays) Z must remain
at 0 for the full transmission as the input stream pushes its way thru
the n delays effectively in cascade (since the Z-inputs to the adders
are all 0)
5.  So if we apply an input corresponding to QP+R,  we can see, by
superposition, that the the R sequence winds up at the delay outputs
following the last member of the sequence.  (Key point is that the
length of the remainder R is n, X remains fixed at 0 and D^jZ is 0 (the
only Z=1 signal occurred more than n units of time ago.)  So all
delays hold 0 values for time t>n.

The degree of the polynomial determines the extent to which a wide
range of errors can be detected.  Degree 16 and degree 32 polynomials
are commonly used. Message lengths should be greater than the degree
of the polynomial and are often many kilobytes.  (The polynomials used
here are very small in order to make the computations simple.)

EXAMPLE: Let P=I+D+D^3  (so n=3) and let the message bits be
M=100101.  
The filter is shown below:
                  y2     y1
Z--|---(D|---(+)----(D|----(D|---(+)----X
   |__________|___________________|

y1=D(X+Z), y2=D(y1), Z=D(y2+Z), so we can generate the relevant
signals to produce the 3-bit check sequence as below (starting by
appending 3 0's to the end of M):

 X: 1 0 0 1 0 1 0 0 0 -
y1: 0 1 0 0 0 1 0 1 0 0
y2: 0 0 1 0 0 0 1 0 1 0
 Z: 0 0 0 1 1 1 1 0 0 1  We get the check sequence C=Zy2y1=100 from
 the rightmost column.

We append C to the message and thus transmit 1 0 0 1 0 1 1 0 0.

If there are no errors, the receiver computes as below, using the same
filter and verifies that the transmission is error-free by noting
that, in the last column, Zy2y1=000.

X:  1 0 0 1 0 1 1 0 0 -
y1: 0 1 0 0 0 1 0 0 0 0
y2: 0 0 1 0 0 0 1 0 0 0
 Z: 0 0 0 1 1 1 1 0 0 0