Sparse Polynomials and Matrices

CS321 Lecture: Sparse Polynomials and Matrices          revised 10/15/99

Objectives:

1. To illustrate a situation in which alternative data structures might be
   used depending on specifics.
2. To illustrate designing and implementing an ADT in C++
3. To lay groundwork for programming project

Materials:

1. Handout of Polynomial class - .h, .cc, tester

I. Sparse Polynomials
-  ------ -----------

   A. As an illustration of a situation in which alternative data structures 
      might be used, we consider the problem of representing a polynomial.

      1. A simple but often good representation: array of coefficients 
         (frequently stored in descending order of exponent) - e.g.

                  3    2
                4x + 3x - 5x + 4 could be represented by { 4, 3, -5, 4 };

         In C++

                const int degree = ...
                float coefficient[degree + 1];

         (Note that we must know the degree up front)

         This allows common operations such as evaluation at a point, addition,
         subtraction, comparison etc. to be done in O(degree) time.

         Note that exponents are NOT explicitly stored.  The exponent associated
         with a given coefficient is implicit in its position in the array.

      2. This representation is not good when the polynomial is SPARSE - e.g. 
         has relatively few non-zero terms compared to its degree:

                             1000
         Example: to store x     + 1 would take 1001 slots, and operations would
                  require 1001 steps, even though there are only two terms.

      3. In such cases, we might prefer one of several sparse representations, 
         in which we explicitly store both coefficients and exponents, but only 
         for terms where the coefficient is non-zero.

         ASK CLASS FOR POSSIBILITIES

         a. Parallel arrays:

                const int numTerms = ...
                float coefficient[numTerms];
                int exponent[numTerms];

         b. Array of structs:

                const int numTerms = ...
                stuct
                  { float coefficient;
                    int exponent;
                  } poly[numTerms];

         c. Both of the above suffer from a serious difficulty:

            ASK CLASS

            We must know up-front what the number of terms will be.  This is 
            especially problematical if we intend to read the polynomial in 
            from the user.  It may be reasonable to ask the user to tell us the 
            degree, but asking the user to count non-zero terms is a bit much!

   B. For this reason, we may consider representing a sparse polynomial by
      using a linked list: each cell holds one coefficient/exponent pair (plus
      a link to the next cell.)

      DISTRIBUTE/GO OVER HANDOUT OF POLYNOMIAL CLASS

      Key concepts (in addition to polynomial itself):

      1. Illustration of general approach to implementing an ADT using a C++
         class - note kinds of operations provided.
        
         a. Operations that almost all ADT's should provide

            i. Constructors: including default, copy, type-specific

           ii. Operator =

          iii. Operator == / !=

           iv. >> and <<

         b. Type specific

            i. Arithmetic

           ii. operator []

               (Note two versions here - one an accessor that does not allow
                the coefficient to be changed, and one that can be used as a
                mutator.  The latter is a bit problematic, since it may
                involve adding a term if the coefficient is currently zero,
                and may result in a zero term being stored, since it breaks
                the wall of encapsulation to allow direct access to coefficient;
                however, there is no alternative if we want to implement the
                full semantics of [])   

          iii. operator () - object can behave like a function!

      2. Illustration of using STL list template.

      3. Use of a nested class to encapsulate a representation

   C. Analysis of space usage - assume float, int, and pointer all take
      same amount of storage

      Let d = degree of polynomial
          n = # of non-zero terms in polynomial

      1. For ordinary array of coefficients representation

         d + 1

      2. For linked list representation

         3n + 1 if list is singly linked; 4(n + 1) + 1 if doubly linked with
         header (as here)

      3. Note that sparse representation saves space if n < 4/d

   D. Analysis of operations:

      In each case, compare list representation with ordinary array of
      coefficients representation

      1. Polynomial(int, float[])               O(d) for both

      2. Polynomial(const Polynomial &)         O(n) vs O(d)

      3. operators (), *= float, * float        O(n) vs O(d)

      4. operators =, +=, -=, +, -              O(max(n1,n2) vs O(max(d1,d2))

      5. operator *= Polynomial, * Polynomial   O(n1*n2) vs O(d1*d2)

      6. operator []                            O(n) vs O(1) -
                                                the one place where sparse
                                                representation loses

II. Sparse Matrices
--  ------ --------

   A. Matrices have numerous uses, especially in scientific computation.  Thus, 
      data structures and algorithms for working with matrices have been 
      extensively studied.

   B. In many cases, a good representation for a matrix is a two-dimensional
      array - e.g.

        float m[ROWS][COLS];

   C. But many applications involve working with matrices where many of the
      elements are zero.  In these cases, as was the case with sparse
      polynomials, we seek an alternate representation.

   D. Some special cases can make use of standard arrays, accessed in a
      non-standard way, if the non-zero elements are always known to
      lie in certain positions in the array.

      1. Example: A LOWER-TRIANGULAR matrix has the property that all the
         non-zero elements lie on or below the main diagonal. Ex:

                1   0   0   0   0
                3  -1   0   0   0
                4   7   5   0   0
                6  -8  -4   3   0
                -2  0   3   1   4

         Such a matrix might be stored in sequential locations in memory as 
         follows:

                a[1,1]
                a[2,1]
                a[2,2]
                a[3,1]
                a[3,2]
                a[3,3]
                a[4,1]
                a[4,2]
                a[4,3]
                ...

         With an addressing function (if subscripts start at 1):

         Slot occupied by element [i,j] (if it exists) is i*(i-1)/2 + j-1

         Example: C++ function to return value of element i,j of a lower
                  triangular matrix mapped onto a sequential array a

         float value(int i, int j)
           {
             if (j > i)
                return 0.0;
             else
                return a[i*(i-1)/2 + j-1];
           }

         (Another similar case is an UPPER-TRIANGULAR matrix).

      2. Example: A TRI-DIAGONAL matrix has the property that all elements are
         0 except those on the main diagonal, just above the main diagonal, and
         just below it (though some of these can be zero, too).  Ex:

                1   2   0   0   0
                2  -1   7   0   0
                0   3   0   5   0
                0   0   1   4   2
                0   0   0   3  -2

         Such a matrix might be stored in sequential locations in memory as 
         follows:

                a[1,1]
                a[1,2]
                a[2,1]
                a[2,2]
                a[2,3]
                a[3,2]
                a[3,3]
                a[3,4]
                a[4,3]
                a[4,4]
                a[4,5]
                a[5,4]
                a[5,5]

         With an addressing function (if subscripts start at 1):

               Slot occupied by element [i,j] (if it exists) is 3*(i-1) + (j-i)

         Example: C++ function to return value of element i,j of a tridiagonal
                  matrix mapped onto a sequential array a

         float value(int i, int j)
           {
             if (abs(i-j) > 1)
                return 0.0;
             else
                return a[3*(i-1) + (j-i)]'
           }

   E. In the case of a sparse matrix that lacks such a regular structure,
      we have choices similar to those we faced with sparse polynomials.
      In any case, we store the non-zero elements as triples of the form
        
         (row, column, value)

      1. In parallel arrays

      2. In an array of structs

      3. In a linked structure.  Here there are two alternatives to consider:
        
         a. A singly linked list, in which the triples are kept in row
            major order - e.g. corresponding to the order in which elements
            of a dense representation are kept in memory.

         b. A MULTILIST in which each node appears on two lists - one linking 
            all the elements in the same row, and one linking all the elements 
            in the same column.

        Example: for the matrix         0  0  3  0  0 
                                        2  0  7  0  5
                                        0  0  0  0  4
                                        8  0  1  0  0
                                        0  0  0  0  0

        we have the following singly linked, row major list representation:
        (assuming we number the rows and columns 1, 2, ...)

    _______    _______    _______    _______    _______    _______    _______   
    | 1 3 |    | 2 1 |    | 2 3 |    | 2 5 |    | 3 5 |    | 4 1 |    | 4 3 |
    |  3  |    |  2  |    |  7  |    |  5  |    |  4  |    |  8  |    |  1  |
    |_____|--> |_____|--> |_____|--> |_____|--> |_____|--> |_____|--> |_____|--
                                                                              | 

        and we have the following multilist representation: (where * = maxint)

_______         _______         _______         _______         _______ 
| * * |      -> | * * |         | * * |      -> | * * |         | * * |<--
|     |      |  |     |         |     |      |  |     |         |     |  |
|     |--    |  |     |-> to    |     |-> to |  |     |-> to    |     |--
|_____|  \   |  |_____|   row 2 |_____|row 3 |  |_____|   row 4 |_____|
   |      \  |_____|               |         |_____|               |
   |       \                       v                               |
   |        \                   _______     to                     |
   |         -----------------> | 1 3 | --> hdr 1                  |
   |                            |  3  |                            |
   |                            |     |                            |
   |                            |_____|                            |
   |                               |                               |
   v                               v                               v
_______                         _______                         _______
| 2 1 |                         | 2 3 |                         | 2 5 |
|  2  |                         |  7  |                         |  5  |    to
|     |-----------------------> |     |-----------------------> |     |--> hdr 2
|_____|                         |_____|                         |_____|
   |                               |                               |
   |                               |                               v
   |                               |                            _______
   |                               |                            | 3 5 |
   |                               |                            |  4  |
   |                               |                            |     |--> to
   |                               |                            |_____|    hdr 3
   |                               |                               |
   v                               v                               v
_______                         _______                           to hdr 5
| 4 1 |                         | 4 3 |
|  8  |                         |  1  |
|     |-----------------------> |     |-> to
|_____|                         |_____|   hdr 4
   |                               |
   v                               v
  to hdr 1                       to hdr 3

   F. Some remarks about operations on either such representation - note that
      actually implementing the first one will be your first project:

      1. Storing a new value at a certain position - e.g.

         store(float newValue, int atRow, int atCol);

         Such a procedure needs to consider four cases:
                               Current value
                                =======================================
                                Zero (no node           Non-zero
            Value to be         for this element        (a node for this
            stored              exists)                 element already exists.)
            ===========                                 
            Zero                Do nothing              Delete existing
                                                        node
            Non-zero            Create new node         Modify existing node

      2. In principle, most other operations could be defined in terms of
         the procedures for accessing and storing a value.  For example,
         the following pseudo-code would add two conformable matrices:

         Matrix Matrix::operator + (const Matrix & other) const
           { Matrix result;
             for (int row = 0; row < maxrow; row ++)
                for (int col = 0; col < maxcol; col ++)
                    result.store(value(row, col) + other.value(row, col),
                                 row, col);
             return result;
           }

           a. This is very inefficient.  WHY?  (ASK CLASS).

           b. An efficient algorithm would, instead, traverse the 
              list representations of A and B, generating the corresponding
              list representation of the result. 

      4. Note that the destructor for such a matrix must arrange to dispose
         of all nodes used by the matrix.  (If an STL list is used, this
         happens automatically!)

   G. One very important use of sparse matrix techniques is electronic
      spreadsheets:

      1. The first "killer application" in the history of PC's was probably
         Lotus 1-2-3.  

         a. The original Lotus ran on IBM-PC class machines with as little 
            as 256 K of main memory.  Of this, less than 128 K was actually 
            available for storing the spreadsheet grid after allowing space for 
            the MSDOS and Lotus code.

         b. A Lotus spreadsheet was 255 rows x 2048 columns.  If we used
            an ordinary matrix to store the grid, and needed only 1 byte
            per cell (an impossibly low number), we would need 512K just
            to store an empty grid!

      2. Modern computers, of course, have much more memory - but they also
         allow much bigger spreadsheets, so the same technique is still
         useful.

      3. Therefore, spreadsheets can use a sparse matrix technique, 
         in which only cells that are not blank need actually be stored.  (The
         actual technique used by Lotus, however, is not the one we have used
         here, but it could be.)

   H. We will not discuss the analysis of operations on a sparse matrix,
      because you will get to do this for your project!
Copyright ©1999 - Russell C. Bjork