Algorithm Analysis

CS122 Lecture: Introduction to Algorithm Analysis - Last revision 4/16/98

OBJECTIVES:

1. To introduce the notion of algorithm analysis in terms of time and
   space (by counting instructions/memory cells)
2. To introduce the O() measure of complexity and show how the O() measure can
   be obtained by inspecting an algorithm.
3. To explain the significance of the O() complexity of an algorithm.

Materials: transparencies of rates of growth table and graph

        Demo program of various solutions to Bentley's problem (in CS122.DEMOS)

I. Introduction to algorithm analysis

   A. One of the things one discovers is that there are often several ways of
      doing the same job on a computer.  Example: An electronic phone directory.
      (Input: name; output: number or message that person is unknown.)

      1. Could be implemented using an unordered array  
      2. Could be implemented using an ordered array
      3. Could be implemented using a linked list       
      4. Could be implemented using a binary tree
      5. Could be implemented using hashing
      6. etc.

   B. One mark of maturity as a Computer Scientist is the ability to choose
      intelligently from among alternative ways of solving a problem.  This
      implies some ability to measure various options to assess their "cost".
      The two most common measures are:

      1. Time

         a. CPU cycles (typically < 10 ns)
         b. Disk accesses (typically ~ 10 ms

            Note - btw - the 1 : 10^6 ratio between the above

      2. Space

         a. Main storage
         b. Secondary storage - blocks

      3. Also to be considered is programmer effort to write and maintain the
         code, of course.

   C. Often, there is a trade-off between space and time; one can gain speed
      at the expense of more space and vice versa.  (But some bad algorithms
      are hogs at both)

   D. One must also consider the types of operations to be performed.
      Example - in the above:

      1. If searches are very common and insertions/deletions are rare, then
         ordered array may be best method, since it is very space efficient
         and allows time-efficient binary search.  (In fact, for search it
         cannot be beat on either count except by hashing, which is time
         efficient but not space efficient.)

      2. But if insertions/deletions are at all common, then we would
         probably reject the ordered array as too time costly 

   E. Therefore, one must often analyze algorithms for performing various tasks
      in order to discover the best method.  Such analyses are generally done
      with a view to measuring time or space as a function of n -some parameter
      measuring the size of a particular instance of the problem  (e.g. the 
      number of names in the phone directory.)

   F. An example: Algorithm analysis applied to time complexity of two
      algorithms for searching a list - one for unordered lists (linear search);
      the other an ordered list (binary search). 

CONST   LSize = 1000;

VAR     L: ARRAY[1..LSize] OF CHAR;     (* List of characters to search *)
        N: 0 .. LSize;                  (* Number of slots currently used *)

FUNCTION LSearch(C: CHAR): BOOLEAN;
(* Searches a global list of characters, L, to see if C is present.
   If so, returns TRUE; else returns FALSE.  Uses a linear search. *)

        VAR     I: 1..LSize;
                Found: BOOLEAN;

        BEGIN
                Found := FALSE; I := 1;

                WHILE (NOT Found) AND (I <= N) DO
                        IF L[I] = C THEN
                                Found := TRUE
                        ELSE
                                I := I + 1;

                LSearch := Found
        END;

FUNCTION BSearch(C: CHAR): BOOLEAN;
(* Searches a global list of characters, L, to see if C is present.
   If so, returns TRUE; else returns FALSE.  Uses a binary search;
   therefore requires that L be sorted in ascending order. *)

        VAR     Lo, Hi, Mid: 1..LSize;

        BEGIN
                Lo := 1; Hi := N; Mid := (Lo + Hi) DIV 2;

                WHILE (L[Mid] <> C) AND (Lo <= Hi) DO
                    BEGIN
                        IF L[Mid] < C THEN
                                Lo := Mid + 1
                        ELSE
                                Hi := Mid - 1;
                        Mid := (Lo + Hi) DIV 2
                    END;

                BSearch := (L[Mid] = C)
        END;

      1. The above were compiled using the NBS Pascal compiler on a PDP-11, and
         the number of memory cycles needed for each method was computed as a
         function of the number of items in the lists - assuming that all
         values present in the list were equally likely to be searched for):

         a. LSearch: Item found: 28  + 23*N/2
                     Not found:  28  + 23*N  

         b. BSearch: Item found: 167 + 42.5*LogN
                     Not found:  (same)

      2. Some values (in estimated memory cycles)

N           Linear average, worst   Binary

     2          51          74      209  
     3          63          97      234
     4          74         120      252
     5          86         143      266  
     6          97         166      277
     7         109         189      287
     8         120         212      295
     9         132         235      302
    10         143         258      308  
    20         258         488      351  
    30         373         718      376  
    40         488         948      393  
    50         603        1178      407  
   100        1178        2328      449  
  1000       11528       23028      590  
 10000      115028      230028      731  

       3. Observe that for large N one term in the expression dominates
          (N in LSearch; logN in BSearch).

   G. Another example: Simplest form of bubble sort (Here we will not attempt 
      to measure actual memory cycles but will use constants t1, t2 ... to 
      stand for times for various operations.)

        FOR i := 1 TO n-1 DO                            t1 to setup + t2/loop
                FOR j := 1 TO n-1 DO                    t3 to setup + t4/loop
                        iF x[j] > x[j+1] THEn           t5
                                (* Exchange them *)     t6 with probability p

      1. Time = t1 + (n-1)t2 + (n-1)t3 + (n-1)*(n-1)(t4+t5+pt6)
                           2
              = (t4+t5+pt6)n  + (t2+t3-2t4-2t5-2pt6)n + (t1-t2-t3+t4+t5+pt6)
                  2
              = c1n + c2n + c3

      2. For large n, the last two terms become arbitrarily small when compared
         to c1n^2.  Therefore, we take c1n^2 as an approximate value for the
         run time. 

      3. Further, we note that c1 is basically determined by the particular
         hardware on which the problem is run.  However, the fact that the
         execution time grows proportionally to n^2 is a fundamental property
         of the algorithm, regardless of hardware.  Therefore, we say that the
         bubble sort is an O(n^2) algorithm.

      4. Since the number of statements executed is proportional to n^2, we
         say that bubble sort is an O(n^2) algorithm.  Its time grows as the
         square of the number of items to sort.  In particular, if sorting 100 
         items takes (say) 1ms of CPU time on a certain CPU, then we expect:

         a. 200 items to take about 4ms

         b. 1000 items to take about 100 ms

         c. 10,000 items to take about 10 seconds

            etc.

      5. In the last two examples, we have done rather detailed analysis of
         algorithms.  If we had to do this every time, analysis could be very
         difficult.  However, we will now see that we can arrive at an order
         of magnitude estimate - a "big O" rating - fairly easily.

II. An example of a problem where efficiency of the algorithm makes a big
    difference (taken from Programming Pearls article by Jon Bentley - 
    9/84 CACM):

   A. Consider the following task: given an array x[1..N] of real, find the
      maximum sum in any CONTIGUOUS subvector - e.g.

        31 -41 59 26 -53 58 97 -93 -23 84

      the best sum is x[3] + x[4] + x[5] + x[6] + x[7] = 187

   B. Observe:

      1. If all the numbers are positive, the task is trivial: take all of
         them.

      2. If all the numbers are negative, the task is also trivial: take a
         subvector of length 0, whose sum, therefore, is 0.

      3. The problem is interesting if it includes mixed signs - we include
         a negative number in the sum iff it lets us "get at" one or more
         positive numbers that offset it.

         a. In the above, we included -53 because 59 + 27 on the one side
            or 58 + 97 on the other more than offset it.  The continguous
            requirement would force us to omit one or the other of these
            subvectors if we omitted -53.

         b. We did not include -41. It would let us get at 31, but that is
            not enough to offset it.  Likewise, we did not include -93.

     C. We will consider and analyze four solutions:

        1. The most immediately obvious - but poorest - method is to form
           all possible sums:

                MaxSoFar := 0;
                for l := 1 to n do
                    for u := l to n do
                      begin
                        sum := 0;
                        for i := l to u do
                            sum := sum + x[i];
                        MaxSoFar := Max(MaxSoFar, sum)
                      end;

           (We assume a function Max of two arguments that returns the bigger
            of the two - a trivial task to code.)

           a. Time complexity?? - ASK

           b. The outer for is done n times.  Each time through the outer for
              the middle for is done 1 to n times, depending on l.  (The average
              is n/2 times.)  The inner for is done 1 to n times each time 
              through the middle for, depending on l and u.  (The average is
              n/2 times.)  Thus, the sum := sum + x[i] statement is done:

                n * (n/2) * (n/2) = n^3/4 = O(n^3) times

           c. Implication: doubling the size of the vector would increase the
              run time by a factor of 8.

           DEMO: Run demo program for n = 100, 500, 1000

        2. A better method is to take advantage of previous work, as follows:

                MaxSoFar := 0;
                for l := 1 to n do
                  begin
                    sum := 0;
                    for u := l to n do
                      begin
                        sum := sum + x[u];
                        MaxSoFar := Max(MaxSoFar, sum)
                      end
                  end;

          a. Complexity?  - ASK

          b. The outer for is done n times; the inner for 1..n for each time
             through the outer (average n/2).  The inner begin..end, then,
             is done:

                n * (n/2) = n^2/2 = O(n^2) times.

             This is much better.

          DEMO: Run demo program for N = 500, 1000, 5000

       3. An even better method is based on divide and conquer:

          a. Divide the array in half.  The best sum will either be:

             - The best sum of the left half
             - The best sum of the right half
             - The best sum that spans the division

        _________________________________________________
        |                       |                       |
        | <-->                <---->         <-->       |
        |                       |                       |
        -------------------------------------------------

          b. We can find the best sum of each half recursively.

          c. There are two trivial cases in the recursion:

             i. A subvector of length 0 has best sum 0.

            ii. A subvector of length 1 has best sum either equal to its
                one element (if that element is positive) or 0 (if the element
                is negative.)

                function MaxSum(L, U: integer): real;
                  var
                    M, i: integer;
                    sum, MaxToLeft, MaxToRight: real;
                  begin
                    if L > U then
                        MaxSum := 0
                    else if L = U then
                        MaxSum := Max(x[L], 0)
                    else
                      begin
                        M := (L + U) div 2;
                        sum := 0; MaxToLeft := 0;
                        for i := M downto L do
                          begin
                            sum := sum + x[i];
                            MaxToLeft := Max(MaxToLeft, Sum)
                          end;
                        sum := 0; MaxToRight := 0;
                        for i := M + 1 to U do
                          begin
                            sum := sum + x[i];
                            MaxToRight := Max(MaxToRight, Sum)
                          end;
                        MaxSum := Max3(MaxToLeft + MaxToRight,
                                       MaxSum(L,M), MaxSum(M+1, U)
                      end
                  end;

            d. Initial call is MaxSum(1,n)

            e. Analysis: Each non-trivial call to MaxSum involves an O(U-L+1)
               loop + O(U-L+1) calls, each of which faces a problem of half
               the size.  

               i. The time complexity may be analyzed in terms of a recurrence
                  equation.  Let T(n) = the time to solve a problem of size n.
                  Then we have:

                        T(1) = O(1)
                        T(n) [for n > 1] = 2T(n/2) + O(n)

                  It can be shown mathematically that this recurrence has the 
                  solution:

                        T(n) = O(n log n) for all n > 1

              ii. This can be seen intuitively from the following tree 
                  structure:

                                1 .. n

                        1 .. n/2        n/2+1 .. n

                1..n/4  n/4 +1 .. n/2   n/2+1 .. 3n/4  3n/4+1 .. n

                etc.

        1       2       3       4       ....            n-2     n-1     n

                (we start with the problem of finding a solution in the vector
                 x[1] .. x[n], which leads us to two subproblems for 
                 x[1] .. x[n/2], x[n/2 + 1] .. x[n], each of which leads to
                 two subproblems ...  Expansion of the tree stops when we reach
                 n subproblems, each of size 1.)

                 - At each level, the total work is O(n)

                 - The number of levels is O(log N)

                 - Thus, the task done this way is O(n log N)

         DEMO: Run demo program for N = 5000, 50, 000, 100,000

      4. The best solution, however, beats even this.  We use the following
         method:

         a. Suppose that, in solving the problem for the vector x[1] .. x[n],
            I first obtain the solution for the vector x[1] .. x[n-1].  Then
            clearly, the solution for x[1] .. x[n] is one of the following:

            - The same as the solution for x[1] .. x[n-1]
         or - A solution which includes x[n] as its last element.  This latter
              solution consists of the sum of the best subvector ending at
              x[n-1] (which may be the empty vector) + x[n].

         b. These observations lead to the following algorithm:

            BestEndingHere := 0;
            BestSoFar := 0;
            for i := 1 to n do
              begin
                BestEndingHere := max(BestEndingHere + x[i], 0);
                BestSoFar := max(BestSoFar, BestEndingHere)
              end;

         c. Clearly, this solution is O(n).  Further, we cannot hope to
            improve upon it, since any algorithm must at least look at each
            element of the vector once, and thus must be at least O(n).

         DEMO: Run demo program for N = 5000, 50, 000, 100,000

III. Methodology for Doing Algorithm Analysis

   A. Formally, we say that a function T(n) is O(f(n)) if there exist positive
      constants c and n0 such that:

        |T(n)| <= c|f(n)| whenever n >= n0.

      We then say that T(n) = O(f(n)) - note that the less precise O function
      appears on the right hand side of the equality.

      In the bubble sort, we are saying that T(n) = c1n^2 + c2n + c3 is O(n^2),
      so f(n) is n^2.  To show that this claim holds, let n0 be an arbitrary
      positive value and let c be c1 + c2/n0 + c3/n0^2.  Then we have cf(n) =

        c1n^2 + c2n^2/n0 + c3n^2/n0^2.

      Clearly c1n^2 +c2n + c3 is <= this whenever n >= n0.

      example: Linear search is O(n).  Binary search is O(log n).

   B. To compute the order of a time or space complexity function, we use the
      following rules:

      a. If some function T(n) is a constant independent of n (T(n) = c), then
         T(n) = O(1).

      b. We say that O(f1(n)) is greater than O(f2(n)) if for any c >= 1
         we can find an n0 such that |f1(n)|/|f2(n)| > c for all
         n > n0.  In particular, we observe the following relationship among
         functions frequently occurring in analysis of algorithms:

                                                           2       3       n
        O(1) < O(loglogn) < O(logn) < O(n) < O(nlogn) < O(n ) < O(n ) < O(2 )

      c. Rule of sums:  If a program consists of two sequential steps with
         time complexity f(n) and g(n), then the overall complexity is
         O(max(F(n),g(n))).  That is, O(f(n)) + O(g(n)) = O(max(f(n),g(n))).
         Note that if f(n) >= g(n) for all n >= n0 then this reduces to 
         O(F(n)).  

         Corollary: O(f(n)+f(n)) = O(f(n)) - NOT O(2f(n))

      d. Rule of products: If a program consists of a step with complexity g(n)
         that is performed f(n) times [i.e. it is embedded in a loop], then
         the overall complexity is O( f(n)*g(n) ), which is equivalent to
         O(f(n)) * O(g(n))

         Corollary: O(c*f(n)) = O(f(n)) since O(c) = 1

      e. Example - with bubble sort the comparison (and possible exchange)
         step has complexity 1 but is embedded in a loop FOR j := 1 to n-1 that
         has complexity O(n). The inner loop as a whole consists of a setup
         step O(1) and overhead O(n).  Therefore, the time complexity of the
         inner loop is O(1)+O(n)+O(n) = O(n).  The outer loop consists of a
         setup step O(1) and overhead O(n) + the inner loop which has O(n)
         complexity done O(n) times and hence is O(n^2).  Therefore, the time
         for the outer loop - and the overall program - is O(1) + O(n) +
         O(n^2) = O(n^2).

   C. It is often useful to calculate two separate time or space complexity
      measures for a given algorithm - one for the average case and one for
      the worst case.  For example, some sorting methods are O(nlogn) in the
      average case but O(n^2) for certain pathological input data.

   D. The O() measure of a function's complexity gives us an upper bound on
      its rate of growth. Less frequently we speak of a lower bound omega, 
      saying that T(N) is omega(F(N)) if there exists a c st T(N) >= cF(n) for
      infinitely many values of n.

      Example: The subvector sum PROBLEM is omega(N) - any solution must
      use O(n) time.

   E. While the O measure of an algorithm describes the way that its time or
      space utilization grows with problem size, it is not necessarily the
      case that if f1(n) < f2(n) then an algorithm that is O(f1(n)) is better
      than one that is O(f2(n)).  If it is known ahead of time that the
      problem is of limited size (e.g. searching a list that will never
      contain more than ten items), then the algorithm with worse behavior
      for large size may actually be better because it is simpler and thus
      has a smaller constant of proportionality.

      example: For searching a linear list, using the particular values
      computed for the PDP-11, simple linear search is preferred for n <= 6,
      binary above that.

IV. Why O() figures are important: rates of increase of functions

   A. TRANSPARENCY FROM DALE/LILLY

   B. Graph of the above (partial): TRANSPARENCY

   C. If we know the time needed by a particular algorithm for one value of
      n and we know its time complexity, we can project its time for other,
      higher values of n - e.g.

      1. An O(n^2) sort - if it needs 100 ms for 1000 items, it will need
         400 ms for 2000, 900 ms for 3000, 1.6 sec for 4000 etc.

      2. An O(nlogn) sort - it it needs 100 ms for 1000 items, it will need
         (2000 log 2000)/(1000 log 1000)*100 = 2*1.1*100 = 220 ms for 2000,
         (3000 log 3000)/(1000 log 1000)*100 = 3*1.15*100 = 345 ms for 3000,
         (4000 log 4000)/(1000 log 1000)*100 = 4*1.2*100 = 460 ms for 4000 etc.

   D. Observe: if an O(nlogn) algorithm was 10 times slower than an O(n^2)
      algorithm for n = 1, it would beat the O(n^2) algorithm for all n > 60
      or so.  If it were a 100 times slower, it would beat the O(n^2)
      algorithm for all n > 1000 or so.
Copyright ©1999 - Russell C. Bjork