CS122 Lecture: Implementing Stacks with Dynamic Variables       2/9/96

Objectives:

1. To introduce the notion of linked lists.
2. To show how lists can be realized in Pascal by using dynamic variables.
3. To show how to realize stacks using lists.

Materials:

1. Transparency of linked stacks.

I. Introduction.
-  ------------

   A. We have now covered a number of different data structures that are
      linear in structure - that is, each member of the structure has at most
      one neighbor on each side:

      1. Arrays
      2. Character strings
      3. Stacks

   B. Of these linear structures, only one is standard to the Pascal language:
      the array structure.  However, we have seen that the other linear 
      structures can be implemented in practice by using an array.  That is,
      the LOGICAL adjacency of two items in the structure is modelled by
      PHYSICAL adjacency within an array.

   C. Using an array to implement linear structures as we have done does lead
      to some problems, though:

      1. We have to decide ahead of time what the maximum number of items
         in the structure will be 

      2. Suppose we have a program in which multiple occurrences of a data
         structure must share the same space.  We could, of course, allocate
         a fixed amount of space to each.  But if sizes could vary widely from
         one run of the program to another, this could lead to problems.  E.g.
         suppose we had two stacks and allocated 100 elements to each.  If
         the first had only ten elements and the second needed to grow to 101,
         then the program would crash for lack of room, even though plenty of
         space is available.  Alternately, we could have several structures
         share the same array:

         a. Example: two stacks.  These can share the same array as follows:

                     top1                      top2
                      |                         |
                STACK 1 --> growth              growth <-- STACK 2

            So long as the sum total of the sizes of the two stacks does not
            exceed the available space, all is well.  (But we do have to
            write separate versions of push and pop for each stack, since in
            one case push increments top and in the other it decrements it,
            etc.)

         b. Even this technique breaks down quickly, however.  What if we
            wanted three stacks to share the same array?  We could only handle 
            this by being prepared to ocassionally pick up a structure and move 
            it - a cumbersome and time-consuming process.

      3. We will see later that some structures are best implemented using a
         structure that is conceptually circular - i.e. the last element in
         the structure is considered to be adjacent to the first.  This cannot
         be directly modelled by physical adjacency in memory.

II. Introduction to Linked Lists and Dynamic Variables in Pascal
--  ------------ -- ------ ----- --- ------- --------- -- ------

   A. The above problems can be handled if we can somehow find a way to relax
      the requirement that LOGICAL ADJACENCY be modelled by PHYSICAL adjacency.
      One way to do this is with a linked structure.

      1. In a linked structure, each element contains both a value and a
         special field called a link which points to its neighbor(s).  

      2. The link can be implemented in one of two ways:

         a. The entire structure can be represented by an array of elements,
            whose physical order is irrelevant.  Links can be represented by
            indices into this array.  Thus, instead of finding the index of
            the successor of an element by adding one to its index, we will
            use an explicitly stored link index.

         b. In a number of languages (including Pascal), actual machine memory 
            addresses can be used as links.  These links are commonly called
            "pointers" and the objects they point to are called "dynamic
            variables".

   B. Now we look at a mechanism available in Pascal and many other languages
      (but not all) which allows a flexible and efficient approach to
      implementing various data structures.

      1. Instead of allocating storage for all of our data ahead of time (by
         means of var declarations), we will allocate some storage this way
         and some as we need it.  We call this DYNAMIC STORAGE ALLOCATION, and
         we call the variables created this way DYNAMIC VARIABLES.

      2. This means that, a dynamic structure can grow to any size needed as
         the program runs - subject, of course, to total available memory on 
         the computer.

   C. Dynamic variables involve several language features that we have not
      seen before.

      1. Pascal has a mechanism whereby a variable can contain the actual 
         physical memory ADDRESS of another data item.  We call such a
         variable a POINTER.

      2. In Pascal we may declare a variable to be a POINTER to an object of
         a certain type.  This means that it will store the actual memory
         ADDRESS of an object of that type.  Example:

        var
            a: ^real;
            b: ^integer;
            p: ^node;

         declares that a will contain the address of an object of type integer,
         b the address of an object of type real, and p the address of an object
         of type node (where we assume that the type node has already been
         declared).

         a. Note that each pointer variable is constrained to point to an object
            of a certain type.  For example, it would not be possible, given the
            above declarations, to make p point to an object of type real.

         b. But any type can be used in declaring a pointer - built in or
            programmer-defined.

         c. A pattern like the following is often encountered in programs using
            dynamic variables.  (Here, we are setting up the structure we will
            need for a dynamic implementation of a stack of integers).

         type
             nodeptr = ^node;
             node = record
                info: integer;
                link: nodeptr
             end;

             i. Note that this represents an exception to a pattern we have
                always seen up until now in Pascal declarations.  What is it?
                (Ask)

            ii. We have here a case where a type is used BEFORE it is declared. 
                Note that the type node is used in declaring nodeptr before the
                type node is itself declared.  This forms an exception to the
                normal rule that declaration preceeds use, and is necessary
                so that a node can contain pointer fields that point to other
                nodes of the same type.

           iii. The language definition allows for this one exception: a type 
                name may be used in declaring a pointer type before the named
                type is itself declared - provided that a later declaration
                for the type in question is given in the same block.

            iv. The use of names like "node" and "nodeptr" is traditional; but
                there is nothing special about these names - they are just
                Pascal identifiers.  We could have used "aardvark" and "zebra"
                if we wanted to - though the program would be harder to read
                this way!

         d. The basic idea we will use is this: each object pushed on the stack
            will be represented by a node, containing the value pushed and a
            pointer to the previous item on the stack.  A single ordinary
            variable will contain an EXTERNAL POINTER to the top of the stack.

         Example: The stack resulting from pushing 1, 42, and -3 (in that order)
                  onto an empty stack would be:

                _____   _____   _____
                |-3 |   |42 |   | 1 |
         o----->| o-|-->| o-|-->| o-|--
                -----   -----   -----  |
                                     -----
                                      ---

      3. The standard procedure NEW takes a pointer variable as its argument,
         creates a new object of the type the variable is declared to point to,
         and returns the address of the newly created object.

         Example:

                Given the declaration
        
                    var p:nodeptr

                We could execute the following statement:

                new(p)

                This would cause 8 bytes of storage to be allocated, and would
                set the variable p to point to the first byte.  (The size 8
                bytes represents the size of an object of type node on our VAX.)

         a. In memory, this is accomplished as follows.  Pascal divides
            memory into three major regions:

            i. A fixed-size region, holding code and global variables.
           ii. A stack, holding stack-frames of procedures and functions.
          iii. A heap, from which space is taken for dynamic variables

         b. e.g.

                -----------
                | CODE    |
                | DATA    |
                |---------|
                | Heap    | <-- new obtains space from this region
                |         |
                |---------|
                |         |
                | HW Stack| <-- used by hardware to keep track of procedure
                -----------     calls/returns and for local variables

         c. When new is executed, the Pascal system carves a chunk of storage
            of the appropriate size out of the heap, and returns its address in
            the variable passed as a parameter to new.  The "appropriate size"
            is determined by the declaration of the pointer variable.  Recall
            that a pointer variable is always declared to point to an object
            of a certain type, so the size is the size of the type of object
            the pointer points to.

         d. Each time we push an item on our stack, we will use new to obtain
            space for it.

      4. Corresponding to new is a standard procedure DISPOSE which takes
         a pointer variable and returns the object it points to to the heap
         for possible reallocation later.  (Not all dialects of Pascal implement
         this, though.)

         a. Example:

                new(p);
                (* p now points to an object of type node. *)
                ...
                dispose(p);
                (* that object no longer exists.  Any further attempt to
                   refer to it is an error. *)

         b. Each time we pop an item from the stack, we will use dispose to
            free up the space it occupied.

      5. Sometimes, it is necessary to give a pointer a value that denotes
         that it is null - i.e. it does not point to anything.  Pascal supplies 
         the reserved word nil, which denotes a pointer that points to nothing.

         a. When we create an empty stack, we will set the external pointer to
            it to nil.

         b. The bottom node on the stack will have a link of nil.

   D. Some observations about Pascal dynamic (pointer) variables:

      1. A pointer variable actually has two types:

         a. It is of type pointer - it contains a machine memory address.

         b. It points to an object of some other type (e.g. node)

      2. In addition to new and dispose, there are two basic operations on 
         pointers.

         a. Deferencing. If p is a pointer, then p^ is the object it points
            to.  For example, if p is of type ^node, then p^ is of type node.
            In the above example, then, we could refer to p^.info, since p^
            is a record and info is one of its fields.  Thus, we could do
            things like:

                p^.info := 37;
            or
                writeln(p^.info);

         b. Assignment.  Example:

            Given the declaration

                var
                    p,q: ^node;

            Assume that p and q have been made to point, respectively, to nodes
            containing abcdefgh and stuvwxyz in their info fields - i.e. we have

                p --> [ 42 ]            q --> [ 25 ]

            Now suppose we execute the statement:

                q := p;

            Then the situation we have is that p and q BOTH point to the SAME
            object (and nothing points to the object q used to point to.)

                p ----> [ 42 ]
                q -/    [ 25 ] (* garbage *)

         c. Note the difference between: q := p  and q^ := p^

            i. In the first case, after the assignment q and p point to the
               same object.  In the second case, q and p still point to
               different objects, but the object q points to is made to have
               the same value as that which p pointed to - e.g. given the same
               initial situation as before, the result of q^ := p^ would be:

                p --> [ 42 ]
                q --> [ 42 ]

            ii. In the previous case, the value of the variable q was altered -
                it contains a different address.  In the present case, q
                is not altered; the object it points to is modified.

      d. Note, too, that since nil points to nothing, the following would
         be illegal:

                p := nil;               (* ok *)
                ... p^ ...              (* no! *)

III. Linked Stacks
---  ------ ------

   A. With these preliminaries taken care of, we can now look in detail at how
      we can implement a stack by using a linked list of dynamic variables.
      Let's again summarize what we have said about how we will approach this.

      1. The basic idea is this:

         a. The stack will be represented by a single external pointer,
            pointing to the "top" node of the stack.

         b. Each node - except the bottom - will contain a link to the one
            pushed just before it.

         c. The bottom node's link will be nil.

         Example: The stack resulting from pushing 1, 42, and -3 (in that order)
                  onto an empty stack would be:

                _____   _____   _____
                |-3 |   |42 |   | 1 |
         o----->| o-|-->| o-|-->| o-|--
                -----   -----   -----  |
                                     -----
                                      ---

      2. An empty stack will be represented by a nil external pointer.

      3. To push an item, we create a new node for it, make that node point
         to the top item on the stack, and reset the external pointer to
         the new node.

         Example: The above after pushing 7:

                _____   _____   _____   _____
                | 7 |   |-3 |   |42 |   | 1 |
         o----->| o-|-->| o-|-->| o-|-->| o-|--
                -----   -----   -----   -----  |
                                             -----
                                              ---

      4. To pop an item, we reset the external pointer to the link of the top
         node.  (Note how this would restore the above to its state before the
         push.)

      5. Code: TRANSPARENCY

   B. We have now seen two totally different ways of implementing stacks:

      1. Using an array with a "top" index.

         a. Advantage: simple.

         b. Disadvantage: maximum size is fixed when stack is declared.

      2. Using dynamic variables with a "top" pointer.

         a. Advantage: flexible.

         b. Disadvantages: may use more total storage; somewhat slower running
                           due to overhead of storage allocation/deallocation.

      One of the advantages of adhering to proper use of data abstraction (the
      "wall of abstraction") is that it should be possible to develop a program 
      that uses stacks without regard to the method used to implement them -
      indeed, it should be possible to pull out one implementation and plug in
      the other without changing anything else.

Copyright ©1999 - Russell C. Bjork