CS122 Lecture: Implementing Stacks with Dynamic Variables 2/9/96
Objectives:
1. To introduce the notion of linked lists.
2. To show how lists can be realized in Pascal by using dynamic variables.
3. To show how to realize stacks using lists.
Materials:
1. Transparency of linked stacks.
I. Introduction.
- ------------
A. We have now covered a number of different data structures that are
linear in structure - that is, each member of the structure has at most
one neighbor on each side:
1. Arrays
2. Character strings
3. Stacks
B. Of these linear structures, only one is standard to the Pascal language:
the array structure. However, we have seen that the other linear
structures can be implemented in practice by using an array. That is,
the LOGICAL adjacency of two items in the structure is modelled by
PHYSICAL adjacency within an array.
C. Using an array to implement linear structures as we have done does lead
to some problems, though:
1. We have to decide ahead of time what the maximum number of items
in the structure will be
2. Suppose we have a program in which multiple occurrences of a data
structure must share the same space. We could, of course, allocate
a fixed amount of space to each. But if sizes could vary widely from
one run of the program to another, this could lead to problems. E.g.
suppose we had two stacks and allocated 100 elements to each. If
the first had only ten elements and the second needed to grow to 101,
then the program would crash for lack of room, even though plenty of
space is available. Alternately, we could have several structures
share the same array:
a. Example: two stacks. These can share the same array as follows:
top1 top2
| |
STACK 1 --> growth growth <-- STACK 2
So long as the sum total of the sizes of the two stacks does not
exceed the available space, all is well. (But we do have to
write separate versions of push and pop for each stack, since in
one case push increments top and in the other it decrements it,
etc.)
b. Even this technique breaks down quickly, however. What if we
wanted three stacks to share the same array? We could only handle
this by being prepared to ocassionally pick up a structure and move
it - a cumbersome and time-consuming process.
3. We will see later that some structures are best implemented using a
structure that is conceptually circular - i.e. the last element in
the structure is considered to be adjacent to the first. This cannot
be directly modelled by physical adjacency in memory.
II. Introduction to Linked Lists and Dynamic Variables in Pascal
-- ------------ -- ------ ----- --- ------- --------- -- ------
A. The above problems can be handled if we can somehow find a way to relax
the requirement that LOGICAL ADJACENCY be modelled by PHYSICAL adjacency.
One way to do this is with a linked structure.
1. In a linked structure, each element contains both a value and a
special field called a link which points to its neighbor(s).
2. The link can be implemented in one of two ways:
a. The entire structure can be represented by an array of elements,
whose physical order is irrelevant. Links can be represented by
indices into this array. Thus, instead of finding the index of
the successor of an element by adding one to its index, we will
use an explicitly stored link index.
b. In a number of languages (including Pascal), actual machine memory
addresses can be used as links. These links are commonly called
"pointers" and the objects they point to are called "dynamic
variables".
B. Now we look at a mechanism available in Pascal and many other languages
(but not all) which allows a flexible and efficient approach to
implementing various data structures.
1. Instead of allocating storage for all of our data ahead of time (by
means of var declarations), we will allocate some storage this way
and some as we need it. We call this DYNAMIC STORAGE ALLOCATION, and
we call the variables created this way DYNAMIC VARIABLES.
2. This means that, a dynamic structure can grow to any size needed as
the program runs - subject, of course, to total available memory on
the computer.
C. Dynamic variables involve several language features that we have not
seen before.
1. Pascal has a mechanism whereby a variable can contain the actual
physical memory ADDRESS of another data item. We call such a
variable a POINTER.
2. In Pascal we may declare a variable to be a POINTER to an object of
a certain type. This means that it will store the actual memory
ADDRESS of an object of that type. Example:
var
a: ^real;
b: ^integer;
p: ^node;
declares that a will contain the address of an object of type integer,
b the address of an object of type real, and p the address of an object
of type node (where we assume that the type node has already been
declared).
a. Note that each pointer variable is constrained to point to an object
of a certain type. For example, it would not be possible, given the
above declarations, to make p point to an object of type real.
b. But any type can be used in declaring a pointer - built in or
programmer-defined.
c. A pattern like the following is often encountered in programs using
dynamic variables. (Here, we are setting up the structure we will
need for a dynamic implementation of a stack of integers).
type
nodeptr = ^node;
node = record
info: integer;
link: nodeptr
end;
i. Note that this represents an exception to a pattern we have
always seen up until now in Pascal declarations. What is it?
(Ask)
ii. We have here a case where a type is used BEFORE it is declared.
Note that the type node is used in declaring nodeptr before the
type node is itself declared. This forms an exception to the
normal rule that declaration preceeds use, and is necessary
so that a node can contain pointer fields that point to other
nodes of the same type.
iii. The language definition allows for this one exception: a type
name may be used in declaring a pointer type before the named
type is itself declared - provided that a later declaration
for the type in question is given in the same block.
iv. The use of names like "node" and "nodeptr" is traditional; but
there is nothing special about these names - they are just
Pascal identifiers. We could have used "aardvark" and "zebra"
if we wanted to - though the program would be harder to read
this way!
d. The basic idea we will use is this: each object pushed on the stack
will be represented by a node, containing the value pushed and a
pointer to the previous item on the stack. A single ordinary
variable will contain an EXTERNAL POINTER to the top of the stack.
Example: The stack resulting from pushing 1, 42, and -3 (in that order)
onto an empty stack would be:
_____ _____ _____
|-3 | |42 | | 1 |
o----->| o-|-->| o-|-->| o-|--
----- ----- ----- |
-----
---
3. The standard procedure NEW takes a pointer variable as its argument,
creates a new object of the type the variable is declared to point to,
and returns the address of the newly created object.
Example:
Given the declaration
var p:nodeptr
We could execute the following statement:
new(p)
This would cause 8 bytes of storage to be allocated, and would
set the variable p to point to the first byte. (The size 8
bytes represents the size of an object of type node on our VAX.)
a. In memory, this is accomplished as follows. Pascal divides
memory into three major regions:
i. A fixed-size region, holding code and global variables.
ii. A stack, holding stack-frames of procedures and functions.
iii. A heap, from which space is taken for dynamic variables
b. e.g.
-----------
| CODE |
| DATA |
|---------|
| Heap | <-- new obtains space from this region
| |
|---------|
| |
| HW Stack| <-- used by hardware to keep track of procedure
----------- calls/returns and for local variables
c. When new is executed, the Pascal system carves a chunk of storage
of the appropriate size out of the heap, and returns its address in
the variable passed as a parameter to new. The "appropriate size"
is determined by the declaration of the pointer variable. Recall
that a pointer variable is always declared to point to an object
of a certain type, so the size is the size of the type of object
the pointer points to.
d. Each time we push an item on our stack, we will use new to obtain
space for it.
4. Corresponding to new is a standard procedure DISPOSE which takes
a pointer variable and returns the object it points to to the heap
for possible reallocation later. (Not all dialects of Pascal implement
this, though.)
a. Example:
new(p);
(* p now points to an object of type node. *)
...
dispose(p);
(* that object no longer exists. Any further attempt to
refer to it is an error. *)
b. Each time we pop an item from the stack, we will use dispose to
free up the space it occupied.
5. Sometimes, it is necessary to give a pointer a value that denotes
that it is null - i.e. it does not point to anything. Pascal supplies
the reserved word nil, which denotes a pointer that points to nothing.
a. When we create an empty stack, we will set the external pointer to
it to nil.
b. The bottom node on the stack will have a link of nil.
D. Some observations about Pascal dynamic (pointer) variables:
1. A pointer variable actually has two types:
a. It is of type pointer - it contains a machine memory address.
b. It points to an object of some other type (e.g. node)
2. In addition to new and dispose, there are two basic operations on
pointers.
a. Deferencing. If p is a pointer, then p^ is the object it points
to. For example, if p is of type ^node, then p^ is of type node.
In the above example, then, we could refer to p^.info, since p^
is a record and info is one of its fields. Thus, we could do
things like:
p^.info := 37;
or
writeln(p^.info);
b. Assignment. Example:
Given the declaration
var
p,q: ^node;
Assume that p and q have been made to point, respectively, to nodes
containing abcdefgh and stuvwxyz in their info fields - i.e. we have
p --> [ 42 ] q --> [ 25 ]
Now suppose we execute the statement:
q := p;
Then the situation we have is that p and q BOTH point to the SAME
object (and nothing points to the object q used to point to.)
p ----> [ 42 ]
q -/ [ 25 ] (* garbage *)
c. Note the difference between: q := p and q^ := p^
i. In the first case, after the assignment q and p point to the
same object. In the second case, q and p still point to
different objects, but the object q points to is made to have
the same value as that which p pointed to - e.g. given the same
initial situation as before, the result of q^ := p^ would be:
p --> [ 42 ]
q --> [ 42 ]
ii. In the previous case, the value of the variable q was altered -
it contains a different address. In the present case, q
is not altered; the object it points to is modified.
d. Note, too, that since nil points to nothing, the following would
be illegal:
p := nil; (* ok *)
... p^ ... (* no! *)
III. Linked Stacks
--- ------ ------
A. With these preliminaries taken care of, we can now look in detail at how
we can implement a stack by using a linked list of dynamic variables.
Let's again summarize what we have said about how we will approach this.
1. The basic idea is this:
a. The stack will be represented by a single external pointer,
pointing to the "top" node of the stack.
b. Each node - except the bottom - will contain a link to the one
pushed just before it.
c. The bottom node's link will be nil.
Example: The stack resulting from pushing 1, 42, and -3 (in that order)
onto an empty stack would be:
_____ _____ _____
|-3 | |42 | | 1 |
o----->| o-|-->| o-|-->| o-|--
----- ----- ----- |
-----
---
2. An empty stack will be represented by a nil external pointer.
3. To push an item, we create a new node for it, make that node point
to the top item on the stack, and reset the external pointer to
the new node.
Example: The above after pushing 7:
_____ _____ _____ _____
| 7 | |-3 | |42 | | 1 |
o----->| o-|-->| o-|-->| o-|-->| o-|--
----- ----- ----- ----- |
-----
---
4. To pop an item, we reset the external pointer to the link of the top
node. (Note how this would restore the above to its state before the
push.)
5. Code: TRANSPARENCY
B. We have now seen two totally different ways of implementing stacks:
1. Using an array with a "top" index.
a. Advantage: simple.
b. Disadvantage: maximum size is fixed when stack is declared.
2. Using dynamic variables with a "top" pointer.
a. Advantage: flexible.
b. Disadvantages: may use more total storage; somewhat slower running
due to overhead of storage allocation/deallocation.
One of the advantages of adhering to proper use of data abstraction (the
"wall of abstraction") is that it should be possible to develop a program
that uses stacks without regard to the method used to implement them -
indeed, it should be possible to pull out one implementation and plug in
the other without changing anything else.
Copyright ©1999 - Russell C. Bjork