CS320 Lecture: C++ Memory Management 10/2/98
NEED: Video store system code for Customer class
Constructor/destructor demo program, handout of it
I. Introduction
- ------------
A. One word we use frequently is the word "variable". What does this term
mean? In a formal sense, what is a variable?
ASK
Formally, a variable has four components, though one or more may be
unspecified at any given time:
1. A name
2. A type
3. A storage allocation
4. A value
We can depict this as follows
_________________
______ ______ | storage alloc |
( name )------< type >-----| _______ |
------ ------ | ( value ) |
| ------- |
-----------------
e.g. int x = 42;
_________________
______ ______ | some address |
( x )------< int >-----| _______ |
------ ------ | ( 42 ) |
| ------- |
-----------------
B. When we refer to an object by means of a pointer, we are actually
using two variables: the pointer variable and (unless the pointer is
NULL) the thing pointed to - e.g.
e.g. Customer * c = new Customer("AARDVARK", "999-9999");
_________________
______ ______ | some address |
( c )------< ptr >-----| _______ |
------ ------ | ( o---)---|---+
| ------- | |
----------------- |
_________________ |
______ ______ |another address|<--+
( )-----<Customer>----| _______ |
------ ------ | (AARDVARK) |
| (999-9999) |
| ------- |
-----------------
The same is true with a reference - references and pointers are
identical in terms of implementation - but differ in how they are
used ( p -> versus r.) and whether they can be made to refer elsewhere
(pointers yes, references no)
C. As noted above, at a given point in time, one or more components of a
given variable can be unspecified
1. When do we have variables without names?
- ASK
- Dynamically variables do not have names - e.g. the individual
nodes of a linked list. (But they do have a type, storage
allocation, and value)
2. In languages like C++, one thing all variables always have is a
type. (Hence C++ is called a strongly typed language). There
are other languages (called untyped languages) where the type of
value a variable holds may change - possibly many times - during
program execution.
Examples: LISP, Prolog, APL
3. In C++, variables that are local to a block do not have a storage
allocation when the block in which they are declared is not active -
e.g.
void foo()
{ int x;
...
}
The variable named by x only has a storage allocation during the
time foo() is being executed.
4. A variable that has no storage allocation also has no value. Even if
it does have a storage allocation, its value may be unspecified (e.g.
an uninitialized variable).
D. Two related, but distinct, concepts are important when talking about
a variable: its SCOPE and its LIFETIME.
1. Scope refers to the portion of the program text in which it can be
referred to by its name.
Example: int x, y;
void foo()
{ int y;
...
}
void bar()
{ ...
x ++;
...
}
The scope of global x is the entire program
The scope of global y is the entire program EXCEPT the body of foo
The scope of lcoal y in foo is just the body of foo
Note that scope is a textual, or static concept
2. Lifetime refers to the time in which the variable has a storage
allocation associated with it.
In the above, the lifetime of global x and global y is the entire
execution of the program.
The lifetime of local y in foo is just time time foo is executing
(including the execution of any functions/methods it may call)
Note that lifetime is a temporal, or dynamic concept
E. In C++, there are several possibilities for scope and several
possibilities for lifetime, which interact with one another.
1. In terms of scope, variables may have
a. file scope - visible throughout an entire source file
variables declared globally - i.e. outside of any class or
funcition definition - have file scope
b. class scope - visible only within the methods of a class, or
when qualified by the class name or an object of the class
instance and class variables declared within a class have
class scope
c. local scope - visible only within a block (delimited by { } )
local variables of functions and methods have local scope
2. In terms of lifetime, variables may have
a. static lifetime - their lifetime is the entire execution of the
program.
all file scope (global) variables have static lifetime
all class variables (declared as static) have static lifetime
all static local variables (declared as static) have static lifetime
NOTE: When the word static appears as part of the declaration of
a file scope (global) variable, it has nothing to do with
lifetime. (The variable has static lifetime with or without
this modifier). Rather, it has to do with linkage - making
the variable invisible to the linker and thus outside the
current file.
b. automatic lifetime - their lifetime is just when the block in
which they are declared is executing
all local variables have automatic lifetime unless declared static
(Note: lifetime is identical to scope for these variables,
which can then be said to be "in scope" or "out of scope" to
mean they do/do not have a storage allocation.)
Since block entry/exit exhibits a LIFO discipline, automatic
objects are generally allocated on a runtime stack maintained
"behind the scenes" by the system. Hence, automatic variables
are also called stack-allocated variables.
c. dynamic lifetime - all variables created by new live until they
are destroyed by delete (or the program terminates - whichever
occurs first)
The pool of storage from which these variables are created is
known as the heap - hence these are also called heap-allocated
variables.
'
Note: the lifetime of instance variables (fields) of an object is the
same as that of the object of which they are a part, and thus can
be any of the above.
Note: any class may be used to declare objects with any lifetime - e.g.
a given class may have some instances that are static, some that
are automatic, some that are dynamic, and some that are fields of
another object!
F. In the case of objects, lifetime is associated with construction and
destruction, as follows.
1. A constructor for an object is always called at the beginning of
its lifetime.
a. At program startup (before execution of main() begins) for
objects with static lifetime.
b. When the declaration is encountered during execution, for
objects with automatic lifetime.
c. When the object is created by new, for objects with dynamic
lifetime.
d. When the object of which it is a part is constructed, for
instance variables (fields).
2. If an object has a destructor, the destructor is always called at
the end of its lifetime.
a. At program termination (after exit from main()) for objects
with static lifetime.
b. When the block in which it is declared is exited, for objects
with automatic lifetime.
c. When the object is destroyed by delete, for objects with dynamic
lifetime.
d. When the object of which it is a part is destructed, for intance
variables (fields).
Note: If a class does not itself have a destructor, but has fields
of a class that does have a destructor, the compiler will
automatically create a destructor for it that calls those
other destructors.
G. Note that construction and destruction may occur unexpectedly because
of temporary copies:
1. If a function/method has a parameter that is an object (not a pointer
or reference to it), then a copy is made of the actual parameter as
part of the call of that function/method. This temporary, local
copy is used within the function/method, and is then destructed when
it exits.
2. If a function/method returns an object (not a pointer or reference to
it), then a copy is made of the return value into a temporary in
space that belongs to the caller. (Necessary because space that
belongs to the function/method is about to be deallocated). This
temporary is later destructed when the caller's block exits.
3. Temporary copies are made by using a special constructor called
a copy constructor.
i. The copy constructor for a class C has signature:
C(const C &)
ii. If a class author does not provide a copy constructor, but one
is needed, the compiler will create one using member-wise copy.
HANDOUT / DEMO - CONSTRUCTOR DESTRUCTOR DEMO PROGRAM
ASK CLASS TO PREDICT OUTPUT BEFORE RUNNING
II. Memory Management Issues For Dynamic Variables
-- ------ ---------- ------ --- ------- ---------
A. In the case of static and automatic lifetime objects, management of
storage allocation / deallocation and constructor / destructor calls
is handled automatically, and the programmer does not need to be
concerned with it. However, for dynamic variables the programmer is
responsible for handling these issues explicitly.
B. Significant problems can result from the non-deletion or incorrect
deletion of dynamic variables.
1. This becomes especially problematical if dynamic variables are
shared - e.g. several pointers refer to the same object.
Example: In the video store system, a single customer object occurs
in the master customer list, but may also be referred to
by one or more copy objects (that the customer has out or
on hold) and/or by one or more reservation objects.
2. What should happen - but can easily not happen - is that a dynamic
variable should be deleted just when the LAST pointer referring to it
is either made to point elsewhere or is itself destroyed.
Example: in the video store system, the delete customer option does
not allow a customer to be deleted if the customer has one
or more copies rented (because those copies contain pointers
to the customer). Further, the destructor for customer
first cancels all holds and reservations the customer has,
thus eliminating those pointers to the customer. Only
then is deletion of the customer object itself safe.
SHOW CODE FOR Customer::okToDelete(), ~Customer
3. Problems that arise when this does not occur include the following:
a. If the last reference to a dynamic object is removed, but the object
is not deleted, we have "garbage" or a "storage leak".
Systems that contain storage leaks can, when run over an extended
period, experience either slow deterioration in performance or
a crash due to all available memory being used.
Example: Driver's License system I worked on as a tester
b. If a dynamic object is deleted while there are still pointers
referring to it, we have "dangling references" which can lead to
strange and hard to find errors when its storage is reallocated for
some other purpose, and is now potentially used in two different
ways at the same time.
c. If the same dynamic object is deleted more than once, the free
storage mechanism can be corrupted, causing serious errors or a
crash.
d. Really severe problems also arise if the delete operation for
dynamic variables is applied to a pointer to a static or automatic
variable. Hence, using the address of operator (&) to convert a
local or global variable into a pointer is generally a dangerous
idea!
C. Avoiding these problems is a non-trivial exercise - and continues to be
an insidious source of problems in programs written in C++ (and is a
likely reason why even commerical software sometimes "freezes" or
crashes the computer!)
1. One imporant principle - when pointers are passed around - and
especially when they are stored in containers - make sure that it
is clear who is responsible for deleting the object when the pointer
is not longer used.
2. The book discusses at some length the need to be sure that any
class that has a non-trivial destructor (especially one that deletes
other objects that the object being deleted is pointing to) also
has an appropriate copy constructor and operator =.
3. Likewise, if a class is meant to serve as a base class for other
classes, and if pointers to it will be stored in polymorphic
containers, then it should have a virtual destructor - even if it
is do nothing - to allow subclasses to ensure they are properly
destructed.
4. Getting all this right in a complex system is often impossible.
D. One way to handle this problem is to use a scheme known as REFERENCE
COUNTING.
1. We associate with every dynamically allocated object a hidden field
known as the reference count.
2. Each operation that assigns a new pointer to point to this object
increments the reference count; and each operation that removes
a pointer from the node decrements the count.
3. When the reference count of any node goes to zero, it is recycled.
4. Management of reference counts can be automated by creating a separate,
"smart pointer" class, as discussed in the text. Such smart pointers
are sometimes called handles.
5. Reference counting generally works well, so long as all pointer
assignment operations can be "captured" and made to maintain the
reference counts. However, there is one situation reference counting
cannot handle: circular lists.
a. Suppose we have the following circular structure, with a single
external pointer. (Reference counts are shown in parentheses)
+-------------------------------------------+
\ (2) (1) (1) |
----------> [ X | o ]-> [ Y | o ]-> [ Z | o ]------+
b. Now suppose the external pointer is dropped. The result is:
+-------------------------------------------+
\ (2) (1) (1) |
-> [ X | o ]-> [ Y | o ]-> [ Z | o ]------+
The list is now garbage, but is not recycled because each node is
referenced by another on the list!
6. One other limitation of reference counting is that it imposes some
space and time overhead, since each object now requires an extra
overhead field for the reference count, and each pointer assignment
requires an extra step of indirection through the handle.
E. An alternative to reference counting is GARBAGE COLLECTION.
1. The basic idea is this: initially, we allocate storage upon request
with no attempt to recycle previously-allocated nodes. This continues
until a request is received that cannot be satisfied because there is
no more memory available to satisfy it.
2. At this point, a special routine called the GARBAGE COLLECTOR is run.
a. The garbage collector makes use of one or two overhead bits in
each object (or in a separate table). One of these bits is known
as the MARK BIT, and is initially clear.
b. The garbage collector runs in two phases - a MARK PHASE and a SWEEP
PHASE.
c. In the mark phase, a search is made for all objects that are
currently accessible from some external pointer, or from some other
already accessible object. The mark bit of each such object is set.
d. In the sweep phase, we go through the heap from start to finish,
examining the mark bit of each object.
i. If the bit is set (meaning the object was marked during the
mark phase), we clear the bit in preparation for the next
garbage collection.
ii. If the bit is clear (meaning the object was not marked), we
recycle it.
3. Garbage collection algorithms have been the subject of intense
study, because they play a key role in the performance of certain
systems. One critical issue is that garbage collection, though done
infrequently, consumes considerable processing time. Thus, a
system that does garbage collection will be observed to
periodically "freeze-up" for an interval whenever the garbage
collector is called.
4. In contrast to C++, Java uses garbage collection - which makes it
much easier to develop stable, robust systems. One consequence,
though, is that occassional pauses in execution. Thus, the
documentation for Java explicitly says that it should not be used for
life critical systems (in which a pause for garbage collection
might have fatal consequences.)
Copyright ©1999 - Russell C. Bjork