Go forward to Representations.
Go backward to OK.
Go up to Top.

Introduction to container class prototypes
******************************************

   As a temporary mechanism enabling the support of generic classes,
the GNU C++ Library distribution contains a directory (`g++-include')
of files designed to serve as the basis for generating container
classes of specified elements.  These files can be used to generate
`.h' and `.cc' files in the current directory via a supplied shell
script program that performs simple textual substitution to create
specific classes.

   While these classes are generated independently, and thus share no
code, it is possible to create versions that do share code among
subclasses. For example, using `typedef void* ent', and then generating
a `entList' class, other derived classes could be created using the
`void*' coercion method described in Stroustrup, pp204-210.

   This very simple class-generation facility is useful enough to serve
current purposes, but will be replaced with a more coherent mechanism
for handling C++ generics in a way that minimally disrupts current
usage.  Without knowing exactly when or how parametric classes might be
added to the C++ language, provision of this simplest possible
mechanism, textual substitution, appears to be the safest strategy,
although it does require certain redundancies and awkward constructions.

   Specific classes may be generated via the `genclass' shell script
program. This program has arguments specifying the kinds of base
types(s) to be used. Specifying base types requires two arguments. The
first is the name of the base type, which may be any named type, like
`int' or `String'. Only named types are supported; things like `int*'
are not accepted. However, pointers like this may be used by supplying
the appropriate typedefs (e.g., editing the resulting files to include
`typedef int* intp;'). The type name must be followed by one of the
words `val' or `ref', to indicate whether the base elements should be
passed to functions by-value or by-reference.

   You can specify basic container classes using `genclass base
[val,ref] proto', where `proto' is the name of the class being
generated.  Container classes like dictionaries and maps that require
two types may be specified via `genclass -2 keytype [val, ref],
basetype [val, ref] proto', where the key type is specified first and
the contents type second.  The resulting classnames and filenames are
generated by prepending the specified type names to the prototype names,
and separating the filename parts with dots.  For example, `genclass
int val List' generates class `intList' residing in files `int.List.h'
and `int.List.cc'. `genclass -2 String ref int val VHMap' generates
(the awkward, but unavoidable) class name `StringintVHMap'. Of course,
programmers may use `typedef' or simple editing to create more
appropriate names.  The existence of dot seperators in file names
allows the use of GNU make to help automate configuration and
recompilation. An example Makefile exploiting such capabilities may be
found in the `libg++/proto-kit' directory.

   The `genclass' utility operates via simple text substitution using
`sed'. All occurrences of the pseudo-types `<T>' and `<C>' (if there
are two types) are replaced with the indicated type, and occurrences of
`<T&>' and `<C&>' are replaced by just the types, if `val' is
specified, or types followed by "&" if `ref' is specified.

   Programmers will frequently need to edit the `.h' file in order to
insert additional `#include' directives or other modifications.  A
simple utility, `prepend-header' to prepend other `.h' files to
generated files is provided in the distribution.

   One dubious virtue of the prototyping mechanism is that, because
sources files, not archived library classes, are generated, it is
relatively simple for programmers to modify container classes in the
common case where slight variations of standard container classes are
required.

   It is often a good idea for programmers to archive (via `ar')
generated classes into `.a' files so that only those class functions
actually used in a given application will be loaded.  The test
subdirectory of the distribution shows an example of this.

   Because of `#pragma interface' directives, the `.cc' files should be
compiled with `-O' or `-DUSE_LIBGXX_INLINES' enabled.

   Many container classes require specifications over and above the base
class type. For example, classes that maintain some kind of ordering of
elements require specification of a comparison function upon which to
base the ordering.  This is accomplished via a prototype file `defs.hP'
that contains macros for these functions. While these macros default to
perform reasonable actions, they can and should be changed in
particular cases. Most prototypes require only one or a few of these.
No harm is done if unused macros are defined to perform nonsensical
actions. The macros are:

`DEFAULT_INITIAL_CAPACITY'
     The initial capacity for containers (e.g., hash tables) that
     require an initial capacity argument for constructors.  Default:
     100

`<T>EQ(a, b)'
     return true if a is considered equal to b for the purposes of
     locating, etc., an element in a container.  Default: (a == b)

`<T>LE(a, b)'
     return true if a is less than or equal to b Default: (a <= b)

`<T>CMP(a, b)'
     return an integer < 0 if a<b, 0 if a==b, or > 0 if a>b.  Default:
     (a <= b)? (a==b)? 0 : -1 : 1

`<T>HASH(a)'
     return an unsigned integer representing the hash of a.  Default:
     hash(a) ; where extern unsigned int hash(<T&>).  (note: several
     useful hash functions are declared in builtin.h and defined in
     hash.cc)

   Nearly all prototypes container classes support container traversal
via `Pix' pseudo indices, as described elsewhere.

   All object containers must perform either a `X::X(X&)' (or `X::X()'
followed by `X::operator =(X&)') to copy objects into containers.  (The
latter form is used for containers built from C++ arrays, like
`VHSets'). When containers are destroyed, they invoke `X::~X()'.  Any
objects used in containers must have well behaved constructors and
destructors. If you want to create containers that merely reference
(point to) objects that reside elsewhere, and are not copied or
destroyed inside the container, you must use containers of pointers,
not containers of objects.

   All prototypes are designed to generate *HOMOGENOUS* container
classes.  There is no universally applicable method in C++ to support
heterogenous object collections with elements of various subclasses of
some specified base class. The only way to get heterogenous structures
is to use collections of pointers-to-objects, not collections of objects
(which also requires you to take responsibility for managing storage for
the objects pointed to yourself).

   For example, the following usage illustrates a commonly encountered
danger in trying to use container classes for heterogenous structures:

     class Base { int x; ...}
     class Derived : public Base { int y; ... }
     
     BaseVHSet s; // class BaseVHSet generated via something like
                  // `genclass Base ref VHSet'
     
     void f()
     {
       Base b;
       s.add(b); // OK
     
       Derived d;
       s.add(d);  // (CHOP!)
     }

   At the line flagged with `(CHOP!)', a `Base::Base(Base&)' is called
inside `Set::add(Base&)'--*not* `Derived::Derived(Derived&)'.
Actually, in `VHSet', a `Base::operator =(Base&)', is used instead to
place the element in an array slot, but with the same effect.  So only
the Base part is copied as a `VHSet' element (a so-called
chopped-copy). In this case, it has an `x' part, but no `y' part; and a
Base, not Derived, vtable. Objects formed via chopped copies are rarely
sensible.

   To avoid this, you must resort to pointers:

     typedef Base* BasePtr;
     
     BasePtrVHSet s; // class BaseVHSet generated via something like
                     // `genclass BasePtr val VHSet'
     
     void f()
     {
       Base* bp = new Base;
       s.add(b);
     
       Base* dp = new Derived;
       s.add(d);  // works fine.
     
       // Don't forget to delete bp and dp sometime.
       // The VHSet won't do this for you.
     }

Example
=======

   The prototypes can be difficult to use on first attempt. Here is an
example that may be helpful. The utilities in the `proto-kit' simplify
much of the actions described, but are not used here.

   Suppose you create a class `Person', and want to make an Map that
links the social security numbers associated with each person. You start
off with a file `Person.h'


     #include <String.h>
     
     class Person
     {
       String nm;
       String addr;
       //...
     public:
       const String& name() { return nm; }
       const String& address() { return addr; }
       void          print() { ... }
       //...
     }

   And in file `SSN.h',

     typedef unsigned int SSN;

   Your first decision is what storage/usage strategy to use. There are
several reasonable alternatives here: You might create an "object
collection" of Persons, a "pointer collection" of pointers-to-Persons,
or even a simple String map, housing either copies of pointers to the
names of Persons, since other fields are unused for purposes of the
Map. In an object collection, instances of class Person "live" inside
the Map, while in a pointer collection, the instances live elsewhere.
Also, as above, if instances of subclasses of Person are to be used
inside the Map, you must use pointers. In a String Map, the same
difference holds, but now only for the name fields. Any of these
choices might make sense in particular applications.

   The second choice is the Map implementation strategy. Either a tree
or a hash table might make sense. Suppose you want an AVL tree Map.
There are two things to now check. First, as an object collection, the
AVLMap requires that the elsement class contain an `X(X&)' constructor.
In C++, if you don't specify such a constructor, one is constructed for
you, but it is a very good idea to always do this yourself, to avoid
surprises. In this example, you'd use something like
     class Person
     { ...;
         Person(const Person& p) :nm(p.nm), addr(p.addr) {}
     };

   Also, an AVLMap requires a comparison function for elements in order
to maintain order. Rather than requiring you to write a particular
comparison function, a `defs' file is consulted to determine how to
compare items. You must create and edit such a file.

   Before creating `Person.defs.h', you must first make one additional
decision. Should the Map member functions like `m.contains(p)' take
arguments (`p') by reference (i.e., typed as `int Map::contains(const
Person& p)' or by value (i.e., typed as `int Map::contains(const Person
p)'. Generally, for user-defined classes, you want to pass by
reference, and for builtins and pointers, to pass by value. SO you
should pick by-reference.

   You can now create `Person.defs.h' via `genclass Person ref defs'.
This creates a simple skeleton that you must edit. First, add `#include
"Person.h"' to the top. Second, edit the `<T>CMP(a,b)' macro to compare
on name, via

     #define <T>CMP(a, b) ( compare(a.name(), b.name()) )

which invokes the `int compare(const String&, const String&)' function
from `String.h'. Of course, you could define this in any other way as
well. In fact, the default versions in the skeleton turn out to be OK
(albeit inefficient) in this particular example.

   You may also want to create file `SSN.defs.h'. Here, choosing
call-by-value makes sense, and since no other capabilities (like
comparison functions) of the SSNs are used (and the defaults are OK
anyway), you'd type

     genclass SSN val defs

and then edit to place `#include "SSN.h"' at the top.

   Finally, you can generate the classes. First, generate the base
class for Maps via

     genclass -2 Person ref SSN val Map

This generates only the abstract class, not the implementation, in file
`Person.SSN.Map.h' and `Person.SSN.Map.cc'.  To create the AVL
implementation, type

     genclass -2 Person ref SSN val AVLMap

This creates the class `PersonSSNAVLMap', in `Person.SSN.AVLMap.h' and
`Person.SSN.AVLMap.cc'.

   To use the AVL implementation, compile the two generated `.cc'
files, and specify `#include "Person.SSN.AVLMap.h"' in the application
program.  All other files are included in the right ways automatically.

   One last consideration, peculiar to Maps, is to pick a reasonable
default contents when declaring an AVLMap. Zero might be appropriate
here, so you might declare a Map,

     PersonSSNAVLMap m((SSN)0);

   Suppose you wanted a `VHMap' instead of an `AVLMap' Besides
generating different implementations, there are two differences in how
you should prepare the `defs' file. First, because a VHMap uses a C++
array internally, and because C++ array slots are initialized
differently than single elements, you must ensure that class Person
contains (1) a no-argument constructor, and (2) an assignment operator.
You could arrange this via

     class Person
     { ...;
         Person() {}
       void operator = (const Person& p) { nm = p.nm; addr = p.addr; }
     };

   (The lack of action in the constructor is OK here because `Strings'
possess usable no-argument constructors.)

   You also need to edit `Person.defs.h' to indicate a usable hash
function and default capacity, via something like

     #include <builtin.h>
     #define <T>HASH(x)  (hashpjw(x.name().chars()))
     #define DEFAULT_INITIAL_CAPACITY 1000

   Since the `hashpjw' function from `builtin.h' is appropriate here.
Changing the default capacity to a value expected to exceed the actual
capacity helps to avoid "hidden" inefficiencies when a new VHMap is
created without overriding the default, which is all too easy to do.

   Otherwise, everything is the same as above, substituting `VHMap' for
`AVLMap'.