Go backward to ANSI.
Go up to Top.
Name Encoding in GNU C++
************************
In order to support its strong typing rules and the ability to
provide function overloading, the C++ programming language "encodes"
information about functions and objects, so that conflicts across object
files can be detected during linking. (1) These rules tend to be unique
to each individual implementation of C++.
The scheme detailed in the commentary for 7.2.1 of `The Annotated
Reference Manual' offers a description of a possible implementation
which happens to closely resemble the `cfront' compiler. The design
used in GNU C++ differs from this model in a number of ways:
* In addition to the basic types `void', `char', `short', `int',
`long', `float', `double', and `long double', GNU C++ supports two
additional types: `wchar_t', the wide character type, and `long
long' (if the host supports it). The encodings for these are `w'
and `x' respectively.
* According to the ARM, qualified names (e.g., `foo::bar::baz') are
encoded with a leading `Q'. Followed by the number of
qualifications (in this case, three) and the respective names, this
might be encoded as `Q33foo3bar3baz'. GNU C++ adds a leading
underscore to the list, producing `_Q33foo3bar3baz'.
* The operator `*=' is encoded as `__aml', not `__amu', to match the
normal `*' operator, which is encoded as `__ml'.
* In addition to the normal operators, GNU C++ also offers the
minimum and maximum operators `>?' and `<?', encoded as `__mx' and
`__mn', and the conditional operator `?:', encoded as `__cn'.
* Constructors are encoded as simply `__NAME', where NAME is the
encoded name (e.g., `3foo' for the `foo' class constructor).
Destructors are encoded as two leading underscores separated by
either a period or a dollar sign, depending on the capabilities of
the local host, followed by the encoded name. For example, the
destructor `foo::~foo' is encoded as `_$_3foo'.
* Virtual tables are encoded with a prefix of `_vt', rather than
`__vtbl'. The names of their classes are separated by dollar signs
(or periods), and not encoded as normal: the virtual table for
`foo' is `__vt$foo', and the table for `foo::bar' is named
`__vt$foo$bar'.
* Static members are encoded as a leading underscore, followed by the
encoded name of the class in which they appear, a separating
dollar sign or period, and finally the unencoded name of the
variable. For example, if the class `foo' contains a static
member `bar', its encoding would be `_3foo$bar'.
* GNU C++ is not as aggressive as other compilers when it comes to
always generating `Fv' for functions with no arguments. In
particular, the compiler does not add the sequence to conversion
operators. The function `foo::bar()' is encoded as `bar__3foo',
not `bar__3fooFv'.
* The argument list for methods is not prefixed by a leading `F'; it
is considered implied.
* GNU C++ approaches the task of saving space in encodings
differently from that noted in the ARM. It does use the `TN' and
`NXY' codes to signify copying the Nth argument's type, and making
the next X arguments be the type of the Yth argument,
respectively. However, the values for N and Y begin at zero with
GNU C++, whereas the ARM describes them as starting at one. For
the function `foo (bartype, bartype)', GNU C++ uses
`foo__7bartypeT0', while compilers following the ARM example
generate `foo__7bartypeT1'.
* GNU C++ does not bother using the space-saving methods for types
whose encoding is a single character (like an integer, encoded as
`i'). This is useful in the most common cases (two `int's would
result in using three letters, instead of just `ii').
---------- Footnotes ----------
(1) This encoding is also sometimes called, whimsically enough,
"mangling"; the corresponding decoding is sometimes called "demangling".