Go backward to ANSI.
Go up to Top.

Name Encoding in GNU C++
************************

   In order to support its strong typing rules and the ability to
provide function overloading, the C++ programming language "encodes"
information about functions and objects, so that conflicts across object
files can be detected during linking. (1) These rules tend to be unique
to each individual implementation of C++.

   The scheme detailed in the commentary for 7.2.1 of `The Annotated
Reference Manual' offers a description of a possible implementation
which happens to closely resemble the `cfront' compiler.  The design
used in GNU C++ differs from this model in a number of ways:

   * In addition to the basic types `void', `char', `short', `int',
     `long', `float', `double', and `long double', GNU C++ supports two
     additional types: `wchar_t', the wide character type, and `long
     long' (if the host supports it).  The encodings for these are `w'
     and `x' respectively.

   * According to the ARM, qualified names (e.g., `foo::bar::baz') are
     encoded with a leading `Q'.  Followed by the number of
     qualifications (in this case, three) and the respective names, this
     might be encoded as `Q33foo3bar3baz'.  GNU C++ adds a leading
     underscore to the list, producing `_Q33foo3bar3baz'.

   * The operator `*=' is encoded as `__aml', not `__amu', to match the
     normal `*' operator, which is encoded as `__ml'.

   * In addition to the normal operators, GNU C++ also offers the
     minimum and maximum operators `>?' and `<?', encoded as `__mx' and
     `__mn', and the conditional operator `?:', encoded as `__cn'.

   * Constructors are encoded as simply `__NAME', where NAME is the
     encoded name (e.g., `3foo' for the `foo' class constructor).
     Destructors are encoded as two leading underscores separated by
     either a period or a dollar sign, depending on the capabilities of
     the local host, followed by the encoded name.  For example, the
     destructor `foo::~foo' is encoded as `_$_3foo'.

   * Virtual tables are encoded with a prefix of `_vt', rather than
     `__vtbl'.  The names of their classes are separated by dollar signs
     (or periods), and not encoded as normal: the virtual table for
     `foo' is `__vt$foo', and the table for `foo::bar' is named
     `__vt$foo$bar'.

   * Static members are encoded as a leading underscore, followed by the
     encoded name of the class in which they appear, a separating
     dollar sign or period, and finally the unencoded name of the
     variable.  For example, if the class `foo' contains a static
     member `bar', its encoding would be `_3foo$bar'.

   * GNU C++ is not as aggressive as other compilers when it comes to
     always generating `Fv' for functions with no arguments.  In
     particular, the compiler does not add the sequence to conversion
     operators.  The function `foo::bar()' is encoded as `bar__3foo',
     not `bar__3fooFv'.

   * The argument list for methods is not prefixed by a leading `F'; it
     is considered implied.

   * GNU C++ approaches the task of saving space in encodings
     differently from that noted in the ARM.  It does use the `TN' and
     `NXY' codes to signify copying the Nth argument's type, and making
     the next X arguments be the type of the Yth argument,
     respectively.  However, the values for N and Y begin at zero with
     GNU C++, whereas the ARM describes them as starting at one.  For
     the function `foo (bartype, bartype)', GNU C++ uses
     `foo__7bartypeT0', while compilers following the ARM example
     generate `foo__7bartypeT1'.

   * GNU C++ does not bother using the space-saving methods for types
     whose encoding is a single character (like an integer, encoded as
     `i').  This is useful in the most common cases (two `int's would
     result in using three letters, instead of just `ii').

   ---------- Footnotes ----------

   (1)  This encoding is also sometimes called, whimsically enough,
"mangling"; the corresponding decoding is sometimes called "demangling".