Extending SWIG

Caution: This chapter is being rewritten! (11/25/01)

Introduction

This chapter describes SWIG's internal organization and the process by which new target languages can be developed. First, a brief word of warning---SWIG has been undergoing a massive redevelopment effort that has focused extensively on its internal organization. The information in this chapter is mostly up to date, but changes are ongoing. Expect to find a few inconsistencies.

Prerequisites

In order to extend SWIG, it is useful to have the following background:

Since SWIG is essentially a specialized C++ compiler, it may be useful to have some prior experience with compiler design (perhaps even a compilers course) to better understand certain parts of the system. A number of books will also be useful. For example, "The C Programming Language" by Kernighan and Ritchie (a.k.a, "K&R") and the "C++ Annotated Reference Manual" by Stroustrup (a.k.a, the "ARM") will be of great use.

High Level Overview

When you run SWIG on an interface, processing is handled in stages by a few different system components:

Preprocessing

The preprocessor plays a critical role in the SWIG implementation. This is because a lot of SWIG's processing and internal configuration is managed not by code written in C, but by configuration files in the SWIG library. In fact, when you run SWIG, parsing starts with a small interface file like this (note: this explains the cryptic error messages that new users sometimes get when SWIG is misconfigured or installed incorrectly):
%include "swig.swg"             // Global SWIG configuration
%include "langconfig.swg"       // Language specific configuration
%include "yourinterface.i"      // Your interface file
The swig.swg file contains global configuration information. In addition, this file defines many of SWIG's standard directives as macros. For instance, part of of swig.swg looks like this:
...
/* Code insertion directives such as %wrapper %{ ... %} */

#define %init        %insert("init")
#define %wrapper     %insert("wrapper")
#define %header      %insert("header")
#define %runtime     %insert("runtime")

/* Access control directives */

#define %readonly    %pragma(swig) readonly;
#define %readwrite   %pragma(swig) readwrite;

/* Directives for callback functions */

#define %callback(x) %pragma(swig) callback=`x`;
#define %nocallback  %pragma(swig) nocallback;

/* Directives for attribute functions */

#define %attributefunc(_x,_y)  %pragma(swig)   attributefunction=`_x`":"`_y`;
#define %noattributefunc       %pragma(swig)   noattributefunction;

/* %ignore directive */

#define %ignore         %rename($ignore)
#define %ignorewarn(x)  %rename("$ignore:" x)

/* Generation of default constructors/destructors */

#define %nodefault     %pragma nodefault
#define %makedefault   %pragma makedefault

...
The fact that most of the standard SWIG directives are macros is intended to simplify the implementation of the parser. For instance, rather than having to support dozens of special grammar rules, it is easier to have a few basic primitives such as %pragma or %insert.

The langconfig.swg file is supplied by the target language. This file contains language-specific configuration information. More often than not, this file provides run-time wrapper support code (e.g., the type-checker) as well as a collection of typemaps that define the default wrapping behavior. Note: the name of this file depends on the target language and is usually something like python.swg or perl5.swg.

Although the SWIG preprocessor is intended to mimic the behavior of the C preprocessor, it is not meant to be a direct replacement. Instead, its behavior is adapted for use with SWIG and it provides a number of a non-standard extensions:

As a debugging aide, the text that SWIG feeds to its C++ parser can be obtained by running swig -E interface.i. This output probably isn't too useful in general, but it will show how macros have been expanded as well as everything else that goes into the low-level construction of the wrapper code.

Parsing

The current C++ parser handles a subset of C++. Most incompatibilities with C are due to subtle aspects of how SWIG parses declarations. Specifically, SWIG expects all C/C++ declarations to follow this general form:
storage type declarator initializer;
storage is a keyword such as extern, static, typedef, or virtual. type is a primitive datatype such as int or void. type may be optionally qualified with a qualifier such as const or volatile. declarator is a name with additional type-construction modifiers attached to it (pointers, arrays, references, functions, etc.). Examples of declarators include *x, **x, x[20], and (*x)(int,double). The initializer may be a value assigned using = or body of code enclosed in braces { ... }.

This declaration format covers most common C++ declarations. However, the C++ standard is somewhat more flexible in the placement of the pieces. For example, it is technically legal, although unusual to write something like int typedef const a in your program. SWIG simply doesn't bother to deal with this (although it could probably be modified if there is sufficient demand).

The other significant difference between C++ and SWIG is in the treatment of typenames. In C++, if you have a declaration like this,

int blah(Foo *x, Bar *y);
it won't parse correctly unless Foo and Bar have been previously defined as types either using a class definition or a typedef. The reasons for this are subtle, but this treatment of typenames is normally integrated at the level of the C tokenizer---when a typename appears, a different token is returned to the parser instead of an identifier.

SWIG does not operate in this manner--any legal identifier can be used as a type name. The reason for this is primarily motivated by the use of SWIG with partially defined data. Specifically, SWIG is supposed to be easy to use on interfaces with missing type information. On a more practical level however, the introduction of typenames would greatly complicate other parts of SWIG such as the parsing of SWIG directives (many of which also rely upon identifier names).

Because of the different treatment of typenames, the most serious limitation of the SWIG parser is that it can't process type declarations in which an extra (and unnecessary) grouping operator is used. For example:

int (x);         /* A variable x */
int (y)(int);    /* A function y */
The placing of extra parentheses in type declarations like this is already recognized by the C++ community as a potential source of strange programming errors. For example, Scott Meyers "Effective STL" discusses this problem in a section on avoiding C++'s "most vexing parse."

The parser is also unable to handle declarations with no return type or bare argument names. For example, in an old C program, you might see things like this:

foo(a,b) {
...
}
In this case, the return type as well as the types of the arguments are taken by the C compiler to be an int. However, SWIG interprets the above code as an abstract declarator for a function returning a foo and taking types a and b as arguments).

Parse Trees

The SWIG parser produces a complete parse tree of the input file before any wrapper code is actually generated. Each item in the tree is known as a "Node". Each node is identified by a symbolic tag. Furthermore, a node may have an arbitrary number of children. The parse tree structure and tag names of an interface can be displayed using swig -dump_tags. For example:
$ swig -c++ -python -dump_tags example.i
 . top (example.i:1)
 . top . include (example.i:1)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
 . top . include (example.i:4)
 . top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:7)
 . top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:8)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:19)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:19)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:19)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:20)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:20)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:20)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:21)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:21)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:21)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:22)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:22)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:22)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:23)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:23)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:24)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:24)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:25)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:25)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:26)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:26)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:29)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:29)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:32)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:32)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:42)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:42)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:42)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:42)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:45)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:45)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:46)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:46)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:49)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:49)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:59)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:61)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:61)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:61)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:62)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:62)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:63)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:63)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:66)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:66)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:66)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:66)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:69)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:69)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:72)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:72)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:75)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:75)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:75)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:84)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:84)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:105)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:114)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:114)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:114)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:124)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:124)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:137)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:137)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:154)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:154)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:164)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:164)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:173)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:173)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:173)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:182)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:182)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:191)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:191)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:200)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:200)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:205)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:208)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:208)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:208)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:211)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:211)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:211)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:214)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:214)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:214)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:214)
 . top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:217)
 . top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/python/python.swg:217)
 . top . include (example.i:6)
 . top . include . module (example.i:2)
 . top . include . insert (example.i:6)
 . top . include . include (example.i:9)
 . top . include . include . class (example.h:3)
 . top . include . include . class . access (example.h:4)
 . top . include . include . class . constructor (example.h:7)
 . top . include . include . class . destructor (example.h:10)
 . top . include . include . class . cdecl (example.h:11)
 . top . include . include . class . cdecl (example.h:11)
 . top . include . include . class . cdecl (example.h:12)
 . top . include . include . class . cdecl (example.h:13)
 . top . include . include . class . cdecl (example.h:14)
 . top . include . include . class . cdecl (example.h:15)
 . top . include . include . class (example.h:18)
 . top . include . include . class . access (example.h:19)
 . top . include . include . class . cdecl (example.h:20)
 . top . include . include . class . access (example.h:21)
 . top . include . include . class . constructor (example.h:22)
 . top . include . include . class . cdecl (example.h:23)
 . top . include . include . class . cdecl (example.h:24)
 . top . include . include . class (example.h:27)
 . top . include . include . class . access (example.h:28)
 . top . include . include . class . cdecl (example.h:29)
 . top . include . include . class . access (example.h:30)
 . top . include . include . class . constructor (example.h:31)
 . top . include . include . class . cdecl (example.h:32)
 . top . include . include . class . cdecl (example.h:33)
Even for the most simple interface, the parse tree structure is larger than you might expect. For example, in the above output, a substantial number of nodes are actually generated by the python.swg configuration file which defines typemaps and other directives. The contents of the user-supplied input file don't appear until the end of the output.

The contents of each parse tree node consist of a collection of attribute/value pairs. Internally, the nodes are simply stored as a hash table. A display of the parse-tree structure can be obtained using swig -dump_tree. For example:

$ swig -c++ -python -dump_tree example.i
...
      +++ include ----------------------------------------
      | name         - "example.i"

            +++ module ----------------------------------------
            | name         - "example"
            | 
            +++ insert ----------------------------------------
            | code         - "\n#include \"example.h\"\n"
            | 
            +++ include ----------------------------------------
            | name         - "example.h"

                  +++ class ----------------------------------------
                  | abstract     - "1"
                  | sym:name     - "Shape"
                  | name         - "Shape"
                  | kind         - "class"
                  | symtab       - 0x40194140
                  | sym:symtab   - 0x40191078

                        +++ access ----------------------------------------
                        | kind         - "public"
                        | 
                        +++ constructor ----------------------------------------
                        | sym:name     - "Shape"
                        | name         - "Shape"
                        | decl         - "f()."
                        | code         - "{\n    nshapes++;\n  }"
                        | sym:symtab   - 0x40194140
                        | 
                        +++ destructor ----------------------------------------
                        | sym:name     - "~Shape"
                        | name         - "~Shape"
                        | storage      - "virtual"
                        | code         - "{\n    nshapes--;\n  }"
                        | sym:symtab   - 0x40194140
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "x"
                        | name         - "x"
                        | decl         - ""
                        | type         - "double"
                        | sym:symtab   - 0x40194140
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "y"
                        | name         - "y"
                        | decl         - ""
                        | type         - "double"
                        | sym:symtab   - 0x40194140
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "move"
                        | name         - "move"
                        | decl         - "f(double,double)."
                        | parms        - double ,double 
                        | type         - "void"
                        | sym:symtab   - 0x40194140
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "area"
                        | name         - "area"
                        | decl         - "f(void)."
                        | parms        - void 
                        | storage      - "virtual"
                        | value        - "0"
                        | type         - "double"
                        | sym:symtab   - 0x40194140
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "perimeter"
                        | name         - "perimeter"
                        | decl         - "f(void)."
                        | parms        - void 
                        | storage      - "virtual"
                        | value        - "0"
                        | type         - "double"
                        | sym:symtab   - 0x40194140
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "nshapes"
                        | name         - "nshapes"
                        | decl         - ""
                        | storage      - "static"
                        | type         - "int"
                        | sym:symtab   - 0x40194140
                        | 
                  +++ class ----------------------------------------
                  | sym:name     - "Circle"
                  | name         - "Circle"
                  | kind         - "class"
                  | bases        - 0x40194510
                  | symtab       - 0x40194538
                  | sym:symtab   - 0x40191078

                        +++ access ----------------------------------------
                        | kind         - "private"
                        | 
                        +++ cdecl ----------------------------------------
                        | name         - "radius"
                        | decl         - ""
                        | type         - "double"
                        | 
                        +++ access ----------------------------------------
                        | kind         - "public"
                        | 
                        +++ constructor ----------------------------------------
                        | sym:name     - "Circle"
                        | name         - "Circle"
                        | parms        - double 
                        | decl         - "f(double)."
                        | code         - "{ }"
                        | sym:symtab   - 0x40194538
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "area"
                        | name         - "area"
                        | decl         - "f(void)."
                        | parms        - void 
                        | storage      - "virtual"
                        | type         - "double"
                        | sym:symtab   - 0x40194538
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "perimeter"
                        | name         - "perimeter"
                        | decl         - "f(void)."
                        | parms        - void 
                        | storage      - "virtual"
                        | type         - "double"
                        | sym:symtab   - 0x40194538
                        | 
                  +++ class ----------------------------------------
                  | sym:name     - "Square"
                  | name         - "Square"
                  | kind         - "class"
                  | bases        - 0x40194760
                  | symtab       - 0x40194788
                  | sym:symtab   - 0x40191078

                        +++ access ----------------------------------------
                        | kind         - "private"
                        | 
                        +++ cdecl ----------------------------------------
                        | name         - "width"
                        | decl         - ""
                        | type         - "double"
                        | 
                        +++ access ----------------------------------------
                        | kind         - "public"
                        | 
                        +++ constructor ----------------------------------------
                        | sym:name     - "Square"
                        | name         - "Square"
                        | parms        - double 
                        | decl         - "f(double)."
                        | code         - "{ }"
                        | sym:symtab   - 0x40194788
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "area"
                        | name         - "area"
                        | decl         - "f(void)."
                        | parms        - void 
                        | storage      - "virtual"
                        | type         - "double"
                        | sym:symtab   - 0x40194788
                        | 
                        +++ cdecl ----------------------------------------
                        | sym:name     - "perimeter"
                        | name         - "perimeter"
                        | decl         - "f(void)."
                        | parms        - void 
                        | storage      - "virtual"
                        | type         - "double"
                        | sym:symtab   - 0x40194788

Attribute namespaces

When attributes are added to parse tree nodes, their names may be prepended with a namespace qualifier. For example, the attributes sym:name and sym:symtab are attributes related to symbol table management and are prefixed with sym:. As a general rule, only very general attributes such as types, names, and so forth appear without a prefix.

Target language modules may add additional attributes to nodes to assist the generation of wrapper code. The convention for doing this is to place these attributes in a namespace that matches the name of the target language. For example, python:foo or perl:foo.

Symbol Tables

During parsing, all symbols are managed in the space of the target language. The sym:name attribute of each node contains the symbol name selected by the parser. Normally, sym:name and name are the same. However, the %rename directive can be used to change the value of sym:name. You can see the effect of %rename by trying it on a simple interface and dumping the parse tree. For example:
%rename(foo_i) foo(int);
%rename(foo_d) foo(double);

void foo(int);
void foo(double);
void foo(Bar *b);
Now, running SWIG:
$ swig -dump_tree example.i
...
            +++ cdecl ----------------------------------------
            | sym:name     - "foo_i"
            | name         - "foo"
            | decl         - "f(int)."
            | parms        - int 
            | type         - "void"
            | sym:symtab   - 0x40165078
            | 
            +++ cdecl ----------------------------------------
            | sym:name     - "foo_d"
            | name         - "foo"
            | decl         - "f(double)."
            | parms        - double 
            | type         - "void"
            | sym:symtab   - 0x40165078
            | 
            +++ cdecl ----------------------------------------
            | sym:name     - "foo"
            | name         - "foo"
            | decl         - "f(p.Bar)."
            | parms        - Bar *
            | type         - "void"
            | sym:symtab   - 0x40165078
All symbol-related conflicts and complaints about overloading are based on sym:name values. For instance, the following example uses %rename in reverse to generate a name clash.
%rename(foo) foo_i(int);
%rename(foo) foo_d(double;

void foo_i(int);
void foo_d(double);
void foo(Bar *b);
When you run SWIG on this you now get:
$ ./swig example.i
example.i:6. Overloaded declaration ignored.  foo_d(double )
example.i:5. Previous declaration is foo_i(int )
example.i:7. Overloaded declaration ignored.  foo(Bar *)
example.i:5. Previous declaration is foo_i(int )

The %feature directive

A number of SWIG directives such as %exception are implemented using the lower-level %feature directive. For example:
%feature("except") getitem(int) {
  try {
     $action
  } catch (badindex) {
     ...
  }
}

...
class Foo {
public:
    Object *getitem(int index) throws(badindex);
    ...
};
The behavior of %feature is very easy to describe--it simply attaches a new attribute to any parse tree node that matches the given prototype. When a feature is added, it shows up in the feature: namespace. You can see this when running with the -dump_tree option. For example:
 +++ cdecl ----------------------------------------
 | sym:name     - "getitem"
 | name         - "getitem"
 | decl         - "f(int).p."
 | parms        - int 
 | type         - "Object"
 | feature:except - "{\n    try {\n       $action\n    } catc..."
 | sym:symtab   - 0x40168ac8
 | 
Feature names are completely arbitrary and a target language module can be programmed to respond to any name that it wishes. The data stored in a feature attribute is usually just a raw unparsed string. For example, the exception code above is simply stored without any modifications.

Code Generation

Language modules work by defining handler functions that know how to respond to different types of parse-tree nodes. These handlers simply look at the attributes of each node in order to produce low-level code.

In reality, the generation of code is somewhat more subtle than simply invoking handler functions. This is because parse-tree nodes might be transformed. For example, suppose you are wrapping a class like this:

class Foo {
public:
    virtual int *bar(int x);
};
When the parser constructs a node for the member bar, it creates a raw "cdecl" node with the following attributes:
nodeType    : cdecl
name        : bar
type        : int
decl        : f(int).p
parms       : int x
storage     : virtual
sym:name    : bar
To produce wrapper code, this "cdecl" node undergoes a number of transformations. First, the node is recognized as a function declaration. This adjusts some of the type information--specifically, the declarator is joined with the base datatype to produce this:
nodeType    : cdecl
name        : bar
type        : p.int        <-- Notice change in return type
decl        : f(int).p
parms       : int x
storage     : virtual
sym:name    : bar
Next, the context of the node indicates that the node is really a member function. This produces a transformation to a low-level accessor function like this:
nodeType    : cdecl
name        : bar
type        : int.p
decl        : f(int).p
parms       : Foo *self, int x            <-- Added parameter
storage     : virtual
wrap:action : result = (arg1)->bar(arg2)  <-- Action code added
sym:name    : Foo_bar                     <-- Symbol name changed
In this transformation, notice how an additional parameter was added to the parameter list and how the symbol name of the node has suddenly changed into an accessor using the naming scheme described in the "SWIG Basics" chapter. A small fragment of "action" code has also been generated--notice how the wrap:action attribute defines the access to the underlying method. The data in this transformed node is then used to generate a wrapper.

Language modules work by registering handler functions for dealing with various types of nodes at different stages of transformation. This is done by inheriting from a special Language class and defining a collection of virtual methods. For example, the Python module defines a class as follows:

class PYTHON : public Language {
protected:
public :
  virtual void main(int, char *argv[]);
  virtual int  top(Node *); 
  virtual int  functionWrapper(Node *);
  virtual int  constantWrapper(Node *);
  virtual int  variableWrapper(Node *);
  virtual int  nativeWrapper(Node *);
  virtual int  membervariableHandler(Node *);
  virtual int  memberconstantHandler(Node *);
  virtual int  memberfunctionHandler(Node *);
  virtual int  constructorHandler(Node *);
  virtual int  destructorHandler(Node *);
  virtual int  classHandler(Node *);
  virtual int  classforwardDeclaration(Node *);
  virtual int  insertDirective(Node *);
  virtual void import_start(char *);
  virtual void import_end();
};
The role of these functions are described shortly.

SWIG and XML

Much of SWIG's current parser design was originally motivated by interest in using XML to represent SWIG parse trees. Although XML is not currently used in any direct manner, the parse tree structure, use of node tags, attributes, and attribute namespaces are all influenced by aspects of XML parsing. Therefore, in trying to understand SWIG's internal data structures, it may be useful keep XML in the back of your mind as a model.

Summary so far

SWIG is a multi-pass compiler that works by building a complete parse tree of input files. These parse trees are structured as a hierarchy of nodes with arbitrary attributes. Language modules are created by writing special handlers for different types of parse tree nodes.

The rest of this chapter describes some of the internal data structures and various code generation tasks in more detail.


SWIG 1.3 - Last Modified : January 22, 2002