Using SIP

Bindings are generated by the SIP code generator from a number of specification files, typically with a .sip extension. Specification files look very similar to C and C++ header files, but often with additional information (in the form of a directive or an annotation) and code so that the bindings generated can be finely tuned.

A Simple C++ Example

We start with a simple example. Let’s say you have a (fictional) C++ library that implements a single class called Word. The class has one constructor that takes a \0 terminated character string as its single argument. The class has one method called reverse() which takes no arguments and returns a \0 terminated character string. The interface to the class is defined in a header file called word.h which might look something like this:

// Define the interface to the word library.

class Word {
    const char *the_word;

public:
    Word(const char *w);

    char *reverse() const;
};

The corresponding SIP specification file would then look something like this:

// Define the SIP wrapper to the word library.

%Module word

class Word {

%TypeHeaderCode
#include <word.h>
%End

public:
    Word(const char *w);

    char *reverse() const;
};

Obviously a SIP specification file looks very much like a C++ (or C) header file, but SIP does not include a full C++ parser. Let’s look at the differences between the two files.

  • The %Module directive has been added 1. This is used to name the Python module that is being created, word in this example.

  • The %TypeHeaderCode directive has been added. The text between this and the following %End directive is included literally in the code that SIP generates. Normally it is used, as in this case, to #include the corresponding C++ (or C) header file 2.

  • The declaration of the private variable this_word has been removed. SIP does not support access to either private or protected instance variables.

If we want to we can now generate the C++ code in the current directory by running the following command:

sip -c . word.sip

However, that still leaves us with the task of compiling the generated code and linking it against all the necessary libraries. It’s much easier to use the SIP build system to do the whole thing.

Using the SIP build system is simply a matter of writing a small Python script. In this simple example we will assume that the word library we are wrapping and it’s header file are installed in standard system locations and will be found by the compiler and linker without having to specify any additional flags. In a more realistic example your Python script may take command line options, or search a set of directories to deal with different configurations and installations.

This is the simplest script (conventionally called configure.py):

import os
import sipconfig

# The name of the SIP build file generated by SIP and used by the build
# system.
build_file = "word.sbf"

# Get the SIP configuration information.
config = sipconfig.Configuration()

# Run SIP to generate the code.
os.system(" ".join([config.sip_bin, "-c", ".", "-b", build_file, "word.sip"]))

# Create the Makefile.
makefile = sipconfig.SIPModuleMakefile(config, build_file)

# Add the library we are wrapping.  The name doesn't include any platform
# specific prefixes or extensions (e.g. the "lib" prefix on UNIX, or the
# ".dll" extension on Windows).
makefile.extra_libs = ["word"]

# Generate the Makefile itself.
makefile.generate()

Hopefully this script is self-documenting. The key parts are the Configuration and SIPModuleMakefile classes. The build system contains other Makefile classes, for example to build programs or to call other Makefiles in sub-directories.

After running the script (using the Python interpreter the extension module is being created for) the generated C++ code and Makefile will be in the current directory.

To compile and install the extension module, just run the following commands 3:

make
make install

That’s all there is to it.

See Building Your Extension with distutils for an example of how to build this example using distutils.

1

All SIP directives start with a % as the first non-whitespace character of a line.

2

SIP includes many code directives like this. They differ in where the supplied code is placed by SIP in the generated code.

3

On Windows you might run nmake or mingw32-make instead.

A Simple C Example

Let’s now look at a very similar example of wrapping a fictional C library:

/* Define the interface to the word library. */

struct Word {
    const char *the_word;
};

struct Word *create_word(const char *w);
char *reverse(struct Word *word);

The corresponding SIP specification file would then look something like this:

/* Define the SIP wrapper to the word library. */

%Module(name=word, language="C")

struct Word {

%TypeHeaderCode
#include <word.h>
%End

    const char *the_word;
};

struct Word *create_word(const char *w) /Factory/;
char *reverse(struct Word *word);

Again, let’s look at the differences between the two files.

  • The %Module directive specifies that the library being wrapped is implemented in C rather than C++. Because we are now supplying an optional argument to the directive we must also specify the module name as an argument.

  • The %TypeHeaderCode directive has been added.

  • The Factory annotation has been added to the create_word() function. This tells SIP that a newly created structure is being returned and it is owned by Python.

The configure.py build system script described in the previous example can be used for this example without change.

A More Complex C++ Example

In this last example we will wrap a fictional C++ library that contains a class that is derived from a Qt class. This will demonstrate how SIP allows a class hierarchy to be split across multiple Python extension modules, and will introduce SIP’s versioning system.

The library contains a single C++ class called Hello which is derived from Qt’s QLabel class. It behaves just like QLabel except that the text in the label is hard coded to be Hello World. To make the example more interesting we’ll also say that the library only supports Qt v4.2 and later, and also includes a function called setDefault() that is not implemented in the Windows version of the library.

The hello.h header file looks something like this:

// Define the interface to the hello library.

#include <qlabel.h>
#include <qwidget.h>
#include <qstring.h>

class Hello : public QLabel {
    // This is needed by the Qt Meta-Object Compiler.
    Q_OBJECT

public:
    Hello(QWidget *parent = 0);

private:
    // Prevent instances from being copied.
    Hello(const Hello &);
    Hello &operator=(const Hello &);
};

#if !defined(Q_OS_WIN)
void setDefault(const QString &def);
#endif

The corresponding SIP specification file would then look something like this:

// Define the SIP wrapper to the hello library.

%Module hello

%Import QtGui/QtGuimod.sip

%If (Qt_4_2_0 -)

class Hello : public QLabel {

%TypeHeaderCode
#include <hello.h>
%End

public:
    Hello(QWidget *parent /TransferThis/ = 0);

private:
    Hello(const Hello &);
};

%If (!WS_WIN)
void setDefault(const QString &def);
%End

%End

Again we look at the differences, but we’ll skip those that we’ve looked at in previous examples.

  • The %Import directive has been added to specify that we are extending the class hierarchy defined in the file QtGui/QtGuimod.sip. This file is part of PyQt4. The build system will take care of finding the file’s exact location.

  • The %If directive has been added to specify that everything 4 up to the matching %End directive only applies to Qt v4.2 and later. Qt_4_2_0 is a tag defined in QtCoremod.sip 5 using the %Timeline directive. %Timeline is used to define a tag for each version of a library’s API you are wrapping allowing you to maintain all the different versions in a single SIP specification. The build system provides support to configure.py scripts for working out the correct tags to use according to which version of the library is actually installed.

  • The TransferThis annotation has been added to the constructor’s argument. It specifies that if the argument is not 0 (i.e. the Hello instance being constructed has a parent) then ownership of the instance is transferred from Python to C++. It is needed because Qt maintains objects (i.e. instances derived from the QObject class) in a hierarchy. When an object is destroyed all of its children are also automatically destroyed. It is important, therefore, that the Python garbage collector doesn’t also try and destroy them. This is covered in more detail in Ownership of Objects. SIP provides many other annotations that can be applied to arguments, functions and classes. Multiple annotations are separated by commas. Annotations may have values.

  • The = operator has been removed. This operator is not supported by SIP.

  • The %If directive has been added to specify that everything up to the matching %End directive does not apply to Windows. WS_WIN is another tag defined by PyQt4, this time using the %Platforms directive. Tags defined by the %Platforms directive are mutually exclusive, i.e. only one may be valid at a time 6.

One question you might have at this point is why bother to define the private copy constructor when it can never be called from Python? The answer is to prevent the automatic generation of a public copy constructor.

We now look at the configure.py script. This is a little different to the script in the previous examples for two related reasons.

Firstly, PyQt4 includes a pure Python module called pyqtconfig that extends the SIP build system for modules, like our example, that build on top of PyQt4. It deals with the details of which version of Qt is being used (i.e. it determines what the correct tags are) and where it is installed. This is called a module’s configuration module.

Secondly, we generate a configuration module (called helloconfig) for our own hello module. There is no need to do this, but if there is a chance that somebody else might want to extend your C++ library then it would make life easier for them.

Now we have two scripts. First the configure.py script:

import os
import sipconfig
from PyQt4 import pyqtconfig

# The name of the SIP build file generated by SIP and used by the build
# system.
build_file = "hello.sbf"

# Get the PyQt4 configuration information.
config = pyqtconfig.Configuration()

# Get the extra SIP flags needed by the imported PyQt4 modules.  Note that
# this normally only includes those flags (-x and -t) that relate to SIP's
# versioning system.
pyqt_sip_flags = config.pyqt_sip_flags

# Run SIP to generate the code.  Note that we tell SIP where to find the qt
# module's specification files using the -I flag.
os.system(" ".join([config.sip_bin, "-c", ".", "-b", build_file, "-I", config.pyqt_sip_dir, pyqt_sip_flags, "hello.sip"]))

# We are going to install the SIP specification file for this module and
# its configuration module.
installs = []

installs.append(["hello.sip", os.path.join(config.default_sip_dir, "hello")])

installs.append(["helloconfig.py", config.default_mod_dir])

# Create the Makefile.  The QtGuiModuleMakefile class provided by the
# pyqtconfig module takes care of all the extra preprocessor, compiler and
# linker flags needed by the Qt library.
makefile = pyqtconfig.QtGuiModuleMakefile(
    configuration=config,
    build_file=build_file,
    installs=installs
)

# Add the library we are wrapping.  The name doesn't include any platform
# specific prefixes or extensions (e.g. the "lib" prefix on UNIX, or the
# ".dll" extension on Windows).
makefile.extra_libs = ["hello"]

# Generate the Makefile itself.
makefile.generate()

# Now we create the configuration module.  This is done by merging a Python
# dictionary (whose values are normally determined dynamically) with a
# (static) template.
content = {
    # Publish where the SIP specifications for this module will be
    # installed.
    "hello_sip_dir":    config.default_sip_dir,

    # Publish the set of SIP flags needed by this module.  As these are the
    # same flags needed by the qt module we could leave it out, but this
    # allows us to change the flags at a later date without breaking
    # scripts that import the configuration module.
    "hello_sip_flags":  pyqt_sip_flags
}

# This creates the helloconfig.py module from the helloconfig.py.in
# template and the dictionary.
sipconfig.create_config_module("helloconfig.py", "helloconfig.py.in", content)

Next we have the helloconfig.py.in template script:

from PyQt4 import pyqtconfig

# These are installation specific values created when Hello was configured.
# The following line will be replaced when this template is used to create
# the final configuration module.
# @SIP_CONFIGURATION@

class Configuration(pyqtconfig.Configuration):
    """The class that represents Hello configuration values.
    """
    def __init__(self, sub_cfg=None):
        """Initialise an instance of the class.

        sub_cfg is the list of sub-class configurations.  It should be None
        when called normally.
        """
        # This is all standard code to be copied verbatim except for the
        # name of the module containing the super-class.
        if sub_cfg:
            cfg = sub_cfg
        else:
            cfg = []

        cfg.append(_pkg_config)

        pyqtconfig.Configuration.__init__(self, cfg)

class HelloModuleMakefile(pyqtconfig.QtGuiModuleMakefile):
    """The Makefile class for modules that %Import hello.
    """
    def finalise(self):
        """Finalise the macros.
        """
        # Make sure our C++ library is linked.
        self.extra_libs.append("hello")

        # Let the super-class do what it needs to.
        pyqtconfig.QtGuiModuleMakefile.finalise(self)

Again, we hope that the scripts are self documenting.

4

Some parts of a SIP specification aren’t subject to version control.

5

Actually in versions.sip. PyQt4 uses the %Include directive to split the SIP specification for Qt across a large number of separate .sip files.

6

Tags can also be defined by the %Feature directive. These tags are not mutually exclusive, i.e. any number may be valid at a time.

Wrapping Enums

New in version 4.19.4.

SIP wraps C/C++ enums using a dedicated Python type and implements behaviour that mimics the C/C++ behaviour regqrding the visibility of the enum’s members. In other words, an enum’s members have the same visibility as the enum itself. For example:

class MyClass
{
public:
    enum MyEnum
    {
        Member
    }
}

In Python the Member member is referenced as MyClass.Member. This behaviour makes it easier to translate C/C++ code to Python.

In more recent times C++11 has introduced scoped enums and Python has introduced the enum module. In both cases a member is only visible in the scope of the enum. In other words, the Member member is referenced as MyClass.MyEnum.Member.

This version of SIP adds support for wrapping C++11 scoped enums and implements them as Python enum.Enum objects. For versions of Python that don’t include the enum module in the standrd library (i.e. versions earlier than v3.4) then the enum34 package must be installed from PyPI.

New in version 4.19.9.

A disadvantage of the above is that the Python programmer needs to know the nature of the C/C++ enum in order to access its members. In order to avoid this, this version of SIP makes the members of traditional C/C++ enums visible from the scope of the enum as well.

It is recommended that Python code should always specify the enum scope when referencing an enum member.

Ownership of Objects

When a C++ instance is wrapped a corresponding Python object is created. The Python object behaves as you would expect in regard to garbage collection - it is garbage collected when its reference count reaches zero. What then happens to the corresponding C++ instance? The obvious answer might be that the instance’s destructor is called. However the library API may say that when the instance is passed to a particular function, the library takes ownership of the instance, i.e. responsibility for calling the instance’s destructor is transferred from the SIP generated module to the library.

Ownership of an instance may also be associated with another instance. The implication being that the owned instance will automatically be destroyed if the owning instance is destroyed. SIP keeps track of these relationships to ensure that Python’s cyclic garbage collector can detect and break any reference cycles between the owning and owned instances. The association is implemented as the owning instance taking a reference to the owned instance.

The TransferThis, Transfer and TransferBack annotations are used to specify where, and it what direction, transfers of ownership happen. It is very important that these are specified correctly to avoid crashes (where both Python and C++ call the destructor) and memory leaks (where neither Python and C++ call the destructor).

This applies equally to C structures where the structure is returned to the heap using the free() function.

See also sipTransferTo(), sipTransferBack() and sipTransferBreak().

Types and Meta-types

Every Python object (with the exception of the object object itself) has a meta-type and at least one super-type. By default an object’s meta-type is the meta-type of its first super-type.

SIP implements two super-types, sip.simplewrapper and sip.wrapper, and a meta-type, sip.wrappertype.

sip.simplewrapper is the super-type of sip.wrapper. The super-type of sip.simplewrapper is object.

sip.wrappertype is the meta-type of both sip.simplewrapper and sip.wrapper. The super-type of sip.wrappertype is type.

sip.wrapper supports the concept of object ownership described in Ownership of Objects and, by default, is the super-type of all the types that SIP generates.

sip.simplewrapper does not support the concept of object ownership but SIP generated types that are sub-classed from it have Python objects that take less memory.

SIP allows a class’s meta-type and super-type to be explicitly specified using the Metatype and Supertype class annotations.

SIP also allows the default meta-type and super-type to be changed for a module using the %DefaultMetatype and %DefaultSupertype directives. Unlike the default super-type, the default meta-type is inherited by importing modules.

If you want to use your own meta-type or super-type then they must be sub-classed from one of the SIP provided types. Your types must be registered using sipRegisterPyType(). This is normally done in code specified using the %InitialisationCode directive.

As an example, PyQt4 uses %DefaultMetatype to specify a new meta-type that handles the interaction with Qt’s own meta-type system. It also uses %DefaultSupertype to specify that the smaller sip.simplewrapper super-type is normally used. Finally it uses Supertype as an annotation of the QObject class to override the default and use sip.wrapper as the super-type so that the parent/child relationships of QObject instances are properly maintained.

Note

It is not possible to define new super-types or meta-types if the limited Python API is enabled.

Lazy Type Attributes

Instead of populating a wrapped type’s dictionary with its attributes (or descriptors for those attributes) SIP only creates objects for those attributes when they are actually needed. This is done to reduce the memory footprint and start up time when used to wrap large libraries with hundreds of classes and tens of thousands of attributes.

SIP allows you to extend the handling of lazy attributes to your own attribute types by allowing you to register an attribute getter handler (using sipRegisterAttributeGetter()). This will be called just before a type’s dictionary is accessed for the first time.

Overflow Checking

By default SIP does not check for overflow when converting Python number objects to C/C++ types. Overflowed values are undefined - it cannot be assumed that upper bits are simply discarded.

SIP v4.19.4 allowed overflow checking to be enabled and disabled by the wrapper author (using :c:func`sipEnableOverflowChecking()`) or by the application developer (using :py:func`sip.enableoverflowchecking()`).

It is recommended that wrapper authors should always enable overflow checking by default.

Support for Python’s Buffer Interface

SIP supports Python’s buffer interface in that whenever C/C++ requires a char or char * type then any Python type that supports the buffer interface (including ordinary Python strings) can be used.

If a buffer is made up of a number of segments then all but the first will be ignored.

Support for Wide Characters

SIP v4.6 introduced support for wide characters (i.e. the wchar_t type). Python’s C API includes support for converting between unicode objects and wide character strings and arrays. When converting from a unicode object to wide characters SIP creates the string or array on the heap (using memory allocated using sipMalloc()). This then raises the problem of how this memory is subsequently freed.

The following describes how SIP handles this memory in the different situations where this is an issue.

  • When a wide string or array is passed to a function or method then the memory is freed (using sipFree()) after that function or method returns.

  • When a wide string or array is returned from a virtual method then SIP does not free the memory until the next time the method is called.

  • When an assignment is made to a wide string or array instance variable then SIP does not first free the instance’s current string or array.

The Python Global Interpreter Lock

Python’s Global Interpretor Lock (GIL) must be acquired before calls can be made to the Python API. It should also be released when a potentially blocking call to C/C++ library is made in order to allow other Python threads to be executed. In addition, some C/C++ libraries may implement their own locking strategies that conflict with the GIL causing application deadlocks. SIP provides ways of specifying when the GIL is released and acquired to ensure that locking problems can be avoided.

SIP always ensures that the GIL is acquired before making calls to the Python API. By default SIP does not release the GIL when making calls to the C/C++ library being wrapped. The ReleaseGIL annotation can be used to override this behaviour when required.

If SIP is given the -g command line option then the default behaviour is changed and SIP releases the GIL every time is makes calls to the C/C++ library being wrapped. The HoldGIL annotation can be used to override this behaviour when required.

Managing Incompatible APIs

New in version 4.9.

Sometimes it is necessary to change the way something is wrapped in a way that introduces an incompatibility. For example a new feature of Python may suggest that something may be wrapped in a different way to exploit that feature.

SIP’s %Feature directive could be used to provide two different implementations. However this would mean that the choice between the two implementations would have to be made when building the generated module potentially causing all sorts of deployment problems. It may also require applications to work out which implementation was available and to change their behaviour accordingly.

Instead SIP provides limited support for providing multiple implementations (of classes, mapped types and functions) that can be selected by an application at run-time. It is then up to the application developer how they want to manage the migration from the old API to the new, incompatible API.

This support is implemented in three parts.

Firstly the %API directive is used to define the name of an API and its default version number. The default version number is the one used if an application doesn’t explicitly set the version number to use.

Secondly the API class, mapped type or function annotation is applied accordingly to specify the API and range of version numbers that a particular class, mapped type or function implementation should be enabled for.

Finally the application calls sip.setapi() to specify the version number of the API that should be enabled. This call must be made before any module that has multiple implementations is imported for the first time.

Note this mechanism is not intended as a way or providing equally valid alternative APIs. For example:

%API(name=MyAPI, version=1)

class Foo
{
public:
    void bar();
};

class Baz : Foo
{
public:
    void bar() /API=MyAPI:2-/;
};

If the following Python code is executed then an exception will be raised:

b = Baz()
b.bar()

This is because when version 1 of the MyAPI API (the default) is enabled there is no Baz.bar() implementation and Foo.bar() will not be called instead as might be expected.

Writing %ConvertToSubClassCode

When SIP needs to wrap a C++ class instance it first checks to make sure it hasn’t already done so. If it has then it just returns a new reference to the corresponding Python object. Otherwise it creates a new Python object of the appropriate type. In C++ a function may be defined to return an instance of a certain class, but can often return a sub-class instead.

The %ConvertToSubClassCode directive is used to specify handwritten code that exploits any available real-time type information (RTTI) to see if there is a more specific Python type that can be used when wrapping the C++ instance. The RTTI may be provided by the compiler or by the C++ instance itself.

The directive is included in the specification of one of the classes that the handwritten code handles the type conversion for. It doesn’t matter which one, but a sensible choice would be the one at the root of that class hierarchy in the module.

Note

In a future version of SIP this use of the directive will be deprecated and it will instead be placed outside any class specification.

If a class hierarchy extends over a number of modules then this directive should be used in each of those modules to handle the part of the hierarchy defined in that module. SIP will ensure that the different pieces of code are called in the right order to determine the most specific Python type to use.

A class has at least one convertor if it or any super-class defines %ConvertToSubClassCode. A convertor has a base class. If a class that defines %ConvertToSubClassCode does not have a super-class that defines %ConvertToSubClassCode then that class is the base class. Otherwise the base class is that of the right-most super-class that has a convertor. In this case the %ConvertToSubClassCode extends all other convertors with the same base class.

Consider the following class hierarchy:

A
  \
   B*     C*
     \  /   \
      D      E
    /   \
  F       G*

The classes marked with an asterisk define %ConvertToSubClassCode.

Classes A to F are implemented in module X. Class G is implemented in module Y.

We can say the following:

A convertor is invoked when SIP needs to wrap a C++ instance and the type of that instance is a sub-class of the convertor’s base class. The convertor is passed a pointer to the instance cast to the base class. The convertor then, if possible, casts that pointer to an instance of a sub-class of its original class. It also returns a pointer to the corresponding generated type object.

It is possible for a convertor to switch to another convertor. This can avoid duplication of convertor code where there is multiple inheritance.

When more than one convertor may be invoked they are done so in the order that reflects the module hierarchy. When the convertors are defined in the same module then the order is undefined. Convertors must be written with this mind.

Given the class hierarchy shown above, lets say that SIP needs to wrap an instance of known to be of class D but is actually of class F. We want the conversion mechanism to recognise that fact and return a Python object of type F. The following steps are taken:

  • G’s %ConvertToSubClassCode is invoked and passed the pointer to D cast to C. This convertor only recognises instances of class G and so returns a value that indicates it was unable to perform a conversion.

  • SIP will now invoke either B’s %ConvertToSubClassCode or C’s %ConvertToSubClassCode. As they are defined in the same module which is chosen is undefined. Let’s assume it is the C convertor that is invoked.

  • The convertor recognises that the instance is of class D (rather than C or E). It must also determine whether this really is D or whether it is actually F. Of course B’s %ConvertToSubClassCode must also make the same distinction. Rather than possibly duplicating the required code in both convertors the C convertor switches to the B convertor. It does this by casting the pointer it is trying to convert to B and returns B’s generated type object.