Design overview¶
pygccxml has 4 packages:
-
This package defines classes that describe C++ declarations and types.
-
This package defines classes that parse GCC-XML or CastXML generated files. It also defines a few classes that will help you unnecessary parsing of C++ source files.
-
This package defines a few functions useful for the whole project, but which are mainly used internally by pygccxml.
declarations
package¶
Please take a look on the UML diagram. This UML diagram describes almost all
classes defined in the package and their relationship. declarations
package
defines two hierarchies of class:
types hierarchy - used to represent a C++ type
declarations hierarchy - used to represent a C++ declaration
Types hierarchy¶
Types hierarchy is used to represent an arbitrary type in C++. class type_t
is the base class.
type_traits
¶
Are you aware of boost::type_traits library? The boost::type_traits library contains a set of very specific traits classes, each of which encapsulate a single trait from the C++ type system; for example, is a type a pointer or a reference? Or does a type have a trivial constructor, or a const-qualifier?
pygccxml implements a lot of functionality from the library:
a lot of algorithms were implemented
is_same
is_enum
is_void
is_const
is_array
is_pointer
is_volatile
is_integral
is_reference
is_arithmetic
is_convertible
is_fundamental
is_floating_point
is_base_and_derived
is_unary_operator
is_binary_operator
remove_cv
remove_const
remove_alias
remove_pointer
remove_volatile
remove_reference
has_trivial_copy
has_trivial_constructor
has_any_non_copyconstructor
For a full list of implemented algorithms, please consult API documentation.
a lot of unit tests has been written base on unit tests from the boost::type_traits library.
If you are going to build code generator, you will find type_traits
very handy.
Declarations hierarchy¶
A declaration hierarchy is used to represent an arbitrary C++ declaration. Basically, most of the classes defined in this package are just “set of properties”.
declaration_t
is the base class of the declaration hierarchy. Every declaration
has parent
property. This property keeps a reference to the scope declaration
instance, in which this declaration is defined.
The scopedef_t
class derives from declaration_t
. This class is used to
say - “I may have other declarations inside”. The “composite” design pattern is
used here. class_t
and namespace_t
declaration classes derive from the
scopedef_t
class.
parser
package¶
Please take a look on parser package UML diagram . Classes defined in this package, implement parsing and linking functionality. There are few kind of classes defined by the package:
classes, that implements parsing algorithms of GCC-XML generated XML file
parser configuration classes
cache - classes, those one will help you to eliminate unnecessary parsing
patchers - classes, which fix GCC-XML generated declarations. ( Yes, sometimes GCC-XML generates wrong description of C++ declaration. )
Parser classes¶
source_reader_t
- the only class that have a detailed knowledge about GCC-XML.
It has only one responsibility: it calls GCC-XML with a source file specified
by user and creates declarations tree. The implementation of this class is split
to 2 classes:
scanner_t
- this class scans the “XML” file, generated by GCC-XML and creates pygccxml declarations and types classes. After the xml file has been processed declarations and type class instances keeps references to each other using GCC-XML generated ids.linker_t
- this class contains logic for replacing GCC-XML generated ids with references to declarations or type class instances.
Both those classes are implementation details and should not be used by user.
Performance note: scanner_t
class uses Python xml.sax
package in order
to parse XML. As a result, scanner_t
class is able to parse even big XML files
pretty quick.
project_reader_t
- think about this class as a linker. In most cases you work
with few source files. GCC-XML does not supports this mode of work. So, pygccxml
implements all functionality needed to parse few source files at once.
project_reader_t
implements 2 different algorithms, that solves the problem:
project_reader_t
creates temporal source file, which includes all the source files.project_reader_t
parse separately every source file, usingsource_reader_t
class and then joins the resulting declarations tree into single declarations tree.
Both approaches have different trades-off. The first approach does not allow you to reuse information from already parsed source files. While the second one allows you to setup cache.
Parser configuration classes¶
gccxml_configuration_t
- a class, that accumulates all the settings needed to invoke GCC-XML:
file_configuration_t
- a class, that contains some data and description how
to treat the data. file_configuration_t
can contain reference to the the following types
of data:
path to C++ source file
path to GCC-XML generated XML file
path to C++ source file and path to GCC-XML generated XML file
In this case, if XML file does not exists, it will be created. Next time you will ask to parse the source file, the XML file will be used instead.
Small tip: you can setup your makefile to delete XML files every time, the relevant source file has changed.
Python string, that contains valid C++ code
There are few functions that will help you to construct file_configuration_t
object:
def create_source_fc( header )
header
contains path to C++ source filedef create_gccxml_fc( xml_file )
xml_file
contains path to GCC-XML generated XML filedef create_cached_source_fc( header, cached_source_file )
header
contains path to C++ source filexml_file
contains path to GCC-XML generated XML file
def create_text_fc( text )
text
- Python string, that contains valid C++ code
Cache classes¶
There are few cache classes, which implements different cache strategies.
file_configuration_t
class, that keeps path to C++ source file and path to GCC-XML generated XML file.file_cache_t
class, will save all declarations from all files within single binary file.directory_cache_t
class will store one index file called “index.dat” which is always read by the cache when the cache object is created. Each header file will have its corresponding *.cache file that stores the declarations found in the header file. The index file is used to determine whether a *.cache file is still valid or not (by checking if one of the dependent files (i.e. the header file itself and all included files) have been modified since the last run).
In some cases, directory_cache_t
class gives much better performance, than
file_cache_t
. Many thanks to Matthias Baas for its implementation.
Warning: when pygccxml writes information to files, using cache classes, it does not write any version information. It means, that when you upgrade pygccxml you have to delete all your cache files. Otherwise you will get very strange errors. For example: missing attribute.
Patchers¶
Well, GCC-XML has few bugs, which could not be fixed from it. For example
namespace ns1{ namespace ns2{
enum fruit{ apple, orange };
} }
void fix_enum( ns1::ns2::fruit arg=ns1::ns2::apple );
GCC-XML will report the default value of arg
as apple
. Obviously
this in an error. pygccxml knows how to fix this bug.
This is not the only bug, which could be fixed, there are few of them. pygccxml introduces few classes, which knows how to deal with specific bug. More over, those bugs are fixed, only if I am 101% sure, that this is the right thing to do.
utils
package¶
Use internally by pygccxml. Some methods/classes may be still usefull: loggers, find_xml_generator
Summary¶
That’s all. I hope I was clear, at least I tried. Any way, pygccxml is an open source project. You always can take a look on the source code. If you need more information please read API documentation.