Open the PoD bay doors

Step 1 - Open the PoD bay doors

This is the first step, basic types and calls - each step adds or improves a small feature, this text is just highlighting a few details along the way, the best way to work through is to build and run each one, reading through the code with this description alongside. Similarly reading this alongside the diff for each step will also mean more than either just reading the text or just reading the code alone.
That's what works for me anyway, YMMV.

In C++26 with the std::meta features it's possible to interrogate langage metadata at compile time. The plan is to use this to build up a metadata dictionary for a library that can be interrogated dynamically at runtime through an external C API. With that, it will be possible to dynamically generate a Python binding - no pybind, no SWIG, no hand written glue - to interface with the library. Our library can remain a plain C++ library with no need to have any knowledge or special cases for each target user.

TL;DR - std::meta lets you query type information directly. std::meta::members_of gets class members, std::meta::parameters_of gets method parameters. Previously this required recursive templates, variadic arguments and SFINAE; now I can just iterate the data directly.

Using the new C++26 syntax

The syntax - reflect operator, expansion statement and splicer:

  • ^^T - the reflect operator. ^^Demo gives you a std::meta::info handle for the type Demo. (a compile time pointer to the metadata)
  • template for (...) - the expansion statement. This is a compile-time loop over a constexpr range of those std::meta::info handles. Since it's a compile time loop, unlike a regular for, each iteration can produce different types.
  • [:expr:] - the splicer. It takes a std::meta::info handle and splices the thing it refers to back into the code. typename[:std::meta::type_of(Param):] means "the type of this parameter as a usable C++ type".

and yeah I read those descriptions and had no idea what they meant until I used them.

What we're building

In libdemo we have a basic class with functions taking and returning "plain old data" types which we're going to expose to python:

class Basic {
  public:
    int    getInt();
    double getFloat();
    bool   compareString(const std::string &lhs, const std::string &rhs);
};

With one macro, this can then be exposed for our dynamic binding:

XCLASS_REFLECT(Basic);

That macro

The macro expands to a struct with a constructor, plus a static instance of it:

namespace {
struct Basic_ReflectedRegistrar {
    Basic_ReflectedRegistrar() {
        xplat::registerClassReflect<Basic>("Basic");
    }
};
static Basic_ReflectedRegistrar Basic_reflected_reg_inst;
}

This results in our metadata builder registerClassReflect<Basic> being executed on library load before main runs, populating our metadata dictionary.

Filling the registry with C++26

registerClassReflect uses the new reflection features to walk the class:

template <typename C>
void registerClassReflect(const char *name) { //*1
    registerClass<C>(name); // Creates an entry for the class

    template for (constexpr auto Method : std::meta::members_of(^^C)) { //*2
        if constexpr (std::meta::is_function(Method) && !std::meta::is_special_member(Method)) {

            MethodInfo m;
            m.name = std::string(std::meta::identifier_of(Method));

            auto params = std::define_static_array(std::meta::parameters_of(Method)); //*3
            template for (constexpr auto Param : params) {
                ParameterInfo p;
                p.name = std::string(std::meta::identifier_of(Param));
                p.type = makeTypeInfo<typename[:std::meta::type_of(Param):]>(); //*4
                m.parameters.push_back(p); //*5
            }
            Registry::instance().addMethod<C>(std::move(m)); //*6
        }
    }
}

Ok so a chunk of alien C++ code... what's happening here is:

  1. we've called our template with the type of the class (C) and name (char*)
  2. our expansion statement iterates the class members
  3. for each method we then do the same to iterate over the method parameters
  4. we build an array of the parameter names and types.
  5. we add the array to a method object and then
  6. add the method object into our registry.

std::define_static_array is converting the parameter list into a constexpr array that we can then loop over with template for. Needed since std::meta::parameters_of returns a range that can't be iterated with a plain runtime loop - each element potentially a different type.

Our variant

To pass data between languages I want a generic variant type, I'm going to be "passing by value" for arguments and returns. So a basic union will do the trick:

union XData {
    int64_t     intValue;
    double      doubleValue;
    const char *stringValue;
    int         boolValue;
};

struct XPlatValue {
    XPlatType type;  // which union member is active
    XData     data;
};

i.e. we're going to support just basic PoD types, a subset, and we can hold them all in a single shared value (for now), albeit we'll need a little careful footwork for handling strings and their lifetimes. This will map onto our external languages, i.e. we need an "integer" we need a "floating point" a "string" and a "boolean".

Function arguments will be an array of these variants, and we'll return one.

The C API

Out C API starts simply:

Registry    *registry_get();
size_t       XPLAT_getNumClasses(const Registry *registry);
const char  *XPLAT_getClassName(const Registry *registry, size_t index);
void        *XPLAT_createInstance(const char *className);
void         XPLAT_destroyInstance(const char *className, void *instance);
int          XPLAT_invoke(const char *className, void *instance,
                          const char *methodName,
                          const XPlatValue *args, size_t argCount,
                          XPlatValue *returnValue);

This will do for now, our binding will load our C++ library and ask it "how many classes do you have", then "what's class 1 called" etc. We can improve this later but for now we can simply pass in a string and say "create this class" and similarly pass a function name for "call this method". This keeps it really simple and makes debugging straightforward.

The Python code

bind_library loads the shared library via ctypes, wires up the C API signatures, queries the registry for class names, and builds a Python module populated with stub class objects:

registry = lib.registry_get()
num_classes = lib.XPLAT_getNumClasses(registry)

module = types.ModuleType(f"xplat.{lib_name}")
for i in range(num_classes):
    class_name = lib.XPLAT_getClassName(registry, i).decode('utf-8')
    setattr(module, class_name, PythonClass(lib, class_name))

sys.modules[f'xplat.{lib_name}'] = module

As you can see c++ pointers are being treated as opaque handles to be passed in and out, this keeps the C interface really simple for now. This could be enhanced with a handle management system, but this is ideal for now.

There are two Python classes involved. PythonClass is the factory - the thing you import and call like a constructor. Calling it creates a PythonClassInstance, which calls XPLAT_createInstance on construction and XPLAT_destroyInstance in __del__.

In this first version PythonClassInstance uses __getattr__ to intercept any method call and forward it to XPLAT_invoke without checking whether the method actually exists obviously this isn't ideal, but, simple!

def __getattr__(self, name: str):
    if name.startswith('_'):
        return super().__getattribute__(name)
    return Method(self.lib, self.class_name, self.instance_ptr, name)

Method.__call__ packs arguments into an XVar array by inspecting their Python types, calls XPLAT_invoke, then unpacks the return value from the tagged union. Again, this works, but it's a footgun - it'll happily forward a typo'd method name and only fail at the C++ boundary. We can improve this later, but for now to call our native C++ code our "final" python application talking to our C++ looks like this:

Usage

from xplat import bind_library
bind_library('demo')
from xplat import demo

basic = demo.Basic()
print(basic.getInt())                          # 42
print(basic.getFloat())                        # 3.14
print(basic.compareString("hello", "hello"))   # True
print(basic.compareString("hello", "world"))   # False

Here our C++ class looks and feels like a native Python object, and we can use and call it just as we would any Python function.

(make run-python to run the example demo. The easiest way to follow the internals is stepping through in the Python debugger.)

Popular posts from this blog

seven month update

Tracking running Part #2

Capsure RM200 hacking