Open the PoD bay doors
Step 1 - Open the PoD bay doors
In C++26 with the std::meta features it's possible to interrogate langage metadata at compile time. The plan is to use this to build up a metadata dictionary for a library that can be interrogated dynamically at runtime through an external C API. With that, it will be possible to dynamically generate a Python binding - no pybind, no SWIG, no hand written glue - to interface with the library. Our library can remain a plain C++ library with no need to have any knowledge or special cases for each target user.
TL;DR - std::meta lets you query type information directly. std::meta::members_of gets class members, std::meta::parameters_of gets method parameters.
Previously this required recursive templates, variadic arguments and SFINAE; now I can just iterate the data directly.
Using the new C++26 syntax
The syntax - reflect operator, expansion statement and splicer:
^^T- the reflect operator.^^Demogives you astd::meta::infohandle for the typeDemo. (a compile time pointer to the metadata)template for (...)- the expansion statement. This is a compile-time loop over a constexpr range of thosestd::meta::infohandles. Since it's a compile time loop, unlike a regularfor, each iteration can produce different types.[:expr:]- the splicer. It takes astd::meta::infohandle and splices the thing it refers to back into the code.typename[:std::meta::type_of(Param):]means "the type of this parameter as a usable C++ type".
and yeah I read those descriptions and had no idea what they meant until I used them.
What we're building
In libdemo we have a basic class with functions taking and returning "plain old data" types which we're going to expose to python:
class Basic {
public:
int getInt();
double getFloat();
bool compareString(const std::string &lhs, const std::string &rhs);
};
With one macro, this can then be exposed for our dynamic binding:
XCLASS_REFLECT(Basic);
That macro
The macro expands to a struct with a constructor, plus a static instance of it:
namespace {
struct Basic_ReflectedRegistrar {
Basic_ReflectedRegistrar() {
xplat::registerClassReflect<Basic>("Basic");
}
};
static Basic_ReflectedRegistrar Basic_reflected_reg_inst;
}
This results in our metadata builder registerClassReflect<Basic> being executed on library load before main runs, populating our metadata dictionary.
Filling the registry with C++26
registerClassReflect uses the new reflection features to walk the class:
template <typename C>
void registerClassReflect(const char *name) { //*1
registerClass<C>(name); // Creates an entry for the class
template for (constexpr auto Method : std::meta::members_of(^^C)) { //*2
if constexpr (std::meta::is_function(Method) && !std::meta::is_special_member(Method)) {
MethodInfo m;
m.name = std::string(std::meta::identifier_of(Method));
auto params = std::define_static_array(std::meta::parameters_of(Method)); //*3
template for (constexpr auto Param : params) {
ParameterInfo p;
p.name = std::string(std::meta::identifier_of(Param));
p.type = makeTypeInfo<typename[:std::meta::type_of(Param):]>(); //*4
m.parameters.push_back(p); //*5
}
Registry::instance().addMethod<C>(std::move(m)); //*6
}
}
}
Ok so a chunk of alien C++ code... what's happening here is:
- we've called our template with the type of the class (C) and name (char*)
- our expansion statement iterates the class members
- for each method we then do the same to iterate over the method parameters
- we build an array of the parameter names and types.
- we add the array to a method object and then
- add the method object into our registry.
std::define_static_array is converting the parameter list into a constexpr array that we can then loop over with template for. Needed since std::meta::parameters_of returns a range that can't be iterated with a plain runtime loop - each element potentially a different type.
Our variant
To pass data between languages I want a generic variant type, I'm going to be "passing by value" for arguments and returns. So a basic union will do the trick:
union XData {
int64_t intValue;
double doubleValue;
const char *stringValue;
int boolValue;
};
struct XPlatValue {
XPlatType type; // which union member is active
XData data;
};
i.e. we're going to support just basic PoD types, a subset, and we can hold them all in a single shared value (for now), albeit we'll need a little careful footwork for handling strings and their lifetimes. This will map onto our external languages, i.e. we need an "integer" we need a "floating point" a "string" and a "boolean".
Function arguments will be an array of these variants, and we'll return one.
The C API
Out C API starts simply:
Registry *registry_get();
size_t XPLAT_getNumClasses(const Registry *registry);
const char *XPLAT_getClassName(const Registry *registry, size_t index);
void *XPLAT_createInstance(const char *className);
void XPLAT_destroyInstance(const char *className, void *instance);
int XPLAT_invoke(const char *className, void *instance,
const char *methodName,
const XPlatValue *args, size_t argCount,
XPlatValue *returnValue);
This will do for now, our binding will load our C++ library and ask it "how many classes do you have", then "what's class 1 called" etc. We can improve this later but for now we can simply pass in a string and say "create this class" and similarly pass a function name for "call this method". This keeps it really simple and makes debugging straightforward.
The Python code
bind_library loads the shared library via ctypes, wires up the C API signatures, queries the registry for class names, and builds a Python module populated with stub class objects:
registry = lib.registry_get()
num_classes = lib.XPLAT_getNumClasses(registry)
module = types.ModuleType(f"xplat.{lib_name}")
for i in range(num_classes):
class_name = lib.XPLAT_getClassName(registry, i).decode('utf-8')
setattr(module, class_name, PythonClass(lib, class_name))
sys.modules[f'xplat.{lib_name}'] = module
As you can see c++ pointers are being treated as opaque handles to be passed in and out, this keeps the C interface really simple for now. This could be enhanced with a handle management system, but this is ideal for now.
There are two Python classes involved. PythonClass is the factory - the thing you import and call like a constructor. Calling it creates a PythonClassInstance, which calls XPLAT_createInstance on construction and XPLAT_destroyInstance in __del__.
In this first version PythonClassInstance uses __getattr__ to intercept any method call and forward it to XPLAT_invoke without checking whether the method actually exists obviously this isn't ideal, but, simple!
def __getattr__(self, name: str):
if name.startswith('_'):
return super().__getattribute__(name)
return Method(self.lib, self.class_name, self.instance_ptr, name)
Method.__call__ packs arguments into an XVar array by inspecting their Python types, calls XPLAT_invoke, then unpacks the return value from the tagged union.
Again, this works, but it's a footgun - it'll happily forward a typo'd method name and only fail at the C++ boundary. We can improve this later, but for now to call our native C++ code our "final" python application talking to our C++ looks like this:
Usage
from xplat import bind_library
bind_library('demo')
from xplat import demo
basic = demo.Basic()
print(basic.getInt()) # 42
print(basic.getFloat()) # 3.14
print(basic.compareString("hello", "hello")) # True
print(basic.compareString("hello", "world")) # False
Here our C++ class looks and feels like a native Python object, and we can use and call it just as we would any Python function.
(make run-python to run the example demo. The easiest way to follow the internals is stepping through in the Python debugger.)