Containment

Step 5 - Containment

This is the fifth step, adding collection - each step adds or improves a small feature, this text is just highlighting a few details along the way, the best way to work through is to build and run each one, reading through the code with this description alongside. Similarly reading this alongside the diff for each step will also mean more than either just reading the text or just reading the code alone.
That's what works for me anyway, YMMV.

So far every type has been a scalar - a single value that fits in one slot of the XPlatValue union. However, containers like std::vector and std::map are different, with an arbitrary number of elements, we won't know that at compile time. We'll need some more helpers to add to the API to pass the information across the boundary.

The ArrayBuilder

The solution is another builder, with a set of C API calls that can be used to construct the array on the heap within the library and pass one of our opaque 'handle' pointers. *Footnote, again this is something that can potentially be optimised with additional smarts, but for now, this simple solution is easy to understand and debug.

void  *XPLAT_createArrayBuilder();
void   XPLAT_destroyArrayBuilder(void *builder);
size_t XPLAT_getArraySize(void *builder);
void   XPLAT_setArraySize(void *builder, size_t size);
void   XPLAT_setArrayElement(void *builder, size_t index, const XPlatValue *value);
int    XPLAT_getArrayElement(void *builder, size_t index, XPlatValue *value);

The arrayValue field in XPlatValue's union holds a raw pointer to one of these builders. Maps reuse the same mechanism - keys and values are interleaved as alternating elements (key₀, val₀, key₁, val₁, …). *another footnote! an enhancement can be to hold two dimensional blocks of data, but again, simple here.

Type system additions

We add Type::Vector and Type::Map to the enum, and we need two trait structs so makeTypeInfo can inspect the element and key/value types at compile time:

template <typename T, typename Alloc> struct is_vector<std::vector<T, Alloc>> : std::true_type {
    using element_type = T;
};

template <typename K, typename V, typename Compare, typename Alloc>
struct is_map<std::map<K, V, Compare, Alloc>> : std::true_type {
    using key_type    = K;
    using mapped_type = V;
};

The TypeMap specialisations follow the same pattern as before:

template <typename T, typename Alloc> struct TypeMap<std::vector<T, Alloc>> {
    static constexpr Type value = Type::Vector;
};
template <typename K, typename V, typename Compare, typename Alloc>
struct TypeMap<std::map<K, V, Compare, Alloc>> {
    static constexpr Type value = Type::Map;
};

Recursive TypeInfo

Now we can (need) to make our type storage more complex: TypeInfo now carries typeArgs - a list of child TypeInfo nodes. makeTypeInfo<T>() fills this in recursively. i.e. the outer type is a vector, but the vector type then holds the type of the contents (int/double etc), but similarly we could have a vector holding a vector holding an X.

template <typename T> TypeInfo makeTypeInfo()
{
    TypeInfo info;
    info.type = TypeMap<T>::value;

    if constexpr (info.type == Type::Vector) {
        using ElemT = typename xplat::is_vector<T>::element_type;
        info.typeArgs.push_back(std::make_shared<TypeInfo>(makeTypeInfo<ElemT>()));
    }
    else if constexpr (info.type == Type::Map) {
        using KeyT = typename xplat::is_map<T>::key_type;
        using ValT = typename xplat::is_map<T>::mapped_type;
        info.typeArgs.push_back(std::make_shared<TypeInfo>(makeTypeInfo<KeyT>()));
        info.typeArgs.push_back(std::make_shared<TypeInfo>(makeTypeInfo<ValT>()));
    }
    return info;
}

A std::vector<double> gives a TypeInfo{Vector, [TypeInfo{Double}]}. A std::map<std::string, long> gives TypeInfo{Map, [TypeInfo{String}, TypeInfo{Int}]} and so on, the Python side can walk the tree and know how to convert each element.

Conversion specialisations

The invoker gains generic valueToNative / nativeToValue overloads enabled by the trait structs for "is_vector/is_map":

template <typename T>
typename std::enable_if_t<is_vector_v<T>, T>
valueToNative(const XPlatValue &val)
{
    using ElemT = typename is_vector<T>::element_type;
    auto *builder = static_cast<xplat::ArrayBuilder *>(val.data.arrayValue);
    T result;
    for (size_t i = 0; i < builder->elements.size(); ++i) {
        result.push_back(valueToNative<ElemT>(builder->elements[i]));
    }
    return result;
}

Maps consume pairs of elements from the builder.

Python side

On the Python side we'll add META_TYPE_VECTOR = 7 and META_TYPE_MAP = 8, then two conversion helpers that mirror the C++ ArrayBuilder protocol:

def _python_to_c_vector(self, python_obj, element_type_info):
    builder_handle = self.lib.XPLAT_createArrayBuilder()
    items = list(python_obj)
    self.lib.XPLAT_setArraySize(builder_handle, len(items))
    for idx, item in enumerate(items):
        val = self._python_to_c_value(item, element_type_info)
        self.lib.XPLAT_setArrayElement(builder_handle, idx, ctypes.byref(val))
    return builder_handle

def _python_to_c_map(self, python_obj, key_type_info, value_type_info):
    builder_handle = self.lib.XPLAT_createArrayBuilder()
    items = list(python_obj.items())
    self.lib.XPLAT_setArraySize(builder_handle, len(items) * 2)
    for i, (key, value) in enumerate(items):
        key_val = self._python_to_c_value(key, key_type_info)
        val_val = self._python_to_c_value(value, value_type_info)
        self.lib.XPLAT_setArrayElement(builder_handle, i * 2,     ctypes.byref(key_val))
        self.lib.XPLAT_setArrayElement(builder_handle, i * 2 + 1, ctypes.byref(val_val))
    return builder_handle

Note the XPLAT_setArraySize call before filling in elements - this pre-allocates the storage so that string pointers written by setArrayElement remain stable.

The typeArgs we built with makeTypeInfo reach Python as a nested dict that _python_to_c_vector uses to know how to convert each element. Containers of containers (e.g. vector<vector<double>>) 'just work' because both helpers call _python_to_c_value recursively.

Usage

From Python, C++ vectors and maps look exactly like Python lists and dicts:

obj = demo.Demo()
print(obj.getVector())          # [1.0, 2.0, 3.5]
print(obj.getMap())             # {'one': 1, 'two': 2}

obj.putVector([10.0, 20.5, 30.25])
obj.putMap({'alpha': 100, 'beta': 200})

Popular posts from this blog

seven month update

Tracking running Part #2

Capsure RM200 hacking