Skip to main content

Inlined rust functions

Project description

xenoform-rs

Write and execute superfast rust inside your Python code! Here's how...

Write a type-annotated function or method definition in python, add the rust decorator and put the rust implementation in a docstr:

from xenoform_rs import rust

@rust(py=False)
def vector_sum(v: list[int]) -> int:  # ty: ignore[empty-body]
    """
    Ok(v.iter().sum())
    """

When Python loads this module, all functions using this decorator have their function signatures translated to rust and the source for an extension module with python bindings is generated. The first time any function is called, the module is built, the attribute corresponding to the (empty) Python function is replaced with the rust implementation in the module.

Subsequent calls to these functions incur minimal overhead, as the attribute corresponding to the (dummy) python function already points to the rust implementation.

Each module stores a hash of its source code (and Cargo.toml). Modules are checked on load and automatically rebuilt when any changes are detected.

By default, the binaries, source code and build logs for the compiled modules can be found in the ext subfolder (this location can be changed).

Features

  • Supports numpy arrays (via the numpy crate) for customised "vectorised" operations.
  • Using annotated types, you can override the default mapping of python types to rust types.
  • Supports positional and keyword arguments with defaults, including positional-only and keyword-only markers (/,*)
  • Supports *args and **kwargs, mapped (respectively) to &Bound<'py, PyTuple> and Option<&Bound<'py, PyDict>>. NB type annotations for these types are still useful for python type checkers. See test_kwargs.py
  • Supports custom dependencies and imports.
  • Callable types are supported both as arguments and return values. See below.
  • Optional (T | None) types are supported, mapping to Option<T>
  • Can link to separate rust sources, see test_modules.py for details.
  • By default, supports parallel execution when the python interpreter is free-threaded.

Caveats & points to note:

  • callable types (more detail here):
    • only generic (untyped) functions/closures are supported.
    • a type override is necessary to pass functions as arguments. The default works for return values.
  • complex: 128 bit support only (i.e. not np.complex64)
  • if additional modules are specified, the files are copied into the crate. Modifications to additional modules will trigger a rebuild.
  • no support for compound types, other than optional (T | None) (This would require building support for rust enums). Use a type override to a generic python type e.g. Annotated[int | float, "&Bound<'py, PyAny>"] or coerce to a single rust type e.g. Annotated[int | float, "f64"].
  • no support currently for linking to external prebuilt binaries
  • due to restrictions arising from linguistic differences, xenoform-rs will likely never be as functionally complete than its C++ sister, xenoform

Getting started

Install the package

uv add xenoform-rs  # or pip install xenoform-rs

Simply decorate your rust-implemented functions with the rust decorator factory - it handles all the configuration and compilation. Here's a function that counts the elements in a multidimensional array:

from typing import Annotated

import numpy as np
import numpy.typing as npt

from xenoform_rs import rust, rust_dependency


@rust(
    py=False,  # we don't require the python context as the first argument (we aren't constructing any python objects or calling any python APIs)
    dependencies=[rust_dependency("numpy", version="0.28")],  # declare we need the numpy crate
    imports=["numpy::PyReadonlyArrayDyn"],  # import the type we need
)
def array_nelems(a: npt.NDArray[np.int64]) -> Annotated[int, "usize"]:
    # npt.NDArray[np.int64] maps by default to numpy::PyReadOnlyArrayDyn<i64>
    # the return type is a rust usize which gets converted to a python int
    """
    Ok(a.as_array().shape().iter().product())
    """

if __name__ == "__main__":
    print(array_nelems(np.empty([2, 3, 5, 7], dtype=np.int64)))

The @rust decorator factory parameters

name type default description
py bool True Pass the python context as the first argument. Necessary when (e.g.) creating python objects.
dependencies list[str] | None None Rust package dependencies, the rust_dependency convenience function can be used to specify dependency parameters, e.g. dependencies=[rust_dependency("numpy", version="0.28")].
imports list[str] | None None Additional imports, e.g. imports=["numpy::{PyArray2, PyArrayMethods, PyReadonlyArray2}"]
modules list[Path | str] | None None Sources for additional modules
edition str "2024" The rust edition.
profile dict[str, str] | None None Overrides to (release mode) profile, e.g. optimisation level, strip symbols, etc.
help str | None None Docstring for the function
verbose bool False enable debug logging

Performance

Rust can offer very significant performance enhancements over python, especially where vectorised * operations are not available, but even when they are.

* "vectorisation" in this sense means implementing loops in compiled - rather than interpreted - code. In fact, the compiler also has various optimisations available to it including but by no means limited to "true" vectorisation (meaning hardware SIMD instructions).

The first example deals with an operation on a pandas Series that must be done sequentially, and the second shows that significant performance gains can be had even when a vectorised python implementation is available. Running these examples requires the "examples" optional dependencies (and of course rust):

uv add xenoform-rs --extra examples  # or pip install xenoform-rs[examples]

Loop

This is a Rust vs python comparison of a non-vectorisable sequential operation on a pd.Series. First a python implementation...

def calc_balances_py(data: pd.Series, rate: float) -> pd.Series:
    """Cannot vectorise, since each value is dependent on the previous value"""
    result = pd.Series(index=data.index)
    # Directly access the underlying numpy array for performance. pandas>=3 returns a read only array, so make it writeable
    result_np = result.to_numpy()
    result_np.flags.writeable = True
    current_value = 0.0
    for i, value in data.items():
        current_value = (current_value + value) * (1 - rate)
        result_np[i] = current_value
    return result

...and the equivalent rust implementation. Note that pyo3/rust knows nothing about pandas, but can still work with such objects via their python API:

@rust(
    dependencies=[rust_dependency("numpy", version="0.28")],
    imports=["numpy::{PyArray1, PyArrayMethods}", "pyo3::types::{PyDict, PyAnyMethods}"],
    module_name="loop_rs",  # override as "loop" is a rust keyword
    profile={"strip": "symbols"},
)
def calc_balances_rust(
    data: Annotated[pd.Series, "Bound<'py, PyAny>"], rate: float
) -> Annotated[pd.Series, "Bound<'py, PyAny>"]:  # ty: ignore[empty-body]
    """
    // extract numpy arrays from the series. Note input is i64, output is f64
    let data_obj = data.call_method0("to_numpy")?;
    let data_np: &Bound<'py, PyArray1<i64>> = data_obj.cast()?;
    let n = data_np.len()?;

    // use the pattern from the numpy documentation
    let result_np = unsafe {
        let r = PyArray1::<f64>::zeros(py, [n], false);
        let mut current_value = 0.0;

        for i in 0..n {
            current_value = (current_value + *data_np.uget([i]) as f64) * (1.0 - rate);
            *r.uget_mut([i]) = current_value;
        }
        r
    };

    // Construct a pd.Series with the same index as the input
    let pd = py.import("pandas")?;
    let kwargs = PyDict::new(py);
    kwargs.set_item("index", data.getattr("index")?)?;
    pd.getattr("Series")?.call((result_np,), Some(&kwargs))
    """

Performance comparison:

N py (ms) rust (ms) speedup
1000 0.5 1.2 -60%
10000 2.0 0.1 2235%
100000 18.7 0.5 3654%
1000000 192.8 2.7 7131%
10000000 1894.8 22.8 8214%

Full code is in examples/loop.py.

Distance Matrix

In this example we compute a distance matrix between $N$ points in $D$ dimensions. An efficient numpy implementation could be:

def calc_dist_matrix_py(p: npt.NDArray[np.float64]) -> npt.NDArray[np.float64]:
    "Compute distance matrix from points, using numpy"
    return np.sqrt(((p[:, np.newaxis, :] - p[np.newaxis, :, :]) ** 2).sum(axis=2))

bearing in mind there is some redundancy here as the resulting matrix is symmetric; however vectorisation with redundancy will always win the tradeoff against loops with no redundancy. But a rust implementation is significantly faster, partly because it can avoid the redundant computations:

@rust(
    dependencies=[rust_dependency("numpy", version="0.28")],
    imports=["numpy::{PyArray2, PyArrayMethods, PyReadonlyArray2}"],
)
def calc_dist_matrix_rust(
    points: Annotated[npt.NDArray[np.float64], "PyReadonlyArray2<f64>"],
) -> Annotated[npt.NDArray[np.float64], "Bound<'py, PyArray2<f64>>"]:
    """
    let points = points.as_array();
    let shape = points.shape();
    let (n, d) = (shape[0], shape[1]);

    let result = PyArray2::zeros(py, [n, n], false);
    let mut r = unsafe { result.as_array_mut() };

    for i in 0..n {
        for j in i + 1..n {
            let mut sum = 0.0;
            for k in 0..d {
                let diff = points.get([i, k]).unwrap() - points.get([j, k]).unwrap();
                sum += diff * diff;
            }
            let dist = sum.sqrt();
            if let Some(x) = r.get_mut([i, j]) {
                *x = dist;
            }
            if let Some(x) = r.get_mut([j, i]) {
                *x = dist;
            }
        }
    }
    Ok(result)
    """
N py (ms) rust (ms) speedup
100 0.4 1.3 -68%
300 3.6 0.2 1907%
1000 28.7 2.3 1162%
3000 208.1 20.8 902%
10000 2270.2 236.2 861%

Full code is in examples/distance_matrix.py.

Type Translations

Default mapping

Basic Python types are recursively mapped to rust types, like so:

Python rust
None ()
int i32
np.int32 i32
np.int64 i64
bool bool
float f64
np.float32 f32
np.float64 f64
complex Bound<'py, PyComplex>
np.complex128 Bound<'py, PyComplex>
str String
np.ndarray PyReadonlyArrayDyn
bytes &'py [u8]
bytearray Bound<'py, PyByteArray
list Vec
set HashSet
frozenset HashSet
dict HashMap
tuple (...)
slice Bound<'py, PySlice>
Any Bound<'py, PyAny>
Self Bound<'py, PyAny>
type Bound<'py, PyType>
*args &Bound<'py, PyTuple>
**kwargs Option<&Bound<'py, PyDict>>
T | None Option<T>
Callable Bound<'py, PyCFunction>
... Bound<'py, PyEllipsis>

Thus, dict[str, list[float]] becomes - by default - HashMap<String, Vec<f64>>.

The only type mapped to something mutable is npt.NDArray (PyReadonlyArrayDyn elements are mutable). For dict, list, set or bytearray override to the corresponding pyo3 type, e.g. PyList (see test_inplace.py).

Callable Types

Passing and returning functions to and from rust is supported, and they can be used interchangeably with python functions and lambdas. Annotate types using Callable[...] - this gets mapped to Bound<'py, PyCFunction>

When returning functions, note that pyo3's PyCFunction type does not intrinsically contain information about the function's argument and return types.

doesn't allow for python functions/lambdas to be passed into rust. In this case override to Bound<'py, PyAny> (PyAnyMethods implement the call... traits).

However, When passing function (either python or rust) arguments, the generic override Bound<'py, PyAny> should be used - it has the necessary API - e.g.:

@rust(py=False)
def use_modulo(f: Annotated[Callable[[int], int], "Bound<'py, PyAny>"], i: int) -> int:
    """
    f.call1((i,))?.extract::<i32>()
    """

See the examples in test_callable.py for more detail.

Configuration

pyo3 version

The pyo3 version can be overridden with the environment variable XENOFORM_RS_PYO3_VERSION. The default - and only supported version - is currently 0.28. Using a different version is not guaranteed to work, and will probably require overrides for all argument and return types.

Location of Extension Modules

By default, compiled modules are placed in an ext subdirectory of your project's root. If this location is unsuitable, it can be overridden using the environment variable XENOFORM_RS_EXTMODULE_ROOT. NB avoid using characters in paths (e.g. space, hyphen) that would not be valid in a python module name.

Free-threaded Interpreter

By default, if the interpreter is free-threaded, extension modules will be built without the GIL. This requires the extension code to be threadsafe. If xenoform detects an environment variable XENOFORM_RS_DISABLE_FT, free-threading is disabled.

Troubleshooting

The generated module source code is written to src/lib.rs in a module-specific folder (e.g. ext/my_module_ext). Cargo build output is redirected to build.log in the that folder. The actual binary will be found in the target/release subfolder.

Adding verbose=True to the rust(...) decorator logs the steps taken, with timings, e.g.:

$ uv run examples/loop.py
08:34:22.535 INFO     registering loop_rs_ext.loop_rs.calc_balances_rust (in ext)
08:34:22.597 INFO     module is up-to-date (d4c7165ade6f52c0aa2ef748c4d6e7c4edce201788a65e7b6e29ebde0d480e3e)
08:34:22.597 INFO     imported compiled module loop_rs
08:34:22.598 INFO     redirected calc_balances_rust to compiled function loop_rs._calc_balances_rust

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xenoform_rs-0.1.1.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xenoform_rs-0.1.1-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file xenoform_rs-0.1.1.tar.gz.

File metadata

  • Download URL: xenoform_rs-0.1.1.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for xenoform_rs-0.1.1.tar.gz
Algorithm Hash digest
SHA256 81837d8f5ff8ac433e6fbf1bb36740ded3663b8efa98dfc7ddcb04ea6956d898
MD5 806c66e69443e7f879606dded6ebf070
BLAKE2b-256 ea28e7dcf0c5d8d3d76cbfe7ee6133c5a6ab677fc649724fadf29172f80dd539

See more details on using hashes here.

File details

Details for the file xenoform_rs-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: xenoform_rs-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for xenoform_rs-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c4172371ff79f4aaa7fe00398f838552ef9a2d3c8df59715332890ff66e74c7b
MD5 e2f974c86acefe8e9dd49e286aff02f8
BLAKE2b-256 8ee709b468947c472af1a431afa1dd69ebdd2581998e1a3cdc60182e0530345a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page