How to bind (Python + NumPy) with (Rust + Ndarray)
Content
This article will address:
- PyO3 bindings (Rust to Python)
- Python NumPy to Rust Ndarray (no copy, i.e. fast)
- Rust Ndarray to Python NumPy
- Mutable and immutable examples
If you just want to read how to implement the binding, skip the “Background” section.
Background
Natural Python code is not the fastest (yet), and can sometimes make Python the lesser right choice for the task. However, building larger data (or other) projects in Python is good because of all of the good packages that exist.
So, how do we deal with this issue of speed? The common way today is to extend using C, and most major packages (such as NumPy and PyTorch) are implemented in C, C++ or even CUDA. And for a lot of Python people, that are used to work with simple Python code, going all the way to C or C++ is usually not a fun experience.
Luckily, other lower level languages do exist, and one good example of this is the relatively new language Rust. The features of Rust that makes it such a killer (IMHO) are:
- Excellent package manager (Cargo).
- No garbage collection for predictable performance.
- Safe way of handling memory (good for newcomers to make less mistakes).
- Easy to get started with thanks to its incredible tooling (e.g. rust-analyser for VSCode). Though, fighting the compiler can be tedious and educating at times.
Getting started
Packages
Writing Rust bindings for Python is surprisingly simple (after a few times of desperately trying). For starters we going to rely on a few packages:
Rust
Python
Rust Setup
Start off by creating a new project (you may name it whatever you like). Note, the name chosen will be the name you import in python.
cargo new --lib rust_numpy_ext
Then add the packages to your Cargo (these were the newest versions when writing this article).
[package]
name = "rust_numpy_ext"
version = "0.1.0"
edition = "2021"[lib]
crate-type = ["cdylib"][dependencies]
ndarray = "0.15.3"
numpy = "0.15"
rand = "0.8.5" # Specific for this example
ordered-float = "2.10.0" # Specific for this example[dependencies.pyo3]
version = "0.15.1"
features = ["extension-module"]
Python Setup
For the people who don’t know what a Python virtual environment is, please look it up as it is required to install the Rust package/module into the virtual environment. There are many great tutorials for it online. It may vary depending on OS.
In order to install the Rust package, we will need to have a Python virtual environment. You may create a local environment with the following command (depending on what name you have for your Python).
python3 -m venv ./test_venv
Next step is to activate the environment that can be done by calling
source ./test_venv/bin/activate
Once activated, you should see the name of your venv in your terminal, see example below.
Now we also need to install our dependencies in our venv. Note, when activating the virtual environment, your python may be called with a different name (python3 -> python).
python -m pip install numpy
python -m pip install maturin
python -m pip install matplotlib
Note, matplotlib is only for visualization and is not required for the bindings.
Code
Now we are going to loosely follow the example from PyO3 that has must things of interest. Since their example already exist, I chose to make a few new “flashy” functions for testing.
A fantastic part of Rust and it’s community are the number of great packages that are increasing everyday and are super duper easy to install. We are going to use the Rust package “numpy”, that let’s us easily convert Python “numpy” data into native Rust “ndarray” data. Spectacular!
Let’s dump all of the code and then go through it!
Rust — Creating the Python Module
The Rust side may look scary at first glance. But what we are doing is specifying the Python module that must have the same name as the package. The empty module will look like
#[pymodule]
fn rust_numpy_ext(_py: Python<'_>, m: &PyModule) -> PyResult<()> {
Ok(())
}
Not too scary. We are taking Python and a reference to a PyModule as input and returns a PyResult. No need to think too much about this. You can just copy this and rename the module to the package name. Standard procedure.
Rust — Populating the Python Module
Nest step is to create the first function in our Python module. Let’s start with the absolute simplest one of the functions.
#[pyfn(m)]
fn eye<'py>(py: Python<'py>, size: usize) -> &PyArray2<f64> {
let array = ndarray::Array::eye(size);
array.into_pyarray(py)
}
This will be created inside of the pymodule, full code below. We can see, our pyfn will take our PythonModule m as input to add our function to the module, namely the code part “#[pyfn(m)]”.
#[pymodule]
fn rust_numpy_ext(_py: Python<'_>, m: &PyModule) -> PyResult<()> { #[pyfn(m)]
fn eye<'py>(py: Python<'py>, size: usize) -> &PyArray2<f64> {
let array = ndarray::Array::eye(size);
array.into_pyarray(py)
} Ok(())
}
Here the packages will handle a lot for us. PyO3 will try and automatically convert the Python arguments into the Rust types. In this case:
Python int -> Rust usize
Every pyfn needs to have Python as input, then the rest of the types. Again, we can see this as a standard procedure for creating our Rust/Python functions.
Next, we use the Rust package ndarray to create the eye matrix (i.e. one on the diagonal). To then send it back to Python, we need to convert ndarray into Python numpy array. Here we are using the package “numpy” in Rust with the trait “IntoPyArray”. The good part of having this as a trait is that the trait can be implemented for other Rust packages as well, such as “nalgebra”. That makes it possible to standardize the procedure of converting Rust data into Python numpy data. Great!
The other functions we are going to write follow the same logic, only a bit more operations behind the scenes. One of the functions mutate the numpy array, for situations that may be more effective. The other takes the numpy array as read only, making it impossible to mutate input data.
Rust — Populating the Python Module with NumPy inputs
Read only
First example is to take a numpy array as input that is not mutable. If you are following pure functional programming this is great! Now we can ensure no mutations to our array (different from Python/C/C++).
#[pyfn(m)]
fn max_min<'py>(py: Python<'py>, x: PyReadonlyArrayDyn<f64>) -> &'py PyArray1<f64> {
let array = x.as_array();
let result_array = rust_fn::max_min(&array);
result_array.into_pyarray(py)
}
We convert the input numpy array “x” into a ndarray’s “ArrayView” using the “.as_array()”. Meaning the “array” variable will not be mutable. Then we can call our Rust function min_max that takes an ndarray as input and finds the maximum and minimum value and returns it back as an ndarray [max, min].
Great! Then, as already discussed, we can convert it back into Python numpy array using the trait “IntoPyArray”. Now we can return the result back to Python!
Mutable
In the next example, we will take a mutable numpy array, loop over it and double every value and add a random perturbation to it. It is very similar as before. The difference is when converting the numpy array into ndarray, we must use “as_array_mut()” to make it mutable. This will be an unsafe operation.
#[pyfn(m)]
fn double_and_random_perturbation(
_py: Python<'_>,
x: &PyArrayDyn<f64>,
perturbation_scaling: f64,
) {
let mut array = unsafe { x.as_array_mut() };
rust_fn::double_and_random_perturbation(&mut array, perturbation_scaling);
}
We can now call our Rust function that mutates the ndarray. And we do not need to return the array as the mutations on the ndarray are happening directly on the numpy array as the ndarray and numpy array are sharing data.
Building/Installing the Package
Make sure you have your venv activated before building
In order to be able to import the package from Python, we must build it. We will use “maturin” for this. In the root dir from your Rust package run
maturin develop
Once you are happy with the package you can run with the release flag to get a much more optimized build (i.e. your Rust code run much faster).
maturin develop --release
Running the maturin command will make the module importable in Python.
The module can be imported from any python file as long you use the same venv when running Python. Meaning, your Python project can be separated from your Rust module, which is great!
Python — Test our Module
Make sure you have your venv activated before running
Importing the Package
From the Python side, things behaves as expected. We can import our module as
import rust_numpy_ext as RNE
or, if we only want one function we can do
from rust_numpy_ext import double_and_random_perturbation
The only issue we have is that the Python linter (at least in VSCode) will not autocomplete our functions, and we cannot see what argument(s) the function want. To solve this, we can wrap our functions in Python functions. These Python side functions can also handle our type conversions if we were to have non-default types for the numpy array (e.g. u8, f32 etc.).
A wrapper function could look like the code block below. Note, you may wrap all of your functions in the same file. Wrapping the module will make it a a first class citizen of Python and behaves natively, just like numpy.
Running Simple Tests
Now, let’s test that the functions we have written works as they should. This is the same Python code listed before (main.py). I put it here for convenience. The main.py file can be anywhere, as long as the correct venv is activated during run. This should run without any problems. And you should get a plot as below where we can see the doubling and perturbations of the numpy array.
Conclusion
Working with Rust and Python is great, with a lot of work being done by PyO3 and other helper packages. From experience, I find working with Rust much more enjoyable than writhing C/C++ extensions. Though, PyTorch has done a great job of enabling C++ extensions, so that is a bit of an exception (dependent on pybind11).
In the end, it all comes down to preference, and what expertise the team has. Rust is modern and is getting a lot of popularity recently, but C/C++ are well established in the industry and are hard to compete with.
Best way to support me if you like this article is to clap and/or follow me! If you are super kind you will share it with your friends. But honestly, I’m glad if you just made it this far, enjoyed it, and most importantly learnt something.
Cheers!