🚫🐍: Using Python without using Python - Part 2
Part 2, better late than never! This is a follow up to “Using Python without using Python - Part 1” in which we look at a few more methods for using Python as a wrapper/orchestrator to integrate code written in other languages.
CFFI: a ctypes alternative
Continuing where we left off, we explore libffi via the CFFI Python library for calling C code using a foreign function interface.
I find the CFFI documentation relatively difficult to follow, but the purported benefits of CFFI are the ability to avoid learning new syntaxes and APIs for interacting between C and Python, interacting directly with C (note: not C++) libraries by essentially copy-pasting their header files into Python. Anecdotally, this all proved true in this example, barring the fact that strings in Python 3 need to be explicitly encoded/decoded to/from bytes as we will see below.
As before, we start by installing a Python library (again assuming use of a conda
environment):
In this case, since CFFI supports C but not C++, the header and source files look a little different than in the previous post, mostly because of the need to manage C strings:
The above is more cumbersome than the equivalent C++, but seems to work (disclaimer: I have fortunately avoided string processing in C in my career, and my understanding is that there are many footguns).
The new part specific to CFFI here is below. The code is relatively self-explanatory barring the fact that the char*
arguments and the return need to be wrangled into the proper formats for Python to interpret them as strings.
Executing the code directly will compile the interface (generating some files in the local directory), then import it and use it:
As a note, I’m being lazy here but it is of course possible and advisable to compile the interface separately from the time of use.
PyO3: something else
Thus far, we have looked only at using C/C++ from Python, but as all the cool kids will tell you C/C++ is a dead-end for systems programming and in the future we wil all be writing our software using Rust or Zig or the blockchain. Luckily, interfacing Python and Rust is maybe even easier than interfacing Python and C++, which we will explore here. Interfacing Python and web3 is left as an exercise for the reader.
We previously tacitly assumed the availability of C/C++ compilers and relevant system libraries. We will do the same here for Rust for use with PyO3, Rust bindings for Python. We follow the advice of the PyO3 user guide, which recommends starting with the maturin
Python library for building such bindings:
More-or-less following the guide, we use maturin
to create a new package for our Rust-based greeter:
As part of the init
, this creates a src/lib.rs
in the (now) current working directory with an example function. Actually, conveniently (by design) init
sets up the rest of the relevant build/packaging files as well, but we won’t touch those. We slightly modify the generated Rust code to the following:
If you’re not familiar with Rust, the greet
function above probably has some unfamiliar types that warrant further reading, but rest assured: it works.
gRPC: and now for something completely different
Everything we have looked at so far has centered around compiling libraries for use by Python that allow sending data from code written in Python to code written in another language but used as a library via the original Python interpreter. A fundamentally different approach is a Remote Procedure Call (RPC), with gRPC being the RPC framework I’m most familiar with.
With gRPC, rather than communicating with library written in another language as part of the same binary, the basic idea is to send data outside of the current “client” process to a “server” process that receives the data, performs some computation, and typically returns some result. This post is not at all a full gRPC tutorial, but the highlights here are:
- The client and server can be written in separate programming languages (in fact, multiple clients in different languages can talk to the same server).
- The data sent back and forth has to be completely serialized to bytes when sent and then deserialized when received, which is much slower than passing around data “in-place” in memory.
- The complexity of sending data around is managed behind-the-scenes.
As an example here, we will spin up a simple Rust gRPC server and Python client for communicating with it. Once again, we assume Rust is already installed. For our gRPC Rust library we will use tonic, which requires the protocol buffer be installed (on Mac, with Homebrew: brew install protobuf
).
To begin, we will set up a Rust project to run our server. For simplicity, we will call it greet-server
and run all successive commands within the greet-server
directory:
Before jumping into the Rust, we set up a proto
subdirectory and a gRPC service definition that defines the method our service provides (Greet
) as well as the required input (GreetRequest
) and output (GreetReply
):
In a nutshell, the above proto file just says defines in a language-agnostic manner the input and output such that we can use the protocol buffer compiler to compile language-specific types for Rust (which we will use for our server) and Python (which we will use for our client).
With the proto defined as above, we now set up the Rust project such that it knows how to compile and use the Rust-specific types generated by the protocol buffer compiler:
Here, tokio
and prost
are required additional dependencies to use tonic
to build an asynchronous gRPC server based on the proto file we just defined. To do this, we define a build.rs
file in the current working directoy that indicates to tonic that we would like to compile our proto:
From here, we can implement the main body of our service which consists of two pieces:
- We must define in Rust how we want to transform a
GreetRequest
into aGreetResponse
whenGreet
is called. - We must spin up a server to run our service.
This can be seen below:
OK! That finishes the implementation of the service itself, which is roughly analogous to what we have done previously for wrappers: we had to define the implementation of the function in the language that we want to implement it in. It remains to actually invoke this function (via gRPC) using Python. To accomplish this, we must compile the Python-language interface for the gRPC service that we defined previously:
At this point, we can tell our Python script where to find the Rust server and invoke Greet
:
To see this in action, we start up the Rust server that we previously implemented and, while it is running, run the Python script we just wrote:
Ta-da!
Mojo 🔥: deferred, for now
I originally said that we would be looking at Mojo 🔥
as part of this series of posts, but I have struggled to find anything worthwhile to say about it at this level of investigation. This is not a slight against Mojo – it just doesn’t have much to offer for simple string processing helpers.
Parting Thoughts
Compared to the previous post, we’ve seen a lot less new syntax this time around, which is nice – SWIG, Cython, and C extensions just feel pretty cumbersome, whereas CFFI, PyO3, cppyy (previous post) hide a lot of this difficulty (and obviously gRPC is a completely different beast). That said, we have not done any sort of discussion of computational efficiency or profiling here, and so it is not entirely fair to judge these tools based just on their syntax. At the end of the day, it is generally easiest to minimize the “surface area” between languages when using bindings such as these, and so setting up the wrappers for passing information around is ideally a “one-and-done” operation and not something developers are doing every day. As such, it makes a lot of sense to accept the pain of figuring out a gross domain specific language every once in a while if there are other benefits, and I don’t know how these different tools compare on performance.
The most flexible but least performant solution here, in my experience, is gRPC, and I have not seen it employed in my career outside of my time at Alphabet. To be fair, I’ve not seen Python seriously employed for any truly performance-critical pieces of code except e.g., to drop into CUDA via cupy
(or pytorch
or tensorflow
). From a usability standpoint, I’ve spent fair amounts of time debugging issues in memory management in vanilla extensions and more time than I care to admit figuring out ctypes
bindings (not covered in this series) for pre-compiled libraries, and view the complexity of those solutions as an important factor to consider. That said, I don’t think there’s a universal recommendation that can be made for how to best interface Python with other languages without a careful understanding of what “the point” is. I’m open to hearing opinions if anyone feels strongly!