| CXX — safe FFI between Rust and C++ |
| ========================================= |
| |
| [<img alt="github" src="https://img.shields.io/badge/github-dtolnay/cxx-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/cxx) |
| [<img alt="crates.io" src="https://img.shields.io/crates/v/cxx.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/cxx) |
| [<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-cxx-66c2a5?style=for-the-badge&labelColor=555555&logoColor=white&logo=" height="20">](https://docs.rs/cxx) |
| [<img alt="build status" src="https://img.shields.io/github/workflow/status/dtolnay/cxx/CI/master?style=for-the-badge" height="20">](https://github.com/dtolnay/cxx/actions?query=branch%3Amaster) |
| |
| This library provides a **safe** mechanism for calling C++ code from Rust and |
| Rust code from C++, not subject to the many ways that things can go wrong when |
| using bindgen or cbindgen to generate unsafe C-style bindings. |
| |
| This doesn't change the fact that 100% of C++ code is unsafe. When auditing a |
| project, you would be on the hook for auditing all the unsafe Rust code and |
| *all* the C++ code. The core safety claim under this new model is that auditing |
| just the C++ side would be sufficient to catch all problems, i.e. the Rust side |
| can be 100% safe. |
| |
| ```toml |
| [dependencies] |
| cxx = "1.0" |
| |
| [build-dependencies] |
| cxx-build = "1.0" |
| ``` |
| |
| *Compiler support: requires rustc 1.48+ and c++11 or newer*<br> |
| *[Release notes](https://github.com/dtolnay/cxx/releases)* |
| |
| <br> |
| |
| ## Guide |
| |
| Please see **<https://cxx.rs>** for a tutorial, reference material, and example |
| code. |
| |
| <br> |
| |
| ## Overview |
| |
| The idea is that we define the signatures of both sides of our FFI boundary |
| embedded together in one Rust module (the next section shows an example). From |
| this, CXX receives a complete picture of the boundary to perform static analyses |
| against the types and function signatures to uphold both Rust's and C++'s |
| invariants and requirements. |
| |
| If everything checks out statically, then CXX uses a pair of code generators to |
| emit the relevant `extern "C"` signatures on both sides together with any |
| necessary static assertions for later in the build process to verify |
| correctness. On the Rust side this code generator is simply an attribute |
| procedural macro. On the C++ side it can be a small Cargo build script if your |
| build is managed by Cargo, or for other build systems like Bazel or Buck we |
| provide a command line tool which generates the header and source file and |
| should be easy to integrate. |
| |
| The resulting FFI bridge operates at zero or negligible overhead, i.e. no |
| copying, no serialization, no memory allocation, no runtime checks needed. |
| |
| The FFI signatures are able to use native types from whichever side they please, |
| such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s |
| `std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination. |
| CXX guarantees an ABI-compatible signature that both sides understand, based on |
| builtin bindings for key standard library types to expose an idiomatic API on |
| those types to the other language. For example when manipulating a C++ string |
| from Rust, its `len()` method becomes a call of the `size()` member function |
| defined by C++; when manipulating a Rust string from C++, its `size()` member |
| function calls Rust's `len()`. |
| |
| <br> |
| |
| ## Example |
| |
| In this example we are writing a Rust application that wishes to take advantage |
| of an existing C++ client for a large-file blobstore service. The blobstore |
| supports a `put` operation for a discontiguous buffer upload. For example we |
| might be uploading snapshots of a circular buffer which would tend to consist of |
| 2 chunks, or fragments of a file spread across memory for some other reason. |
| |
| A runnable version of this example is provided under the *demo* directory of |
| this repo. To try it out, run `cargo run` from that directory. |
| |
| ```rust |
| #[cxx::bridge] |
| mod ffi { |
| // Any shared structs, whose fields will be visible to both languages. |
| struct BlobMetadata { |
| size: usize, |
| tags: Vec<String>, |
| } |
| |
| extern "Rust" { |
| // Zero or more opaque types which both languages can pass around but |
| // only Rust can see the fields. |
| type MultiBuf; |
| |
| // Functions implemented in Rust. |
| fn next_chunk(buf: &mut MultiBuf) -> &[u8]; |
| } |
| |
| unsafe extern "C++" { |
| // One or more headers with the matching C++ declarations. Our code |
| // generators don't read it but it gets #include'd and used in static |
| // assertions to ensure our picture of the FFI boundary is accurate. |
| include!("demo/include/blobstore.h"); |
| |
| // Zero or more opaque types which both languages can pass around but |
| // only C++ can see the fields. |
| type BlobstoreClient; |
| |
| // Functions implemented in C++. |
| fn new_blobstore_client() -> UniquePtr<BlobstoreClient>; |
| fn put(&self, parts: &mut MultiBuf) -> u64; |
| fn tag(&self, blobid: u64, tag: &str); |
| fn metadata(&self, blobid: u64) -> BlobMetadata; |
| } |
| } |
| ``` |
| |
| Now we simply provide Rust definitions of all the things in the `extern "Rust"` |
| block and C++ definitions of all the things in the `extern "C++"` block, and get |
| to call back and forth safely. |
| |
| Here are links to the complete set of source files involved in the demo: |
| |
| - [demo/src/main.rs](demo/src/main.rs) |
| - [demo/build.rs](demo/build.rs) |
| - [demo/include/blobstore.h](demo/include/blobstore.h) |
| - [demo/src/blobstore.cc](demo/src/blobstore.cc) |
| |
| To look at the code generated in both languages for the example by the CXX code |
| generators: |
| |
| ```console |
| # run Rust code generator and print to stdout |
| # (requires https://github.com/dtolnay/cargo-expand) |
| $ cargo expand --manifest-path demo/Cargo.toml |
| |
| # run C++ code generator and print to stdout |
| $ cargo run --manifest-path gen/cmd/Cargo.toml -- demo/src/main.rs |
| ``` |
| |
| <br> |
| |
| ## Details |
| |
| As seen in the example, the language of the FFI boundary involves 3 kinds of |
| items: |
| |
| - **Shared structs** — their fields are made visible to both languages. |
| The definition written within cxx::bridge is the single source of truth. |
| |
| - **Opaque types** — their fields are secret from the other language. |
| These cannot be passed across the FFI by value but only behind an indirection, |
| such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias |
| for an arbitrarily complicated generic language-specific type depending on |
| your use case. |
| |
| - **Functions** — implemented in either language, callable from the other |
| language. |
| |
| Within the `extern "Rust"` part of the CXX bridge we list the types and |
| functions for which Rust is the source of truth. These all implicitly refer to |
| the `super` module, the parent module of the CXX bridge. You can think of the |
| two items listed in the example above as being like `use super::MultiBuf` and |
| `use super::next_chunk` except re-exported to C++. The parent module will either |
| contain the definitions directly for simple things, or contain the relevant |
| `use` statements to bring them into scope from elsewhere. |
| |
| Within the `extern "C++"` part, we list types and functions for which C++ is the |
| source of truth, as well as the header(s) that declare those APIs. In the future |
| it's possible that this section could be generated bindgen-style from the |
| headers but for now we need the signatures written out; static assertions will |
| verify that they are accurate. |
| |
| Your function implementations themselves, whether in C++ or Rust, *do not* need |
| to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims |
| where necessary to make it all work. |
| |
| <br> |
| |
| ## Comparison vs bindgen and cbindgen |
| |
| Notice that with CXX there is repetition of all the function signatures: they |
| are typed out once where the implementation is defined (in C++ or Rust) and |
| again inside the cxx::bridge module, though compile-time assertions guarantee |
| these are kept in sync. This is different from [bindgen] and [cbindgen] where |
| function signatures are typed by a human once and the tool consumes them in one |
| language and emits them in the other language. |
| |
| [bindgen]: https://github.com/rust-lang/rust-bindgen |
| [cbindgen]: https://github.com/eqrion/cbindgen/ |
| |
| This is because CXX fills a somewhat different role. It is a lower level tool |
| than bindgen or cbindgen in a sense; you can think of it as being a replacement |
| for the concept of `extern "C"` signatures as we know them, rather than a |
| replacement for a bindgen. It would be reasonable to build a higher level |
| bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module |
| (and/or IDL like Thrift) as source of truth and generates the cxx::bridge, |
| eliminating the repetition while leveraging the static analysis safety |
| guarantees of CXX. |
| |
| But note in other ways CXX is higher level than the bindgens, with rich support |
| for common standard library types. Frequently with bindgen when we are dealing |
| with an idiomatic C++ API we would end up manually wrapping that API in C-style |
| raw pointer functions, applying bindgen to get unsafe raw pointer Rust |
| functions, and replicating the API again to expose those idiomatically in Rust. |
| That's a much worse form of repetition because it is unsafe all the way through. |
| |
| By using a CXX bridge as the shared understanding between the languages, rather |
| than `extern "C"` C-style signatures as the shared understanding, common FFI use |
| cases become expressible using 100% safe code. |
| |
| It would also be reasonable to mix and match, using CXX bridge for the 95% of |
| your FFI that is straightforward and doing the remaining few oddball signatures |
| the old fashioned way with bindgen and cbindgen, if for some reason CXX's static |
| restrictions get in the way. Please file an issue if you end up taking this |
| approach so that we know what ways it would be worthwhile to make the tool more |
| expressive. |
| |
| <br> |
| |
| ## Cargo-based setup |
| |
| For builds that are orchestrated by Cargo, you will use a build script that runs |
| CXX's C++ code generator and compiles the resulting C++ code along with any |
| other C++ code for your crate. |
| |
| The canonical build script is as follows. The indicated line returns a |
| [`cc::Build`] instance (from the usual widely used `cc` crate) on which you can |
| set up any additional source files and compiler flags as normal. |
| |
| [`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html |
| |
| ```toml |
| # Cargo.toml |
| |
| [build-dependencies] |
| cxx-build = "1.0" |
| ``` |
| |
| ```rust |
| // build.rs |
| |
| fn main() { |
| cxx_build::bridge("src/main.rs") // returns a cc::Build |
| .file("src/demo.cc") |
| .flag_if_supported("-std=c++11") |
| .compile("cxxbridge-demo"); |
| |
| println!("cargo:rerun-if-changed=src/main.rs"); |
| println!("cargo:rerun-if-changed=src/demo.cc"); |
| println!("cargo:rerun-if-changed=include/demo.h"); |
| } |
| ``` |
| |
| <br> |
| |
| ## Non-Cargo setup |
| |
| For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of |
| invoking the C++ code generator as a standalone command line tool. The tool is |
| packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the |
| *gen/cmd* directory of this repo. |
| |
| ```bash |
| $ cargo install cxxbridge-cmd |
| |
| $ cxxbridge src/main.rs --header > path/to/mybridge.h |
| $ cxxbridge src/main.rs > path/to/mybridge.cc |
| ``` |
| |
| <br> |
| |
| ## Safety |
| |
| Be aware that the design of this library is intentionally restrictive and |
| opinionated! It isn't a goal to be powerful enough to handle arbitrary |
| signatures in either language. Instead this project is about carving out a |
| reasonably expressive set of functionality about which we can make useful safety |
| guarantees today and maybe extend over time. You may find that it takes some |
| practice to use CXX bridge effectively as it won't work in all the ways that you |
| are used to. |
| |
| Some of the considerations that go into ensuring safety are: |
| |
| - By design, our paired code generators work together to control both sides of |
| the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is |
| unsafe because the Rust compiler has no way to know whether the signatures |
| you've written actually match the signatures implemented in the other |
| language. With CXX we achieve that visibility and know what's on the other |
| side. |
| |
| - Our static analysis detects and prevents passing types by value that shouldn't |
| be passed by value from C++ to Rust, for example because they may contain |
| internal pointers that would be screwed up by Rust's move behavior. |
| |
| - To many people's surprise, it is possible to have a struct in Rust and a |
| struct in C++ with exactly the same layout / fields / alignment / everything, |
| and still not the same ABI when passed by value. This is a longstanding |
| bindgen bug that leads to segfaults in absolutely correct-looking code |
| ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the |
| necessary zero-cost workaround transparently where needed, so go ahead and |
| pass your structs by value without worries. This is made possible by owning |
| both sides of the boundary rather than just one. |
| |
| - Template instantiations: for example in order to expose a UniquePtr\<T\> type |
| in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait |
| to connect the behavior back to the template instantiations performed by the |
| other language. |
| |
| [rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778 |
| |
| <br> |
| |
| ## Builtin types |
| |
| In addition to all the primitive types (i32 <=> int32_t), the following |
| common types may be used in the fields of shared structs and the arguments and |
| returns of functions. |
| |
| <table> |
| <tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr> |
| <tr><td>String</td><td>rust::String</td><td></td></tr> |
| <tr><td>&str</td><td>rust::Str</td><td></td></tr> |
| <tr><td>&[T]</td><td>rust::Slice<const T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> |
| <tr><td>&mut [T]</td><td>rust::Slice<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> |
| <tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr> |
| <tr><td>Box<T></td><td>rust::Box<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> |
| <tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.UniquePtr.html">UniquePtr<T></a></td><td>std::unique_ptr<T></td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr> |
| <tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.SharedPtr.html">SharedPtr<T></a></td><td>std::shared_ptr<T></td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr> |
| <tr><td>[T; N]</td><td>std::array<T, N></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> |
| <tr><td>Vec<T></td><td>rust::Vec<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> |
| <tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxVector.html">CxxVector<T></a></td><td>std::vector<T></td><td><sup><i>cannot be passed by value, cannot hold opaque Rust type</i></sup></td></tr> |
| <tr><td>*mut T, *const T</td><td>T*, const T*</td><td><sup><i>fn with a raw pointer argument must be declared unsafe to call</i></sup></td></tr> |
| <tr><td>fn(T, U) -> V</td><td>rust::Fn<V(T, U)></td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr> |
| <tr><td>Result<T></td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr> |
| </table> |
| |
| The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in |
| this repo. You will need to include this header in your C++ code when working |
| with those types. |
| |
| The following types are intended to be supported "soon" but are just not |
| implemented yet. I don't expect any of these to be hard to make work but it's a |
| matter of designing a nice API for each in its non-native language. |
| |
| <table> |
| <tr><th>name in Rust</th><th>name in C++</th></tr> |
| <tr><td>BTreeMap<K, V></td><td><sup><i>tbd</i></sup></td></tr> |
| <tr><td>HashMap<K, V></td><td><sup><i>tbd</i></sup></td></tr> |
| <tr><td>Arc<T></td><td><sup><i>tbd</i></sup></td></tr> |
| <tr><td>Option<T></td><td><sup><i>tbd</i></sup></td></tr> |
| <tr><td><sup><i>tbd</i></sup></td><td>std::map<K, V></td></tr> |
| <tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map<K, V></td></tr> |
| </table> |
| |
| <br> |
| |
| ## Remaining work |
| |
| This is still early days for CXX; I am releasing it as a minimum viable product |
| to collect feedback on the direction and invite collaborators. Please check the |
| open issues. |
| |
| Especially please report issues if you run into trouble building or linking any |
| of this stuff. I'm sure there are ways to make the build aspects friendlier or |
| more robust. |
| |
| Finally, I know more about Rust library design than C++ library design so I |
| would appreciate help making the C++ APIs in this project more idiomatic where |
| anyone has suggestions. |
| |
| <br> |
| |
| #### License |
| |
| <sup> |
| Licensed under either of <a href="LICENSE-APACHE">Apache License, Version |
| 2.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option. |
| </sup> |
| |
| <br> |
| |
| <sub> |
| Unless you explicitly state otherwise, any contribution intentionally submitted |
| for inclusion in this project by you, as defined in the Apache-2.0 license, |
| shall be dual licensed as above, without any additional terms or conditions. |
| </sub> |