8.5 KiB
fish-shell Rust Development Guide
This describes how to get started building fish-shell in its partial Rust state, and how to contribute to the port.
Overview
fish is in the process of transitioning from C++ to Rust. The fish project has a Rust crate embedded at path fish-rust
. This crate builds a Rust library libfish_rust.a
which is linked with the C++ libfish.a
. Existing C++ code will be incrementally migrated to this crate; then CMake will be replaced with cargo and other Rust-native tooling.
Important tools used during this transition:
- Corrosion to invoke cargo from CMake.
- cxx for basic C++ <-> Rust interop.
- autocxx for using C++ types in Rust.
We use forks of the last two - see the FFI section below. No special action is required to obtain these packages. They're downloaded by cargo.
Building
Build Dependencies
fish-shell currently depends on Rust 1.67 or later. To install Rust, follow https://rustup.rs.
Build via CMake
It is recommended to build inside fish-shell/build
. This will make it easier for Rust to find the config.h
file.
Build via CMake as normal (use any generator, here we use Ninja):
$ cd fish-shell
$ mkdir build && cd build
$ cmake -G Ninja ..
$ ninja
This will create the usual fish executables.
Build just libfish_rust.a with Cargo
The directory fish-rust
contains the Rust sources. These require that CMake has been run to produce config.h
which is necessary for autocxx to succeed.
Follow the "Build from CMake" steps above, and then:
$ cd fish-shell/fish-rust
$ cargo build
This will build only the library, not a full working fish, but it allows faster iteration for Rust development. That is, after running cmake
you can open the fish-rust
as the root of a Rust crate, and tools like rust-analyzer will work.
Development
The basic development loop for this port:
- Pick a .cpp (or in some cases .h) file to port, say
util.cpp
. - Add the corresponding
util.rs
file tofish-rust/
. - Reimplement it in Rust, along with its dependencies as needed. Match the existing C++ code where practical, including propagating any relevant comments.
- Do this even if it results in less idiomatic Rust, but avoid being super-dogmatic either way.
- One technique is to paste the C++ into the Rust code, commented out, and go line by line.
- Decide whether any existing C++ callers should invoke the Rust implementation, or whether we should keep the C++ one.
- Utility functions may have both a Rust and C++ implementation. An example is
FLOG
where interop is too hard. - Major components (e.g. builtin implementations) should not be duplicated; instead the Rust should call C++ or vice-versa.
- Utility functions may have both a Rust and C++ implementation. An example is
You will likely run into limitations of autocxx
and to a lesser extent cxx
. See the FFI sections below.
Type Mapping
Constants & Type Aliases
The FFI does not support constants (#define
or static const
) or type aliases (typedef
, using
). Duplicate them using their Rust equivalent (pub const
and type
/struct
/enum
).
Non-POD types
Many types cannot currently be passed across the language boundary by value or occur in shared structs. As a workaround, use references, raw pointers or smart pointers (cxx
provides SharedPtr
and UniquePtr
). Try to keep workarounds on the C++ side and the FFI layer of the Rust code. This ensures we will get rid of the workarounds as we peel off the FFI layer.
Strings
Fish will mostly not use Rust's String/&str
types as these cannot represent non-UTF8 data using the default encoding.
fish's primary string types will come from the widestring
crate. The two main string types are WString
and &wstr
, which are renamed Utf32String and Utf32Str. WString
is an owned, heap-allocated UTF32 string, &wstr
a borrowed UTF32 slice.
In general, follow this mapping when porting from C++:
wcstring
->WString
const wcstring &
->&wstr
const wchar_t *
->&wstr
None of the Rust string types are nul-terminated. We're taking this opportunity to drop the nul-terminated aspect of wide string handling.
Creating strings
One may create a &wstr
from a string literal using the wchar::L!
macro:
use crate::wchar::{wstr, L!}
fn get_shell_name() -> &'static wstr {
L!("fish")
}
There is also a widestrs
proc-macro which enables L as a suffix, to reduce the noise. This can be applied to any block, including modules and individual functions:
use crate::wchar::{wstr, widestrs}
[#widestrs]
fn get_shell_name() -> &'static wstr {
"fish"L // equivalent to L!("fish")
}
Strings for FFI
WString
and &wstr
are the common strings used by Rust components. At the FII boundary there are some additional strings for interop. All of these are temporary for the duration of the port.
CxxWString
is the Rust binding ofstd::wstring
. It is the wide-string analog toCxxString
and is added in our fork of cxx. This is useful for functions which return e.g.const wcstring &
.W0String
is renamed U32CString. This is basicallyWString
except it is nul-terminated. This is useful for getting a nul-terminatedconst wchar_t *
to pass to C++ implementations.wcharz_t
is an annoying C++ struct which merely wraps aconst wchar_t *
, used for passing these pointers from C++ to Rust. We would prefer to useconst wchar_t *
directly butautocxx
refuses to generate bindings for types such asstd::vector<const wchar_t *>
so we wrap it in this silly struct.
Note C++ wchar_t
, Rust char
, and u32
are effectively interchangeable: you can cast pointers to them back and forth (except we check upon u32->char conversion). However be aware of which types are nul-terminated.
These types should be confined to the FFI modules, in particular wchar_ffi
. They should not "leak" into other modules. See the wchar_ffi
module.
Format strings
Rust's builtin std::fmt
modules do not accept runtime-provided format strings, so we mostly won't use them, except perhaps for FLOG / other non-translated text.
Instead we'll continue to use printf-style strings, with a Rust printf implementation.
Vectors
In many cases, autocxx
refuses to allow vectors of certain types. For example, autocxx supports std::vector
and std::shared_ptr
but NOT std::vector<std::shared_ptr<...>>
. To work around this one can create a helper (pointer, length) struct. Example:
struct RustFFIJobList {
std::shared_ptr<job_t> *jobs;
size_t count;
};
This is just a POD (plain old data) so autocxx can generate bindings for it. Then it is trivial to convert it to a Rust slice:
pub fn get_jobs(ffi_jobs: &ffi::RustFFIJobList) -> &[SharedPtr<job_t>] {
unsafe { slice::from_raw_parts(ffi_jobs.jobs, ffi_jobs.count) }
}
Another workaround is to define a struct that contains the shared pointer, and create a vector of that struct.
Development Tooling
The autocxx guidance is helpful:
- Install cargo expand (
cargo install cargo-expand
). Then you can usecargo expand
to see the generated Rust bindings for C++. In particular this is useful for seeing failed expansions for C++ types that autocxx cannot handle. - In rust-analyzer, enable Proc Macro and Proc Macro Attributes.
FFI
The boundary between Rust and C++ is referred to as the FII.
autocxx
and cxx
both are designed for long-term interop: C++ and Rust coexisting for years. To this end, both emphasize safety: requiring lots of unsafe
, Pin
, etc.
fish plans to use them only temporarily, with a focus on getting things working. To this end, both cxx and autocxx have been forked to support fish:
- Relax the requirement that all functions taking pointers are
unsafe
(this just added noise). - Add support for
wchar_t
as a recognized type, andCxxWString
analogous toCxxString
.
See the Cargo.toml
file for the locations of the forks.