mirror of
https://github.com/fish-shell/fish-shell.git
synced 2024-11-24 10:46:59 +08:00
193 lines
9.7 KiB
Markdown
193 lines
9.7 KiB
Markdown
# fish-shell Rust Development Guide
|
|
|
|
This describes how to get started building fish-shell in its partial Rust state, and how to contribute to the port.
|
|
|
|
## Overview
|
|
|
|
fish is in the process of transitioning from C++ to Rust. The fish project has a Rust crate embedded at path `fish-rust`. This crate builds a Rust library `libfish_rust.a` which is linked with the C++ `libfish.a`. Existing C++ code will be incrementally migrated to this crate; then CMake will be replaced with cargo and other Rust-native tooling.
|
|
|
|
Important tools used during this transition:
|
|
|
|
1. [Corrosion](https://github.com/corrosion-rs/corrosion) to invoke cargo from CMake.
|
|
2. [cxx](http://cxx.rs) for basic C++ <-> Rust interop.
|
|
3. [autocxx](https://google.github.io/autocxx/) for using C++ types in Rust.
|
|
|
|
We use forks of the last two - see the [FFI section](#ffi) below. No special action is required to obtain these packages. They're downloaded by cargo.
|
|
|
|
## Building
|
|
|
|
### Build Dependencies
|
|
|
|
fish-shell currently depends on Rust 1.67 or later. To install Rust, follow https://rustup.rs.
|
|
|
|
### Build via CMake
|
|
|
|
It is recommended to build inside `fish-shell/build`. This will make it easier for Rust to find the `config.h` file.
|
|
|
|
Build via CMake as normal (use any generator, here we use Ninja):
|
|
|
|
```shell
|
|
$ cd fish-shell
|
|
$ mkdir build && cd build
|
|
$ cmake -G Ninja ..
|
|
$ ninja
|
|
```
|
|
|
|
This will create the usual fish executables.
|
|
|
|
### Build just libfish_rust.a with Cargo
|
|
|
|
The directory `fish-rust` contains the Rust sources. These require that CMake has been run to produce `config.h` which is necessary for autocxx to succeed.
|
|
|
|
Follow the "Build from CMake" steps above, and then:
|
|
|
|
```shell
|
|
$ cd fish-shell/fish-rust
|
|
$ cargo build
|
|
```
|
|
|
|
This will build only the library, not a full working fish, but it allows faster iteration for Rust development. That is, after running `cmake` you can open the `fish-rust` as the root of a Rust crate, and tools like rust-analyzer will work.
|
|
|
|
## Development
|
|
|
|
The basic development loop for this port:
|
|
|
|
1. Pick a .cpp (or in some cases .h) file to port, say `util.cpp`.
|
|
2. Add the corresponding `util.rs` file to `fish-rust/`.
|
|
3. Reimplement it in Rust, along with its dependencies as needed. Match the existing C++ code where practical, including propagating any relevant comments.
|
|
- Do this even if it results in less idiomatic Rust, but avoid being super-dogmatic either way.
|
|
- One technique is to paste the C++ into the Rust code, commented out, and go line by line.
|
|
4. Decide whether any existing C++ callers should invoke the Rust implementation, or whether we should keep the C++ one.
|
|
- Utility functions may have both a Rust and C++ implementation. An example is `FLOG` where interop is too hard.
|
|
- Major components (e.g. builtin implementations) should _not_ be duplicated; instead the Rust should call C++ or vice-versa.
|
|
5. Remember to run `cargo fmt` and `cargo clippy` to keep the codebase somewhat clean (otherwise CI will fail). If you use rust-analyzer, you can run clippy automatically by setting `rust-analyzer.checkOnSave.command = "clippy"`.
|
|
|
|
You will likely run into limitations of [`autocxx`](https://google.github.io/autocxx/) and to a lesser extent [`cxx`](https://cxx.rs/). See the [FFI sections](#ffi) below.
|
|
|
|
## Type Mapping
|
|
|
|
### Constants & Type Aliases
|
|
|
|
The FFI does not support constants (`#define` or `static const`) or type aliases (`typedef`, `using`). Duplicate them using their Rust equivalent (`pub const` and `type`/`struct`/`enum`).
|
|
|
|
### Non-POD types
|
|
|
|
Many types cannot currently be passed across the language boundary by value or occur in shared structs. As a workaround, use references, raw pointers or smart pointers (`cxx` provides `SharedPtr` and `UniquePtr`). Try to keep workarounds on the C++ side and the FFI layer of the Rust code. This ensures we will get rid of the workarounds as we peel off the FFI layer.
|
|
|
|
### Strings
|
|
|
|
Fish will mostly _not_ use Rust's `String/&str` types as these cannot represent non-UTF8 data using the default encoding.
|
|
|
|
fish's primary string types will come from the [`widestring` crate](https://docs.rs/widestring). The two main string types are `WString` and `&wstr`, which are renamed [Utf32String](https://docs.rs/widestring/latest/widestring/utfstring/struct.Utf32String.html) and [Utf32Str](https://docs.rs/widestring/latest/widestring/utfstr/struct.Utf32Str.html). `WString` is an owned, heap-allocated UTF32 string, `&wstr` a borrowed UTF32 slice.
|
|
|
|
In general, follow this mapping when porting from C++:
|
|
|
|
- `wcstring` -> `WString`
|
|
- `const wcstring &` -> `&wstr`
|
|
- `const wchar_t *` -> `&wstr`
|
|
|
|
None of the Rust string types are nul-terminated. We're taking this opportunity to drop the nul-terminated aspect of wide string handling.
|
|
|
|
#### Creating strings
|
|
|
|
One may create a `&wstr` from a string literal using the `wchar::L!` macro:
|
|
|
|
```rust
|
|
use crate::wchar::prelude::*;
|
|
// This imports wstr, the L! macro, WString, a ToWString trait that supplies .to_wstring() along with other things
|
|
|
|
fn get_shell_name() -> &'static wstr {
|
|
L!("fish")
|
|
}
|
|
```
|
|
|
|
There is also a `widestrs` proc-macro which enables L as a _suffix_, to reduce the noise. This can be applied to any block, including modules and individual functions:
|
|
|
|
```rust
|
|
use crate::wchar::{wstr, widestrs}
|
|
// also imported by the prelude
|
|
|
|
#[widestrs]
|
|
fn get_shell_name() -> &'static wstr {
|
|
"fish"L // equivalent to L!("fish")
|
|
}
|
|
```
|
|
|
|
#### The wchar prelude
|
|
|
|
We have a prelude to make working with these string types a whole lot more ergonomic. In particular `WExt` supplies the null-terminated-compatible `.char_at(usize)`,
|
|
and a whole lot more methods that makes porting C++ code easier. It is also preferred to use char-based-methods like `.char_count()` and `.slice_{from,to}()`
|
|
of the `WExt` trait over directly calling `.len()` and `[usize..]/[..usize]`, as that makes the code compatible with a potential future change to UTF8-strings.
|
|
|
|
```rust
|
|
pub(crate) mod prelude {
|
|
pub(crate) use crate::{
|
|
wchar::{wstr, IntoCharIter, WString, L},
|
|
wchar_ext::{ToWString, WExt},
|
|
wutil::{sprintf, wgettext, wgettext_fmt, wgettext_str},
|
|
};
|
|
pub(crate) use widestring_suffix::widestrs;
|
|
}
|
|
```
|
|
|
|
### Strings for FFI
|
|
|
|
`WString` and `&wstr` are the common strings used by Rust components. At the FII boundary there are some additional strings for interop. _All of these are temporary for the duration of the port._
|
|
|
|
- `CxxWString` is the Rust binding of `std::wstring`. It is the wide-string analog to [`CxxString`](https://cxx.rs/binding/cxxstring.html) and is [added in our fork of cxx](https://github.com/ridiculousfish/cxx/blob/fish/src/cxx_wstring.rs). This is useful for functions which return e.g. `const wcstring &`.
|
|
- `W0String` is renamed [U32CString](https://docs.rs/widestring/latest/widestring/ucstring/struct.U32CString.html). This is basically `WString` except it _is_ nul-terminated. This is useful for getting a nul-terminated `const wchar_t *` to pass to C++ implementations.
|
|
- `wcharz_t` is an annoying C++ struct which merely wraps a `const wchar_t *`, used for passing these pointers from C++ to Rust. We would prefer to use `const wchar_t *` directly but `autocxx` refuses to generate bindings for types such as `std::vector<const wchar_t *>` so we wrap it in this silly struct.
|
|
|
|
Note C++ `wchar_t`, Rust `char`, and `u32` are effectively interchangeable: you can cast pointers to them back and forth (except we check upon u32->char conversion). However be aware of which types are nul-terminated.
|
|
|
|
These types should be confined to the FFI modules, in particular `wchar_ffi`. They should not "leak" into other modules. See the `wchar_ffi` module.
|
|
|
|
### Format strings
|
|
|
|
Rust's builtin `std::fmt` modules do not accept runtime-provided format strings, so we mostly won't use them, except perhaps for FLOG / other non-translated text.
|
|
|
|
Instead we'll continue to use printf-style strings, with a Rust printf implementation.
|
|
|
|
### Vectors
|
|
|
|
See [`Vec`](https://cxx.rs/binding/vec.html) and [`CxxVector`](https://cxx.rs/binding/cxxvector.html).
|
|
|
|
In many cases, `autocxx` refuses to allow vectors of certain types. For example, autocxx supports `std::vector` and `std::shared_ptr` but NOT `std::vector<std::shared_ptr<...>>`. To work around this one can create a helper (pointer, length) struct. Example:
|
|
|
|
```cpp
|
|
struct RustFFIJobList {
|
|
std::shared_ptr<job_t> *jobs;
|
|
size_t count;
|
|
};
|
|
```
|
|
|
|
This is just a POD (plain old data) so autocxx can generate bindings for it. Then it is trivial to convert it to a Rust slice:
|
|
|
|
```
|
|
pub fn get_jobs(ffi_jobs: &ffi::RustFFIJobList) -> &[SharedPtr<job_t>] {
|
|
unsafe { slice::from_raw_parts(ffi_jobs.jobs, ffi_jobs.count) }
|
|
}
|
|
```
|
|
|
|
Another workaround is to define a struct that contains the shared pointer, and create a vector of that struct.
|
|
|
|
## Development Tooling
|
|
|
|
The [autocxx guidance](https://google.github.io/autocxx/workflow.html#how-can-i-see-what-bindings-autocxx-has-generated) is helpful:
|
|
|
|
1. Install cargo expand (`cargo install cargo-expand`). Then you can use `cargo expand` to see the generated Rust bindings for C++. In particular this is useful for seeing failed expansions for C++ types that autocxx cannot handle.
|
|
2. In rust-analyzer, enable Proc Macro and Proc Macro Attributes.
|
|
|
|
## FFI
|
|
|
|
The boundary between Rust and C++ is referred to as the Foreign Function Interface, or FFI.
|
|
|
|
`autocxx` and `cxx` both are designed for long-term interop: C++ and Rust coexisting for years. To this end, both emphasize safety: requiring lots of `unsafe`, `Pin`, etc.
|
|
|
|
fish plans to use them only temporarily, with a focus on getting things working. To this end, both cxx and autocxx have been forked to support fish:
|
|
|
|
1. Relax the requirement that all functions taking pointers are `unsafe` (this just added noise).
|
|
2. Add support for `wchar_t` as a recognized type, and `CxxWString` analogous to `CxxString`.
|
|
|
|
See the `Cargo.toml` file for the locations of the forks.
|