This implements input and input_common FFI pieces in input_ffi.rs, and
simultaneously ports bind.rs. This was done as a single commit because
builtin_bind would have required a substantial amount of work to use the input
ffi.
The "#[bench]" attribute is not allowed in stable Rust, so keep it behind
a new feature flag. Run on nightly Rust with
$ cargo bench --features=bechmark
test tests::encoding::bench::bench_convert_ascii ... bench: 125,988 ns/iter (+/- 1,128) = 1040 MB/s
We rarely attach trait methods to stdlib types so this warning is unlikely to
be a true positive It is a false positive for the methods defined in future.rs.
It's not always obvious which method is selected when it's available in the
stdlib but I haven't seen a build failure yet. So let's disable the warning.
In future we might be able suppress it per method, see Rust issue 48919.
Drop support for history file version 1.
ParseExecutionContext no longer contains an OperationContext because in my
first implementation, ParseExecutionContext didn't have interior mutability.
We should probably try to add it back.
Add a few to-do style comments. Search for "todo!" and "PORTING".
Co-authored-by: Xiretza <xiretza@xiretza.xyz>
(complete, wildcard, expand, history, history/file)
Co-authored-by: Henrik Hørlück Berg <36937807+henrikhorluck@users.noreply.github.com>
(builtins/set)
This implements the "postfork" code in Rust, including calling fork(),
exec(), and all the bits that have to happen in between. postfork lives
in the fork_exec module.
It is not yet adopted.
This introduces a new module called fork_exec, which will be for posix_spawn,
postfork, and flog_safe - stuff concerned with actually executing binaries,
and error reporting.
Add a FLOG_SAFE! macro which writes errors to the flog fd in an
async-signal-safe way. This implementation differs from the C++ in that we
allow printing integers directly - no requiring them to be converted to a
buffer first.
- This is untested and unused, string ownership is very much subject to change
- Ports the minimally necessary parts of complete.rs as well
- This should fix an infinite loop in `create_directory` in `path.rs`, the first
`wstat` loop only breaks if it fails with an error that's different from
EAGAIN
We don't change anything about compilation-setup, we just immediately jump to
Rust, making the eventual final swap to a Rust entrypoint very easy.
There are some string-usage and format-string differences that are generally
quite messy.
In CMake this used a `version` file in the CARGO_MANIFEST_DIR, but
relying on that is problematic due to change-detection, as if we add
`cargo-rerun-if-changed:version`, cargo would rerun every time if the file does
not exist, since cargo would expect the file to be generated by the
build-script. We could generate it, but that relies on the output of `git
describe`, whose dependencies we can only limit to anything in the
`.git`-folder, again causing unnecessary build-script runs.
Instead, this reads the `FISH_BUILD_VERSION`-env-variable at compile time
instead of the `version`-file, and falls back to calling git-describe through
the `git_version`-proc-macro. We thus do not need to deal with extraneous
build-script running.
This is not yet used but will take eventually take the place of all (n)curses
access. The curses C library does a lot of header file magic with macro voodoo
to make it easier to perform certain tasks (such as access or override string
capabilities) but this functionality isn't actually directly exposed by the
library's ABI.
The rust wrapper eschews all of that for a more straight-forward implementation,
directly wrapping only the basic curses library calls that are required to
perform the tasks we care about. This should let us avoid the subtle
cross-platform differences between the various curses implementations that
plagued the previous C++ implementation.
All functionality in this module that requires an initialized curses TERMINAL
pointer (`cur_term`, traditionally) has been subsumed by the `Term` instance,
which once initialized with `curses::setup()` can be obtained at any time with
`curses::Term()` (which returns an Option that evaluates to `None` if `cur_term`
hasn't yet been initialized).
Either add rust wrappers for C++ functions called via ffi or port some pure code
from C++ to rust to provide support for the upcoming `env_dispatch` rewrite.
Except for the indent visitor bits.
Tests for parse_util_detect_errors* are not ported yet because they depend
on expand.h (and operation_context.h which depends on env.h).
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.
Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.
Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now. In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t". This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.
Alternatively, we could define FFI wrappers for recursive AST traversal.
Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:
The only client that requires mutable access is the populator. To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.
The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).
Like in the C++ implementation, the AST nodes themselves are largely defined
via macros. Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.
This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList". To make this work
we need to manually define "ExternType".
There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.
One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.
Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.
The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:
$ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms]
Range (min … max): 193.2 ms … 205.1 ms 15 runs
Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms]
Range (min … max): 611.7 ms … 805.5 ms 10 runs
Summary
'./fish.old -c 'source ../share/completions/git.fish'' ran
3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''
Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
changed since it's not user visible.
Most of it is duplicated, hence untested.
Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
more changes for each new libc. We could also use this approach for 1.
Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
This was added to support signals; however we are unlikely to use this
for anything else. Remove it; just use a u64 to report signals that have
been set.
This optimizes over both the rust rewrite and the original C++ code. The rust
rewrite saw `std::bitset` replaced with `[bool; 65]` which could result in a
lot of memory copy bandwidth each time we checked for and received no signals.
The original C++ code would iterate over all signal slots to see if any were
set. The code now returns a single u64 and only checks slots that are known to
have signals via an intelligent `Iterator` impl.
bool_assert_comparison is stupid, the reason they give is "it's shorter". Well,
`assert!(!foo)` is nowhere near as readable as `assert_eq!(foo, false)` because
of the ! noise from the macro.
Uninlined format args is a stupid lint that Rust actually walked back when they
made it an official warning because you still have to use a mix of inlined and
un-inlined format args (the latter of which won't complain) since only idents
can be inlined.