pcre2-sys includes a vendored copy of PCRE2, which allows for
statically-linked PCRE2. Hook this up to the CMake build variable, and
remove the C++ integration for PCRE2.
Use Rust for executables
Drops the C++ entry points and restructures the Rust package into a
library and three binary crates.
Renames the fish-rust package to fish.
At least on Ubuntu, "fish_indent" is built before "fish".
Make sure export CURSES_LIBRARY_LIST to all binaries to make sure
that "cached-curses-libnames" is populated.
Closes#10198
GNUInstallDirs is what defines CMAKE_INSTALL_FULL_BINDIR and such, so
the setting in Rust.cmake didn't work.
This also makes build.rs error out if any of these aren't defined
This will allow to use "cargo test" for unit tests that depend on our
curses.rs.
This means that Rust.cmake depends on ConfigureChecks, so move that one to
the front.
This implements input and input_common FFI pieces in input_ffi.rs, and
simultaneously ports bind.rs. This was done as a single commit because
builtin_bind would have required a substantial amount of work to use the input
ffi.
Drop support for history file version 1.
ParseExecutionContext no longer contains an OperationContext because in my
first implementation, ParseExecutionContext didn't have interior mutability.
We should probably try to add it back.
Add a few to-do style comments. Search for "todo!" and "PORTING".
Co-authored-by: Xiretza <xiretza@xiretza.xyz>
(complete, wildcard, expand, history, history/file)
Co-authored-by: Henrik Hørlück Berg <36937807+henrikhorluck@users.noreply.github.com>
(builtins/set)
This adopts the Rust postfork code, bridging it from C++ exec module.
We use direct function calls for the bridge, rather than cxx/autocxx, so that we
can be sure that no memory allocations or other shenanigans are happening.
- Add test to verify piped string replace exit code
Ensure fields parsing error messages are the same.
Note: C++ relied upon the value of the parsed value even when `errno` was set,
that is defined behaviour we should not rely on, and cannot easilt be replicated from Rust.
Therefore the Rust version will change the following error behaviour from:
```shell
> string split --fields=a "" abc
string split: Invalid fields value 'a'
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
To:
```shell
> string split --fields=a "" abc
string split: a: invalid integer
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
This adopts the new function store, replacing the C++ version.
It also reimplements builtin_function in Rust, as these was too coupled to
the function store to handle in a separate commit.
Note this is slightly incomplete - the FD is not moved into the parser, and so
will be freed at the end of each directory change. The FD saved in the parser is
never actually used in existing code, so this doesn't break anything, but will
need to be corrected once the parser is ported.
This allows the rust code to free up C++ resources allocated for a callback even
when the callback isn't executed (as opposed to requiring the callback to run
and at the end of the callback cleaning up all allocated resources).
Also add type-erased destructor registration to callback_t. This allows for
freeing variables allocated by the callback for debounce_t's
perform_with_callback() that don't end up having their completion called due to
a timeout.
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.
Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.
Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now. In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t". This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.
Alternatively, we could define FFI wrappers for recursive AST traversal.
Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:
The only client that requires mutable access is the populator. To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.
The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).
Like in the C++ implementation, the AST nodes themselves are largely defined
via macros. Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.
This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList". To make this work
we need to manually define "ExternType".
There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.
One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.
Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.
The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:
$ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms]
Range (min … max): 193.2 ms … 205.1 ms 15 runs
Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms]
Range (min … max): 611.7 ms … 805.5 ms 10 runs
Summary
'./fish.old -c 'source ../share/completions/git.fish'' ran
3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''
Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
changed since it's not user visible.
This is basically a subset of type, so we might as well.
To be clear this is `command -s` and friends, if you do `command grep` that's
handled as a keyword.
One issue here is that we can't get "one path or not" because I don't
know how to translate a maybe_t? Do we need to make it a shared_ptr instead?