There are many places where we want to treat a missing variable the same as
a variable with an empty value.
In C++ we handle this by branching on maybe_t<env_var_t>::missing_or_empty().
If it returns false, we go on to access maybe_t<env_var_t>::value() aka
operator*.
In Rust, Environment::get() will return an Option<EnvVar>.
We could define a MissingOrEmpty trait and implement it for Option<EnvVar>.
However that will still leave us with ugly calls to Option::unwrap()
(by convention Rust does use shorthands like *).
Let's add a variable getter that returns none for empty variables.
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.
Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.
Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now. In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t". This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.
Alternatively, we could define FFI wrappers for recursive AST traversal.
Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:
The only client that requires mutable access is the populator. To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.
The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).
Like in the C++ implementation, the AST nodes themselves are largely defined
via macros. Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.
This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList". To make this work
we need to manually define "ExternType".
There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.
One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.
Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.
The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:
$ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms]
Range (min … max): 193.2 ms … 205.1 ms 15 runs
Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms]
Range (min … max): 611.7 ms … 805.5 ms 10 runs
Summary
'./fish.old -c 'source ../share/completions/git.fish'' ran
3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''
Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
changed since it's not user visible.
This is basically a subset of type, so we might as well.
To be clear this is `command -s` and friends, if you do `command grep` that's
handled as a keyword.
One issue here is that we can't get "one path or not" because I don't
know how to translate a maybe_t? Do we need to make it a shared_ptr instead?
Most of it is duplicated, hence untested.
Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
more changes for each new libc. We could also use this approach for 1.
Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
More ugliness with types that cxx bridge can't recognize as being POD. Using
pointers to get/set `termios` values with an assert to make sure we're using
identical definitions on both sides (in cpp from the system headers and in rust
from the libc crate as exported).
I don't know why cxx bridge doesn't allow `SharedPtr<OpaqueRustType>` but we can
work around it in C++ by converting a `Box<T>` to a `shared_ptr<T>` then convert
it back when it needs to be destructed. I can't find a clean way of doing it
from the cxx bridge wrapper so for now it needs to be done manually in the C++
code.
Types/values that are drop-in ready over ffi are renamed to match the old cpp
names but for types that now differ due to ffi difficulties I've left the `_ffi`
in the function names to indicate that this isn't the "correct" way of using the
types/methods.
Keeps the location of original function definition, and also stores
where it was copied. `functions` and `type` show both locations,
instead of none. It also retains the line numbers in the stack trace.
This is early work but I guess there's no harm in pushing it?
Some thoughts on the conventions:
Types that live only inside Rust follow Rust naming convention
("FeatureMetadata").
Types that live on both sides of the language boundary follow the existing
naming ("feature_flag_t").
The alternative is to define a type alias ("using feature_flag_t =
rust::FeatureFlag") but that doesn't seem to be supported in "[cxx::bridge]"
blocks. We could put it in a header ("future_feature_flags.h").
"feature_metadata_t" is a variant of "FeatureMetadata" that can cross
the language boundary. This has the advantage that we can avoid tainting
"FeatureMetadata" with "CxxString" and such. This is an experimental approach,
probably not what we should do in general.
This means cleaning out old universal variables is now just:
```fish
abbr --erase (abbr --list)
```
which makes upgrading much easier.
Note that this erases the currently defined variable and/or any
universal. It doesn't stop at the former because that makes it *easy*
to remove the universals (no running `abbr --erase` twice), and it
doesn't care about globals because, well, they would be gone on
restart anyway.
Fixes#9468.
This now means `abbr --add` has two modes:
```fish
abbr --add name --function foo --regex regex
```
```fish
abbr --add name --regex regex replacement
```
This is because `--function` was seen to be confusing as a boolean flag.
Unfortunately print_hints was true *by default* - so for all builtins
that didn't pass it it would now be false instead.
This resulted in the trailer missing, which includes the line number
and context. So if you ran a script that includes `bind -M` the error
message would now just be "bind: -M: option requires an argument",
with no indication as to where.
This reverts commit 8a50d47a46.
The print_hints variable was always false, so just remove it.
This caused a cascade of other changes where the parser_t variable
becomes unused, so remove it from the call sites.
No functional change expected here.
This would print
```
abbr -a -- dotdot --regex ^\\.\\.+\$ --function multicd
```
which expands "dotdot" to "--regex ^\\.\\.+\$...".
Instead, we move the name to right before the replacement, and move
the `--` before that:
```
abbr -a --regex ^\\.\\.+\$ --function -- dotdot multicd
```
It might be possible to improve that, but this at least round-trips.