The incompatible_msrv one is a false positive because we have polyfills for
is_some_and() and is_ok_or() which are Rust 1.74. I'm not yet sure how to
communicate that to Clippy.
Call fish_should_add_to_history to see if a command should be saved
If it returns 0, it will be saved, if it returns anything else, it
will be ephemeral.
It gets the right-trimmed text as the argument.
If it doesn't exist, we do the historical behavior of checking for a
leading space.
That means you can now turn that off by defining a
`fish_should_add_to_history` that just doesn't check it.
documentation based on #9298
The existing logic did not work because:
- Path::new("/foo/bar").ends_with("/bar") does not return true.
- PathBuf::shrink_to() only (potentially) reallocates the backing
storage, and won't have an effect on the stored value.
As mentioned in the comment the historical behavior is because pressing unknown
control characters like Ctrl+4 inserts confusing characters, so let's back
out that part of b77d1d0e2 (Stop crashing on invalid Unicode input, 2024-02-27).
We still have the code for rendering control characters, for pasted text,
or text recalled from history. It is unclear whether we should strip those.
Some terminals already strip control characters from pasted text -- but not
all of them: see https://codeberg.org/dnkl/foot/pulls/312 for example which
has a follow up called "Don't strip HT when pasting in non-bracketed mode".
Don't force the internal use of `RefCell<T>`, let the caller place that into
`MainThread<>` manually. This lets us remove the reference to `MainThread<>`
from the definition of `Screen` again and reduces the number of
`assert_is_main_thread()` calls.
Fairly straightforward, with the only unfortunate part of this being that
`Screen` isn't as pure and now encodes the facte that we use it with
main-thread-only stdout `Outputter`.
The regex struct is pretty large at 560 bytes, with the entire
Abbreviation being 664 bytes.
If it's an "Option<Regex>", any abbr gets to pay the price. Boxing it
means abbrs without a regex are over 500 bytes smaller.
IfStatement is 680 bytes, much larger than the other
variants (SwitchStatement is next at 232). An enum is as large as its
largest variant, so this saves a bunch, especially since
DecoratedStatement is much more likely than IfStatement.
This will speed up the no-execute benchmark by 1.07x.
Unlike C++, Rust requires "char" to be a valid Unicode code point. As a
workaround, we take the raw (probably UTF-8-encoded) input and convert each
input byte to a char representation from the private use area (see commit
3b15e995e (str2wcs: encode invalid Unicode characters in the private use
area, 2023-04-01)). We convert back whenever we output the string, which
is correct as long as the encoding didn't change since the data was input.
We also need to convert keyboard input; do that.
Quick testing shows that our reader drops PUA characters. Since this patch
converts both invalid Unicode input as well as PUA input into a safe PUA
representation, there's no longer a reason to not add PUA characters to
the commandline, so let's do that to restore traditional behavior.
Render them as � (REPLACEMENT CHARACTER); unfortunately we show one per
input byte instead of one per code point. To fix this we probably need our
own char type.
While at it, remove some special cases that try to prevent insertion of
control characters. I don't think they are necessary. Could be wrong..
This makes it so code like
```fish
echo foo
echo bar
```
is collapsed into
```fish
echo foo
echo bar
```
One empty line is allowed, more is overkill.
We could also allow more than one for e.g. function endings.
We don't need to know that it tried these five before finally getting
one, the list is *right there*.
It is also very unlikely that someone has "xterm" or "ansi" but not "xterm-256color"
For xterm-256color, we don't warn *at all* because we have that one hardcoded.
This allows us to get the terminfo information without linking against curses.
That means we can get by without a bunch of awkward C-API trickery.
There is no global "cur_term" kept by a library for us that we need to invalidate.
Note that it still requires a "unhashed terminfo database", and I don't know how well it handles termcap.
I am not actually sure if there are systems that *can't* have terminfo, everything I looked at
has the ncurses terminfo available to install at least.
We still don't support tabs but as of the parent commit, there are no more
weird glitches, so it should be fine to recall those lines?
This reverts commit cc0e366037.
Inserting Tab or Backspace characters causes weird glitches. Sometimes it's
useful to paste tabs as part of a code block.
Render tabs as "␉" and so on for other ASCII control characters, see
https://unicode-table.com/en/blocks/control-pictures/. This fixes the
width-related glitches.
You can see it in action by inserting some control characters into the
command line:
set chars
for x in (seq 1 0x1F)
set -a chars (printf "%02x\\\\x%02x" $x $x)
end
eval set chars $chars
commandline -i "echo '" $chars
Fixes#6923Fixes#5274Closes#7295
We could extend this approach to display a fallback symbol for every unknown
nonprintable character, not just ASCII control characters.
In future we might want to support tab properly.
This reserved 64, which is *gigantic*.
Over all of share/**.fish, 75% of lists are empty, 99.97% are 16
elements or fewer.
Reducing this to 16 reduces memory usage for a gigantic example
script (git.fish pasted a bunch of times for a total of almost 100k
lines) by ~10% and speeds up "--no-execute" time by the same amount.
For smaller scripts it's less noticeable simply because parse time
matters less.
There are other options, like creating the vec ::with_capacity, or
using 8 instead of 16, or even letting the vec just grow
naturally (rust's vec currently grows from 0 to 4 and then doubles,
which isn't terrible for this use), but the point is that 64 is
wasteful and never comes out on top, always in the last two places
comparing a bunch of choices.
Commit e5b34d5cd (Suppress autosuggesting during backspacing like browsers do,
2012-02-06) disabled autosuggestion when backspacing. Autosuggestions are
re-enabled whenever we insert anything in the command line. Undo uses a
different code path to insert into the command line, which does not re-enable
autosuggestion.
Fix that.
Also re-enable autosuggestion when undo erases from the command line.
This seems like the simplest approach. It's not clear if there's a better
behavior; browsers don't agree on one in any case.
This is the last remnant of the old percent expansion.
It has the downsides of it, in that it is annoying to combine with
anything:
```fish
echo %self/foo
```
prints "%self/foo", not fish's pid.
We have introduced $fish_pid in 3.0, which is much easier to use -
just like a variable, because it is one.
If you need backwards-compatibility for < 3.0, you can use the
following shim:
```fish
set -q fish_pid
or set -g fish_pid %self
```
So we introduce a feature-flag called "remove-percent-self" to turn it
off.
"%self" will simply not be special, e.g. `echo %self` will print
"%self".
This stops you from doing e.g.
```fish
set pager command less
echo foo | $pager
```
Currently, it would run the command *builtin*, which can only do
`--search` and similar, and would most likely end up printing its own
help.
That means it very very likely won't work, and the code is misguided -
it is trying to defeat function resolution in a way that won't do what
the author wants it to.
The alternative would be to make the command *builtin* execute the
command, *but*
1. That would require rearchitecting and rewriting a bunch of it and
the parser
2. It would be a large footgun, in that `set EDITOR command foo` will
only ever work inside fish, but $EDITOR is also used outside.
I don't want to add a feature that we would immediately have to discourage.
Currently, if you enter `echo` and press up-arrow, it might select
e.g. `echo foo`.
You can then enter text, making it `echo foobar` and press up-arrow
again, but the search string is *still* `echo`.
Many *other* input functions will end history search, including e.g.
expand-abbr, so pressing space by default will already end it.
So this ends the history search once you input something.
Incidentally this allows suggestions to work in this case, so it
Fixes#10287
Note that autosuggestions have been disabled while history search is
active since a08450bcb6, I'm not sure
it's actually *needed*, so it would also be possible to enable it in
that case.
But since this is already awkward (history search is *active* but with
the old search string) and I'm not sure if e.g. suggestions during
history search would be too busy, let's do this first.
This reduces the test time by ~33% on my system (23s to 15s)
Given that it takes ~180-240s on Github Actions, if we get a reduction
like that we can save over a minute.
We only use this
1. if we have localeconv_l
2. to get the decimal point / thousands separator for numbers
So we can ignore all this and directly create a purely LC_NUMERIC locale.
This *was* more useful when we were in C++ and the printing functions
all relied on locale, but we only use this in printf and that only
extracts the number stuff.
Commit b768b9d3f (Use fuzzy subsequence completion for options names as well,
2024-01-27) allowed completing "oa" to "--foobar", which is a false positive,
especially because it hides other valid completions of non-option arguments.
Let's at least require a leading dash again before completing option names.
The lines of code I commented on in #10254 were meant to serve only as examples
of the changes I was requesting, not the only instances.
Also just use `Mode::from_bits_truncate()` instead of unsafe or unwrapping since
we know the modes are correct.
* Fix build on NetBSD
Notably:
1. A typo in `f_flag` vs `f_flags` - this was probably never tested
2. Some pointless name differences - `st_mtimensec` vs
`st_mtime_nsec`
3. The big one: This said that LC_GLOBAL_LOCALE() was -1 "everywhere".
Well, not on NetBSD.
* ifdef for macos
Version 2.1.0 introduced subsequence matching for completions but as the
changelog entry mentions, "This feature [...] is not yet implemented for
options (like ``--foobar``)". Add it. Seems like a strict improvement,
pretty much.
.unwrap() is in effect an assert(). If it is applied mistakenly, the
program crashes and there isn't a good error.
I would like it to be used as a last resort. In these cases there are
nicer ways to do it that handle a missing result properly.
Issue #10194 reports Cobra completions do
set -l args (commandline -opc)
eval $args[1] __complete $args[2..] (commandline -ct | string escape)
The intent behind "eval" is to expand variables and tildes in "$args".
Fair enough. Several of our own completions do the same, see the next commit.
The problem with "commandline -o" + "eval" is that the former already
removes quotes that are relevant for "eval". This becomes a problem if $args
contains quoted () or {}, for example this command will wrongly execute a
command substituion:
git --work-tree='(launch-missiles)' <TAB>
It is possible to escape the string the tokens before running eval, but
then there will be no expansion of variables etc. The problem is that
"commandline -o" only unescapes tokens so they end up in a weird state
somewhere in-between what the user typed and the expanded version.
Remove the need for "eval" by introducing "commandline -x" which expands
things like variables and braces. This enables custom completion scripts to
be aware of shell variables without eval, see the added test for completions
to "make -C $var/some/dir ".
This means that essentially all third party scripts should migrate from
"commandline -o" to "commandline -x". For example
set -l tokens
if commandline -x >/dev/null 2>&1
set tokens (commandline -xpc)
else
set tokens (commandline -opc)
end
Since this is mainly used for completions, the expansion skips command
substitutions. They are passed through as-is (instead of cancelling or
expanding to nothing) to make custom completion scripts work reasonably well
in the common case. Of course there are cases where we would want to expand
command substitutions here, so I'm not sure.
Move all qmark tests to `scoped_test()` sections with explicitly set feature
flags. We already test the default qmark behavior in the functionality tests.
It seems the logic for calculating the cursor position was not ported correctly,
because the correct place to insert it is at the cursor_pos regardless of
range.start, going by the parameters submitted to the function and the expected
result.
This allows running a fish built from `cargo build` *and* built via
cmake.
In future, we should make this an optional thing that's removed from
installed builds.
NCurses headers contain this conditional "#define cur_term":
print "#elif @cf_cv_enable_reentrant@"
print "NCURSES_WRAPPED_VAR(TERMINAL *, cur_term);"
print "#define cur_term NCURSES_PUBLIC_VAR(cur_term())"
print "#else"
OpenSUSE Tumbleweed uses this configuration option; For reentrancy, cur_term
is a function. If the NCurses autoconf variable @NCURSES_WRAP_PREFIX@
is not changed from its default, the function is called _nc_cur_term.
I'm not sure if we have a need to support non-default @NCURSES_WRAP_PREFIX@
but if we do there are various ways;
- search for the symbol with the cur_term suffix
- figure out the prefix based on the local curses installation,
for example by looking at the header files.
Fixes#10243
This is run in the current locale, without resetting to en_US.UTF-8
like our integration tests do.
So if you want to check for a specific message you need to check the
localized version.
This was previously limited to Linux predicated on the existence
of certain headers, but Rust just exposes those functions unconditionally. So
remove the check and just perform the mtime hack on Linux and Android.
Make sure to also look for the error part that occurs after the last format
specifier.
Still not great because it won't fail if there's unexpected output at the
beginning or end of the string.
Commit 5f849d0 changed control-C to print an inverted ^C and then a newline.
The original motivation was
> In bash if you type something and press ctrl-c then the content of the line
> is preserved and the cursor is moved to a new line. In fish the ctrl-c just
> clears the line. For me the behaviour of bash is a bit better, because it
> allows me to type something then press ctrl-c and I have the typed string
> in the log for further reference.
This sounds like a valid use case in some scenarios but I think that most
abandoned commands are noise. After all, the user erased them. Also, now that
we have undo that can be used to get back a limited set of canceled commands.
I believe the original motivation for existing behavior (in other shells) was
that TERM=dumb does not support erasing characters. Similarly, other shells
like to leave behind other artifacts, for example when using tab-completion
or in their interactive menus but we generally don't.
Control-C is the obvious way to quickly clear a multi-line commandline.
IPython does the same. For the other behavior we have Alt-# although that's
probably not very well-known.
Restore the old Control-C behavior of simply clearing the command line.
Our unused __fish_cancel_commandline still prints the ^C. For folks who
have explicitly bound ^C to that, it's probably better to keep the existing
behavior, so let's leave this one.
Previous attempt at #4713 fizzled.
Closes#10213
This function is a hotspot, but it has inefficient codegen:
1. For whatever reason, the chars() iterator of wstr is slower
than that of a slice. Use the slice.
2. Unnecessary overflow checks were preventing vectorization.
Switch to a more optimized implementation.
This improves aliases benchmark time by about 9%.
Since none of the compiles(xxx) calls are to particularly complex code, we can
just use `rsconf` directly to test for the presence of the symbols or headers as
needed.
Note that it seems at least some of the previous detection was not working
correctly; in particular HAVE_PIPE2 was evaluating to false on my WSL install
where pipe2(2) was available (caught because it revealed some compilation errors
in that conditional compilation path after porting).
I kept the cfg names and the tests themselves mostly as-is, though we might want
to change that to conform with the rust convention of lowercase cfg names and
decide whether we want to prefix all these with have_, fish_, or nothing at all.
Also the posix_spawn() test should probably check for the symbol `posix_spawn()`
rather than the header `spawn.h` since we don't use it via the header but rather
via the symbol (but in reality they're almost certainly going to give the same
result).
NB: I only encountered this when rewriting the cfg detection, which means that
the previous detection wasn't correct since I have pipe2 on Linux but didn't run
into this build error before.
I had originally created a safe `set_locale()` wrapper and clippy-disallowed
`libc::setlocale()` but almost all our uses of `libc::setlocale()` are in a loop
where it makes much more sense to just obtain the lock outright then call
`setlocale()` repeatedly rather than lock it in the wrapper function each time.
No need to use cfg_attr and have to worry about syncing the preconditions for
the cfg_attr with the preconditions for where `slice_contains_slice()` is used
in the codebase, just mark it as `allow(unused)` with a comment.
Use Rust for executables
Drops the C++ entry points and restructures the Rust package into a
library and three binary crates.
Renames the fish-rust package to fish.
At least on Ubuntu, "fish_indent" is built before "fish".
Make sure export CURSES_LIBRARY_LIST to all binaries to make sure
that "cached-curses-libnames" is populated.
Closes#10198
Keep running tests serially to avoid breaking assumptions.
I think many of these tests can run in parallel and/or don't need test_init().
Use the safe variant everywhere, to get it done faster.
This would misname `\e\x7F` as "backspace":
bind -k backspace 'do something'
bind \e\x7F 'do something'
because it would check if there was any key *in there*.
This was probably meant for continuous mode, but it simply doesn't
work right. It's preferable to not give a key when one would work over
giving one when it's not correct.
This implements input and input_common FFI pieces in input_ffi.rs, and
simultaneously ports bind.rs. This was done as a single commit because
builtin_bind would have required a substantial amount of work to use the input
ffi.
The existing subsequence search commonly returns false positives.
Support globs, to allow searching for disconnected substrings in a better way.
Closes#10143Closes#10131
The "#[bench]" attribute is not allowed in stable Rust, so keep it behind
a new feature flag. Run on nightly Rust with
$ cargo bench --features=bechmark
test tests::encoding::bench::bench_convert_ascii ... bench: 125,988 ns/iter (+/- 1,128) = 1040 MB/s
Drop support for history file version 1.
ParseExecutionContext no longer contains an OperationContext because in my
first implementation, ParseExecutionContext didn't have interior mutability.
We should probably try to add it back.
Add a few to-do style comments. Search for "todo!" and "PORTING".
Co-authored-by: Xiretza <xiretza@xiretza.xyz>
(complete, wildcard, expand, history, history/file)
Co-authored-by: Henrik Hørlück Berg <36937807+henrikhorluck@users.noreply.github.com>
(builtins/set)
This reduces noise in the upcoming "Port execution" commit.
I accidentally made IoStreams a "class" instead of a "struct". Would be
easy to correct that but this will be deleted soon, so I don't think we care.
These printed "Unknown error while evaluating command substitution".
Now they print something like
```
fish: for: status: cannot overwrite read-only variable
for status in foo; end
^~~~~^
in command substitution
fish: Invalid arguments
echo (for status in foo; end)
^~~~~~~~~~~~~~~~~~~~~~~^
```
for `echo (for status in foo; end)`
This is, of course, still not *great*. Mostly the `fish: Invalid
arguments` is basically entirely redundant.
An alternative is to simply skip the error message, but that requires some
more scaffolding (describe_with_prefix adds some error messages on its
own, so we can't simply say "don't add the prefix if we don't have a
message")
(cherry picked from commit 1b5eec2af6)
* wildcard: Remove file size from the description
We no longer add descriptions for normal file completions, so this was
only ever reached if this was a command completion, and then it was
only added if the file wasn't a regular file... in which case it can't
be an executable.
So this was dead.
* Make possible_link() a maybe
This gives us the full information, not just "no" or "maybe"
* wildcard: Rationalize file/command completions
This keeps the entry_t as long as possible, and asks it, so especially
on systems with working d_type we can get by without a single stat in
most cases.
Then it guts file_get_desc, because that is only used for command
completions - we have been disabling file descriptions for *years*,
and so this is never called there.
That means we have no need to print descriptions about e.g. broken symlinks, because those are not executable.
Put together, what this means is that we, in most cases, only do
an *access(2)* call instead of a stat, because that might be checking
more permissions.
So we have the following constellations:
- If we have d_type:
- We need a stat() for every _symlink_ to get the type (e.g. dir or regular)
(this is for most symlinks, if we want to know if it's a dir or executable)
- We need an access() for every file for executables
- If we do not have d_type:
- We need a stat() for every file
- We need an lstat() for every file if we do descriptions
(i.e. just for command completion)
- We need an access() for every file for executables
As opposed to the current way, where every file gets one lstat whether
with d_type or not, and an additional stat() for links, *and* an
access.
So we go from two syscalls to one for executables.
* Some more comments
* rust link option
* rust remove size
* rust accessovaganza
* Check for .dll first for WSL
This saves quite a few checks if e.g. System32 is in $PATH (which it
is if you inherit windows paths, IIRC).
Note: Our WSL check currently fails for WSL2, where this would
be *more* important because of how abysmal the filesystem performance
on that is.
Given an env like
foo
bar=baz
we would set "foo" to empty due to a typo.
The typo is pointed out by a PORTING comment.
Luckily I don't think we ever hit this case because that would mean our
parent process has a serious bug. Rust's std::env::vars_os() skips env
lines that don't contain a "=" char. This seems like a reasonable behavior
for us too. Do that.
This makes it so expand_intermediate_segment knows about the case
where it's last, only followed by a "/".
When it is, it can do without the file_id for finding links (we don't
resolve the files we get here), which allows us to remove a stat()
call.
This speeds up the case of `...*/` by quite a bit.
If that last component was a directory with 1000 subdirectories we
could skip 1000 stat calls!
One slight weirdness: We refuse to add links to directories that we already visited, even if they are the last component and we don't actually follow them. That means we can't do the fast path here either, but we do know if something is a link (if we get d_type), so it still works in common cases.
This can be bound like `bind \cl clear-screen`, and is, by default
In contrast to the current way it doesn't need the external `clear`
command that was always awkward.
Also it will clear the screen and first draw the old prompt to remove
flicker.
Then it will immediately trigger a repaint, so the prompt will be overwritten.
This fixes the following deadlock. The C++ functions path_get_config and
path_get_data lazily determine paths and then cache those in a C++ static
variable. The path determination requires inspecting the environment stack.
If these functions are first called while the environment stack is locked
(in this case, when fetching the $history variable) we can get a deadlock.
The fix is to call them eagerly during env_init. This can be removed once
the corresponding C++ functions are removed.
This issue caused fish_config to fail to report colors and themes.
Add a test.
This uses "screen.reset_line" to move the cursor without informing the
reader's machinery (because that deals with positions *in the
commandline*), but then only repainted "if needed" - meaning if the
reader thought anything changed.
That could lead to a situation where the cursor stays at column 0
until you do something, e.g. in
```fish
bind -m insert u undo
```
when you press alt+u - because the *escape* calls repaint-mode, which
puts the cursor in column 0, and then the undo doesn't, which keeps it
there.
Of course this binding should also `repaint-mode`, because it changes
the mode.
Some changes might be ergonomic:
1. Make repaint-mode the default if the mode changed (we would need to
skip it for bracketed-paste)
2. Make triggering the repaint easier - do we need to set
force_exec_prompt_and_repaint to false here as well?
Anyway, this
Fixes#7910
This is a sensible thing to do, and fixes some cases where we're
state-dependent.
E.g. this fixes the case in the pager where some things are bold and
some aren't, because that bolding is (rather awkwardly) implicitly
triggered when we have a background, and so we don't notice we need to
re-do that bolding after we moved to the next line because we think we
still have the same color.
Fixes#9617
This adopts the Rust postfork code, bridging it from C++ exec module.
We use direct function calls for the bridge, rather than cxx/autocxx, so that we
can be sure that no memory allocations or other shenanigans are happening.
This implements the "postfork" code in Rust, including calling fork(),
exec(), and all the bits that have to happen in between. postfork lives
in the fork_exec module.
It is not yet adopted.
- wildcard_match is now closer to the original that is linked in a comment, as
pointer-arithmetic translates very poorly. The act of calling wildcard
patterns wc or wildcard is kinda confusing when wc elsewhere is widechar.
I sometimes find myself doing something like this:
- Look for a commandline that includes "echo" (as an example)
- Type echo, press up a few times
- I can't immediately find what I'm looking for
- Press ctrl-r to open up the history pager
- It uses the current commandline as the search string,
so now I'm looking for "echo foobar"
This makes it so if the search string already is in use, that's what
the history-pager picks as the initial search string.
We don't change anything about compilation-setup, we just immediately jump to
Rust, making the eventual final swap to a Rust entrypoint very easy.
There are some string-usage and format-string differences that are generally
quite messy.
- `libc::setlinebuf` is not available through Rust's libc it appears.
- autocxx fails to generate bindings using `*mut FILE`, instead go through
`void*`
- rust_main needs `parse_util_detect_errors_in_ast`, which is _partially_
ported, instead add FFI interop for C++.
- We need to set the filename if we are sourcing a file
This used to be assigned to the job, but that was removed in
f30ce21aaa.
Since then this was vestigial. It could have technically errored out,
but we should be catching that where we use the actual modes, not here.
Similar to `time`, except that one is more common as a command.
Note that this will also allow `builtin and`, which is somewhat
useless, but then it is also useless outside of a pipeline.
Addition to #9985
This used to print all codepoints outside of the ASCII range (i.e.
above 0x80) in \uXXXX or \UYYYYYYYY notation.
That's quite awkward, considering that this is about keys that are
being pressed, and many keyboards have actual symbols for these on
them - I have an "ö" key, so I would like to use `bind ö` and not
`bind \u00F6`. So we go by iswgraph.
On a slightly different note, `\e` was written as `\c[ (or \e)`. I do
not believe anyone really uses `\c[` (the `[` would need to
be escaped!), and it's confusing and unnecessary to even mention that.
This allows e.g. `foo | command time`, while still rejecting `foo | time`.
(this should really be done in the ast itself, but tbh most of
parse_util kinda should)
Fixes#9985
These are both clearly behind early returns, there is no need to check it again.
This isn't a case where we're doing logic gymnastics to see that it
can't be run without no_exec() being handled, this is
```c++
if (no_exec()) return;
// ..
// ..
// ..
if (no_exec()) foo;
```
We have already run waccess with X_OK. We already *know* the file is
executable.
There is no reason to check again.
Restores some of the speedup from the fast_waccess hack that was
removed to fix#9699.
- Add test to verify piped string replace exit code
Ensure fields parsing error messages are the same.
Note: C++ relied upon the value of the parsed value even when `errno` was set,
that is defined behaviour we should not rely on, and cannot easilt be replicated from Rust.
Therefore the Rust version will change the following error behaviour from:
```shell
> string split --fields=a "" abc
string split: Invalid fields value 'a'
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
To:
```shell
> string split --fields=a "" abc
string split: a: invalid integer
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
This adopts the new function store, replacing the C++ version.
It also reimplements builtin_function in Rust, as these was too coupled to
the function store to handle in a separate commit.
We could end up overflowing if we print out something that's a multiple of the
chunk size, which would then finish printing in the chunk-printing, but not
break out early.
This confirmed that a file existed via access(file, F_OK).
But we already *know* that it does because this is the expansion for
the "trailing slash" - by definition all wildcard components up to
here have already been checked.
And it's not checking for directoryness either because it does F_OK.
This will remove one `access()` per result, which will cut the number
of syscalls needed for a glob that ends in a "/" in half.
This brings us on-par with e.g. `ls` (which uses statx while we use
newfstatat, but that should have about the same results)
Fixes#9891.
Remove the following C++ functions/methods, which have no callers:
fallback.cpp:
- wcstod_l
proc.cpp:
- job_t::get_processes
wutil.cpp:
- fish_wcstoll
- fish_wcstoull
Also drop unused configure checks/defines:
- HAVE_WCSTOD_L
- HAVE_USELOCALE
Remove the following C++ functions/methods, which have all been ported to Rust and no longer have any callers in C++:
common.cpp:
- assert_is_locked/ASSERT_IS_LOCKED
path.cpp:
- path_make_canonical
wutil.cpp:
- wreadlink
- fish_iswgraph
- file_id_t::older_than
This makes `fish -c begin` fail with a status of 127 - it already
printed a syntax error so that was weird. (127 was the status for
syntax errors when piping to fish, so we stay consistent with that)
We allow multiple `-c` commands, and this will return the regular
status if the last `-c` succeeded.
This is fundamentally an extremely weird situation but this is the
simple targeted fix - we did nothing, unsuccessfully, so we should
fail.
Things to consider in future:
1. Return something better than 127 - that's the status for "unknown
command"!
2. Fail after a `-c` failed, potentially even checking all of them
before executing the first?
Fixes#9888
This also cleans up and removes unnecessary usage of FFI-oriented `feature_metadata_t`,
which is only used from Rust code after `builtins/status` was ported.
Note this is slightly incomplete - the FD is not moved into the parser, and so
will be freed at the end of each directory change. The FD saved in the parser is
never actually used in existing code, so this doesn't break anything, but will
need to be corrected once the parser is ported.
Prior to this commit, FLOG used the ffi bridge to get the output fd. Invert
this: have fish set the output fd within main. This allows FLOG to be used in
pure Rust tests.
After accidentally running a command that includes a pasted password, I want
to delete command from history. Today we need to recall or type (part of)
that command and type "history delete". Let's maybe add a shortcut to do
this from the history pager.
The current shortcut is Shift+Delete. I don't think that's very discoverable,
maybe we should use Delete instead (but only if the cursor is at the end of
the commandline, otherwise delete a char).
Closes#9454
The tentative binding for the upcoming "history-pager-delete" is
bind -k sdc history-pager-delete or backward-delete-char
When Shift+Delete is pressed while the history pager is active,
"history-pager-delete" succeeds. In this case, the "or" needs to kick the
"backward-delete-char" out of the input queue.
After doing so, it continues reading, but interprets the input as
single-char binding. This breaks when the next key emits a multi-char sequence,
like the arrow keys.
Fix this by reading a full sequence, which means we need to run "read_char()"
instead of "read_ch()" (confusing, right?).
I'm still working on writing a test. Somehow this only reproduces in the
history pager where Shift+Delete followed by down arrow emits "[B" (since
we swallowed the leading escape char). Confusingly, it doesn't do that in
the commandline or the completion search field.
Prior to this change, parser_t exposed an environment_t, and Rust had to go
through that. But because we have implemented Environment in Rust, it is
better to just expose the native Environment from parser_t. Make that
change and update call sites.
We can't just call the Rust version of `fish_setlocale()` without also either
calling the C++ version of `fish_setlocale()` or removing all `src/complete.cpp`
variables that are initialized and aliasing them to their new rust counterparts.
Since we're not interested in keeping the C++ code around, just call the C++
version of the function via ffi until we don't have *any* C++ code referencing
`src/common.h` at all.
Note that *not* doing this and then calling the rust version of
`fish_setlocale()` instead of the C++ version will cause errant behavior and
random segfaults as the C++ code will try to read and use uninitialized values
(including uninitialized pointers) that have only had their rust counterparts
init.
Either add rust wrappers for C++ functions called via ffi or port some pure code
from C++ to rust to provide support for the upcoming `env_dispatch` rewrite.
The global variables are moved (not copied) from C++ to rust and exported as
extern C integers. On the rust side they are accessed only with atomic semantics
but regular int access is preserved from the C++ side (until that code is also
ported).
Historically fish has used the functions `fish_wcstol`, `fish_wcstoi`, and
`fish_wcstoul` (and some long long variants) for most integer conversions.
These have semantics that are deliberately different from the libc
functions, such as consuming trailing whitespace, and disallowing `-` in
unsigned versions.
fish has started to drift away from these semantics; some divergence from
C++ has crept in.
Rename the existing `fish_wcs*` functions in Rust to remove the fish
prefix, to express that they attempt to mirror libc semantics; then
introduce `fish_` wrappers which are ported from C++. Also fix some
miscellaneous bugs which have crept in, such as missing range checks.
This implements the primary environment stack, and other environments such
as the null and snapshot environments, in Rust. These are used to implement
the push and pop from block scoped commands such as `for` and `begin`, and
also function calls.
owning_null_terminated_array is used for environment variables, where we need to
provide envp for child processes. This switches the implementation from C++ to
Rust.
We retain the C++ owning_null_terminated_array_t; it simply wraps the Rust
version now.
init_curses() is/can be called more than once, in which case the previous
ncurses terminal state is leaked and a new one is allocated.
`del_curterm(cur_term)` is supposed to be called prior to calling `setupterm()`
if `setupterm()` is being used to reinit the default `TERMINAL *cur_term`.
The new asan exit handlers are called to get proper ASAN leak reports (as
calling _exit(0) skips the LSAN reporting stage and exits with success every
time).
They are no-ops when not compiled for ASAN.
This ports some signal setup and handling bits to Rust.
The signal handling machinery requires walking over the list of known signals;
that's not supported by the Signal type. Rather than duplicate the list of
signals yet again, switch back to a table, as we had in C++.
This also adds two further pieces which were neglected by the Signal struct:
1. Localize signal descriptions
2. Support for integers as the signal name
This allows the rust code to free up C++ resources allocated for a callback even
when the callback isn't executed (as opposed to requiring the callback to run
and at the end of the callback cleaning up all allocated resources).
Also add type-erased destructor registration to callback_t. This allows for
freeing variables allocated by the callback for debounce_t's
perform_with_callback() that don't end up having their completion called due to
a timeout.
Largely routine but for the trampolines in iothread.h and iothread.cpp which
were a real PITA to get correct w/ all their variants.
Integration is complete with all old code ripped out and the tests using the
rust version of the code.
There are many places where we want to treat a missing variable the same as
a variable with an empty value.
In C++ we handle this by branching on maybe_t<env_var_t>::missing_or_empty().
If it returns false, we go on to access maybe_t<env_var_t>::value() aka
operator*.
In Rust, Environment::get() will return an Option<EnvVar>.
We could define a MissingOrEmpty trait and implement it for Option<EnvVar>.
However that will still leave us with ugly calls to Option::unwrap()
(by convention Rust does use shorthands like *).
Let's add a variable getter that returns none for empty variables.
Except for the indent visitor bits.
Tests for parse_util_detect_errors* are not ported yet because they depend
on expand.h (and operation_context.h which depends on env.h).
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.
Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.
Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now. In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t". This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.
Alternatively, we could define FFI wrappers for recursive AST traversal.
Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:
The only client that requires mutable access is the populator. To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.
The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).
Like in the C++ implementation, the AST nodes themselves are largely defined
via macros. Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.
This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList". To make this work
we need to manually define "ExternType".
There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.
One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.
Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.
The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:
$ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms]
Range (min … max): 193.2 ms … 205.1 ms 15 runs
Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms]
Range (min … max): 611.7 ms … 805.5 ms 10 runs
Summary
'./fish.old -c 'source ../share/completions/git.fish'' ran
3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''
Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
changed since it's not user visible.
This is basically a subset of type, so we might as well.
To be clear this is `command -s` and friends, if you do `command grep` that's
handled as a keyword.
One issue here is that we can't get "one path or not" because I don't
know how to translate a maybe_t? Do we need to make it a shared_ptr instead?
Vi visual mode selection highlighting behaves unexpectedly when the selection
foreground and background in the highlight spec don't match. The following
unexpected behaviors are:
* The foreground color is not being applied when defined by the
`fish_color_selection` variable.
* `set_color` options (e.g., `--bold`) would not be applied under the cursor
when selection begins in the middle of the command line or when the cursor
moves forward after visually selecting text backward.
With this change, visual selection respects the foreground color and any
`set_color` options are applied consistently regardless of where visual
selection begins and the position of the cursor during selection.
Most of it is duplicated, hence untested.
Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
more changes for each new libc. We could also use this approach for 1.
Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
wcs2string converts a wide string to a narrow one. The result is
null-terminated and may also contain interior null-characters.
std::string allows this.
Rust's null-terminated string, CString, does not like interior null-characters.
This means we will need to use Vec<u8> or OsString for the places where we
use interior null-characters.
On the other hand, we want to use CString for places that require a
null-terminator, because other Rust types don't guarantee the null-terminator.
Turns out there is basically no overlap between the two use cases, so make
it two functions. Their equivalents in Rust will have the same name, so
we'll only need to adjust the type when porting.
This can be triggered on linux with:
```js
import { spawn } from 'child_process';
const shell = spawn('/home/alfa/dev/fish-shell/build-c++/fish', []);
```
Under node 19.8.1.
*No clue* how that happens, but since this is a workaround we shall
skip it.
Another from the "why are we asserting instead of doing something
sensible" department.
The alternative is to make exit() and return() compute their own exit
code, but tbh I don't want any *other* builtin to hit this either?
Fixes#9659
This shows some of the ugliness of the rust borrow checker when it comes to
safely implementing any sort of recursive access and the need to be overly
explicit about which types are actually used across threads and which aren't.
We're forced to use an `Arc` for `ItemMaker` (née `item_maker_t`) because
there's no other way to make it clear that its lifetime will last longer than
the FdMonitor's. But once we've created an `Arc<T>` we can't call
`Arc::get_mut()` to get an `&mut T` once we've created even a single weak
reference to the Arc (because that weak ref could be upgraded to a strong ref at
any time). This means we need to finish configuring any non-atomic properties
(such as `ItemMaker::always_exit`) before we initialize the callback (which
needs an `Arc<ItemMaker>` to do its thing).
Because rust doesn't like self-referential types and because of the fact that we
now need to create both the `ItemMaker` and the `FdMonitorItem` separately
before we set the callback (at which point it becomes impossible to get a
mutable reference to the `ItemMaker`), `ItemMaker::item` is dropped from the
struct and we instead have the "constructor" for `ItemMaker` take a reference to
an `FdMonitor` instance and directly add itself to the monitor's set, meaning we
don't need to move the item out of the `ItemMaker` in order to add it to the
`FdMonitor` set later.
CXX does not allow generic types like maybe_t. When porting a C++ function
that returns maybe_t to Rust, we return std::unique_ptr instead. Let's make
the transition more seamless by allowing to convert back to maybe_t implicitly.
* wutil: Rewrite `wrealpath` in Rust
* Reduce use of FFI types in `wrealpath`
* Addressed PR comments regarding allocation
* Replace let binding assignment with regular comparison
More ugliness with types that cxx bridge can't recognize as being POD. Using
pointers to get/set `termios` values with an assert to make sure we're using
identical definitions on both sides (in cpp from the system headers and in rust
from the libc crate as exported).
I don't know why cxx bridge doesn't allow `SharedPtr<OpaqueRustType>` but we can
work around it in C++ by converting a `Box<T>` to a `shared_ptr<T>` then convert
it back when it needs to be destructed. I can't find a clean way of doing it
from the cxx bridge wrapper so for now it needs to be done manually in the C++
code.
Types/values that are drop-in ready over ffi are renamed to match the old cpp
names but for types that now differ due to ffi difficulties I've left the `_ffi`
in the function names to indicate that this isn't the "correct" way of using the
types/methods.
The way cxx bridge works, it doesn't recognize any types from another module as
being shared cxx bridge types with generations native to both C++ and Rust,
meaning every module that was going to use function pointers would have to
define its own `c_void` type (because cxx bridge doesn't recognize any of
libc::c_void, std::ffi::c_void, or autocxx::c_void).
FFI on other platforms has long used the equivalent of `uint8_t *` as an
alternative to `void *` for code where `void` was not available or was
undesirable for some reason. We can join the club - this way we can always use
`* {const|mut} u8` in our rust code and `uint8_t *` in our C++ code to pass
around parameters or values over the C abi.
I needed to rename some types already ported to rust so they don't clash with
their still-extant cpp counterparts. Helper ffi functions added to avoid needing
to dynamically allocate an FdMonitorItem for every fd (we use dozens per basic
prompt).
I ported some functions from cpp to rust that are used only in the backend but
without removing their existing cpp counterparts so cpp code can continue to use
their version of them (`wperror` and `make_detached_pthread`).
I ran into issues porting line-by-line logic because rust inverts the behavior
of `std::remove_if(..)` by making it (basically) `Vec::retain_if(..)` so I
replaced bools with an explict enum to make everything clearer.
I'll port the cpp tests for this separately, for now they're using ffi.
Porting closures was ugly. It's nothing hard, but it's very ugly as now each
capturing lambda has been changed into an explicit struct that contains its
parameters (that needs to be dynamically allocated), a standalone callback
(member) function to replace the lambda contents, and a separate trampoline
function to call it from rust over the shared C abi (not really relevant to
x86_64 w/ its single calling convention but probably needed on other platforms).
I don't like that `fd_monitor.rs` has its own `c_void`. I couldn't find a way to
move that to `ffi.rs` but still get cxx bridge to consider it a shared POD.
Every time I moved it to a different module, it would consider it to be an
opaque rust type instead. I worry this means we're going to have multiple
`c_void1`, `c_void2`, etc. types as we continue to port code to use function
pointers.
Also, rust treats raw pointers as foreign so you can't do `impl Send for * const
Foo` even if `Foo` is from the same module. That necessitated a wrapper type
(`void_ptr`) that implements `Send` and `Sync` so we can move stuff between
threads.
The code in fd_monitor_t has been split into two objects, one that is used by
the caller and a separate one associated with the background thread (this is
made nice and clean by rust's ownership model). Objects not needed under the
lock (i.e. accessed by the background thread exclusively) were moved to the
separate `BackgroundFdMonitor` type.
Keeps the location of original function definition, and also stores
where it was copied. `functions` and `type` show both locations,
instead of none. It also retains the line numbers in the stack trace.
By default, fish does not complete files that have leading dots, unless the
wildcard itself has a leading dot. However this also affected completions;
for example `git add` would not offer `.gitlab-ci.yml` because it has a
leading dot.
Relax this for custom completions. Default file expansion still
suppresses leading dots, but now custom completions can create
leading-dot completions and they will be offered.
Fixes#3707.
When we draw the prompt, we move the cursor to the actual
position *we* think it is by issuing a carriage return (via
`move(0,0)`), and then going forward until we hit the spot.
This helps when the terminal and fish disagree on the width of the
prompt, because we are now definitely in the correct place, so we can
only overwrite a bit of the prompt (if it renders longer than we
expected) or leave space after the prompt. Both of these are benign in
comparison to staircase effects we would otherwise get.
Unfortunately, midnight commander ("mc") tries to extract the last
line of the prompt, and does so in a way that is overly naive - it
resets everything to 0 when it sees a `\r`, and doesn't account for
cursor movement. In effect it's playing a terminal, but not committing
to the bit.
Since this has been an open request in mc for quite a while, we hack
around it, by checking the $MC_SID environment variable.
If we see it, we skip the clearing. We end up most likely doing
relative movement from where we think we are, and in most cases it
should be *fine*.