On the following "Port execution" commit, ASan will complain if we read
beyond a terminating null byte in get_autosuggestion_performer(). This is
actually working as intended but we need to appease ASan somehow..
get_pwd_slash() uses "if var.is_empty()" but it should be "if !var.is_empty()".
This wasn't a problem so far because in practice most code paths use the
get_pwd_slash() override from EnvStackImpl. The generic one is used in the
upcoming unit tests.
This adopts the Rust postfork code, bridging it from C++ exec module.
We use direct function calls for the bridge, rather than cxx/autocxx, so that we
can be sure that no memory allocations or other shenanigans are happening.
This implements the "postfork" code in Rust, including calling fork(),
exec(), and all the bits that have to happen in between. postfork lives
in the fork_exec module.
It is not yet adopted.
This introduces a new module called fork_exec, which will be for posix_spawn,
postfork, and flog_safe - stuff concerned with actually executing binaries,
and error reporting.
Add a FLOG_SAFE! macro which writes errors to the flog fd in an
async-signal-safe way. This implementation differs from the C++ in that we
allow printing integers directly - no requiring them to be converted to a
buffer first.
This makes it so
```fish
if -e foo
# do something
end
```
complains about `-e` not being a command instead of `end` being used
outside of an if-block.
That means both that `-e` could now be used as a command name (it
already can outside of `if`!) *and* that we get a better error!
The only way to get `if` to be a decorated statement now is to use `if
-h` or `if --help` specifically (with a literal option).
The same goes for switch, while and begin.
It would be possible, alternatively, to disallow `if -e` and point
towards using `test` instead, but the "unknown command" message should
already point towards using `test` more than pointing at the
"end" (that might be quite far away).
- This is untested and unused, string ownership is very much subject to change
- Ports the minimally necessary parts of complete.rs as well
- This should fix an infinite loop in `create_directory` in `path.rs`, the first
`wstat` loop only breaks if it fails with an error that's different from
EAGAIN
- wildcard_match is now closer to the original that is linked in a comment, as
pointer-arithmetic translates very poorly. The act of calling wildcard
patterns wc or wildcard is kinda confusing when wc elsewhere is widechar.
This is in regards to a comment on 290d07a833, which resulted in 46c967903d.
Those commits handled the default path when it is unset on startup.
DEFAULT_PATH is used when PATH is unset at runtime as far as I can tell.
As far as I can tell this has had the non-overidding ordering behavior since inception
(or at least 17 years ago ea998b03f2).
We don't change anything about compilation-setup, we just immediately jump to
Rust, making the eventual final swap to a Rust entrypoint very easy.
There are some string-usage and format-string differences that are generally
quite messy.
In CMake this used a `version` file in the CARGO_MANIFEST_DIR, but
relying on that is problematic due to change-detection, as if we add
`cargo-rerun-if-changed:version`, cargo would rerun every time if the file does
not exist, since cargo would expect the file to be generated by the
build-script. We could generate it, but that relies on the output of `git
describe`, whose dependencies we can only limit to anything in the
`.git`-folder, again causing unnecessary build-script runs.
Instead, this reads the `FISH_BUILD_VERSION`-env-variable at compile time
instead of the `version`-file, and falls back to calling git-describe through
the `git_version`-proc-macro. We thus do not need to deal with extraneous
build-script running.
C++ main used getopt (no w!), which appears to internally print
error-messages. The Rust version will use `wgetopter_t`, and therefore needs to
print this itself.
- It is currently never set, but will be set once `main` is ported
- `should_suppress_stderr_for_tests` used to be PROGRAM_NAME !=
TESTS_PROGRAM_NAME, but the equivalent C++ code was
`!std::wcscmp(program_name, TESTS_PROGRAM_NAME)`, and `wcsmp` returns
zero if they are equal, thus is equivalent to `==` in Rust
Similar to `time`, except that one is more common as a command.
Note that this will also allow `builtin and`, which is somewhat
useless, but then it is also useless outside of a pipeline.
Addition to #9985
This allows e.g. `foo | command time`, while still rejecting `foo | time`.
(this should really be done in the ast itself, but tbh most of
parse_util kinda should)
Fixes#9985
- "1.6.0" now supports formatting let-else statements which we use liberally,
and appears to have some fixes in regards to long-indented-lines with macros
like `wgettext_ft!`
- This commit updates the formatting so that devs with the latest stable don't
see random format-fixes upon running `cargo fmt`
Note: This *requires* an argument after the format string:
```rust
FLOGF!(debug, "foo");
```
won't compile. I think that's okay, because in that case you should
just use FLOG.
An alternative is to make it skip the sprintf.
"FLOGF!" is supposed to treat its first argument as a format
string (but doesn't because that part isn't implemented currently).
That means running something like
```rust
FLOGF!(term_support, "curses var", var_name, "=", value);
```
That would rightly just print "curses var", ignoring the other
arguments.
By contrast, FLOG! is the literal "just join these as a string"
version.
- Make CMake use the correct target-path
- Make build.rs use the correct target dir
Workspaces place it in the project root by default, the alternative to making
this change is to add a `.cargo/config.toml` file with
```toml
[build]
target-dir = "fish-rust/target"
```
Which I think is unnecessary, as we likely want to use the new location anyways.
- This allows running `cargo fmt/clippy/test/etc` from root
- Ideally the root should be the fish-rust package instead of being virtual, but
that requires changed to CMake/Corrosion. This change should instead be
completely compatible with our existing setup.
- This also means we will only have on `Cargo.lock` for all current and future
crates.
This was "function", needs to be "function*s*".
It was only an issue in the option parsing because we set cmd there
again instead of passing it. Maybe these should just be file-level constants?
This is an alternative to the very common pattern of
```rust
streams.err.append(output);
streams.err.append1('\n');
```
Which has negative performance implications, see https://github.com/fish-shell/fish-shell/pull/9229
It takes `Into<WString>` to hopefully avoid allocating anew when the argument is
a WString with leftover capacity
This removes some spurious unsafe and some imports.
Note: We don't use it in `test`, because that can be asked to check
arbitrary file descriptors, while this only checks stdout specifically.
Turns out doing `==` on Enums with values will do a deep comparison,
including the values.
So EventDescription::Signal(SIGTERM) is !=
EventDescription::Signal(SIGWINCH).
That's not what we want here, so this does a bit of a roundabout thing.
The `impl<T> Hash for &T` hashes the string itself[^1].
It is unclear if that is actually faster than just calling `keyfunc` multiple times (they should all be linear).
For context, Rust by default uses SipHash 1-3 db1b1919ba
An alternative would be to store it as raw pointers aka `*const T`, which have a cheaper hash impl.
That has a more complicated implementation + removes lifetimes.
This commit rather removes the premature optimization.
[^1]: Source: https://doc.rust-lang.org/std/ptr/fn.hash.html
- The Err-variants will be used by e.g. wildcard, so might as well change it
now.
- `create_directory` should now not infinitely loop until it fails with an
error message that isn't `EAGAIN`
Padding with an unprintable character is now disallowed, like it was for other
zero-length characters.
`string shorten` now ignores escape sequences and non-printable characters
when calculating the visible width of the ellipsis used (except for `\b`,
which is treated as a width of -1).
Previously `fish_wcswidth` returned a length of -1 when the ellipsis-str
contained any non-printable character, causing the command to poentially
print a larger width than expected.
This also fixes an integer overflows in `string shorten`'s
`max` and `max2`, when the cumulative sum of character widths turned negative
(e.g. with any non-printable characters, or `\b` after the changes above).
The overflow potentially caused strings containing non-printable characters
to be truncated.
This adds test that verify the fixed behaviour.
- Add test to verify piped string replace exit code
Ensure fields parsing error messages are the same.
Note: C++ relied upon the value of the parsed value even when `errno` was set,
that is defined behaviour we should not rely on, and cannot easilt be replicated from Rust.
Therefore the Rust version will change the following error behaviour from:
```shell
> string split --fields=a "" abc
string split: Invalid fields value 'a'
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
To:
```shell
> string split --fields=a "" abc
string split: a: invalid integer
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
Empty hash maps muck around with TLS. Per code review, use a boxed slice
of a tuple instead. This has the nice benefit of printing inherited vars
in sorted order.
This adopts the new function store, replacing the C++ version.
It also reimplements builtin_function in Rust, as these was too coupled to
the function store to handle in a separate commit.
DirIter had a serious bug where it would crash on an invalid path. Make it more
robust and rationalize its error handling. Move it into its own module and add
tests.
Prior to this change, we had a silly wrapper type EventDescription which wrapped
EventType, which actually described the event.
Remove this wrapper and rename EventType to EventDescription (since it describes
more than just the type of event).
The RETURN_IN_ORDER argparse mode (enabled via leading '-') causes non-options
(i.e. positionals) to be returned intermixed with options in the original order,
instead of being permuted to the end. Such positionals are identified via the
option sentinel of char code 1. Use a real named constant for this return,
rather than weird stuff like '\u{1}'
This also allows scoped feature tests that makes testing feature flags thread-safe.
As in you can guarantee that the test actually has the correct feature flag
value, regardless of which other tests are running in parallell.
This also cleans up and removes unnecessary usage of FFI-oriented `feature_metadata_t`,
which is only used from Rust code after `builtins/status` was ported.
Note this is slightly incomplete - the FD is not moved into the parser, and so
will be freed at the end of each directory change. The FD saved in the parser is
never actually used in existing code, so this doesn't break anything, but will
need to be corrected once the parser is ported.
This shaves about 9 seconds off of the runtime, and makes the test
deterministic.
We do not touch the test_convert test because there is a known failure and we
need to track it down before making it deterministic.
Get some stuff out of the common module, which is growing large.
Also migrate the tests into "native" Rust tests so they will run in parallel.
We have to use an explicit setlocale() call to get a multibyte locale, for the
"crazy" tests.
Prior to this commit, FLOG used the ffi bridge to get the output fd. Invert
this: have fish set the output fd within main. This allows FLOG to be used in
pure Rust tests.
Two small fixes:
1. ParsedSourceRef, if present, should not be None; express that in the type.
2. ParsedSourceRef is intended to be shareable across threads; make it so.
Use as_wstr() instead of from_ffi() in a few places to avoid an allocation,
and make job_control_t work in &wstr instead of &str to reduce complexity at
the call sites.
- Using an option makes it much clearer that the check for empty args is
redundant.
- Also prefer implementing TryFrom only for &str, to not hide the string
conversion and allocation happening.
This was present in the C++ version for command, though never for type.
Checking over all elements of PATH can be slow on some platforms eg
WSL2, so only do that when used with `--all`.
Based on discussion in
https://github.com/fish-shell/fish-shell/pull/9856
This restores the status quo where builtins are like external commands
in that they can't see anything after a 0x00, because that's the c-style
string terminator.
* Make NULs work for builtins
This switches from passing a c-string to output_stream_t::append to
passing a proper string.
That means a builtin that prints a NUL no longer crashes with "thread '' panicked
at 'String contained intermediate NUL character: ".
Instead, it will actually handle the NUL, even as an argument.
That means something like
`echo foo\x00bar` will now actually print a NUL instead of truncating
after the `foo` because we passed c-strings around everywhere.
The former is *necessary* for e.g. `string`, the latter is a change
that on the whole makes dealing with NULs easier, but it is a
behavioral change.
To restore the c-string behavior we would have to truncate arguments
at NUL.
See #9739.
* Use AsRef instead of trait bound
Prior to this change, parser_t exposed an environment_t, and Rust had to go
through that. But because we have implemented Environment in Rust, it is
better to just expose the native Environment from parser_t. Make that
change and update call sites.
The writembs macro was ported from C++, which attempted to detect when a NULL
termcap was used. However we have never gotten a bug report from this. Bravely
remove it.
The outputter code has a lot of checks that string capabilities are non-empty;
just enforce that at the curses layer so we can remove those checks.
Also remove some types and traits, replacing them with simple functions.
Per code review, we think that tparm does nothing when there are no parameters,
and it is safe to remove it, even though this is a break from C++. This
simplifies some code.
This makes some simplifications to scoped_push and ScopeGuard:
1. ScopeGuard no longer uses ManuallyDrop; the memory management is now
trivial and no longer requires `unsafe`.
2. The functions `cancel` and `rollback` have been removed, as
these were unused. They can be added back later if needed.
3. `scoped_push` has been simplified in both signature and implementation.
4. `Projection` is no longer required and has been removed.
Also add some tests.
We can't just call the Rust version of `fish_setlocale()` without also either
calling the C++ version of `fish_setlocale()` or removing all `src/complete.cpp`
variables that are initialized and aliasing them to their new rust counterparts.
Since we're not interested in keeping the C++ code around, just call the C++
version of the function via ffi until we don't have *any* C++ code referencing
`src/common.h` at all.
Note that *not* doing this and then calling the rust version of
`fish_setlocale()` instead of the C++ version will cause errant behavior and
random segfaults as the C++ code will try to read and use uninitialized values
(including uninitialized pointers) that have only had their rust counterparts
init.
This is not yet used but will take eventually take the place of all (n)curses
access. The curses C library does a lot of header file magic with macro voodoo
to make it easier to perform certain tasks (such as access or override string
capabilities) but this functionality isn't actually directly exposed by the
library's ABI.
The rust wrapper eschews all of that for a more straight-forward implementation,
directly wrapping only the basic curses library calls that are required to
perform the tasks we care about. This should let us avoid the subtle
cross-platform differences between the various curses implementations that
plagued the previous C++ implementation.
All functionality in this module that requires an initialized curses TERMINAL
pointer (`cur_term`, traditionally) has been subsumed by the `Term` instance,
which once initialized with `curses::setup()` can be obtained at any time with
`curses::Term()` (which returns an Option that evaluates to `None` if `cur_term`
hasn't yet been initialized).
Either add rust wrappers for C++ functions called via ffi or port some pure code
from C++ to rust to provide support for the upcoming `env_dispatch` rewrite.
The global variables are moved (not copied) from C++ to rust and exported as
extern C integers. On the rust side they are accessed only with atomic semantics
but regular int access is preserved from the C++ side (until that code is also
ported).
It's not clear whether or not `system_wcwidth()` was picked solely because of
the namespace conflict (which is easily remedied) but using the most obvious
name for this function should be the way to go.
We already have our own overload of `wcwidth()` (`fish_wcwidth()`) so it should
be more obvious which is the bare system call and which isn't.
(I do want to move this w/ some of the other standalone extern C wrappers to the
unix module later.)
Pull in the correct descriptions merged from across the various C++ header and
source files and get rid of the getter function that's only used in one place
but causes us to split the documentation for FISH_EMOJI_WIDTH across multiple
declarations.
This can be used for functions that accept non-Unicode content (i.e. &CStr or
CString) but are often used in our code base with a UTF-8 or UTF-32 string
on-hand.
When such a function is passed a CString, it's passed through as-is and
allocation-free. But when, as is often the case, we have a static string we can
now pass it in directly with all the nice ergonomics thereof instead of having
to manually create and unwrap a CString at the call location.
There's an upstream request to add this functionality to the standard library:
https://github.com/rust-lang/rust/issues/71448
This is more complicated than it needs to be thanks to the presence of CMake and
the C++ ffi in the picture. rsconf can correctly detect the required libraries
and instruct rustc to link against them, but since we generate a static rust
library and have CMake link it against the C++ binaries, we are still at the
mercy of CMake picking up the symbols we want.
Unfortunately, we could detect the gettext symbols but discover at runtime that
they weren't linked in because CMake was compiled with `-DWITH_GETTEXT=0` or
similar (as the macOS CI runner does). This means we also need to pass state
between CMake and our build script to communicate which CMake options were
enabled.
Delegate the `view` and `view_mut` to the newly added `Projection<T>`, which
makes everything oh so much clearer and cleaner. Add comments to clarify what is
happening.
This can be used when you primarily want to return a reference but in order for
that reference to live long enough it must be returned with an object.
i.e. given `Mutex<Foo { bar }>` you want a function to lock the mutex and return
a reference to `bar` but you can't return that reference since it has a lifetime
dependency on `MutexGuard` (which only derefs to all of `Foo` and not just
`bar`). You can return a `Projection` owning the `MutexGuard<Foo>` and set it up
to deref to `&bar`.
This wasn't providing a lot of value, and the license compatibility is iffy.
There's a bit of weirdness in that this now uses a `Box<dyn Error>`,
but since currently nothing actually errors out let's punt that for
later.
This is a terrible way of going about things,
and means we're currently broken on any unix that isn't specifically listed.
But at least it'll build and allow us to keep the FreeBSD CI running.
Historically fish has used the functions `fish_wcstol`, `fish_wcstoi`, and
`fish_wcstoul` (and some long long variants) for most integer conversions.
These have semantics that are deliberately different from the libc
functions, such as consuming trailing whitespace, and disallowing `-` in
unsigned versions.
fish has started to drift away from these semantics; some divergence from
C++ has crept in.
Rename the existing `fish_wcs*` functions in Rust to remove the fish
prefix, to express that they attempt to mirror libc semantics; then
introduce `fish_` wrappers which are ported from C++. Also fix some
miscellaneous bugs which have crept in, such as missing range checks.
This implements the primary environment stack, and other environments such
as the null and snapshot environments, in Rust. These are used to implement
the push and pop from block scoped commands such as `for` and `begin`, and
also function calls.
owning_null_terminated_array is used for environment variables, where we need to
provide envp for child processes. This switches the implementation from C++ to
Rust.
We retain the C++ owning_null_terminated_array_t; it simply wraps the Rust
version now.
The `u64::from(buf.f_flag)` was needed in two places. The existing handled macOS
which always has a 32-bit statfs::f_flag, but statvfs::f_flag is an `unsigned
long` which means it needs to be coerced to 64-bits on 32-bit targets.