We often want to format and print a string to a fd, usually stdout/stderr.
In general we can't use "format!", "print!", "eprint!" etc. because they don't
know about our use of WString where we encode of invalid Unicode characters
in the private use area.
Instead we use "wwrite_to_fd()".
Since we unfortunately don't have a "wformat!()" yet, we use "sprintf!()"
to create a formatted wstring to pass to "wwrite_to_fd()".
Add "printf!" and "eprintf!" to stand in for "print!" and "eprint!".
For printing to files other than stdout and stderr, keep "fwprintf!" but
drop the "w" since our "sprintf!" always produces wide strings.
Replace "fputws" with "fprintf" though we could also use "wwrite_to_fd"
if performance matters.
Unlike std::io::stdout(), we don't use locking yet.
Remaining work:
- There are more places where we use \be?print(ln)?!
Usually we print strings that are guaranteed to be valid UTF-8, but not
always. We should probably make all of them respect our WString semantics
but preferrably keep using the native Rust format strings (#9948).
- I think flog.rs currently uses String so it won't handle invalid Unicode
characters. We should probably fix this as well.
Drop support for history file version 1.
ParseExecutionContext no longer contains an OperationContext because in my
first implementation, ParseExecutionContext didn't have interior mutability.
We should probably try to add it back.
Add a few to-do style comments. Search for "todo!" and "PORTING".
Co-authored-by: Xiretza <xiretza@xiretza.xyz>
(complete, wildcard, expand, history, history/file)
Co-authored-by: Henrik Hørlück Berg <36937807+henrikhorluck@users.noreply.github.com>
(builtins/set)
This function only ever returns true if target_os=linux, so we need to invert
the OS check.
In the first invocation, this function may allocate heap memory.
Clarify that this is safe.
[ja: I don't have the original commit handy so I made up the log message]
Unlike our C++ tests, our Rust tests fail as soon as an assertion fails.
Whether this is desired is debatable; it seems fine for
most cases and is easier to implement.
This means that Rust tests usually don't need to print anything besides
what assert!/assert_eq! already provide.
One exception is the history merge test. Let's add a simple err!() macro to
support this. Unlike the C++ err() it does not yet print colors.
Currently all of our macros live in common.rs, to keep the import graph simple.
- This is untested and unused, string ownership is very much subject to change
- Ports the minimally necessary parts of complete.rs as well
- This should fix an infinite loop in `create_directory` in `path.rs`, the first
`wstat` loop only breaks if it fails with an error that's different from
EAGAIN
- It is currently never set, but will be set once `main` is ported
- `should_suppress_stderr_for_tests` used to be PROGRAM_NAME !=
TESTS_PROGRAM_NAME, but the equivalent C++ code was
`!std::wcscmp(program_name, TESTS_PROGRAM_NAME)`, and `wcsmp` returns
zero if they are equal, thus is equivalent to `==` in Rust
- Add test to verify piped string replace exit code
Ensure fields parsing error messages are the same.
Note: C++ relied upon the value of the parsed value even when `errno` was set,
that is defined behaviour we should not rely on, and cannot easilt be replicated from Rust.
Therefore the Rust version will change the following error behaviour from:
```shell
> string split --fields=a "" abc
string split: Invalid fields value 'a'
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
To:
```shell
> string split --fields=a "" abc
string split: a: invalid integer
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
Get some stuff out of the common module, which is growing large.
Also migrate the tests into "native" Rust tests so they will run in parallel.
We have to use an explicit setlocale() call to get a multibyte locale, for the
"crazy" tests.
This makes some simplifications to scoped_push and ScopeGuard:
1. ScopeGuard no longer uses ManuallyDrop; the memory management is now
trivial and no longer requires `unsafe`.
2. The functions `cancel` and `rollback` have been removed, as
these were unused. They can be added back later if needed.
3. `scoped_push` has been simplified in both signature and implementation.
4. `Projection` is no longer required and has been removed.
Also add some tests.
We can't just call the Rust version of `fish_setlocale()` without also either
calling the C++ version of `fish_setlocale()` or removing all `src/complete.cpp`
variables that are initialized and aliasing them to their new rust counterparts.
Since we're not interested in keeping the C++ code around, just call the C++
version of the function via ffi until we don't have *any* C++ code referencing
`src/common.h` at all.
Note that *not* doing this and then calling the rust version of
`fish_setlocale()` instead of the C++ version will cause errant behavior and
random segfaults as the C++ code will try to read and use uninitialized values
(including uninitialized pointers) that have only had their rust counterparts
init.
This can be used for functions that accept non-Unicode content (i.e. &CStr or
CString) but are often used in our code base with a UTF-8 or UTF-32 string
on-hand.
When such a function is passed a CString, it's passed through as-is and
allocation-free. But when, as is often the case, we have a static string we can
now pass it in directly with all the nice ergonomics thereof instead of having
to manually create and unwrap a CString at the call location.
There's an upstream request to add this functionality to the standard library:
https://github.com/rust-lang/rust/issues/71448
Delegate the `view` and `view_mut` to the newly added `Projection<T>`, which
makes everything oh so much clearer and cleaner. Add comments to clarify what is
happening.
This can be used when you primarily want to return a reference but in order for
that reference to live long enough it must be returned with an object.
i.e. given `Mutex<Foo { bar }>` you want a function to lock the mutex and return
a reference to `bar` but you can't return that reference since it has a lifetime
dependency on `MutexGuard` (which only derefs to all of `Foo` and not just
`bar`). You can return a `Projection` owning the `MutexGuard<Foo>` and set it up
to deref to `&bar`.
This ports some signal setup and handling bits to Rust.
The signal handling machinery requires walking over the list of known signals;
that's not supported by the Signal type. Rather than duplicate the list of
signals yet again, switch back to a table, as we had in C++.
This also adds two further pieces which were neglected by the Signal struct:
1. Localize signal descriptions
2. Support for integers as the signal name
Like the WSL check, this was incorrectly assuming WSL implies
cfg(windows) when it's actually picked up as Linux.
Also, improve over the C++ code by not relying on the build-time WSL
status to determine if we are running on WSL at runtime since it's often
the case that the fish binaries are built on a non-WSL host (for
packaging) then executed on a WSL only at runtime.
(But it's ok to assume if fish has been built for Windows or not Linux
that it will either be run or not run on top of a Win32 character device
system.)
Also, port of the comment and relevant WSL and fish issue links over
from the CPP codebase for posterity.
* Since we already have an allocation of length wstr.len(), it's
probably better to allocate the result (which is strictly less than or
equal to the input length) up-front rather than risk thrashing the Vec
allocation,
* There's no need to compare c2 against '\0' since that will just cause
to_digit(16) to return None anyway,
* Our convert_hex() specialization of to_digit(16) that only checks
capital letters A-F without also checking lowercase a-f isn't
significantly faster than just use to_digit(16), and we already assert
that the input *wasn't* a lowercase a-f before making the call, so
there's no point in using a special function to handle that.
This reverts commit 76dc849fca.
The warning added in that commit is incorrect. The functions
unescape_string_url and unescape_string_var will not panic, because
char_at() return 0 if the index is equal to its length.
This reverts commit f9c92753c4.
This commit attempted to replace exit_without_destructors() with
std::process::exit; however this is wrong for two reasons:
1. std::process::exit() runs Rust runtime cleanup stuff we don't want
2. std::process::exit() invokes destructors, meaning atexit handlers,
which we don't want.
The type system no longer guarantees that the input string is nul-terminated,
meaning accessing beyond the range-checked `i` a char-at-a-time is no longer
safe. (In C++, we would either be using a plain C string which is always
nul-terminated or we would be using (w)string::cstr() which similarly grants
access to its nul-terminated buffer.)
Aside from that, there's no need to explicitly check `if c2 == '\0'` because
'\0' is not a valid hex digit so the `?` tacked on to `convert_hex_digit(c2)?`
will abort and return `None` anyway.
convert_hex_digit() is not appreciably faster than char::to_digit(16) and makes
the code less maintainable since it encodes certain assumptions; since it's also
not used consistently just drop it in favor of the std fn.
Since the output string (per the decode logic) is always shorter than or equal
to the input string, just reserve the input string size upfront to prevent vec
reallocations.
Somewhat counter-intuitively, this code is active when compiling under *Linux*
and is always false when compiling under Windows. The logic was incorrectly
reversed before (it's easier to reason about when you realize that fish doesn't
even compile under Windows because it uses tons of libc functions).
As the code was actually never compiled, it wasn't actually tested for validity
either and there were some issues that prevented it from compiling that have
since been fixed. The logic has also been adjusted a bit to make it possible to
use the rust-native int parsing instead of `libc::strtod()`.
The code has been changed to use `once_cell::race::OnceBool` instead of
`once_cell::sync::Lazy<T>` which imposes a greater runtime burden with locking
and other overhead. We don't care if the code runs more than once on init (if
calls were to race, though they probably don't) - just that the code isn't
subsequently executed on each call. The `once_cell::race` module is a better fit
here, though it doesn't expose the ergonomic `Lazy<T>` façade around its types.