Drop support for history file version 1.
ParseExecutionContext no longer contains an OperationContext because in my
first implementation, ParseExecutionContext didn't have interior mutability.
We should probably try to add it back.
Add a few to-do style comments. Search for "todo!" and "PORTING".
Co-authored-by: Xiretza <xiretza@xiretza.xyz>
(complete, wildcard, expand, history, history/file)
Co-authored-by: Henrik Hørlück Berg <36937807+henrikhorluck@users.noreply.github.com>
(builtins/set)
- `libc::setlinebuf` is not available through Rust's libc it appears.
- autocxx fails to generate bindings using `*mut FILE`, instead go through
`void*`
- rust_main needs `parse_util_detect_errors_in_ast`, which is _partially_
ported, instead add FFI interop for C++.
- We need to set the filename if we are sourcing a file
Remove the following C++ functions/methods, which have all been ported to Rust and no longer have any callers in C++:
common.cpp:
- assert_is_locked/ASSERT_IS_LOCKED
path.cpp:
- path_make_canonical
wutil.cpp:
- wreadlink
- fish_iswgraph
- file_id_t::older_than
We can't just call the Rust version of `fish_setlocale()` without also either
calling the C++ version of `fish_setlocale()` or removing all `src/complete.cpp`
variables that are initialized and aliasing them to their new rust counterparts.
Since we're not interested in keeping the C++ code around, just call the C++
version of the function via ffi until we don't have *any* C++ code referencing
`src/common.h` at all.
Note that *not* doing this and then calling the rust version of
`fish_setlocale()` instead of the C++ version will cause errant behavior and
random segfaults as the C++ code will try to read and use uninitialized values
(including uninitialized pointers) that have only had their rust counterparts
init.
Largely routine but for the trampolines in iothread.h and iothread.cpp which
were a real PITA to get correct w/ all their variants.
Integration is complete with all old code ripped out and the tests using the
rust version of the code.
Most of it is duplicated, hence untested.
Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
more changes for each new libc. We could also use this approach for 1.
Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
wcs2string converts a wide string to a narrow one. The result is
null-terminated and may also contain interior null-characters.
std::string allows this.
Rust's null-terminated string, CString, does not like interior null-characters.
This means we will need to use Vec<u8> or OsString for the places where we
use interior null-characters.
On the other hand, we want to use CString for places that require a
null-terminator, because other Rust types don't guarantee the null-terminator.
Turns out there is basically no overlap between the two use cases, so make
it two functions. Their equivalents in Rust will have the same name, so
we'll only need to adjust the type when porting.
This is early work but I guess there's no harm in pushing it?
Some thoughts on the conventions:
Types that live only inside Rust follow Rust naming convention
("FeatureMetadata").
Types that live on both sides of the language boundary follow the existing
naming ("feature_flag_t").
The alternative is to define a type alias ("using feature_flag_t =
rust::FeatureFlag") but that doesn't seem to be supported in "[cxx::bridge]"
blocks. We could put it in a header ("future_feature_flags.h").
"feature_metadata_t" is a variant of "FeatureMetadata" that can cross
the language boundary. This has the advantage that we can avoid tainting
"FeatureMetadata" with "CxxString" and such. This is an experimental approach,
probably not what we should do in general.
This meant we didn't actually do our weird en/decoding scheme for e.g.
a C locale, which meant that, when you then switch to a proper locale
the previous variables were broken.
I don't know how to test this automatically - none of my attempts seem
to ever *fail* with the old code, here's what you'd do manually:
- Run fish with an actual C locale (LC_ALL=C
fish_allow_singlebyte_locale=1 fish)
- `set -gx foo 💩`
- `set -e LC_ALL`
- `echo $foo` outputs "💩" if it works and "ð⏎" if it's broken.
Fixes#2613
Up to now, in normal locales \x was essentially the same as \X, except
that it errored if given a value > 0x7f.
That's kind of annoying and useless.
A subtle change is that `\xHH` now represents the character (if any)
encoded by the byte value "HH", so even for values <= 0x7f if that's
not the same as the ASCII value we would diverge.
I do not believe anyone has ever run fish on a system where that
distinction matters. It isn't a thing for UTF-8, it isn't a thing for
ASCII, it isn't a thing for UTF-16, it isn't a thing for any extended
ASCII scheme - ISO8859-X, it isn't a thing for SHIFT-JIS.
I am reasonably certain we are making that same assumption in other
places.
Fixes#1352
Closes#9240.
Squash of the following commits (in reverse-chronological order):
commit 03b5cab3dc40eca9d50a9df07a8a32524338a807
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date: Sun Sep 25 15:09:04 2022 -0500
Handle differently declared posix_spawnxxx_t on macOS
On macOS, posix_spawnattr_t and posix_spawn_file_actions_t are declared as void
pointers, so we can't use maybe_t's bool operator to test if it has a value.
commit aed83b8bb308120c0f287814d108b5914593630a
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date: Sun Sep 25 14:48:46 2022 -0500
Update maybe_t tests to reflect dynamic bool conversion
maybe_t<T> is now bool-convertible only if T _isn't_ already bool-convertible.
commit 2b5a12ca97b46f96b1c6b56a41aafcbdb0dfddd6
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date: Sun Sep 25 14:34:03 2022 -0500
Make maybe_t a little harder to misuse
We've had a few bugs over the years stemming from accidental misuse of maybe_t
with bool-convertible types. This patch disables maybe_t's bool operator if the
type T is already bool convertible, forcing the (barely worth mentioning) need
to use maybe_t::has_value() instead.
This patch both removes maybe_t's bool conversion for bool-convertible types and
updates the existing codebase to use the explicit `has_value()` method in place
of existing implicit bool conversions.
This was recently converted to a while-loop. However, we only
loop in a specific case when (by hitting "continue") so a
loop condition is not necessary.
No functional change.
We forgot to decode (i.e. turn into nice wchar_t codepoints)
"byte_literal" escape sequences. This meant that e.g.
```fish
string match ö \Xc3\Xb6
math 5 \X2b 5
```
didn't work, but `math 5 \x2b 5` did, and would print the wonderful
error:
```
math: Error: Missing operator
'5 + 5'
^
```
So, instead, we decode eagerly.
This reverts commit 3d8f98c395.
In addition to the issues mentioned on the GitHub page for this commit,
it also broke the CentOS 7 build.
Note one can locally test the CentOS 7 build via:
./docker/docker_run_tests.sh ./docker/centos7.Dockerfile
Be more careful with sign extension issues stemming from the differences in how
an untyped literal is promoted to an integer vs how a typed (and signed) `char`
is promoted to an integer.
Also convert some `const[expr] static xxx` to `const[expr] xxx` where it makes
sense to let the compiler deduce on its own whether or not to allocate storage
for a constant variable rather than imposing our view that it should have STATIC
storage set aside for it.
A few call sites were not making use of the `XXX_LEN` definitions and were
calling `strlen(XXX)` - these have been updated to use `const_strlen(XXX)`
instead.
I'm not sure if any toolchains will have raise any issues with these changes...
CI will tell!
strncpy will fill the entire buffer with NUL.
In this case we have a 128 byte buffer and write "empty" - 5 bytes -
into it.
So now instead of writing 6 bytes it'll write 128 bytes. Especially
wasteful because we already did memset before
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.
It's been a long time.
One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.
Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
The history pager will show multiline commands in single-line cells.
We escape newline characters as \\n but that looks awkward if the next line
starts with a letter. Let's render control characters using their corresponding
symbol from the Control Pictures Unicode block.
This means there is also no need to escape backslashes, which further improves
the history pager - now the rendering has exactly as many backslashes as
the eventual command.
This means that (multiline) commands in the history pager will be rendered
with the same amount of characters as are in the actual command (unless
they contain funny nonprintables). This makes it easy for the next commit
to highlight multiline commands correctly in the history pager.
The font size for these symbols (for example ␉) is quite small, but that's
okay since for the proposed uses it's not so important that they readable.
The important thing is that the stand out from surrounding text.
We use "c > 0" but we actually mean "c != 0". The former looks like the
other code path handles negative c. Yet if c is negative, our code would
print a single escaped byte (\xXY) which is wrong because a negative value
has "sizeof wchar_t" bytes which is at least 2.
I think on platforms with 16-bit wchar_t it's possible that we actually
get a negative value but I haven't checked.
Since the fix for #3892, this escaping style escapes
\n to \\n
as well as
\\ to \\\\
\' to \\'
I believe these two are the only printable characters that are escaped with
ESCAPE_NO_PRINTABLES.
The rationale is probably to keep the encoding unambiguous and reversible.
However that doesn't justify escaping the single quote. Probably this was
an accident, so let's revert that part.
This has the nice effect that single quotes will no longer be escaped
when rendered in the completion pager (which is consistent with other
special characters). Try it:
complete : -a "aaa\'\; aaaa\'\;" -f
Also this makes the error output of builtin bind consistent:
$ bind -e --preset \;
$ bind -e --preset \'
$ bind \;
bind: No binding found for sequence “;”
$ bind \'
bind: No binding found for sequence “'”
the last line is clearly better than the old version:
bind: No binding found for sequence “\'”
In general, the fact that ESCAPE_NO_PRINTABLES escapes the (printable)
backslash is weird but I guess it's fine because it looks more consistent to
users, even though the result is an undocumented subset of the fish language.
ESCAPE_ALL is not really a helpful name. Also it's the most common flag.
Let's make it the default so we can remove this unhelpful name.
While at it, let's add a default value for the flags argument, which helps
most callers.
The absence of ESCAPE_ALL makes it only escape nonprintable characters
(with some exceptions). We use this for displaying strings in the completion
pager as well as for the human-readable output of "set", "set -S", "bind"
and "functions".
No functional change.
Or should we stop using it?
I'm fine with either always or never using auto-formatting but our current
way of using it only sometimes is confusing.
No functional change.
The last remnant of the old debug system, this was only used in
show_stackframe.
Because that's only ever called with an "E" level currently I've
removed the level argument entirely. If it's needed we'd have to pass
a flog category here.