- `libc::setlinebuf` is not available through Rust's libc it appears.
- autocxx fails to generate bindings using `*mut FILE`, instead go through
`void*`
- rust_main needs `parse_util_detect_errors_in_ast`, which is _partially_
ported, instead add FFI interop for C++.
- We need to set the filename if we are sourcing a file
Similar to `time`, except that one is more common as a command.
Note that this will also allow `builtin and`, which is somewhat
useless, but then it is also useless outside of a pipeline.
Addition to #9985
This allows e.g. `foo | command time`, while still rejecting `foo | time`.
(this should really be done in the ast itself, but tbh most of
parse_util kinda should)
Fixes#9985
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.
Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.
Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now. In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t". This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.
Alternatively, we could define FFI wrappers for recursive AST traversal.
Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:
The only client that requires mutable access is the populator. To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.
The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).
Like in the C++ implementation, the AST nodes themselves are largely defined
via macros. Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.
This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList". To make this work
we need to manually define "ExternType".
There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.
One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.
Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.
The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:
$ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms]
Range (min … max): 193.2 ms … 205.1 ms 15 runs
Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms]
Range (min … max): 611.7 ms … 805.5 ms 10 runs
Summary
'./fish.old -c 'source ../share/completions/git.fish'' ran
3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''
Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
changed since it's not user visible.
Most of it is duplicated, hence untested.
Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
more changes for each new libc. We could also use this approach for 1.
Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
This is early work but I guess there's no harm in pushing it?
Some thoughts on the conventions:
Types that live only inside Rust follow Rust naming convention
("FeatureMetadata").
Types that live on both sides of the language boundary follow the existing
naming ("feature_flag_t").
The alternative is to define a type alias ("using feature_flag_t =
rust::FeatureFlag") but that doesn't seem to be supported in "[cxx::bridge]"
blocks. We could put it in a header ("future_feature_flags.h").
"feature_metadata_t" is a variant of "FeatureMetadata" that can cross
the language boundary. This has the advantage that we can avoid tainting
"FeatureMetadata" with "CxxString" and such. This is an experimental approach,
probably not what we should do in general.
This works around an autocxx limitations where different types cannot
have the same name even if they live in different namespace.
ast::job_t conflicts with job_t.
Inside a comment we offer plain file completions (or command completions if
the comment is in command position). However these completions are broken
because they don't consider any of the surrounding characters. For example
with a command line
echo # comment
^ cursor
we suggest file completions and insert them as
echo # comsomefile ment
Providing completions inside comments does not seem useful and it can be
misleading. Let's remove the completions; this should communicate better that
we are in a free-form comment that's not subject to fish syntax.
Closes#9320
This reverts commit 3d8f98c395.
In addition to the issues mentioned on the GitHub page for this commit,
it also broke the CentOS 7 build.
Note one can locally test the CentOS 7 build via:
./docker/docker_run_tests.sh ./docker/centos7.Dockerfile
Be more careful with sign extension issues stemming from the differences in how
an untyped literal is promoted to an integer vs how a typed (and signed) `char`
is promoted to an integer.
Also convert some `const[expr] static xxx` to `const[expr] xxx` where it makes
sense to let the compiler deduce on its own whether or not to allocate storage
for a constant variable rather than imposing our view that it should have STATIC
storage set aside for it.
A few call sites were not making use of the `XXX_LEN` definitions and were
calling `strlen(XXX)` - these have been updated to use `const_strlen(XXX)`
instead.
I'm not sure if any toolchains will have raise any issues with these changes...
CI will tell!
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.
It's been a long time.
One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.
Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
ESCAPE_ALL is not really a helpful name. Also it's the most common flag.
Let's make it the default so we can remove this unhelpful name.
While at it, let's add a default value for the flags argument, which helps
most callers.
The absence of ESCAPE_ALL makes it only escape nonprintable characters
(with some exceptions). We use this for displaying strings in the completion
pager as well as for the human-readable output of "set", "set -S", "bind"
and "functions".
No functional change.
String tokens are subdivided by command substitutions. Some syntax errors
can occur in the gap between two command substitutions. Make the caret point
to the start of that gap, instead of the token start.
When expanding command substitutions, we use a naïve way of detecting whether
the cmdsub has the optional leading dollar. We check if the last character was
a dollar, which breaks if it's an escaped dollar. We wrongly expand
\$(echo "") to the empty string. Fix this by checking if the dollar was escaped.
The parse_util_* functions have a bunch of output parameters. We should
return a parameter bag instead (I think I tried once and failed).
Commit e40eba358 (Treat text following quoted command substitution
as quoted) made parse_util_locate_cmdsubst_range() aware of quoted
command substitutions, by skipping surrounding text via quote_end().
However, it was not quite right. We fail to properly parse
two consecutive command substitutions in the same string,
because we don't maintain the quoting context across calls to
parse_util_locate_cmdsubst_range(). Let's track that bit in a
parameter. This allows us to get rid of the quote_end() hack.
Also apply this to the other place where we call
parse_util_locate_cmdsubst_range() in a loop (highlighting).
Fixes#8500
Commit ec3d3a481 (Support "$(cmd)" command substitution without line
splitting, 2021-07-02) started treating an input string like
"a$()b" as if it were "a"$()"b". Yet, we do not actually insert the
virtual quotes. Instead we just adapted the definition of when quotes
are closed - hence the changes to quote_end().
parse_util_locate_cmdsubst_range() is aware
of the changes to quote_end() but some of its
callers like parse_util_detect_errors_in_argument() and
highlighter_t::color_as_argument() are not. They split strings at
command substitution boundaries without handling the special quoting
rules. (Only the expansion logic did it right.)
Fix this by handling the special quoting rules inside
parse_util_locate_cmdsubst_range(). This is a bit hacky since it
makes it harder for callers to process some substrings in between
command substitutions, but that's okay because current callers only
care about what's inside the command substitutions.
Fixes#8394
Like the $status commit, this would add the offset to already existing
errors, so
```fish
(foo)
(bar)
something
```
would see the "(foo)" error, store the correct error location, then
see the "(bar)" error, and *add the offset of (bar)* to the "(foo)"
error location.
Solve this by making a new error list and appending it to the existing
ones.
There's a few other ways to solve this, including:
- Stopping after the first error (we only display the first anyway, I
think?)
- Making it so the source location has an "absolute" flag that shows
the offset has already been added (but do we ever need to add two offsets?)
I went with the simpler fix.
This would break the location of any prior errors without doing
anything of value.
E.g.
```fish
echo foo | exec grep # this exec is not allowed!
$status
somethingelse # The error might be found here!
```
Would apply the offset of `$status` to the offset of `exec`, locating
the error for `exec` somewhere after $status!
Fixes some regressions from 35ca42413 ("Simplify some parse_util functions").
The tmux tests are not beautiful but I find them easy to write.
Probably a pexpect test would also be enough here?
This is slightly unclean. Even tho it would otherwise be syntactically
valid, using $status as a command is very very very likely to be an
error, like
if not $status
We have reports of this surprisingly regularly, including #2773.
Because $status can only ever be a value from 0 to 255, it is also
very unlikely to be an actual command, and that command is very
unlikely to do what you want.
So we simply point the user towards the "conditions" help section,
that should explain things.
Currently, if a "return" is given outside of a function, we'd just
throw an error.
That always struck me as a bit weird, given that scripts can also
return a value.
So simply let "return" outside also exit the script, kinda like "exit"
does.
However, unlike "exit" it doesn't quit an interactive shell - it seems
weird to have "return" do that as well. It sets $status, so it can be
used to quickly set that, in case you want to test something.