Commit Graph

186 Commits

Author SHA1 Message Date
Henrik Hørlück Berg
55302629cd Add various FFI-interop-functions
- `libc::setlinebuf` is not available through Rust's libc it appears.
- autocxx fails to generate bindings using `*mut FILE`, instead go through
  `void*`
- rust_main needs `parse_util_detect_errors_in_ast`, which is _partially_
  ported, instead add FFI interop for C++.
- We need to set the filename if we are sourcing a file
2023-09-05 11:38:59 +02:00
Fabian Boehm
b454b3bc40 Also allow command and in a pipeline
Similar to `time`, except that one is more common as a command.

Note that this will also allow `builtin and`, which is somewhat
useless, but then it is also useless outside of a pipeline.

Addition to #9985
2023-08-26 13:45:54 +02:00
Fabian Boehm
482616f101 parse_util: Only reject time in a pipeline without decorator
This allows e.g. `foo | command time`, while still rejecting `foo | time`.

(this should really be done in the ast itself, but tbh most of
parse_util kinda should)

Fixes #9985
2023-08-25 19:45:15 +02:00
Johannes Altmanninger
12ce42a2f9 Rename kw() to keyword() also in C++ 2023-04-19 22:43:36 +02:00
Johannes Altmanninger
09ffac5a0a Port parse_util_compute_indents 2023-04-19 10:35:22 +02:00
Johannes Altmanninger
971d257e67 Port AST to Rust
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.

Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.

Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now.  In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t".  This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.

Alternatively, we could define FFI wrappers for recursive AST traversal.

Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:

The only client that requires mutable access is the populator.  To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.

The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).

Like in the C++ implementation, the AST nodes themselves are largely defined
via macros.  Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.

This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList".  To make this work
we need to manually define "ExternType".

There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.

One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.

Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.

The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:

    $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
    Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
      Time (mean ± σ):     195.5 ms ±   2.9 ms    [User: 190.1 ms, System: 4.4 ms]
      Range (min … max):   193.2 ms … 205.1 ms    15 runs

    Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
      Time (mean ± σ):     677.5 ms ±  62.0 ms    [User: 665.4 ms, System: 10.0 ms]
      Range (min … max):   611.7 ms … 805.5 ms    10 runs

    Summary
      './fish.old -c 'source ../share/completions/git.fish'' ran
        3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''

Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
  changed since  it's not user visible.
2023-04-16 17:46:56 +02:00
Johannes Altmanninger
05bad5eda1 Port common.{h,cpp} to Rust
Most of it is duplicated, hence untested.

Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
   be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
   more changes for each new libc. We could also use this approach for 1.

Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
2023-04-02 15:17:06 +02:00
Johannes Altmanninger
39f3c894d7 Port tokenizer.cpp to Rust
In hindsight, I should probably have split this into three different commits.
2023-02-09 00:37:22 +01:00
Johannes Altmanninger
7f8d247211 Port parse_constants.h to Rust 2023-02-09 00:37:22 +01:00
Johannes Altmanninger
9ca160eac2 Convert parse_error_code_t to a scoped enum
This will make the Rust port's diff smaller.
2023-02-08 21:49:54 +01:00
Johannes Altmanninger
4639f7ec40 Follow Rust naming convention for some types
But don't do it for enum variants just yet.
2023-02-08 21:49:54 +01:00
Johannes Altmanninger
83fd7ea7c4 Port future_feature_flags.cpp to Rust
This is early work but I guess there's no harm in pushing it?
Some thoughts on the conventions:

Types that live only inside Rust follow Rust naming convention
("FeatureMetadata").

Types that live on both sides of the language boundary follow the existing
naming ("feature_flag_t").
The alternative is to define a type alias ("using feature_flag_t =
rust::FeatureFlag") but that doesn't seem to be supported in "[cxx::bridge]"
blocks. We could put it in a header ("future_feature_flags.h").

"feature_metadata_t" is a variant of "FeatureMetadata" that can cross
the language boundary. This has the advantage that we can avoid tainting
"FeatureMetadata" with "CxxString" and such. This is an experimental approach,
probably not what we should do in general.
2023-02-03 18:55:06 +01:00
ridiculousfish
f38543ccb7 Rename ast::job_t to ast::job_pipeline_t
This works around an autocxx limitations where different types cannot
have the same name even if they live in different namespace.

ast::job_t conflicts with job_t.
2023-02-02 19:34:48 -07:00
Fabian Boehm
0f8b9699a1 Fix error for {$}
Fixes #9337
2022-11-15 19:02:30 +01:00
Johannes Altmanninger
c4a60feff1 Stop attempting to complete inside comments
Inside a comment we offer plain file completions (or command completions if
the comment is in command position). However these completions are broken
because they don't consider any of the surrounding characters. For example
with a command line

    echo # comment
              ^ cursor

we suggest file completions and insert them as

    echo # comsomefile ment

Providing completions inside comments does not seem useful and it can be
misleading. Let's remove the completions; this should communicate better that
we are in a free-form comment that's not subject to fish syntax.

Closes #9320
2022-11-12 22:37:27 +01:00
Aaron Gyes
daf5e11179 Spelling fixes
Found with scspell
2022-10-28 20:10:09 -07:00
Aaron Gyes
efa2cf0cb6 Replace fallthrough comments with __fallthrough__
Defined in config.h
2022-10-26 21:02:48 -07:00
Aaron Gyes
b2a4a50daf Run include-what-you-use 2022-10-26 19:58:40 -07:00
ridiculousfish
5f4583b52d Revert "Re-implement macro to constexpr transition"
This reverts commit 3d8f98c395.

In addition to the issues mentioned on the GitHub page for this commit,
it also broke the CentOS 7 build.

Note one can locally test the CentOS 7 build via:

    ./docker/docker_run_tests.sh ./docker/centos7.Dockerfile
2022-09-20 11:58:37 -07:00
Mahmoud Al-Qudsi
3d8f98c395 Re-implement macro to constexpr transition
Be more careful with sign extension issues stemming from the differences in how
an untyped literal is promoted to an integer vs how a typed (and signed) `char`
is promoted to an integer.
2022-09-19 18:10:41 -05:00
Mahmoud Al-Qudsi
7c3e4a7ccb Revert "Convert constant macros to constexpr expressions"
This reverts commit e1626818f7.
2022-09-19 17:42:11 -05:00
Mahmoud Al-Qudsi
e1626818f7 Convert constant macros to constexpr expressions
Also convert some `const[expr] static xxx` to `const[expr] xxx` where it makes
sense to let the compiler deduce on its own whether or not to allocate storage
for a constant variable rather than imposing our view that it should have STATIC
storage set aside for it.

A few call sites were not making use of the `XXX_LEN` definitions and were
calling `strlen(XXX)` - these have been updated to use `const_strlen(XXX)`
instead.

I'm not sure if any toolchains will have raise any issues with these changes...
CI will tell!
2022-09-19 17:17:09 -05:00
Mahmoud Al-Qudsi
351500e42d Emit more specific error for incomplete escape sequences
This replaces "Invalid token ..." with "Incomplete escape sequence ..." for
bare \c, \u, \U, \x, and \X escapes.
2022-09-16 15:44:33 -05:00
ridiculousfish
3eae0a9b6a clang-format all C++ files
This mostly re-sorts headers that got desorted after the IWYU
application in 14d2a6d8ff.
2022-08-21 15:02:19 -07:00
Aaron Gyes
14d2a6d8ff IWYU-guided #include rejiggering.
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.

It's been a long time.

One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.

Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
2022-08-20 23:55:18 -07:00
Fabian Boehm
232ca25ff9 Add length to the parse_util syntax errors 2022-08-12 18:38:47 +02:00
Fabian Boehm
eaf92918e6 Fix error offset for command (foo)
This used the decorated statement offset when the expansion errors
refer to the command without decoration.
2022-08-12 18:38:47 +02:00
Johannes Altmanninger
8729623cec Make ESCAPE_ALL the default and call its inverse ESCAPE_NO_PRINTABLES
ESCAPE_ALL is not really a helpful name. Also it's the most common flag.
Let's make it the default so we can remove this unhelpful name.

While at it, let's add a default value for the flags argument, which helps
most callers.

The absence of ESCAPE_ALL makes it only escape nonprintable characters
(with some exceptions). We use this for displaying strings in the completion
pager as well as for the human-readable output of "set", "set -S", "bind"
and "functions".

No functional change.
2022-07-27 11:24:35 +02:00
ridiculousfish
1a4b1c3298 Remove the is_first parameter from tok_is_string_character
This parameter is unused now that carets are no longer special, per
7f905b082.
2022-04-16 10:47:01 -07:00
Johannes Altmanninger
4b5b56452b Make string syntax error location a bit more precise
String tokens are subdivided by command substitutions. Some syntax errors
can occur in the gap between two command substitutions. Make the caret point
to the start of that gap, instead of the token start.
2022-04-03 16:34:46 +02:00
Johannes Altmanninger
e717b13e75 Fix spurious syntax error on escaped $@ inside quoted command substitution
We detect use of unsupported features like $@ by scanning string tokens
as a whole. With quoted command substitution, this has false positives,
as reported in [1]. We already recursively run the same error checks on
command substitutions, so limit the remaining checks to the gaps in-between
command substitutions.

[1]: 5f94dfd094/.config/fish/README/bug.md (cannot-use-dollar-anchor-in-sed-regex-in-quoted-command-substitution)
2022-04-03 16:18:47 +02:00
Johannes Altmanninger
3e3f507012 Fix regression expanding \$()
When expanding command substitutions, we use a naïve way of detecting whether
the cmdsub has the optional leading dollar. We check if the last character was
a dollar, which breaks if it's an escaped dollar.  We wrongly expand
\$(echo "") to the empty string. Fix this by checking if the dollar was escaped.

The parse_util_* functions have a bunch of output parameters. We should
return a parameter bag instead (I think I tried once and failed).
2022-04-03 15:54:08 +02:00
ridiculousfish
a960a3cde6 Emit an error if time is used past the first command in a pipeline
Fixes #8841
2022-03-31 16:14:59 -07:00
ridiculousfish
247d4b2c8f Rename EXEC_ERR_MSG to INVALID_PIPELINE_CMD_ERR_MSG
This error message was used for more than exec.
No functional change here.
2022-03-31 15:49:15 -07:00
Shay Aviv
2ef12af60e Fix comment parsing inside command substitutions and brackets 2022-02-08 16:20:31 +01:00
Johannes Altmanninger
8208fc4f87 Cleanup comment to match implementation
This was recently changed to return bool.
2021-12-12 18:21:35 +01:00
Aaron Gyes
ce475c0b4c more int -> bool
all the things
2021-12-09 00:52:45 -08:00
Johannes Altmanninger
4a575b26f5 Fix error check for repeated quoted command substitution
Commit e40eba358 (Treat text following quoted command substitution
as quoted) made parse_util_locate_cmdsubst_range() aware of quoted
command substitutions, by skipping surrounding text via quote_end().

However, it was not quite right. We fail to properly parse
two consecutive command substitutions in the same string,
because we don't maintain the quoting context across calls to
parse_util_locate_cmdsubst_range().  Let's track that bit in a
parameter. This allows us to get rid of the quote_end() hack.

Also apply this to the other place where we call
parse_util_locate_cmdsubst_range() in a loop (highlighting).

Fixes #8500
2021-12-04 16:56:07 +01:00
Johannes Altmanninger
c94dec5d0e Fix assertion error trying to highlight cmdsubs inside unbalanced quotes
I initially put this logic + assertion in another function, where we
always get balanced quotes. Not for highlighting.
2021-10-31 14:28:54 +01:00
Johannes Altmanninger
db377385f6 Fix copy paste error 2021-10-31 14:28:54 +01:00
Johannes Altmanninger
e40eba3585 Treat text following quoted command substitution as quoted
Commit ec3d3a481 (Support "$(cmd)" command substitution without line
splitting, 2021-07-02) started treating an input string like
"a$()b" as if it were "a"$()"b". Yet, we do not actually insert the
virtual quotes. Instead we just adapted the definition of when quotes
are closed - hence the changes to quote_end().

parse_util_locate_cmdsubst_range() is aware
of the changes to quote_end() but some of its
callers like parse_util_detect_errors_in_argument() and
highlighter_t::color_as_argument() are not.  They split strings at
command substitution boundaries without handling the special quoting
rules. (Only the expansion logic did it right.)

Fix this by handling the special quoting rules inside
parse_util_locate_cmdsubst_range(). This is a bit hacky since it
makes it harder for callers to process some substrings in between
command substitutions, but that's okay because current callers only
care about what's inside the command substitutions.

Fixes #8394
2021-10-30 18:02:10 +02:00
Aaron Gyes
dcaa9c7959 fix incorrect error message for 'end --foo' 2021-10-01 04:54:02 -07:00
Fabian Homborg
4ffabd44be Don't add expansion error offset twice
Like the $status commit, this would add the offset to already existing
errors, so

```fish
(foo)
(bar)

something
```

would see the "(foo)" error, store the correct error location, then
see the "(bar)" error, and *add the offset of (bar)* to the "(foo)"
error location.

Solve this by making a new error list and appending it to the existing
ones.

There's a few other ways to solve this, including:

- Stopping after the first error (we only display the first anyway, I
think?)
- Making it so the source location has an "absolute" flag that shows
the offset has already been added (but do we ever need to add two offsets?)

I went with the simpler fix.
2021-09-30 18:09:58 +02:00
Fabian Homborg
6774a514fa Don't set error offset for $status
This would break the location of any prior errors without doing
anything of value.

E.g.

```fish
echo foo | exec grep # this exec is not allowed!

$status

somethingelse # The error might be found here!
```

Would apply the offset of `$status` to the offset of `exec`, locating
the error for `exec` somewhere after $status!
2021-09-30 18:09:58 +02:00
Johannes Altmanninger
3a375c2399 reader: fix regressions when moving between lines
Fixes some regressions from 35ca42413 ("Simplify some parse_util functions").
The tmux tests are not beautiful but I find them easy to write.
Probably a pexpect test would also be enough here?
2021-08-01 17:50:44 +02:00
Fabian Homborg
8939a71ec6 An empty string means we're on the first line
Oops, this broke up-or-search!
2021-07-27 20:11:32 +02:00
Fabian Homborg
35ca42413d Simplify some parse_util functions
Don't just reflexively drop down to wchar_t.
2021-07-27 18:39:56 +02:00
Fabian Homborg
08209b3d9a Forbid $status as a command
This is slightly unclean. Even tho it would otherwise be syntactically
valid, using $status as a command is very very very likely to be an
error, like

    if not $status

We have reports of this surprisingly regularly, including #2773.

Because $status can only ever be a value from 0 to 255, it is also
very unlikely to be an actual command, and that command is very
unlikely to do what you want.

So we simply point the user towards the "conditions" help section,
that should explain things.
2021-07-27 18:37:20 +02:00
Fabian Homborg
3359e5d2e9
Let "return" exit a script (#8148)
Currently, if a "return" is given outside of a function, we'd just
throw an error.

That always struck me as a bit weird, given that scripts can also
return a value.

So simply let "return" outside also exit the script, kinda like "exit"
does.

However, unlike "exit" it doesn't quit an interactive shell - it seems
weird to have "return" do that as well. It sets $status, so it can be
used to quickly set that, in case you want to test something.
2021-07-21 22:33:39 +02:00
ridiculousfish
6960a56f29 parse_util_locate_brackets_of_type to only find cmdsubs
Now that we have a separate function for parsing slices, we no longer
need to support parsing slices in the same function as cmdsubs.
2021-07-14 13:59:48 -07:00