Commit Graph

51 Commits

Author SHA1 Message Date
Mahmoud Al-Qudsi
fe63775ec5 string: Also escape new lines with --style=regex
This isn't *required* in the PCRE2 spec but it greatly increases the utility of
escaped regex strings at the commandline.
2024-07-16 17:05:11 -05:00
Mahmoud Al-Qudsi
8b7597913e Add tests for string match/replace --max-matches 2024-06-30 17:51:50 -05:00
Johannes Altmanninger
8d88b4d358 Support quoted escaping also when ' or \ is present
Also, if there are more single quotes than double quotes and dollars, use
double quotes for quoting.
2024-04-13 15:33:05 +02:00
Johannes Altmanninger
1e858eae35 tests: filter control sequences only when interactive
This demonstrates that we only write control sequences when interactive.
2024-04-12 12:28:22 +02:00
Johannes Altmanninger
8bf8b10f68 Extended & human-friendly keys
See the changelog additions for user-visible changes.

Since we enable/disable terminal protocols whenever we pass terminal ownership,
tests can no longer run in parallel on the same terminal.

For the same reason, readline shortcuts in the gdb REPL will not work anymore.
As a remedy, use gdbserver, or lobby for CSI u support in libreadline.

Add sleep to some tests, otherwise they fall (both in CI and locally).

There are two weird failures on FreeBSD remaining, disable them for now
https://github.com/fish-shell/fish-shell/pull/10359/checks?check_run_id=23330096362

Design and implementation borrows heavily from Kakoune.

In future, we should try to implement more of the kitty progressive
enhancements.

Closes #10359
2024-04-02 14:35:16 +02:00
Himadri Bhattacharjee
4e6e897781
string repeat: allow omission of -n (#10282) 2024-02-11 12:19:02 +01:00
Henrik Hørlück Berg
6dd2cd2b20 Fix behaviour in the presence of non-visible width
Padding with an unprintable character is now disallowed, like it was for other
zero-length characters.

`string shorten` now ignores escape sequences and non-printable characters
when calculating the visible width of the ellipsis used (except for `\b`,
which is treated as a width of -1).
Previously `fish_wcswidth` returned a length of -1 when the ellipsis-str
contained any non-printable character, causing the command to poentially
print a larger width than expected.

This also fixes an integer overflows in `string shorten`'s
`max` and `max2`, when the cumulative sum of character widths turned negative
(e.g. with any non-printable characters, or `\b` after the changes above).
The overflow potentially caused strings containing non-printable characters
to be truncated.

This adds test that verify the fixed behaviour.
2023-07-27 22:00:03 -07:00
Henrik Hørlück Berg
20be990fd9 Port builtins/string to Rust
- Add test to verify piped string replace exit code

Ensure fields parsing error messages are the same.

Note: C++ relied upon the value of the parsed value even when `errno` was set,
that is defined behaviour we should not rely on, and cannot easilt be replicated from Rust.
Therefore the Rust version will change the following error behaviour from:

```shell
> string split --fields=a "" abc
string split: Invalid fields value 'a'
> string split --fields=1a "" abc
string split: 1a: invalid integer
```

To:

```shell
> string split --fields=a "" abc
string split: a: invalid integer
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
2023-07-27 22:00:03 -07:00
Henrik Hørlück Berg
6325b3662d Fix #9899 integer overflow in string repeat
We could end up overflowing if we print out something that's a multiple of the
chunk size, which would then finish printing in the chunk-printing, but not
break out early.
2023-07-17 15:41:08 +02:00
Fabian Boehm
41568eb2a8 Move NUL-handling tests to their own file 2023-06-24 21:26:44 +02:00
Fabian Boehm
11c8d9684e
Make NULs work for builtins (#9859)
* Make NULs work for builtins

This switches from passing a c-string to output_stream_t::append to
passing a proper string.

That means a builtin that prints a NUL no longer crashes with "thread '' panicked
at 'String contained intermediate NUL character: ".

Instead, it will actually handle the NUL, even as an argument.

That means something like

`echo foo\x00bar` will now actually print a NUL instead of truncating
after the `foo` because we passed c-strings around everywhere.

The former is *necessary* for e.g. `string`, the latter is a change
that on the whole makes dealing with NULs easier, but it is a
behavioral change.

To restore the c-string behavior we would have to truncate arguments
at NUL.

See #9739.

* Use AsRef instead of trait bound
2023-06-22 20:50:22 +02:00
Fabian Boehm
cb28b39b24 string shorten: Make max of 0 mean no shortening
This makes it easier to just slot in `string shorten` wherever,
without having to do a weird "if test $max -gt 0" check.
2022-10-04 18:44:21 +02:00
Fabian Boehm
41c22d5e60 Add string shorten
This is essentially the inverse of `string pad`.
Where that adds characters to get up to the specified width,
this adds an ellipsis to a string if it goes over a specific maximum width.
The char can be given, but defaults to our ellipsis string.
("…" if the locale can handle it and "..." otherwise)

If the ellipsis string is empty, it just truncates.

For arguments given via argv, it goes line-by-line,
because otherwise length makes no sense.

If "--no-newline" is given, it adds an ellipsis instead and removes all subsequent lines.

Like pad and `length --visible`, it goes by visible width,
skipping recognized escape sequences, as those have no influence on width.

The default target width is the shortest of the given widths that is non-zero.

If the ellipsis is already wider than the target width,
we truncate instead. This is safer overall, so we don't e.g. move into a new line.
This is especially important given our default ellipsis might be width 3.
2022-09-09 18:49:57 +02:00
Fabian Boehm
eac808a819
string repeat: Don't allocate repeated string all at once (#9124)
* string repeat: Don't allocate repeated string all at once

This used to allocate one string and fill it with the necessary
repetitions, which could be a very very large string.

Now, it instead uses one buffer and fills it to a chunk size,
and then writes that.

This fixes:

1. We no longer crash with too large max/count values. Before they
caused a bad_alloc because we tried to fill all RAM.
2. We no longer fill all RAM if given a big-but-not-too-big value. You
could've caused fish to eat *most* of your RAM here.
3. It can start writing almost immediately, instead of waiting
potentially minutes to start.

Performance is about the same to slightly faster overall.
2022-08-09 19:58:56 +02:00
Fabian Homborg
585d1de653
Merge branch 'master' into string-preserve-missing-newline 2022-03-13 11:21:53 +01:00
Fabian Homborg
b48d8188b9 Change our test emoji
The emoji we used wasn't actually widened-in-9, so we now switch to
one that does.
2022-02-14 22:31:30 +01:00
Johannes Altmanninger
745129e825 builtin string: don't print final newline if it's missing from stdin
A command like "printf nonewline | sed s/x/y/" does not print a
concluding newline, whereas "printf nnl | string replace x y" does.
This is an edge case -- usually the user input does have a newline at
the end -- but it seems still better for this command to just forward
the user's data.

Teach most string subcommands to check if stdin is missing the trailing
newline, and stop adding one in that case.
This does not apply when input is read from commandline arguments.

* Most subcommands stop adding the final newline, because they don't
  really care about newlines, so besides their normal processing,
  they just want to preserve user input. They are:
  * string collect
  * string escape/unescape
  * string join¹
  * string lower/upper
  * string pad
  * string replace
  * string repeat
  * string sub
  * string trim

* string match keeps adding the newline, following "grep". Additionally,
  for string match --regex, it's important to output capture groups
  separated by newlines, resulting in multiple output lines for an
  input line. So it is not obvious where to leave out the newline.

* string split/split0 keep adding the newline for the same reason --
  they are meant to output multiple elements for a single input line.

¹) string join0 is not changed because it already printed a trailing
   zero byte instead of the trailing newline. This is consistent
   with other tools like "find -print0".

Closes #3847
2021-11-27 19:11:24 +01:00
Aaron Gyes
fefb913857 Update tests for changed error output 2021-11-03 22:54:55 -07:00
Fabian Homborg
2b8fe280e0 tests: Switch emoji used
widechar_width no longer classifies U+1F41F as widened-in-9, so the
width no longer changes.

Since we're interested in testing the change here, we need a different
emoji.

Just use 🥁, which was introduced in 9 as wide, and therefore widened
in 9.
2021-10-26 18:30:43 +02:00
Fabian Homborg
bb115c847e Handle backspaces for visible width
This makes it so we treat backspaces as width -1, but never go below a
0 total width when talking about *lines*, like in screen or string
length --visible.

Fixes #8277.
2021-09-23 12:58:35 +02:00
Fabian Homborg
2087a3ca63 Let visible length work with CR and LF
Because we are, ultimately, interested in how many cells a string
occupies, we *have* to handle carriage return (`\r`) and line
feed (`\n`).

A carriage return sets the current tally to 0, and only the longest
tally is kept. The idea here is that the last position is the same as
the last position of the longest string. So:

abcdef\r123

ends up looking like

123def

which is the same width as abcdef, 6.

A line feed meanwhile means we flush the current tally and start a new
one. Every line is printed separately, even if it's given as one.

That's because, well, counting the width over multiple lines
doesn't *help*.

As a sidenote: This is necessarily imperfect, because, while we may
know the width of the terminal ($COLUMNS), we don't know the current
cursor position. So we can only give the width, and the user can then
figure something out on their own.

But for the common case of figuring out how wide the prompt is, this
should do.
2021-08-04 21:09:47 +02:00
Fabian Homborg
f3f6e4a982 string: Add "--groups-only" to match
This adds a simple way of picking bits from a string that might be a
bit nicer than having to resort to a full `replace`.

Fixes #6056
2021-07-16 20:27:54 +02:00
Fabian Homborg
0e1f5108ae
string: Allow collect --allow-empty to avoid empty ellision (#8054)
* string: Allow `collect --no-empty` to avoid empty ellision

Currently we still have that issue where

    test -n (thing | string collect)

can return true if `thing` doesn't print anything, because the
collected argument will still be removed.

So, what we do is allow `--no-empty` to be used, in which case we
print one empty argument.

This means

    test -n (thing | string collect -n)

can now be safely used.

"no-empty" isn't the best name for this flag, but string's design
really incentivizes reusing names, and it's not *terrible*.

* Switch to `--allow-empty`

`--no-empty` does the exact opposite for `string split` and split0.

Since `-a`/`--allow-empty` already exists, use it.
2021-07-09 21:20:58 +02:00
Fabian Homborg
daa3cc17c4 Fix crash in string pad
Try:

    string pad -w 8 he \eh
2021-03-09 18:36:02 +01:00
Fabian Homborg
932074f06c escape_string_script: Escape DEL as \x7f
This used to print a literal DEL character in the output for `bind`,
which wouldn't actually show up and made it hard to figure out what
the key was.

So we just escape it back to how we actually used it - `\x7f`.

Fixes #7631.
2021-01-16 12:49:49 +01:00
ridiculousfish
7a0bddfcfa Teach string repeat to handle multiple arguments
Each argument in string repeat is handled independently, except that the
--no-newline option applies only to the last newline.

Fixes #5988
2021-01-11 17:00:06 -08:00
Fabian Homborg
aa895645dd Add string to reserved keywords
Since `string match` now creates variables, wrapping `string`
necessarily breaks things, so we need to disallow it.

See #7459, #7509.
2020-12-06 15:39:49 +01:00
Fabian Homborg
720982a3cb string: Quit early if --quiet is satisfied
E.g. if we do `string match -q`, and we find a match, nothing about
the input can change anything, so we quit early.

This is mainly useful for performance, but it also allows `string`
with `-q` to be used with infinite input (e.g. `yes`).

Alternative to #7495.
2020-12-01 18:55:01 +01:00
Johannes Altmanninger
3002c88f44 Set fish_emoji_width in test to guard against older wcwidth
See #7340
2020-09-28 18:09:39 +02:00
Fabian Homborg
dfcb4b811c tests: Regex the width tests
Some wcwidths are old.

Belongs to #7340.
2020-09-28 17:48:42 +02:00
Johannes Altmanninger
f758d39535 string pad: handle padding characters of width > 1
If the padding is not divisible by the char's width without remainder,
we pad the remainder with spaces, so the total width of the output is correct.

Also add completions, changelog entry, adjust documentation, add examples
with emoji and some tests.  Apply some minor style nitpicks and avoid extra
allocations of the input strings.
2020-09-27 21:59:15 +02:00
Andrew Prokhorenkov
92511b09c4 New command "string pad" to pad text to a given width (#7340)
Pads text to a given width, or the maximum width of all inputs.
2020-09-27 21:59:15 +02:00
Fabian Homborg
7ec57f2c50 string: Handle unmatched capturing groups as empty
Instead of erroring out.

Fixes #7343.
2020-09-20 10:36:17 +02:00
ridiculousfish
5c3571d626 Revert accidental merge of #7340
This reverts back to commit d8e2cac83e.
I accidentally did a 'git push' during code review.
2020-09-19 19:31:44 -07:00
Andrew Prokhorenkov
2afa354c14 builtin_string: implement "width" argument for "string pad" 2020-09-19 19:25:57 -07:00
Andrew Prokhorenkov
efe94344e2 builtin_string: extra tests 2020-09-19 19:25:57 -07:00
Andrew Prokhorenkov
f389bb0e97 tests: added tests for "string pad" 2020-09-19 19:25:57 -07:00
Jason Nader
6a839519b9 string split: add --allow-empty flag to be used with --fields 2020-04-20 22:39:48 +02:00
Jason Nader
3bb86d3a61 string split --fields: handle multi-line/arg input 2020-04-20 22:39:48 +02:00
Jason Nader
21bbd2ecb4 Return 1 if non-existent field is given 2020-04-04 15:30:09 +02:00
Jason Nader
1329a40e87 Allow simple ranges to be specified for --fields 2020-04-04 15:30:08 +02:00
Jason Nader
7cb1d3a646 Add string split --fields 2020-04-04 15:30:08 +02:00
George Christou
a3436110c1
Add string sub --end (#6765) 2020-03-22 15:53:09 +01:00
Fabian Homborg
9367d4ff71 Reindent functions to remove useless quotes
This does not include checks/function.fish because that currently
includes a "; end" in a message that indent would remove, breaking the test.
2020-03-09 19:46:43 +01:00
Fabian Homborg
69b464bc37 Run fish_indent on all our fish scripts
It's now good enough to do so.

We don't allow grid-alignment:

```fish
complete -c foo -s b -l barnanana -a '(something)'
complete -c foo -s z              -a '(something)'
```

becomes

```fish
complete -c foo -s b -l barnanana -a '(something)'
complete -c foo -s z -a '(something)'
```

It's just more trouble than it is worth.

The one part I'd change:

We align and/or'd parts of an if-condition with the in-block code:

```fish
if true
   and false
    dosomething
end
```

becomes

```fish
if true
    and false
    dosomething
end
```

but it's not used terribly much and if we ever fix it we can just
reindent.
2020-01-13 20:34:22 +01:00
Johannes Altmanninger
fdf398e435 show missing argument error only for last flag
closes #6483
2020-01-08 14:59:26 +01:00
Fabian Homborg
d77c465d23 string: Allow -eq again
Instead of forbidding it for both modes, allow it for both and make it
quiet for string.

Fixes #6282
2019-11-04 17:34:37 +01:00
Fabian Homborg
66938d206a string: Error out on match -eq
The `--entire` would enable output even though the `--quiet` should
have silenced it. These two don't make any sense together so print an
error, because the user could have just left off the `-q`.
2019-10-22 22:11:36 +02:00
Aaron Gyes
61f0756fe6 builtins: Use standard builtin.h error macros more 2019-09-17 22:04:33 -07:00
ridiculousfish
554ee240b3 Correct handling of explicitly separated output when all elements are empty
Previously when propagating explicitly separated output, we would early-out
if the buffer was empty, where empty meant contains no characters. However
it may contain one or more empty strings, in which case we should propagate
those strings.

Remove this footgun "empty" function and handle this properly.

Fixes #5987
2019-07-21 14:00:27 -07:00