As mentioned in the comment the historical behavior is because pressing unknown
control characters like Ctrl+4 inserts confusing characters, so let's back
out that part of b77d1d0e2 (Stop crashing on invalid Unicode input, 2024-02-27).
We still have the code for rendering control characters, for pasted text,
or text recalled from history. It is unclear whether we should strip those.
Some terminals already strip control characters from pasted text -- but not
all of them: see https://codeberg.org/dnkl/foot/pulls/312 for example which
has a follow up called "Don't strip HT when pasting in non-bracketed mode".
Don't force the internal use of `RefCell<T>`, let the caller place that into
`MainThread<>` manually. This lets us remove the reference to `MainThread<>`
from the definition of `Screen` again and reduces the number of
`assert_is_main_thread()` calls.
Fairly straightforward, with the only unfortunate part of this being that
`Screen` isn't as pure and now encodes the facte that we use it with
main-thread-only stdout `Outputter`.
The regex struct is pretty large at 560 bytes, with the entire
Abbreviation being 664 bytes.
If it's an "Option<Regex>", any abbr gets to pay the price. Boxing it
means abbrs without a regex are over 500 bytes smaller.
IfStatement is 680 bytes, much larger than the other
variants (SwitchStatement is next at 232). An enum is as large as its
largest variant, so this saves a bunch, especially since
DecoratedStatement is much more likely than IfStatement.
This will speed up the no-execute benchmark by 1.07x.
Unlike C++, Rust requires "char" to be a valid Unicode code point. As a
workaround, we take the raw (probably UTF-8-encoded) input and convert each
input byte to a char representation from the private use area (see commit
3b15e995e (str2wcs: encode invalid Unicode characters in the private use
area, 2023-04-01)). We convert back whenever we output the string, which
is correct as long as the encoding didn't change since the data was input.
We also need to convert keyboard input; do that.
Quick testing shows that our reader drops PUA characters. Since this patch
converts both invalid Unicode input as well as PUA input into a safe PUA
representation, there's no longer a reason to not add PUA characters to
the commandline, so let's do that to restore traditional behavior.
Render them as � (REPLACEMENT CHARACTER); unfortunately we show one per
input byte instead of one per code point. To fix this we probably need our
own char type.
While at it, remove some special cases that try to prevent insertion of
control characters. I don't think they are necessary. Could be wrong..
This makes it so code like
```fish
echo foo
echo bar
```
is collapsed into
```fish
echo foo
echo bar
```
One empty line is allowed, more is overkill.
We could also allow more than one for e.g. function endings.
We don't need to know that it tried these five before finally getting
one, the list is *right there*.
It is also very unlikely that someone has "xterm" or "ansi" but not "xterm-256color"
For xterm-256color, we don't warn *at all* because we have that one hardcoded.
This allows us to get the terminfo information without linking against curses.
That means we can get by without a bunch of awkward C-API trickery.
There is no global "cur_term" kept by a library for us that we need to invalidate.
Note that it still requires a "unhashed terminfo database", and I don't know how well it handles termcap.
I am not actually sure if there are systems that *can't* have terminfo, everything I looked at
has the ncurses terminfo available to install at least.
... even if the file hasn't changed. This addresses an oddity in the following
case:
* Shell is started,
* function `foo` is sourced from foo.fish
* foo.fish is *externally* edited and saved
* <Loaded definition of `foo` is now stale, but fish is unaware>
* `funced foo` loads `type -p foo` showing changed definition, user exits
$EDITOR saving no changes (or with $status 0, more generally).
* Stale definition of `foo` remains
If a hostname starts with a dash `-` character, the prompt_hostname function
fails because the `string` function interprets it as an option instead
of an argument.