Unlike C++, Rust requires "char" to be a valid Unicode code point. As a
workaround, we take the raw (probably UTF-8-encoded) input and convert each
input byte to a char representation from the private use area (see commit
3b15e995e (str2wcs: encode invalid Unicode characters in the private use
area, 2023-04-01)). We convert back whenever we output the string, which
is correct as long as the encoding didn't change since the data was input.
We also need to convert keyboard input; do that.
Quick testing shows that our reader drops PUA characters. Since this patch
converts both invalid Unicode input as well as PUA input into a safe PUA
representation, there's no longer a reason to not add PUA characters to
the commandline, so let's do that to restore traditional behavior.
Render them as � (REPLACEMENT CHARACTER); unfortunately we show one per
input byte instead of one per code point. To fix this we probably need our
own char type.
While at it, remove some special cases that try to prevent insertion of
control characters. I don't think they are necessary. Could be wrong..
Inserting Tab or Backspace characters causes weird glitches. Sometimes it's
useful to paste tabs as part of a code block.
Render tabs as "␉" and so on for other ASCII control characters, see
https://unicode-table.com/en/blocks/control-pictures/. This fixes the
width-related glitches.
You can see it in action by inserting some control characters into the
command line:
set chars
for x in (seq 1 0x1F)
set -a chars (printf "%02x\\\\x%02x" $x $x)
end
eval set chars $chars
commandline -i "echo '" $chars
Fixes#6923Fixes#5274Closes#7295
We could extend this approach to display a fallback symbol for every unknown
nonprintable character, not just ASCII control characters.
In future we might want to support tab properly.
* Fix build on NetBSD
Notably:
1. A typo in `f_flag` vs `f_flags` - this was probably never tested
2. Some pointless name differences - `st_mtimensec` vs
`st_mtime_nsec`
3. The big one: This said that LC_GLOBAL_LOCALE() was -1 "everywhere".
Well, not on NetBSD.
* ifdef for macos