This updates widechar_width.h to one generated from
15e782aa3df9dfef436516f66f745a90b421329.
The change here is a rationalization of doublewide vs widened-in-9.
Many emoji have been moved to widened-in-9 because we now use the
correct version (this uses the *emoji* version, and emoji version 3.0
corresponds to Unicode 9).
This updates widecharwidth to
6d3d55b419db93934517cb568d1a3d95909b4c7b, which includes the same
Hangul Jamo check in a separate table.
This should slightly speed up most width calculation because we no
longer need to do it for most chars, including the overwhelmingly
common ascii ones.
Also the range is increased and should better match reality.
This pulls in widechar_width.h from commit 7e9dfdaf05059b3f. The big change
here is that some characters which were previously marked as widened in 9
are now marked as unconditionally narrow; this includes some randoms like
hot pepper (U+1F336) but more importantly all of the regional indicators,
which affects how flags are rendered.
If you put two regional indicators together, you get a flag emoji. It's
unclear what the width of this flag emoji should be; Terminal and iTerm2
renders it as width 1, while kitty renders it as width 2. This is
unaffected by fish_emoji_width because the flag does not have an assigned
codepoint, it is a pair of codepoints.
The regional indicators are marked as "neutral" in EastAsianWidth.txt which
means they conceptually have width 1. So two of them have width 2. So now
we assume that flags are rendered as width 2.
This fixes#7237, for terminals that render flags as width 2 (but not 1,
unfortunately, which includes iTerm2 and Terminal.app).
This pulls in widechar_width.h from commit d4e75d5bb1930291223d1.
This is a "rebuild with latest data" before we attempt a risky bugfix.
The idea here is that bisecting can separate whether any regression is
due to using the latest Unicode data, or the bug fix.
The C++ spec (as of C++17/n4713) does not specify the sign of `wchar_t`,
saying only (in section 6.7.1: Fundamental Types)
> Type wchar_t shall have the same size, signedness, and alignment
> requirements (6.6.5) as one of the other integral types, called its
> underlying type.
On most *nix platforms on AMD64 architecture, `wchar_t` is a signed type
and can be compared with `int32_t` without incident, but on at least
some platforms (tested: clang under FreeBSD 12.1 on AARCH64), `wchar_t`
appears to be unsigned leading to sign comparison warnings:
```
../src/widecharwidth/widechar_width.h:512:48: warning: comparison of
integers of different signs: 'const wchar_t' and 'int32_t' (aka 'int')
[-Wsign-compare]
return where != std::end(arr) && where->lo <= c;
```
This patch forces the use of wchar_t for the range start/end values in
`widechar_range` and the associated comparison values.