The conversion to usize is used for array accesses, so negative values
would cause crashes either way. Let's do it earlier so we can get rid of
the suspect C-style cast.
This allows us to use the scoped push in more scenarios by appeasing the
borrow checker.
Use it in a couple of places instead of ScopeGuard. Hopefully this is makes
porting easier.
Even though we generally dont' want to use this type (because it's immutable),
it can be advantageous when working with the std::fs API. This is because
it implements "AsRef<Path>" which neither of CString and Vec<u8> do.
This is basically a subset of type, so we might as well.
To be clear this is `command -s` and friends, if you do `command grep` that's
handled as a keyword.
One issue here is that we can't get "one path or not" because I don't
know how to translate a maybe_t? Do we need to make it a shared_ptr instead?
Most of it is duplicated, hence untested.
Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
more changes for each new libc. We could also use this approach for 1.
Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
wcs2string converts a wide string to a narrow one. The result is
null-terminated and may also contain interior null-characters.
std::string allows this.
Rust's null-terminated string, CString, does not like interior null-characters.
This means we will need to use Vec<u8> or OsString for the places where we
use interior null-characters.
On the other hand, we want to use CString for places that require a
null-terminator, because other Rust types don't guarantee the null-terminator.
Turns out there is basically no overlap between the two use cases, so make
it two functions. Their equivalents in Rust will have the same name, so
we'll only need to adjust the type when porting.
Existing C++ code didn't use a function for this but simply added
ENCODE_DIRECT_BASE. In Rust that's more verbose because char won't do
arithmetics, hence the function.
We'll add a dual function for decoding, so let's rename this.
BTW we should get rid of the "wchar" naming, it's just "char" in Rust.