Commit 09685c3682 tried making the apt
completions faster by doing two things:
1. Introduce a limiting "head"
2. Re-replace our "string" usage with tr
Unfortunately, in doing so it introduced a few issues:
1. The "tr" had a dangling "+" so it cut apart package
descriptions that contained a "+".
This caused e.g. "a C++ library" to generate another completion
candidate, "library".
2. In reusing "tr" it probably reintroduced #8575,
as tr is not 8-bit-clean.
3. It filtered too early, on the raw apt-cache output,
which caused it to fill up with long descriptions.
So e.g. for "texlive" it would only generate 10 completions,
where it should have matched 54 packages.
Because most of the speedup is in the "head" stopping early, we
instead go back to the old string way, but introduce a limiting "head"
after the "sed" (which will have removed everything but the package
name line and the first line of the description)
In my tests this is about ~10% slower than doing head early and using
tr, but it's more correct.
Admittedly I haven't been able to reproduce the 35s scenario that
09685 talks about, but the most likely cause of that is *apt-cache*
being slow - I don't see how string can be that much slower on another
system - and so it will most likely also be fixed by doing head here.
Future possibilities here include:
1. Using "apt-cache search --names-only", which gives a much nicer
format (but only for non-installed packages - the search strings are
apparently ANDed?)
2. Switching to `string split`, possibly using NUL and using `string
split0`?
3. Introducing a `string --null-in` switch so we can get by with one
`string`
4. (multi-threaded execution so the `string`s run in parallel)
`apt-cache` is just so incredibly slow that filtering against the final results
just doesn't cut it. Attempting to match against 'ac.*' (already taking
advantage of changing short search terms into prefix-only matches) would take
35 seconds, all of bottlenecked before the filtering step. This change uses more
of a heuristic to filter `apt-cache` results directly (before additional
filtering) to speed things up.
A variety of different limits from 100 to 5000 were timed and their result sets
compared to see what ended up artificially limiting valid completions vs what
took too long to be considered functional/usable and this is where we ended up.
GNU tr is not Unicode-aware, and was corrupting descriptions that had
non-ASCII characters.
Additionally, rather than using the Unicode private use characters, use
the ASCII/UTF-8 record separator character as it was intended.
The sed command could probably be rewritten to do all the heavy lifting
here, but would be even less readable.
Closes#8575.
Of note: The rpm/yum thing seems to be coupled, so I put it into one
function that tries the yum helper and uses the rpm path otherwise.
Zypper is already its own thing, so this should only be used for yum
and probably dnf (does that still have the helper?)
Zypper can be dropped, as that already used a separate function in the file.
Apk can just be inlined - it's literally one line for installed and another for all packages.
This function doesn't make any sense.
Most things that expect package names expect package names for *one
specific package manager*.
It only happens to work, most of the time, because most people only
have one package manager installed.