ThreadId is way slower than it should be for the sense that we use it in; it
doesn't cache the id and allocates an Arc internally.
We don't care about the thread id used in crate::threads correlating with any
other thread id the code uses anywhere (not that it does) because it's only used
for our own bookkeeping. Change to something much simpler instead.
Verified that std::sync::OnceLock<T> compiles to the same assembly at the
*access* site as the Option<T> we were using. The additional overhead upon init
is fine. No need for extra Box<T> indirection for IO_THREAD_POOL.
Don't force the internal use of `RefCell<T>`, let the caller place that into
`MainThread<>` manually. This lets us remove the reference to `MainThread<>`
from the definition of `Screen` again and reduces the number of
`assert_is_main_thread()` calls.
This allows us to get the terminfo information without linking against curses.
That means we can get by without a bunch of awkward C-API trickery.
There is no global "cur_term" kept by a library for us that we need to invalidate.
Note that it still requires a "unhashed terminfo database", and I don't know how well it handles termcap.
I am not actually sure if there are systems that *can't* have terminfo, everything I looked at
has the ncurses terminfo available to install at least.