We had pid_status defined as a pid_t instance, which was fine since on
most platforms pid_t is an alias for int. However, that is not
universally the case and waitpid takes an int *, not a pid_t *.
This eliminates the "missing" notion of env_var_t. Instead
env_get returns a maybe_t<env_var_t>, which forces callers to
handle the possibility that the variable is missing.
Internally fish should store vars as a vector of elements. The current
flat string representation is a holdover from when the code was written
in C.
Fixes#4200
It's bugged me forever that the scope is the second arg to `env_get()`
but not `env_set()`. And since I'll be introducing some helper functions
that wrap `env_set()` now is a good time to change the order of its
arguments.
There is no more race condition between parent and child with
regards to setting the process groups. Each child sets it for themselves
and then blocks indefinitely until the parent does what it needs to for
them (having waited for them to set their process groups). They are not
SIGCONT'd until the next process in the chain (if any) starts so that
that process can join their process group and open the pipes.
In the last commit, we introduced an indiscriminate if !EXTERNAL check
that unblocks a previously SIGSTOP'd command (if any) to allow the main
loop in exec_job to read from it without deadlocking (since builtins and
functions read directly from input as an optimization, sometimes).
Now only unblocking where a fork will not happen to ensure that if a
builtin ends up forking, that fork'd process is guaranteed to be able to
join the previous process' process group and access its output pipes.
We were having child processes SIGSTOP themselves immediately after
setting their process group and before launching their intended targets,
but they were not necessarily stopped by the time the next command was
being executed (so the opposite of the original race condition where
they might have finished executing by the time the next command came
around), and as a result when we sent them SIGCONT, that could never
reach. Now using waitpid to synchronize the SIGSTOP/SIGCONT between the
two.
If we had a good, unnamed inter-process event/semaphore, we could use
that to have a child process conditionally stop itself if the next
command in the job chain hadn't yet been started / setup, but this is
probably a lot more straightforward and less-confusing, which isn't a
bad thing.
Additionally, there was a bug caused by the fact that the main exec_job
loop actually blocks to read from previous commands in the job if the
current command is a built-in that doesn't need to fork.
With this waitpid code, I was able to finally add the SIGSTOP code to
all the fork'd processes in the main exec_job loop without introducing
deadlocks; it turns out that they should be treated just like the main
EXTERNAL fork, but they tend to execute faster causing the same deadlock
described above to occur more readily.
The only thing I'm not sure about is whether we should execute
unblock_pid undconditionally for all !EXTERNAL commands. It makes more
sense to *only* do that if a blocking read were about to be done in the
main loop, otherwise the original race condition could still appear
(though it is probably mitigated by whatever duration the SIGSTOP lasted
for, even if it is SIGCONT'd before the next command tries to join the
process group).
I hadn't realized that the for loop is called multiple times for a given
"single input" (anything that doesn't include semicolons, etc) to fish,
and so processes were being blocked but blocked_pid was lost by the time
that the next job (which was reading from the last process in the
previous job) came around.
Now using a static variable to store the last blocked PID. AFAICT, this
main job control loop is always executed from the same process and
thread, so this shouldn't need to be wrapped in atomics/mutexes, etc.
This code should be more portable, and certainly cleaner. We are
currently always sending SIGCONT to the last process (if it was part of
a job chain) regardless of whether it called SIGSTOP on itself or not,
which should be fine.
Need to explore whether or not the other forks in src/exec.cpp need to
be SIGSTOP'd on run or only the one that we included in this patch.
I'm not sure if this happens on all platforms, but under WSL with the
existing codebase, processes in the job chain that pipe their
stdout/stderr to the next process in the job could terminate before the
next job started (on fast enough machines for quick enough jobs).
This caused issues like #4235 and possibly #3952, at least for external
commands. What was happening is that the first process was finishing
before the second process was fully set up. fish would then try to
assign (in both the child and the parent) the process group id belonging
to the process group leader to the new process; and if the first process
had already terminated, it would have ended its process group with it as
well before that happened.
I'm not sure if there was already a mechanism in place for ensuring that
a process remains running at least as long as it takes for the next
process in the chain to join its group, etc., but if that code was
there, it wasn't working in my test setup (WSL).
This patch definitely needs some review; I'm not sure how I should
handle non-external commands (and external commands executed via
posix_spawn). I don't know if they are affected by the race condition in
the first place, but when I tried to add the same "wait for next command
in chain to run before unblocking" that would cause black screens
requiring ctrl+c to bypass.
The "unblock previous command" code was originally run by the next child
to be forked, but was then moved to the shell code instead, making it
more-centrally located and less error-prone.
Note that additional headers may be required for the mmap system call on
other platforms.
This is the first step to implementing issue #4200 is to stop subclassing
env_var_t from wcstring. Not too surprisingly doing this identified
several places that were incorrectly treating env_var_t and wcstring as
interchangeable types. I'm not talking about those places that passed
an env_var_t instance to a function that takes a wcstring. I'm talking
about doing things like assigning the former to the latter type, relying
on the implicit conversion, and thus losing information.
We also rename `env_get_string()` to `env_get()` for symmetry with
`env_set()` and to make it clear the function does not return a string.
This makes command substitutions impose the same limit on the amount
of data they accept as the `read` builtin. It does not limit output of
external commands or builtins in other contexts.
Fixes#3822
PR #3691 made most calls to `signal_block()` and `signal_unblock()`
no-ops unless a magic env var is set when fish starts running. It's
been seven months since that change was made and no problems have been
reported. This finishes that work by removing those no-op function calls
and support for the magic env var in our next major release (which won't
happen till at least six months from now).
This implements `status is-breakpoint` that returns true if the current
shell prompt is displayed in the context of a `breakpoint` command.
This also fixes several bugs. Most notably making `breakpoint` a no-op if
the shell isn't interactive. Also, typing `breakpoint` at an interactive
prompt should be an error rather than creating a new nested debugging
context.
Partial fix for #1310
This is the next step in determining whether we can disable blocking
signals without a good reason to do so. This makes not blocking signals
the default behavior. If someone finds a problem they can add this to
their ~/config/fish/config.fish file:
set FISH_NO_SIGNAL_BLOCK 0
Alternatively set that env var before starting fish. I won't be surprised
if people report problems. Till now we have relied on people opting in
to this behavior to tell us whether it causes problems. This makes the
experimental behavior the default that has to be opted out of. This will
give us a lot more confidence this change doesn't cause problems before
the next minor release.
Note that there are still a few places where we force blocking of
signals. Primarily to keep SIGTSTP from interfering with the shell in
response to manipulating the controlling tty. Bash is more selective
in the signals it blocks around the problematic syscalls (c.f., its
`git_terminal_to()` function). However, I don't see any value in that
refinement.
I recently upgraded the software on my macOS server and was dismayed to
see that cppcheck reported a huge number of format string errors due to
mismatches between the format string and its arguments from calls to
`assert()`. It turns out they are due to the macOS header using `%lu`
for the line number which is obviously wrong since it is using the C
preprocessor `__LINE__` symbol which evaluates to a signed int.
I also noticed that the macOS implementation writes to stdout, rather
than stderr. It also uses `printf()` which can be a problem on some
platforms if the stream is already in wide mode which is the normal case
for fish.
So implement our own `assert()` implementation. This also eliminates
double-negative warnings that we get from some of our calls to
`assert()` on some platforms by oclint.
Also reimplement the `DIE()` macro in terms of our internal
implementation.
Rewrite `assert(0 && msg)` statements to `DIE(msg)` for clarity and to
eliminate oclint warnings about constant expressions.
Fixes#3276, albeit not in the fashion I originally envisioned.