Commit Graph

206 Commits

Author SHA1 Message Date
David Adam
662708e72d src/exec: drop unused parameter in can_use_posix_spawn_for_job
Process object is not checked since 084ff64f4f.
2019-02-10 15:57:06 +08:00
ridiculousfish
d3fa58d621 Cleanup common.h
Remove a bunch of headers, simplify lots of code, migrate it into .cpp files.

Debug build time improves by ~3 seconds on my Mac.
2019-02-03 18:22:38 -08:00
ridiculousfish
6f682c8405 Fill io_buffer via background thread
This is a large change to how io_buffers are filled. The essential problem
comes about with code like (example):

    echo ( /bin/pwd )

The output of /bin/pwd must go to fish, not the tty. To arrange for this,
fish does the following:

1. Invoke pipe() to create a pipe.
2. Add an io_bufferfill_t redirection that owns the write end of the pipe.
3. After fork (or equiv), call dup2() to replace pwd's stdout with this  pipe.

Now when /bin/pwd writes, it will send output to the read end of the pipe.
But who reads it?

Prior to this fix, fish would do the following in a loop:

1. select() on the pipe with a 10 msec timeout
2. waitpid(WNOHANG) on the pwd proc

This polling is ugly and confusing and is what is replaced here.

With this new change, fish now reads from the pipe via a background thread:

1. Spawn a background pthread, which select()s on the pipe's read end with
a long (100 msec) timeout.
2. In the foreground, waitpid() (allowing hanging) on the pwd proc.

The big win here is a major simplification of job_t::continue_job() since
it no longer has to worry about filling buffers. This will make things
easier for concurrent execution.

It may not be obvious why the background thread still needs a poll (100 msec).
The answer is for cases where the write end of the fd escapes, in particular
background processes invoked inside command substitutions. psub is perhaps
the only important case of this (other shells typically just hang here).
2019-02-03 01:58:49 -08:00
ridiculousfish
178b72b2fd io_buffer_t becomes io_bufferfill_t
This makes some significant architectual improvements to io_pipe_t and
io_buffer_t.

Prior to this fix, io_buffer_t subclassed io_pipe_t. io_buffer_t is now
replaced with a class io_bufferfill_t, which does not subclass pipe.

io_pipe_t no longer remembers both fds. Instead it has an autoclose_fd_t,
so that the file descriptor ownership is clear.
2019-02-03 01:58:49 -08:00
ridiculousfish
084ff64f4f Allow posix_spawn more often
Now that we no longer open files after fork, we can correctly report errors
for failed file opens. So allow posix_spawn even if there's redirections.
2019-02-03 01:58:49 -08:00
ridiculousfish
2742267b9e Use dup2_list_t in posix_spawn
This simplifies the posix_spawn path and unifies it with the fork execution
path.
2019-02-03 01:58:49 -08:00
ridiculousfish
d62576ce22 Adopt dup2_list_t in fork execution path
This switches IO redirections after fork() to use the dup2_list_t,
instead of io_chain_t. This results in simpler code with much simpler
error handling.
2019-02-03 01:58:49 -08:00
ridiculousfish
6ba0d4c88a Revert io_bufferfill_t stack
This reverts commit 88dc484858 onwards.
2019-02-02 17:53:40 -08:00
ridiculousfish
9a4153f5e2 Fill io_buffer via background thread
This is a large change to how io_buffers are filled. The essential problem
comes about with code like (example):

    echo ( /bin/pwd )

The output of /bin/pwd must go to fish, not the tty. To arrange for this,
fish does the following:

1. Invoke pipe() to create a pipe.
2. Add an io_bufferfill_t redirection that owns the write end of the pipe.
3. After fork (or equiv), call dup2() to replace pwd's stdout with this  pipe.

Now when /bin/pwd writes, it will send output to the read end of the pipe.
But who reads it?

Prior to this fix, fish would do the following in a loop:

1. select() on the pipe with a 10 msec timeout
2. waitpid(WNOHANG) on the pwd proc

This polling is ugly and confusing and is what is replaced here.

With this new change, fish now reads from the pipe via a background thread:

1. Spawn a background pthread, which select()s on the pipe's read end with
a long (100 msec) timeout.
2. In the foreground, waitpid() (allowing hanging) on the pwd proc.

The big win here is a major simplification of job_t::continue_job() since
it no longer has to worry about filling buffers. This will make things
easier for concurrent execution.

It may not be obvious why the background thread still needs a poll (100 msec).
The answer is for cases where the write end of the fd escapes, in particular
background processes invoked inside command substitutions. psub is perhaps
the only important case of this (other shells typically just hang here).
2019-02-02 14:21:46 -08:00
ridiculousfish
78bbcef356 io_buffer_t becomes io_bufferfill_t
This makes some significant architectual improvements to io_pipe_t and
io_buffer_t.

Prior to this fix, io_buffer_t subclassed io_pipe_t. io_buffer_t is now
replaced with a class io_bufferfill_t, which does not subclass pipe.

io_pipe_t no longer remembers both fds. Instead it has an autoclose_fd_t,
so that the file descriptor ownership is clear.
2019-02-02 14:21:46 -08:00
ridiculousfish
7c256e7e51 Allow posix_spawn more often
Now that we no longer open files after fork, we can correctly report errors
for failed file opens. So allow posix_spawn even if there's redirections.
2019-02-02 14:21:46 -08:00
ridiculousfish
4c0b6a6add Use dup2_list_t in posix_spawn
This simplifies the posix_spawn path and unifies it with the fork execution
path.
2019-02-02 14:21:46 -08:00
ridiculousfish
d895075d9b Adopt dup2_list_t in fork execution path
This switches IO redirections after fork() to use the dup2_list_t,
instead of io_chain_t. This results in simpler code with much simpler
error handling.
2019-02-02 14:21:46 -08:00
ridiculousfish
b00f039489 Clean up the io_chain_t interface 2019-01-31 18:49:52 -08:00
ridiculousfish
371f67f1b5 Remove pipe_read_fd
In practice it was always STDIN_FILENO.
2019-01-31 17:58:59 -08:00
ridiculousfish
a2aab24db7 Switch io_mode to an enum class 2019-01-31 12:12:46 -08:00
ridiculousfish
6f52e6bb1c Instantize contents of exec.cpp and others 2019-01-10 20:07:47 -08:00
ridiculousfish
c1dd284b3e Instantize env_set
Switch env_set to an instance method on environmnet_t.
2019-01-10 20:05:45 -08:00
ridiculousfish
ede66ccaac Instance env_set_argv and env_set_pwd 2019-01-10 20:29:10 -08:00
ridiculousfish
e6872b83b0 Eliminate global env_export_arr()
This assumes the set of exported variables is a global property; but we
want it to be a local property.
2019-01-10 20:29:10 -08:00
Mahmoud Al-Qudsi
3d557518d5 Replace 0/1 with true/false in calls to job_reap 2018-11-18 17:40:18 -06:00
Mahmoud Al-Qudsi
d0085cae3c Fix zombie job on failed redirection in exec_job
Closes #5346.
2018-11-18 17:40:18 -06:00
ridiculousfish
73537fc7c3 Remove NESTED and WAIT_BY_PROCESS
Now jobs are aware of their parent jobs, and can interrogate those jobs,
to determine if every job in the chain is fully constructed.
Remove flags and the static stacks that manipulated them.
2018-11-04 01:52:17 -08:00
ridiculousfish
3770d9fb7a Teach each job about its parent
The parent of a job is the parent pipeline that executed the function or
block corresponding to this job. This will help simplify
process_mark_finished_children().
2018-11-04 01:40:07 -08:00
ridiculousfish
93aa95d8c4 Remove proc_last_bg_pid
It wasn't used.
2018-11-03 19:28:16 -07:00
Mahmoud Al-Qudsi
4d3b56c151 Associate external commands in functions with extant pgrps
When a function is encountered by exec_job, a new context is created for
its execution from the ground up, with a new job and all, ultimately
resulting in a recursive call to exec_job from the same (main) thread.

Since each time exec_job encounters a new job with external commands
that needs terminal control it creates a new pgrp and gives it control
of the terminal (tcsetpgrp & co), this effectively takes control away
from the previously spawned external commands which may be (and likely
are) expecting to still have terminal access.

This commit attempts to detect when such a situation arises by handling
recursive calls to exec_job (which can only happen if the pipeline
included a function) by borrowing the pgrp from the (necessarily still
active) parent job and spawning new external commands into it.

When a parent job spawns new jobs due to the evaluation of a new
function (which shouldn't be the case in the first place), we end up
with two distinct jobs sharing one pgrp (to fix #3952). This can lead to
early termination of a pgrp if finished parent job children are reaped
before future processes in either the parent or future child jobs can
join it.

While the parent job is under construction, require that waitpid(2)
calls for the child job be done by process id and not job pgrp.

Closes #3952.
2018-10-27 18:01:38 -05:00
Mahmoud Al-Qudsi
419d7a5138 Don't decompose shared_ptr to raw pointer for exec_job 2018-10-27 18:01:38 -05:00
Mahmoud Al-Qudsi
3afcca3114 Drop keepalive process even for WSL
Windows 10 17763 Redstone 5 (October 2018 Update) officially brings
zombie support (first introduced in 17713) to the general public.

See https://docs.microsoft.com/en-us/windows/wsl/release-notes#build-17763-1809
2018-10-27 18:01:38 -05:00
Mahmoud Al-Qudsi
f9118d964e Clean up job flags, status helpers, and instance helper methods
* Convert JOB_* enums to scoped enums
* Convert standalone job_is_* functions to member functions
* Convert standalone job_{promote, signal, continue} to member functions
* Convert standolen job_get{,_from_pid} to `job_t` static functions
* Reduce usage of JOB_* enums outside of proc.cpp by using new
  `job_t::is_foo()` const helper methods instead.

This patch is only a refactor and should not change any functionality or
behavior (both observed and unobserved).
2018-10-27 18:01:38 -05:00
Mahmoud Al-Qudsi
e753581df7 Bring some consistency and rationale to debug log levels
* Debug level 3: describe all commands being executed (this is, after all,
a shell and one can argue that this is the most important debug
information avaliable)
* Debug level 4: details of execution, mainly fork vs no-fork and io
handling

Also introduced j->preview() to print a short descriptor of the job
based on the head of the first process so we don't overwhelm with
needless repitition, but also so that we don't have to rely on
distinguishing between repeated, non-unique/non-monotonic job ids that
are often recycled within a single "execution cycle" (pressing enter
once).
2018-10-27 18:01:38 -05:00
Mahmoud Al-Qudsi
d467bb58d9 Replace pid/pgid -2 with INVALID_PID 2018-10-27 18:01:38 -05:00
Mahmoud Al-Qudsi
af0c8d51e0 Overhaul job and terminal control
* Instead of reaping all child processes when we receive a SIGCHLD, try
reaping only processes belonging to process groups from fully-
constructed jobs, which should eliminate the need for the keepalive
process entirely (WSL's lack of zombies not withstanding) as now
completed processes are not reaped until the job has been fully
constructed (i.e.  all processes launched), which means their process
group should still be around for new processes to join.

* When `tcgetpgrp()` calls return 0, attempt to `tcsetpgrp()` before
invoking failure handling code.

* When forking a builtin and not running interactively, do not bail if
unable to set/restore terminal attributes.

Fixes #4178. Fixes #3805. Fixes #5210.
2018-10-27 18:01:38 -05:00
ridiculousfish
a17a815c87 Revert "Add vector of cleanup/termination events to be executed before quit"
This reverts commit 8c14f0f30f.

This list is not reliable - there are many ways for fish to quit that does not
invoke these functions. It's also not necessary since the history is correctly
saved on exec.
2018-09-28 20:21:23 -04:00
Mahmoud Al-Qudsi
8c14f0f30f Add vector of cleanup/termination events to be executed before quit 2018-09-28 11:34:07 -05:00
ridiculousfish
ca61fc1bf8 Stop retrying close() on EINTR
https://lwn.net/Articles/576478/
http://austingroupbugs.net/view.php?id=529
https://sourceware.org/bugzilla/show_bug.cgi?id=14627
2018-09-05 21:49:31 -07:00
ridiculousfish
8b277e711e Large refactor of exec.cpp
Break up that monster function.
2018-09-03 15:57:11 -07:00
ridiculousfish
eca4d113c6 Factor do_fork into a real function 2018-09-03 14:33:53 -07:00
ridiculousfish
2a62e18635 Remove child_forked and child_spawned
These variables weren't used for anything.
2018-09-03 13:31:03 -07:00
ridiculousfish
f7a020ad33 Rename launch process to exec_process_in_job
This avoids a name collision with another launch_process
2018-09-03 11:18:39 -07:00
ridiculousfish
48c510572b Factor out launch_process from exec.cpp
Makes the monster function slightly more tractable.
2018-09-01 14:54:23 -07:00
ridiculousfish
753639aa9c Reduce the scope of pid in exec_job 2018-09-01 14:39:32 -07:00
ridiculousfish
ec9c592edc Adopt autoclose_fd_t in exec_job 2018-09-01 14:27:58 -07:00
ridiculousfish
d9f34147c3 builtins to only acquire terminal if owned by their pgroup
Fix #5133 changed builtins to acquire the terminal, but this regressed
caused fish to be stopped when running in background via `sudo fish`.
Fix this by only acquiring the terminal if the terminal was owned by the
builtin's pgroup.

Fixes #5147
2018-08-18 16:56:01 -07:00
ridiculousfish
d34a300818 Add string split0
This adds a new string command split0, which splits on zero bytes.
split0 has superpowers because its output is not further split on
newlines when used in command substitutions.
2018-07-01 15:56:33 -07:00
ridiculousfish
f998afaa23 Adopt separated_buffer_t in io_buffer_t 2018-07-01 15:56:33 -07:00
ridiculousfish
90a4af5112 Add separated_buffer_t and adopt it in output_stream_t
separated_buffer_t encapsulates the logic around discarding (which
was previously duplicated between output_stream_t and io_buffer_t),
and will also encapsulate the logic around explicitly separated
output.
2018-07-01 15:56:33 -07:00
ridiculousfish
5b9331ade0 Teach io_buffer_t to append from output_stream_t directly
This will simplify logic when we teach output_stream_t about explicitly
split outputs, i.e. for 'string split0'
2018-07-01 15:56:33 -07:00
ridiculousfish
2443ea92c3 Eliminate a common subexpression 2018-06-16 11:43:52 -07:00
Mahmoud Al-Qudsi
0dd2607cac Iron out situation with setpgid() calls after posix_spawn()
Closes #4715. Ticks off a box in #4154.
2018-05-16 19:34:56 -05:00
Aaron Miller
517b77ca74 Fix handling of signals (#4851) 2018-03-24 12:37:15 -07:00
Mahmoud Al-Qudsi
26cc112096 Implement $last_pid, taking the place of %last
Set as a global variable upon the execution of a background job.
2018-03-09 08:56:13 -06:00
Mahmoud Al-Qudsi
2f2a221c56 Don't spawn keepalive for WSL when only one command
This should speed things up on slower PCs given that the vast majority
of shell commands are simple jobs consisting of a single command without
any pipelines, in which case there's no need for a keepalive process at
all. Applies to WSL only.
2018-03-04 21:54:12 -06:00
Mahmoud Al-Qudsi
cf8850a33f Add temporary fix for #4778 (background processes on WSL)
As a temporary workaround for the behavior described in
Microsoft/WSL#2997 wherein WSL does not correctly assign the spawned
child its own PID as its PGID, explicitly set the PGID for the newly
spawned process.
2018-03-04 20:18:38 -06:00
Mahmoud Al-Qudsi
50541544f2 Distinguish between function and block IO for fork debug log messages 2018-02-18 16:49:27 -06:00
Mahmoud Al-Qudsi
be13ac353b Refactor job control to make functions act like their names imply
The job control functions were a bit messy, in particular
`set_child_group`'s name would imply that all it does is set the child
group, but in reality it used to set the child group (via `setpgid`),
set the job's pgrp if it hasn't been set, and possibly assign control of
the terminal to the newly-created job.

These have been split into separate functions. Now `set_child_group`
does just (and only) that, `maybe_assign_terminal` might assign the
terminal to the new pgrp, and `on_process_created` is used to set the
job properties the first time an external process is created. This might
also speed things up (but probably not noticeably) as there are no more
repeated calls to `getpgrp()` if JOB_CONTROL is not set.

Additionally, this closes #4715 by no longer unconditionally calling
`setpgid` on all new processes, including those created by `posix_spawn`
which does not need this since the child's pgrep is set at in the
arguments to that API call.
2018-02-14 19:08:12 -06:00
ridiculousfish
c3f1961e36 Stop copying out function definition when executing a function
This switches function execution from the function's source code to
its stored node and pstree. This means we no longer have to re-parse
the function every time we execute it.
2018-02-12 10:55:00 -08:00
ridiculousfish
976514597d Migrate function getters to use function_get_properties
This replaces some of the teensy function getters with the function
that just returns a shared_ptr to the properties struct.
2018-02-12 10:53:22 -08:00
ridiculousfish
41ba0dfadb Evaluate tnode_t instead of parse_node_t
This concerns block nodes with redirections, like
begin ... end | grep ...
Prior to this fix, we passed in a pointer to the node. Switch to passing
in the tnode and parsed source ref. This improves type safety and better
aligns with the function-node plans.
2018-02-12 10:51:39 -08:00
ridiculousfish
cb03be9fe6 Remove unused 'pgrp_set' variable 2018-02-07 12:54:26 -08:00
ridiculousfish
8de266afb4 Improve commenting regarding process groups and builtins. 2018-02-07 12:49:12 -08:00
ridiculousfish
0c18f68cc2 Remove support for blocking children
This removes support for blocking children via signals, which was used
to orchestrate processes on WSL. Now we use the keepalive mechanism
instead.
2018-02-07 12:49:12 -08:00
ridiculousfish
080521071f Teach keepalives to exit when their parent dies
keepalive processes are typically killed by the main shell process.
However if the main shell exits the keepalive may linger. In WSL
keepalives are used more often, and the lingering keepalives are both
leaks and prevent the tests from finishing.

Have keepalives poll for their parent process ID and exit when it
changes, so they can clean themselves up. The polling frequency can be
low.
2018-02-07 12:49:12 -08:00
ridiculousfish
e9f676a7f4 Provide a way to stop blocking children via s_block_children
This is to investigate alternatives to the existing kill(SIGSTOP)
WSL compatibility thing.
2018-02-07 12:49:11 -08:00
ridiculousfish
1b1fd5ab9b Mark needs_keepalive more often for WSL
Have WSL use a keepalive whenever the first process is external.
This works around the fact that WSL prohibits setting an exited
process as the group leader.
2018-02-07 12:49:11 -08:00
ridiculousfish
d17b298a48 Factor out the code that executes a builtin from exec_job()
Very early work on untangling the exec_job spaghetti.
2017-12-22 13:41:29 -08:00
Aaron Gyes
a9283803d4 Revert "Non-exported vars: rename SHLVL to shlvl"
Duh, of course it is exported.

This reverts commit 5fc17dcc82.
2017-10-15 04:37:34 -07:00
Aaron Gyes
5fc17dcc82 Non-exported vars: rename SHLVL to shlvl
Fixes #4414
2017-10-15 04:33:27 -07:00
Mahmoud Al-Qudsi
9fdfe44236 Fix type of pid_status variable
We had pid_status defined as a pid_t instance, which was fine since on
most platforms pid_t is an alias for int. However, that is not
universally the case and waitpid takes an int *, not a pid_t *.
2017-09-26 08:16:36 -05:00
ridiculousfish
3d40292c00 Switch env_var to using maybe_t
This eliminates the "missing" notion of env_var_t. Instead
env_get returns a maybe_t<env_var_t>, which forces callers to
handle the possibility that the variable is missing.
2017-09-01 00:14:42 -07:00
Kurtis Rader
f872f25f5b change env_var_t to a vector of strings
Internally fish should store vars as a vector of elements. The current
flat string representation is a holdover from when the code was written
in C.

Fixes #4200
2017-08-18 16:24:30 -07:00
Kurtis Rader
58b604c5ba change order of env_set() args
It's bugged me forever that the scope is the second arg to `env_get()`
but not `env_set()`. And since I'll be introducing some helper functions
that wrap `env_set()` now is a good time to change the order of its
arguments.
2017-08-14 18:18:09 -07:00
Kurtis Rader
975a5bfbde make style-all time again
Recent changes have introduced some style deviations so clean them up.
2017-08-06 16:05:51 -07:00
Kurtis Rader
acdb81bbca lint and style cleanups 2017-08-06 15:47:01 -07:00
Kurtis Rader
083224d1c0 fixes to job control changes
The job control changes need a couple of fixes for compatibility with
changes I merged while @mqudsi was workin on his change.
2017-08-06 15:25:42 -07:00
Kurtis Rader
52d739c746 Revert "Revert "finish cleanup of signal blocking code""
This reverts commit 35ee28ff24.

Reapply the signal blocking cleanup change on top of the job control
changes made by @mqudsi.
2017-08-06 14:46:12 -07:00
Mahmoud Al-Qudsi
0594735714 Deduplication between INTERNAL_FUNCTION and INTERNAL_BLOCK_NODE 2017-08-06 14:41:27 -07:00
Mahmoud Al-Qudsi
4dfb334db8 Corrected job_type for external command in debug log 2017-08-06 14:41:27 -07:00
Mahmoud Al-Qudsi
384879704a Unified all child/parent forking code in exec_job 2017-08-06 14:41:27 -07:00
Mahmoud Al-Qudsi
4a1de248bc Split internal_exec to its own function 2017-08-06 14:41:27 -07:00
Mahmoud Al-Qudsi
87db424e45 Removed unused <mutex> header include 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
dabe718c52 Removed unused job_t * parameter from setup_child_process 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
628db65504 OS X EINVAL compatibility for waitpid
The return value on OS X is more along the lines of the documented
waitpid behavior; EINVAL is returned if the group no longer exists.
2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
8e63386203 Removed old/unneeded variants of block_child 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
16d2f4faff Added important comment about blocked_pid 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
15da6f0203 Minor refactoring 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
5db8065f15 unblock_previous on exec_job finish 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
c3d756b5df blocking only if pipes_to_next_command breaks things like read.expect test 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
f7b051905e Split child_set_group from setup_child_process
setup_child_process blocks in the case of IO_FILE, meaning it can't
be called before child processes SIGSTOP.
2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
d6c4e66484 Retry setpgid in setup_child_process on EPERM 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
1ae0272c4e Improved comments 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
30aa8b3663 No need to unblock last process since it will no longer be SIGSTOP'd 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
abf6874a2d Be more judicious about when SIGSTOP is performed 2017-08-06 14:40:18 -07:00
Mahmoud Al-Qudsi
99c6f65fee Better set_child_group logic for multi-process jobs 2017-08-06 14:40:17 -07:00
Mahmoud Al-Qudsi
9f2addcf27 Set child process group in case of posix_spawn 2017-08-06 14:40:17 -07:00
Mahmoud Al-Qudsi
25afc9b377 Changed how process groups are assigned to child processes
There is no more race condition between parent and child with
regards to setting the process groups. Each child sets it for themselves
and then blocks indefinitely until the parent does what it needs to for
them (having waited for them to set their process groups). They are not
SIGCONT'd until the next process in the chain (if any) starts so that
that process can join their process group and open the pipes.
2017-08-06 14:40:17 -07:00
Mahmoud Al-Qudsi
c81cf56c0b Don't indiscriminately unblock previous cmd for internal builtin/functions
In the last commit, we introduced an indiscriminate if !EXTERNAL check
that unblocks a previously SIGSTOP'd command (if any) to allow the main
loop in exec_job to read from it without deadlocking (since builtins and
functions read directly from input as an optimization, sometimes).

Now only unblocking where a fork will not happen to ensure that if a
builtin ends up forking, that fork'd process is guaranteed to be able to
join the previous process' process group and access its output pipes.
2017-08-06 14:40:17 -07:00
Mahmoud Al-Qudsi
87394a9e0b Fixed race condition in new job control synchronization
We were having child processes SIGSTOP themselves immediately after
setting their process group and before launching their intended targets,
but they were not necessarily stopped by the time the next command was
being executed (so the opposite of the original race condition where
they might have finished executing by the time the next command came
around), and as a result when we sent them SIGCONT, that could never
reach. Now using waitpid to synchronize the SIGSTOP/SIGCONT between the
two.

If we had a good, unnamed inter-process event/semaphore, we could use
that to have a child process conditionally stop itself if the next
command in the job chain hadn't yet been started / setup, but this is
probably a lot more straightforward and less-confusing, which isn't a
bad thing.

Additionally, there was a bug caused by the fact that the main exec_job
loop actually blocks to read from previous commands in the job if the
current command is a built-in that doesn't need to fork.

With this waitpid code, I was able to finally add the SIGSTOP code to
all the fork'd processes in the main exec_job loop without introducing
deadlocks; it turns out that they should be treated just like the main
EXTERNAL fork, but they tend to execute faster causing the same deadlock
described above to occur more readily.

The only thing I'm not sure about is whether we should execute
unblock_pid undconditionally for all !EXTERNAL commands. It makes more
sense to *only* do that if a blocking read were about to be done in the
main loop, otherwise the original race condition could still appear
(though it is probably mitigated by whatever duration the SIGSTOP lasted
for, even if it is SIGCONT'd before the next command tries to join the
process group).
2017-08-06 14:40:17 -07:00
Mahmoud Al-Qudsi
dfac81803b Improved blocked prcoess comments, clarified job vs command chain 2017-08-06 14:40:17 -07:00
Mahmoud Al-Qudsi
f653fbfaf4 fixup! Using SIGSTOP/SIGCONT instead of mmap & sem_t to synchronize jobs 2017-08-06 14:40:17 -07:00
Mahmoud Al-Qudsi
fb13b370e2 Fixed cases where first command in chain would stay blocked
I hadn't realized that the for loop is called multiple times for a given
"single input" (anything that doesn't include semicolons, etc) to fish,
and so processes were being blocked but blocked_pid was lost by the time
that the next job (which was reading from the last process in the
previous job) came around.

Now using a static variable to store the last blocked PID. AFAICT, this
main job control loop is always executed from the same process and
thread, so this shouldn't need to be wrapped in atomics/mutexes, etc.
2017-08-06 14:40:17 -07:00