Table of Contents
Process Groups and Terminal Signaling
Pipelines and Process Groups
In the last lesson and lab, we’ve been discussing job control and the mechanisms that enable it. Generally, job control is a feature of the shell and supported by the terminal device driver. The shell manages which jobs are stopped or running and notifies the terminal driver which job is currently in the foreground. The terminal device driver listens for special keys, like Ctrl-c or Ctrl-z, and delivers the appropriate signal to the foreground process, like terminate or stop.
That narrative is fairly straightforward, as long as there is only one process running within a job, but a job may contain more than one process, which could complicate the actions of the terminal device driver. Additionally, jobs can be further grouped together into sessions, and the mechanisms that enable all this interaction requires further discussion. In this lesson, we will explore process grouping and how this operating system services support job control and shell features we’ve grown to rely on (and love?).
Pipeline of processes
Consider the following pipeline:
sleep 10 | sleep 20 | sleep 30 | sleep 50 &
Here we have four different sleep commands running in a pipeline. The sleep command doesn’t read or write to the terminal; it just sleeps for that many seconds and then exits. None of the sleep commands are blocking or waiting on input from another sleep command, so they can all run independently. We just happend to put them in a pipeline, but what is the impact of that? How long will this job take to complete?
One possibility is that each sleep command will run in sequence. First sleep 10 runs, then sleep 20, then sleep 30 runs, and finally sleep 50 runs, and thus it would take 10+20+30+50 = 110 seconds for the pipeline to finish. Another possibility is that they run all at the same time, or concurrently or in parallel, in which case the job would complete when the loggest sleep finishes, 50 seconds.
These two possibilities, in sequence and in parallel, also describe two possibilities for how a pipeline is executed. In sequence would imply that the shell forks the first item in the pipeline, lets that run, then the second item in the pipeline, lets that run, and so on. Or, in parallel: the shell forks all the items in the pipeline at once and lets the run concurrently. The major difference between these two choices is that a pipeline executing in sequence would have a single process running at a time for each total job while executing in parallel, however, would have multiple currently running processes per job.
By now, hopefully, you’ve already plugged that pipeline into the shell and found out that, yes, the pipeline executes in parallel, not in sequence. We can see this as well using the ps command.
> sleep 10 | sleep 20 | sleep 30 | sleep 50 &
[1] 4128
> ps -o pid,args
PID COMMAND
3981 -bash
4125 sleep 10
4126 sleep 20
4127 sleep 30
4128 sleep 50
4129 ps -o pid,args
Process Grouping for Jobs
The implication of this discovery, that all process in the pipeline run concurrently, is that the shell must use a procedure for forking each of the process individually. But, then, how are these process linked? They are suppose to be a single job after all, and we also know that the terminal device driver is responsible for delivering signals to the foreground job. There must be some underlying procedure and process to enable this behavior, and, of course, there is.
The operating system provides a number of ways to group processes together. Process can be grouped into both process groups and sessions. A process group is a way to group processes into distinct jobs that are linked, and a session is way to link process groups under a single interruptive unit, like the terminal.
The key to understanding how the pipeline functions is that all of these process are places in the same process group, and we can see that by running the pipeline again. This time, however, we can also request that ps outputs the parent pid (ppid) and the process group (pgid) in addition to the process id (pid) and the command arguments (args).
> sleep 10 | sleep 20 | sleep 30 | sleep 50 &
[1] 4134
> ps -o pid,pgid,ppid,args
PID PGID PPID COMMAND
3981 3981 3980 -bash
4131 4131 3981 sleep 10
4132 4131 3981 sleep 20
4133 4131 3981 sleep 30
4134 4131 3981 sleep 50
4135 4135 3981 ps -o pid,pgid,ppid,args
Notice first that the shell, bash, has a pid
of 3981
and process group id (pgid
) that is the same. The shell is in it’s own process group. Similarly, the ps
command itself also has a pid
that is the same as its process group. However, the sleep
commands, are in the process group id of 4131
, which also is the pid
of the first process in the pipeline. We can visualize this relationship like so:
As you can see, the rule of thumb for process grouping is that process executing as the same job, e.g., a single input to the shell as a pipeline, are placed in the same group. Also, the choice of process group id is the pid
of the first process.
Programming with Process Groups
Below, we will look at how we program with process groups using system calls, and we will investigate this from the perspective of the programmer as well as how the shell automatically groups processes. We will use series of fairly straight forward system calls, and to bootstrap that discussion, we outline them below with brief descriptions.
Retrieving pid
’s or pgid
’s:
pid_t getpid()
: get the process id for the calling processpid_t getppid()
: get the process id of the parent of the calling processpid_t getpgrp()
: get the process group id of the calling processpid_t getpgid(pid_t pid)
: get the process group id for the process identified bypid
Setting pgid
’s:
pid_t setpgrp()
: set the process group of the calling process to itself, i.e. after a call tosetpgrp()
, the following condition holdsgetpid() == getpgrp()
.pid_t setpgid(pid_t pid, pid_t pgid)
: set the process group id of the process identified bypid
to thepgid
, ifpid
is 0, then set the process group id of the calling process, and ifpgid
is 0, then thepid
of the process identified bypid
it is made the same as its process group, i.e.,setpgid(0, 0)
is equivalent to callingsetpgrp()
.
Retrieving the Process Group
Each process group has a unique process group identifier, or pgid
, which are typically a pid
of a process that is a member of the group. Upon a fork()
, the child process inherits the parent’s process group. We can see how this works with a small program that forks a child and prints the process group identifies of both parent and child.
/* inherit_pgid.c */
int main(int argc, char * argv[])
{
pid_t c_pid, pgid, pid;
c_pid = fork();
if (c_pid == 0) { /* CHILD */
pgid = getpgrp();
pid = getpid();
printf("Child: pid: %d pgid: *%d*\n", pid, pgid);
} else if (c_pid > 0) { /* PARENT */
pgid = getpgrp();
pid = getpid();
printf("Parent: pid: %d pgid: *%d*\n", pid, pgid);
} else { /* ERROR */
perror(argv[0]);
_exit(1);
}
return 0;
}
Here is the output of running this program.
> ./inherit_pgid
Parent: pid: 3630 pgid: *3630*
Child: pid: 3631 pgid: *3630*
The process groups are the same, and that’s because a child inherits the process group of its parent.
Now let’s look at a similar program that doesn’t fork, and instead just prints the process group identifier of itself and its parent, which is the shell.
/* getpgrp.c */
int main(int argc, char * argv[]){
pid_t pid, pgid; // process id and process group for this program
pid_t ppid, ppgid; // process id and process group for the _parent_
// current
pid = getpid();
pgid = getpgrp();
// parent
ppid = getppid();
ppgid = getpgid(ppid);
// print this parent's process pid and pgid
printf("%s: (current) pid: %d pgid: %d\n", argv[0], pid, pgid);
printf("%s: (parent) ppid: %d pgid: %d\n", argv[0], ppid, ppgid);
return 0;
}
If we were to run this program in the shell, you might expect that both the child and the parent would print the same process group. Of course, why shouldn’t this be the case? The program is a result of a fork from the shell, and thus the parent is the shell and the child is the program, and that’s what just happened before, the parent and child had the same process group. But, looking at the output, that is not what occurs here.
> ./getpgrp
./getpgrp: (current) pid: 3760 pgid: 3760
./getpgrp: (parent) ppid: 369 pgid: 369
Instead, we find that the parent, which is the shell, is not in the same process group as the child, the getpgrp
program. Why is that? This is because the new process is also a job in the shell and each job needs to run in its own process group for the purpose of terminal signaling. What we can now recognize from these examples, starting with the pipeline of sleep commands, is that a shell will fork each process separately in a job and assign the process group id based on the first child forked, as is clear upon further inspection of the output of these two examples:
> sleep 10 | sleep 20 | sleep 30 | sleep 50 &
[1] 4134
> ps -o pid,pgid,ppid,args
PID PGID PPID COMMAND
3981 3981 3980 -bash
4131 4131 3981 sleep 10
4132 4131 3981 sleep 20
4133 4131 3981 sleep 30
4134 4131 3981 sleep 50
4135 4135 3981 ps -o pid,pgid,ppid,args
> ./inherit_pgid
Parent: pid: 3630 pgid: *3630*
Child: pid: 3631 pgid: *3630*
Setting the Process Group
Finally, now that we have learned to identify the process group, the next thing to do is to assign new process groups. There are two functions that do this: setpgrp()
and setpgid()
.
setpgrp()
: sets the process group of the calling process to itself. That is the calling process joins a process group of one, containing itself, where its pid is the as its pgid.setpgid(pid_t pid, pid_t pgid)
: set the process group of the process identified bypid
topgid
. Ifpid
is 0, then sets the process group of the calling process topgid
. Ifpgid
is 0, then sets the process group of the process identified bypid
topid
. Thus,setpgid(0, 0)
is the same assetpgrp()
.
Let’s consider a small program that sets the process group of the child after a fork using setpgrp()
call from the child. The program below will print the process id’s and process groups from the child’s and parent’s perspective.
/* setpgrp.c */
int main(int argc, char * argv[])
{
pid_t cpid, pid, pgid, cpgid; // process id's and process groups
cpid = fork();
if (cpid == 0) { /* CHILD */
// set process group to itself
setpgrp();
// print the pid, and pgid of child from child
pid = getpid();
pgid = getpgrp();
printf("Child: pid: %d pgid: *%d*\n", pid, pgid);
} else if (cpid > 0) { /* PARENT */
// print the pid, and pgid of parent
pid = getpid();
pgid = getpgrp();
printf("Parent: pid: %d pgid: %d \n", pid, pgid);
// print the pid, and pgid of child from parent
cpgid = getpgid(cpid);
printf("Parent: Child's pid: %d pgid: *%d*\n", cpid, cpgid);
} else { /* ERROR */
perror("fork");
_exit(1);
}
return 0;
}
And, here’s the output:
> ./setpgrp
Parent: pid: 20178 pgid: 20178
Parent: Child's pid: 20179 pgid: *20178*
Child: pid: 20179 pgid: *20179*
Clearly, something is not right. The child sees a pgid that is different than the parent. What we have here is a race condition, which is when you have two processes running in parallel, you don’t know which is going to finish the race first.
Consider that there are two possibilities for how the above program will execute following the fork
. In one possibility, after the fork
, the child runs before the parent and the process group is set properly, and in the other scenario, the parent runs first reads the process group before the child gets a chance to set it. It is the later that we see above, the parent running before the child, thus the wrong pgid
.
To avoid these issues, when setting the process group of a child, you should call setpgid()
/setpgrp()
in both the parent and the child before anything depends on those values. In this way, you can disambiguate the runtime process, it will not matter which runs first, the parent or the child, the result is always the same, the child is placed in the appropriate process group. Below is an example of that and the output.
/* setpgid.c */
int main(int argc, char * argv[])
{
pid_t cpid, pid, pgid, cpgid; //process id's and process groups
cpid = fork();
if (cpid == 0) { /* CHILD */
// set process group to itself
setpgrp(); //<------------------!
// print the pid, and pgid of child from child
pid = getpid();
pgid = getpgrp();
printf("Child: pid: %d pgid: *%d*\n", pid, pgid);
} else if (cpid > 0) { /* PARENT */
// set the process group of child
setpgid(cpid, cpid); //<------------------!
// print the pid, and pgid of parent
pid = getpid();
pgid = getpgrp();
printf("Parent: pid: %d pgid: %d \n", pid, pgid);
// print the pid, and pgid of child from parent
cpgid = getpgid(cpid);
printf("Parent: Child's pid: %d pgid: *%d*\n", cpid, cpgid);
} else { /* ERROR */
perror("fork");
_exit(1);
}
return 0;
}
> ./setpgid
Parent: pid:20335 pgid: 20335
Parent: Child's pid:20336 pgid:*20336*
Child: pid:20336 pgid:*20336*
Terminal Signaling Process Groups
Where process groups fit into the ecosystem of process settings is within the terminal settings. Let’s return to the terminal control function, tcsetpgrp()
. Before, we discussed this function as setting the foreground processes, but just from its name tcsetpgrp()
, it actually sets the foreground process group.
Foreground Process Group
This distinction is important because of terminal signaling. We know now that when we execute a pipeline, the shell will fork all the process in the job and place them in the same process group. We also know that when we use special control keys, like Ctrl-c or Ctrl-z that the terminal will deliver special signals to the foreground job, such as indicating to terminate or stop. For example, this sequence of shell interaction makes sense:
> sleep 10 | sleep 20 | sleep 30 | sleep 50 &
[1] 24253
> ps
PID TTY TIME CMD
4038 pts/3 00:00:00 bash
24250 pts/3 00:00:00 sleep
24251 pts/3 00:00:00 sleep
24252 pts/3 00:00:00 sleep
24253 pts/3 00:00:00 sleep
24254 pts/3 00:00:00 ps
> fg
sleep 10 | sleep 20 | sleep 30 | sleep 50
^C
> ps
PID TTY TIME CMD
4038 pts/3 00:00:00 bash
24255 pts/3 00:00:00 ps
We started the sleep
commands in the background, we see that there are 4 instances of sleep
running, and we can move them from the background to the foreground, were they are signaled with Ctrl-C
to terminate via the terminal. All good, right? There is something missing: Given that there are multiple processes running in the foreground, how does the terminal know which of those to signal to stop or terminate signal? How does it differentiate which processes are in the foreground?
The answer is, the terminal does not identify foreground process individually. Instead, it identifies a foreground process group. All processes associated with the foreground job are in the foreground process group, and instead of signalling processes individually both shell and the terminal think of execution in terms of process groups.
Orphaned Stopped Process Groups
Process group interaction has other side effects when you consider programs that fork children. For example, consider the program (orphan) below which simply forks a child, and then both child a parent loop forever:
/* orphan.c */
int main(int argc, char * argv[])
{
pid_t cpid;
cpid = fork();
if (cpid == 0) { /* CHILD */
// child loops forever!
while(1)
;
} else if (cpid > 0) { /* PARENT */
// Parent loops forever as well
while(1)
;
} else { /* ERROR */
perror("fork");
_exit(1);
}
return 0;
}
If we were to run this program, we can see that, yes, indeed, it forks and now we have two versions of orphan running in the same process group.
> ./orphan &
[1] 24468
> ps -o pid,pgid,ppid,comm
PID PGID PPID COMMAND
4038 4038 4037 bash
24468 24468 4038 orphan
24469 24468 24468 orphan
24470 24470 4038 ps
Moving the orphan program to the foreground, it can then be terminated by the terminal using Ctrl-C
.
> fg
./orphan
^C
> ps -o pid,pgid,ppid,comm
PID PGID PPID COMMAND
4038 4038 4037 bash
24471 24471 4038 ps
The resulting termination is for both parent and child, which is as expected since they are both in the foreground process group. While we might expect an orphan to be created, this does not occur. However, let’s consider the same program, but this time, the child is placed in a different process group as the parent:
/* orphan_group.c */
int main(int argc, char * argv[])
{
pid_t cpid;
cpid = fork();
if (cpid == 0) { /* CHILD */
// set process group to itself
setpgrp();
// child loops forever!
while(1)
;
} else if (cpid > 0) { /* PARENT */
// set the process group of child
setpgid(cpid, cpid);
// Parent loops forever as well
while(1)
;
} else { /* ERROR */
perror("fork");
_exit(1);
}
return 0;
}
Let’s do the same experiment as before:
> ./orphan_group &
[1] 24487
> ps -o pid,pgid,ppid,comm
PID PGID PPID COMMAND
4038 4038 4037 bash
24487 24487 4038 orphan_group
24488 24488 24487 orphan_group
24489 24489 4038 ps
> fg
./orphan_group
^C
> ps -o pid,pgid,ppid,comm
PID PGID PPID COMMAND
4038 4038 4037 bash
24488 24488 1 orphan_group
24490 24490 4038 ps
This time, yes, we see that we have created an orphan process. This is clear from the PPID
field that indicates that the parent of the orphan_group program is init, which inherits all orphaned processes. This happens because the terminal signal Ctrl-C
is delivered to the foreground process group only, but the child is not in that group. The child is in its own process group and never receives the signal, and, thus, never terminates. It just continues on its merry way never realizing that it just lost its parent. In this examples lies the danger of using process groups; it’s very easy to create a bunch of orphans that will just cary on if not killed. To rid yourself of them, you must explicitly kill them with a call like killall
:
> killall orphan_group
> ps -o pid,pgid,ppid,comm
PID PGID PPID COMMAND
4038 4038 4037 bash
24494 24494 4038 ps