I finally took a little time to get my head around POSIX process groups and sessions.
Fair warning: if you don’t know what a PID is in the context of a POSIX process — or, indeed, you think “POSIX” sounds like some type of screw head - then you probably don’t need to bother reading this post.
Right, assuming there’s anybody left… For awhile now I’ve had a sort of
peripheral awareness of some additional attributes of processes about which
I’ve never really bothered too much. The most important attributes, with which
most people reading this will likely be familiar, are the PID1,
PPID2, UID3 and GID4. There are a few wrinkles like real and
effective IDs, but they’re beyond the scope of this post. If you run your
favourite process listing command with enough detail (such as
ps -eF for
example) then you’ll see most of these shown.
However, there are a couple of extra attributes that I’ve never looked at in much detail, which are process group ID (PGID) and session ID (SID). Today I decided that ignorance wasn’t bliss at all, in fact it was a blasted pain, so I’ve looked up what they mean. It turns out that they’re quite simple and, potentially, quite useful. So, here goes.
A process group is more or less exactly what it sounds like — a way
to group processes together. This is useful because it’s possible to direct a
signal to a process group instead of a specific process. This can be done with
killpg() system call which takes a PGID as a parameter and has
the effect of sending the specified signal to every process within that group.
You can specify a PGID of 0 to specify the group in which the calling process
is found, and actually a standard
kill() call with a PID of 0 does the
The group in which a process is located defaults to the group of the process
which created it, but it can be changed with the
Indeed, this is what the shell does when it executes pipelines of commands -
each pipeline is put into its own process group, separate from the shell’s group.
If any of those commands fork their own children then they’ll also be added to
the same group, unless they actively change it. Note that a “pipeline” in this
context also applies to the degenerate case of a single command (a pipeline of one!).
Conventionally the PGID of a group is the same as the PID of the first process placed in that group, which is referred to as the process group leader. This is an important concept if you want to change your session, but to explain that I’ll have to explain what a session is.
The session is another level of grouping — i.e. a session contains one or more process groups. Sessions are generally tied to a controlling terminal5. For example, all process groups created by a particular shell will have the same session ID, which will generally be the PID of the shell process — as an aside, this is a quick way to locate all the commands created by a particular shell process. One important aspect of a session is that when moving a process between process groups, both groups must be members of the same session or the operation fails.
A process can be moved to a new session using the
call. This will create a new process group and place the calling process into
it, and then create a new session and place the new process group within that.
There are restrictions on which processes may do this, however — see below.
Note that the new session will have no controlling terminal, so this system
call offers a helpful way for processes to detach from their controlling
terminal when they daemonise.
Each session has a foreground process group, which is effectively the
currently executing command. This is the group to which a signal will be sent
if generated by the terminal (e.g.
SIGINT in response to CTRL-C or
in response to CTRL-Z). Also, only processes within the foreground group can
read from the terminal.
Just as a process group has a leader so does a session have a session
leader process, which is often a process group leader as well. Both process
group and session leaders have various restrictions on them: session leaders
can’t be moved between process groups and process group leaders can’t be moved
to a new session with
setsid(). The session leader is also the
process to receive a
SIGHUP if the controlling terminal for the session is
Given all this, we can see how it fits into the “standard” process for daemonising:
fork()and terminate the parent — this ensures the new process is an orphan (adopted by
init) and also returns control to the calling shell.
setsid()to create a new process group and session — we can only do this after the
fork()above because otherwise we’d be a process group leader. This has detached us from the controlling terminal, which is exactly what daemons should do.
fork()a second time — I believe this is simply so we’re not longer a session leader and can never re-acquire a controlling terminal. There may be additional, more subtle, reasons of which I’m unaware.
chdir("/")or some other directory on which the daemon relies — this is to avoid the daemon keeping a directory active which would prevent it being unmounted. If there’s some directory the daemon requires then it actually may be preferable for it to stay active to prevent accidental unmounting.
umask(0)just to clear any permissions mask we may have inherited.
close()standard file descriptors
2, which are standard input, output and error respectively. Since we’re detached from our terminal it’s not clear where they’ve been directed to anyway. Note that some daemons determine the highest possible file descriptor using
close()on them all (ignoring errors) just in case the parent had any other open files — this may be overkill if you’re confident in the behaviour of your calling process, but if you’re at all uncertain it’s the safest course, to avoid wasting file descriptors (of which there’s a finite number available).
open()three times for each of the file descriptors, redirecting them to somewhere sensible. This could be
/dev/console, or perhaps a log file you’ve already opened. Some code assumes file descriptors will be allocated sequentially so they just assume that the next three
open()calls will get descriptors
2, but to be doubly sure you can use
dup2()— in that case, however, you should have opened the replacement descriptor before the previous step, otherwise you could have a clash.
A detailed description of all these steps is outside the scope of this post, but I wanted to reproduce the full procedure here for context — you can find more details all over the web.
Let’s see some illustrations of process groups and sessions. Note that the
invocations I used below are quite Linux-specific, but you should be able to
tailor them to your particular Unix variant with a bit of squinting at the man page.
First, we run a simple
ps to show the relevant IDs:
$ ps -Ho pid,ppid,pgid,tpgid,sess,args PID PPID PGID TPGID SESS COMMAND 1684 3057 1684 59829 1684 /bin/bash 59829 1684 59829 59829 1684 ps -Ho pid,ppid,pgid,tpgid,sess,args
Here we can see the
bash shell has PID 1684 and this matches the SID of both
itself and the
ps command which was executing. The PPID of the
the PID of
bash as one would expect and the
ps process has been assigned
a new PGID which matches its own PID, so it is the process group leader. The
TPGID field indicates the foreground process group within the session, in this
case the PGID of
ps since that’s the currently executing command in the session.
Second, we’ll add an additional pipeline of commands into the mix:
$ cat | sed 's/hello/goodbye/' &  17391 + Stopped cat | sed 's/hello/goodbye/' $ ps -Ho pid,ppid,pgid,tpgid,sess,args PID PPID PGID TPGID SESS COMMAND 1684 3057 1684 17401 1684 /bin/bash 17390 1684 17390 17401 1684 cat 17391 1684 17390 17401 1684 sed s/hello/goodbye/ 17401 1684 17401 17401 1684 ps -Ho pid,ppid,pgid,tpgid,sess,args
Note: you can ignore the “stopped” message, this is a result of
to read from its standard input and failing because it’s in the background.
Only the foreground process group can read from the terminal, a process in any
other group which tries will be sent
SIGTSTP and hence be suspended.
So, we can see that both
sed have been placed into the same PGID
by the shell here, which is different to the PGID of
ps. The TPGID of all
the entries is still the same as the PGID of
ps is again the
currently executing command for all groups within the session. Since I’ve
used the same shell process as in the previous example, the SID is the same.
Now we can see an example of signals being set to the foreground process group (and not just a single process) by executing the following Python script7:
import signal import os import time # Initialise do_exit to False, On CTRL-C (SIGINT), set it to True. do_exit = False def handle_signal(signum, stack): global do_exit do_exit = True # Install signal handler. signal.signal(signal.SIGINT, handle_signal) # Fork into two processes to illustrate both receiving a signal. child_pid = os.fork() if child_pid == 0: print "Child is waiting..." else: print "Parent is waiting..." # Loop until the SIGINT handler sets do_exit to True. while not do_exit: time.sleep(0.1) # Print appropriate message and exit. if child_pid == 0: print "Child has caught signal." else: print "Parent has caught signal."
Execute this script and then, once parent and child are waiting, hit CTRL-C. You should see the following output, potentially with parent and child messages swapped over in either or both cases:
$ python signal-catcher.py Child is waiting... Parent is waiting... Parent has caught signal. Child has caught signal.
This clearly shows both processes receiving the SIGINT as a result of CTRL-C. For comparison, if we only send the signal to the child process:
$ python signal-catcher.py &  33635 Child is waiting... Parent is waiting... $ ps -Ho pid,ppid,pgid,tpgid,sess,args PID PPID PGID TPGID SESS COMMAND 1684 3057 1684 33680 1684 /bin/bash 33635 1684 33635 33680 1684 python signal-catcher.py 33640 33635 33635 33680 1684 python signal-catcher.py 33680 1684 33680 33680 1684 ps -Ho pid,ppid,pgid,tpgid,sess,args $ kill -INT 33640 Child has caught signal. $ ps -Ho pid,ppid,pgid,tpgid,sess,args PID PPID PGID TPGID SESS COMMAND 1684 3057 1684 33744 1684 /bin/bash 33635 1684 33635 33744 1684 python signal-catcher.py 33640 33635 33635 33744 1684 [python] <defunct> 33744 1684 33744 33744 1684 ps -Ho pid,ppid,pgid,tpgid,sess,args $ kill -INT 33635 Parent has caught signal. + Done python signal-catcher.py
Since the command was executed in the background the output gets interleaved with
the shell prompt, so I’ve tidied that up for clarity in the output above. The
pertinent details are shown unchanged, however — in particular, you can see the
child process (only) receives the signal and terminates, remaining only as a
defunct zombie process until its parent reaps its return code with something like
wait(). Since our little Python script never reaps this return code, the
child process’ descriptor will linger as long as the parent remains alive.
We can see that the PGID of the child
python process is the same as the parent,
as expected. This example also shows clearly the difference between signalling
the process group, as in the first example, and signalling a single process, as
Finally, for completeness, let’s see the same example but signalling the parent process first and then the child:
$ python signal-catcher.py &  49149 Parent is waiting... Child is waiting... $ ps -Ho pid,ppid,pgid,tpgid,sess,args PID PPID PGID TPGID SESS COMMAND 1684 3057 1684 50394 1684 /bin/bash 49149 1684 49149 50394 1684 python signal-catcher.py 49154 49149 49149 50394 1684 python signal-catcher.py 50394 1684 50394 50394 1684 ps -Ho pid,ppid,pgid,tpgid,sess,args $ kill -INT 49149 Parent has caught signal. + Done python signal-catcher.py $ ps -Ho pid,ppid,pgid,tpgid,sess,args PID PPID PGID TPGID SESS COMMAND 1684 3057 1684 51192 1684 /bin/bash 51192 1684 51192 51192 1684 ps -Ho pid,ppid,pgid,tpgid,sess,args 49154 1 49149 51192 1684 python signal-catcher.py $ kill -INT 49154 Child has caught signal.
This example shows broadly the same principles, but there are a couple of interesting points to note. Firstly, once the parent is dead the shell indicates that the job is “done” — it doesn’t monitor the children of commands that it executes, just when the command itself is completed.
Secondly, after the parent has terminated note how the PPID of the child
is set to 1. This is because orphaned processes are automatically adopted
init process (the root of all processes on the system). If this
didn’t happen then they would always remain around as defunct zombies after
terminating since there’s no parent process to reap their return code. The
init process is implemented such that it calls
wait() on all of its
children to reap their return codes. Note how even though it’s been adopted,
it still shares the same session and is still attached to the same terminal,
ps still displays it without need for the
Hopefully that’s cleared things up for someone. Well, it’s definitely cleared things up for me — I should try explaining things to myself more often.
Process ID, a unique identifier for a process. ↩
Parent process ID, the PID of the process which created this one. ↩
User ID, the user as which the process is executing. ↩
Group ID, the group as which the process is executing. ↩
Although it’s quite possible for a session to have no controlling terminal — this typically the case with daemon processes, for example. ↩
In reality, of course, the situation is a little more complicated
and there are circumstances that
SIGHUP is not set, such as the terminal
CLOCAL flag set. You can find the gory details in the man pages. ↩
It’s pretty grotty as far as code quality is concerned, but it’s purely for illustrative purposes. ↩