User Tools

Site Tools


notes:python_process_poller

This is an old revision of the document!


Python Process Poller

This is a convenience class which allows multiple subprocess to be forked, and collects output from them as they run. The run_command() method should be passed a command-line in list form as well as unique context value which will be used to identify the command later (this can be any hashable value).

The base class implementation simply buffers all output from each process in a 2-tuple of (stdout, stderr) in the output dictionary of the class, whose keys are the context provided to run_command(). The handle_output() method can be overridden by derived classes to modify this behaviour, but bear in mind that output won't be populated unless the base class version is called.

The return codes of the processes are stored in the results attribute, and are None until the processes terminate. At this point, the handle_terminate() method is called, which can be overridden by derived classes to implement specific behaviour if required.

It should be safe to raise exceptions in either of the handle_X() methods and these will be propogated straight of the poll() method, interrupting any other processing, but I haven't tested this much.

This code requires at least Python 2.6 (to the best of my knowledge) and also assumes a POSIX system - on Windows I believe one must use threads to collect output instead of select.poll().

procpoller.py
import os
import select
import signal
import subprocess
import time
 
 
class ProcPoller(object):
    """Watches multiple processes for output on stdout and stderr."""
 
    def __init__(self):
        self.fd_map = {}
        self.results = {}
        self.output = {}
        self.closed_procs = set()
        self.poller = select.poll()
 
 
    def run_command(self, cmdline, context):
        """Executes the specified command-line."""
 
        if context in self.results:
            raise ValueError("duplicate context value supplied")
        proc = subprocess.Popen(cmdline, stdout=subprocess.PIPE,
                                stderr=subprocess.PIPE, close_fds=True)
        proc.__context = context
        self.results[context] = None
        self.output[context] = ["", ""]
 
        for fd in (i.fileno() for i in (proc.stdout, proc.stderr)):
            self.fd_map[fd] = proc
            self.poller.register(fd, select.POLLIN | select.POLLHUP)
 
 
    def poll(self, timeout=None):
        """Collect output from processes until stopped or timeout (in secs)."""
 
        poll_timeout = timeout * 1000 if timeout is not None else None
        timeout = time.time() + timeout if timeout is not None else None
 
        while self.fd_map or self.closed_procs:
 
            # While there are processes who have closed file descriptors but
            # not yet terminated, check their status at least every 500ms.
            if self.closed_procs:
                poll_timeout = min(poll_timeout, 500)
 
            for fd, events in self.poller.poll(poll_timeout):
                proc = self.fd_map[fd]
                if events & select.POLLIN:
                    data = os.read(fd, 4096)
                    if data:
                        if (self.handle_output(proc.__context,
                                               fd == proc.stderr.fileno(),
                                               data)):
                            proc.send_signal(signal.SIGTERM)
                if events & select.POLLHUP:
                    self.poller.unregister(fd)
                    del self.fd_map[fd]
                    if not proc.stdout.closed and proc.stdout.fileno() == fd:
                        proc.stdout.close()
                    else:
                        proc.stderr.close()
                    if proc.stdout.closed and proc.stderr.closed:
                        self.closed_procs.add(proc)
 
            dead_procs = set()
            try:
                for proc in self.closed_procs:
                    ret = proc.poll()
                    if ret is not None:
                        self.results[proc.__context] = ret
                        dead_procs.add(proc)
                        self.handle_terminate(proc.__context)
            finally:
                self.closed_procs -= dead_procs
 
            if timeout is not None:
                now = time.time()
                if now >= timeout:
                    break
                poll_timeout = (timeout - now) * 1000
 
 
    def handle_output(self, context, stderr, data):
        """Derived classes can override to intercept output.
 
        Returning True from this function will send SIGTERM to the process.
        """
 
        index = 1 if stderr else 0
        self.output[context][index] += data
        return False
 
 
    def handle_terminate(self, context):
        """Derived classes can override to detect terminate."""
 
        pass
notes/python_process_poller.1374491199.txt.gz · Last modified: 2013/07/22 11:06 by andy