Download Install Tutorial Docs FAQ Tools WikiLicense Team IRC Planet Involvement Shop Book

Python Web Site Process Bus v1.0

Abstract

This document specifies a proposed standard interface between operating system events and web components (including web servers, web applications, and frameworks), to promote web component interoperability and portability.

Rationale and Goals

The Python community has produced many useful web application frameworks, including Django, Pylons, Turbogears, Zope, CherryPy, Paste, and many more [1]. Recently, many of these frameworks have attempted to decentralize their architectures and become more component-based, specifically using the WSGI specification [2] to decouple servers from applications, and even framework components from each other.

In general, however, each of these frameworks' and servers' natural assumption is that it alone must be in control of the OS process, generating and responding to process-wide events: startup, shutdown, and restart. As various people have tried to combine components from multiple frameworks, they generally select one framework or server as the primary process controller, and then write ad-hoc adapters to translate the startup/shutdown style of foreign components into the style of the primary controller. In many cases, this adaptation has simply not been possible when frameworks or servers are too tightly coupled to process-wide events; for example, when two frameworks register signal handlers for the same process.

This is a classic M x N adaptation problem that could be resolved by having a common interface for process-wide events. The availability and widespread use of such an API in Python web frameworks would free users and component authors from writing ad-hoc adapters, and promote components which are reusable regardless of framework. This specification, therefore, proposes a simple and universal interface between web components and process-wide events: the Python Web Site Process Bus (WSPB).

Although a Bus must be realized in software (and sample code is included in this spec), this specification does not attempt to produce a new dependency into any existing framework. It is expected that each framework or server which chooses to follow this proposal will provide their own implementation of a Bus, and adapt their own existing process event handlers into Bus actors in a fashion that minimally impacts their existing users. The implementation of a Bus should therefore be as simple as possible, as well as the implementation/adaptation of Bus "plugins" (event listeners).

In addition, nothing in this specification shall depend on a specific version of Python. Rather, as Python and its available libraries continue to improve, Bus implementations and plugins are encouraged to take advantage of such improvements.

Specification Overview

The WSPB interface defines two roles: a single "bus" object (one per main process), and various "listener" callables. The deployer of a web site selects a single Bus object offered by one of the frameworks or servers and instantiates it. Each component is then attached to the Bus by subscribing listener callbacks to one or more message channels. The deployer then calls Bus.start() in the main thread, whereupon any callbacks which are subscribed to the 'start' channel are run, often creating new threads or child processes. If no errors occur, the deployer then calls Bus.block(), which suspends the main thread/process until shutdown occurs. Any thread (or subprocess, via IPC) is free to call any Bus method at any time (although some methods are OPTIONAL). A common example is for a SIGTERM handler to call Bus.exit().

The means by which each framework offers a Bus candidate is not defined, since this interface is often adapted from preexisting framework API's and should remain framework-specific in order to foster adoption of this specification. Frameworks and servers are free to provide additional attributes and methods on the Bus object they provide while they migrate to a component-based architecture.

Specification Details

The Bus object

The Bus object works as a finite state machine which models the current state of the process. Bus methods move it from one state to another; those methods then publish to subscribed listeners on the channel for the new state.

          Process start-O
                        |
                        V
       STOPPING --> STOPPED --> EXITING --> Process end
          A   A         |
          |    \___     |
          |        \    |
          |         V   V
        STARTED <-- STARTING

start

In general, a deployment script will be invoked as/at process start, create a Bus object (in the STOPPED state), add listeners to various channels. and then call Bus.start(). The start method moves the Bus from the initial STOPPED state to the STARTING state, and then calls all listeners which have subscribed to the 'start' channel. If no listener raises an error, the start method moves the Bus to the STARTED state and terminates. If a listener raised an error, the start method calls self.exit() (see below).

If a listener raises an error, start must raise an error. However, it must call Bus.exit() before doing so, which may itself raise errors. The error from the original listener should be raised. That is:

    def start(self):
        """Start all services."""
        self.state = states.STARTING
        self.log('Bus STARTING')
        try:
            self.publish('start')
            self.state = states.STARTED
        except (KeyboardInterrupt, SystemExit):
            raise
            self.log('Bus STARTED')
        except:
            self.log("Shutting down due to error in start listener:\n%s" %
                     _traceback.format_exc())
            exc_info = sys.exc_info()
            try:
                self.exit()
            except:
                # Any stop/exit errors will be logged inside publish().
                pass
            raise exc_info[0], exc_info[1], exc_info[2]

This will re-raise the first exception, and in principle should abort the process.

stop

The Bus must provide a stop() method, which moves the state to STOPPING, calls all listeners on the 'stop' channel, and then moves the state to STOPPED.

exit

The Bus must provide an exit() method, which calls stop() (see above), moves the state to EXITING, and then calls all listeners on the 'exit' channel.

restart

The Bus MAY provide a restart() method, which sets bus.execv to True and calls exit() (see above). This method does not restart the process from the calling thread; instead, it stops the bus and asks the main thread to call os.execv. Bus implementations (e.g., for mod_python) that wish to restrict the ability of frameworks and applications to restart the process MAY omit this method (TODO: or raise NotImplemented?? or shut down? they shouldn't just pass, since the caller may not be prepared to continue processing if the process doesn't actually end).

graceful

The Bus must provide a graceful() method, which calls all listeners on the 'graceful' channel.

block

The Bus must provide a block(interval=0.1) method, which waits for the EXITING state. The interval argument should be used as the polling frequency for the state where needed; however, platform-specific signaling techniques that do not require polling cycles should be preferred (such as win32 Events).

Once the EXITING state has been reached, the block method should join() all non-daemon threads. This allows services the necessary time to shut down before allowing the main thread to proceed to termination (and possible atexit calls out of sequence). It also avoids unpleasant interactions between execv and threads on various platforms (see next).

Once all threads have terminated, the method should test the bus.execv attribute; if it evaluates to True, it should call os.execv to restart the process. Bus implementations (e.g., for mod_python) that wish to restrict the ability of frameworks and applications to restart the process MAY omit this step (TODO: or error? or shut down? see "restart" above).

log

The Bus must provide a log(msg="", traceback=False) method which publishes the given msg to all listeners on the 'log' channel. If the optional traceback argument is True, the current traceback must be appended to the message. The following algorithm is recommended:

    def log(self, msg="", traceback=False):
        """Log the given message. Append the last traceback if requested."""
        if traceback:
            exc = sys.exc_info()
            msg += "\n" + "".join(_traceback.format_exception(*exc))
        self.publish('log', msg)

All Bus methods which modify the state attribute must emit a log message for each state change.

subscribe

The Bus must provide a subscribe(channel, callback, priority=None) method which registers the given callback as a listener on the given channel. This method must be idempotent for the same arguments; that is, calling it twice with the same channel and callback must not result in the callback being called twice on the same channel. However, providing a different priority must change the registered priority of an existing listener.

unsubscribe

The Bus must provide an unsubscribe(channel, callback) method which unregisters the given callback from the given channel. This method must be idempotent; that is, it must not error if the listener is not registered.

publish

The Bus must provide a publish(channel, *args, **kwargs) method which calls all listeners for the given channel, in order from lowest priority to highest, and passes the given *args and **kwargs through to each listener. It must collect and return a list of the return values from each listener (although many listeners return None and the caller is not required to consume these values).

This method must call all listeners, regardless of any uncaught exceptions thrown by individual listeners. The only exceptions to this are KeyboardInterrupt and SystemExit, which must be re-raised immediately. All other listener errors must be logged and include the traceback. After all listeners have been called, the last exception must be re-raised (instead of returning the list of return values).

Listeners

A Bus listener is simply a callable object that accepts zero or more arguments. The term "object" should not be misconstrued as requiring an actual object instance: a function, method, class, or instance with a __call__ method are all acceptable for use as a listener.

Listeners must be callable according to the argument specification of their channel. The channels which this specification mandates all take zero arguments, except for the log channel, which takes a single msg argument. Frameworks and other components are free to define additional channels, which may take arbitrary positional or keyword arguments.

Listeners must be callable from any thread, and must return control to the caller in a timely fashion; i.e., they must not block indefinitely. For listeners which enable long-running behaviors, this means they must start new threads or subprocesses, or register additional callbacks for asynchronous event loops.

At the same time, listeners should not return until their own state is stable. For example, a listener that starts an HTTP server should not return until the HTTP server is known to be ready to accept requests. This requirement helps deployers debug process events more easily since they occur synchronously. This also minimizes both the frequency and damage of overlapping events, such as one thread calling bus.stop() while another is calling bus.start().

Listeners should trap all errors except KeyboardInterrupt and SystemExit.

Listeners should make every effort to be idempotent. Since code may call bus.stop() and bus.start() repeatedly, listeners which are designed to be run only once per process should take care to avoid executing repeatedly (e.g. by setting an internal "finalized" variable).

Channels

All implementations must provide the required channels specified below, and must allow arbitrary code to define additional channels. All channels must allow subscribed listeners without any known publishers, and vice versa. All channel names must be 8-bit str, not unicode.

Required channels

  • start: Called by the bus when it is in the STARTING state. Example listeners: daemonizer, PID file writer, privilege dropper, HTTP server listen(), database connectors, and startup for any site-wide service that needs to run at intervals, asynchronous with HTTP requests. Note that the first three examples should not run twice in the same process, and therefore must take steps to prevent this on their own.
  • stop: Called by the bus when it is in the STOPPING state. Example listeners: HTTP server interrupt (stop listening), database disconnector, and shutdown for any site-wide service that runs at intervals, asynchronous with HTTP requests.
  • graceful: Called by the bus' graceful method. A "graceful restart" is generally used to close and re-open resources, such as log file handles, worker thread or subprocess pools, and child sockets. In general, the main listener socket should not be closed by this message (although it may be handed off to a new process). Note that workers may be in the middle of an arbitrarily-long HTTP conversation, and therefore may take a long time to close down. Listeners which attempt to close workers should provide a configurable timeout, after which time control is returned to the bus' graceful method regardless of whether the worker has successfully exited.
  • exit: Called by the bus' exit method. This channel should only be published once in the lifetime of the main process (although the asynchronous nature of the bus does not guarantee this), at process exit.
  • log(msg): Called by the bus' log method. This channel takes a required msg argument, which must be an 8-bit str, not unicode. Commonly, listeners will write the passed message to a stream such as stdout or a disk file.

Optional signal channels

Implementations which provide a signal handler should catch signals and publish to a channel with the same name; for example, "SIGTERM". A list of signal names can be obtained from the signal module in the Python standard library (note: not all names will be defined for all platforms). It is recommended that listeners for SIGTERM, SIGHUP, and SIGUSR1 should be automatically subscribed (if the operating system supports them), and should call bus.exit, bus.restart, and bus.graceful, respectively.

Arbitrary components MUST NOT publish to SIG* channels; only handlers registered with signal.signal() should do so. This allows web frameworks and components to subscribe listeners for these events, yet rest assured that they will not be called if HTTP servers like Apache or lighttpd are used (which register their own signal handlers).

References

  1. The Python Wiki "Web Programming" topic (http://www.python.org/cgi-bin/moinmoin/WebProgramming)
  2. The Python Web Server Gateway Interface spec (http://www.python.org/dev/peps/pep-0333/)

Hosted by WebFaction

Log in as guest/cpguest to create tickets