Download Install Tutorial Docs FAQ Tools WikiLicense Team IRC Planet Involvement Shop Book

Warning: this is an advocacy piece, written for the sole purpose of convincing Turbogears to use CherryPy. If you don't respond to assertiveness well, try reading CherryPyAndPasteCanPlayNice instead.

CherryPy and Paste

Although CherryPy and Paste provide similar functionality, CherryPy 3 does it more beautifully, more Pythonically. Here's how.

Object-oriented development

CherryPy objects are robust and intuitive, following established Pythonic conventions. CherryPy prefers data stored in attributes over data recalled from functions, so CherryPy provides global request and response objects, with structured, Pythonic attributes. Paste, on the other hand, provides a single, flat environ dict (from WSGI).

Here are some common data points, and their means of access compared:

CherryPy Paste
request.cookie request.get_cookies(environ) or .get_cookie_dict(environ)
request.querystring request.parse_querystring(environ)
request.params request.parse_formvars(environ, include_get_vars=True)
cherrypy.url(path="", qs="", script_name=None, base=None, relative=False)request.construct_url(environ, with_query_string=True, with_path_info=True, script_name=None, path_info=None, querystring=None)
cherrypy.url(..., relative=True)request.resolve_relative_url(url, environ)

Paste seems to prefer long, hard-to-remember names (like "composite_factory" and "make_gzip_middleware"). CherryPy emphasizes normal Python you already know, and short, meaningful, easy-to-remember names.

As you can see, each Paste function also requires you to pass in an environ argument. This means the environ must be available to all possible consumers, and, given the highly functional nature of Paste, encourages all consumer code to pass the environ in every function call. This repeats one of the most arduous headaches of mod_python: passing the req object everywhere. CherryPy avoids all of that by making cherrypy.request and cherrypy.response into thread-local objects, so that they are always accessible as globals (yet without the risk of concurrency issues).

Paste has a wrapper for some of these in its wsgiwrappers.WSGIRequest object. But that is currently underdeveloped and suffers from several flaws:

  • It requires you, the developer, to create it (and pass it the WSGI environ).
  • With only a dozen or so attributes, it already suffers from scope creep, dumping highly custom attributes like is_xhr into the request object namespace.
  • Its test suite is only 25 lines long.
  • Its params property, for example, is recalculated each time you request it.

In short, Paste exposes too much of WSGI to the everyday developer. This is the perfect approach for writing WSGI middleware, but application developers (of whom there should be far more) end up having to handle the WSGI environ dict too often to get anything meaningful done.

CherryPy's Request object has the additional attributes: local, remote, scheme, server_protocol, base, request_line, method, protocol, header_list, headers, rfile, body, dispatch, script_name, path_info, app, handler, config, and is_index (among others). All are manipulable by consumer code (and all are overridable in config if necessary).

In addition to Request and Response, CherryPy provides instances of Server, Engine, Config, Dispatcher, LogManager?, Hook, HookMap?, Tool, Toolbox, Application, and Tree classes. Most of these objects are accessible as top-level attributes. The server, engine, log, hooks, tools, request and response objects are all completely open to configuration. And all of the default objects (like cherrypy.engine, for example) are completely circumventable; you are encouraged to reconfigure them and even replace them if the default objects do not meet your needs. CherryPy's powerful config system and emphasis on normal, imperative Python code make this easy.

CherryPy also prefers customization-by-attribute over customization-by-argument. For example, paste.webkit contains the truly insane construction:

def make_webkit_app(
    global_conf,
    servlet_directory=None,
    package_name=None,
    complete_stack=True,
    debug=None,
    # session middleware:
    cookie_name='_SID_',
    session_file_path='/tmp',
    session_chmod=None,
    # error middleware:
    error_email=None,
    error_log=None,
    show_exceptions_in_wsgi_errors=False,
    from_address=None,
    smtp_server=None,
    error_subject_prefix=None,
    error_message=None,
    # enabling:
    profile=False,
    profile_limit=40,
    **app_conf):

CherryPy's equivalent is:

class Application(object):
        
    def __init__(self, root, script_name=""):
        ...

...and you are encouraged to configure the app and other objects by setting their attributes, either directly (in normal, readable Python code) or in config. CherryPy's stance is that many, many applications (and all frameworks built on top of CherryPy) are too complicated to be expressed either in purely declarative (e.g. config files) or functional (e.g. "make_app") styles. Normal, object-oriented, imperative Python provides both ease-of-use and maximum expression without switching modes to achieve one or the other.

Object-oriented configuration

CherryPy config entries are unsurprising. They follow established conventions for the creation, understanding, and use of configuration files. In circumstances where these conventions limit expression, CherryPy extends the syntax with Pythonic syntax (basically, all keys look like object.attr1.attr2 and all values can be arbitrary Python objects). Paste invents its own syntax which is nothing like Python:

[composite:main]
use = egg:Paste#urlmap
/ = home
/blog = blog
/cms = config:cms.ini

[app:home]
use = egg:Paste#static
document_root = %(here)s/htdocs

[filter-app:blog]
use = egg:Authentication#auth
next = blogapp
roles = admin
htpasswd = /home/me/users.htpasswd

[app:blogapp]
use = egg:BlogApp
database = sqlite:/home/me/blog.db

CherryPy's equivalent might read:

cherrypy.tree.mount(None, "/", "home.ini")
cherrypy.tree.mount(Blog(), "/blog", "blog.ini")
cherrypy.tree.mount(CMS(), "/cms", "cms.ini")

(... but see this email for an example of how easy it would be to do the above entirely in config)

home.ini:

[/]
tools.staticdir.on = True
tools.staticdir.dir = "/htdocs"

blog.ini:

[database] # This section would be app-specific; CP allows arbitrary sections
connect = "sqlite:/home/me/blog.db"

[/]
wsgi.pipeline = [('auth', Authentication.auth_app)]
wsgi.auth.roles = "admin"
wsgi.auth.htpasswd = "/home/me/users.htpasswd"

CherryPy config keys are also namespaced. As the Zen of Python says, "Namespaces are one honking great idea -- let's do more of those!" This aids readability and understanding. Nobody needs to be told that tools.staticdir.on is categorically separate from error_page.404 = 'template404.html'. Paste.deploy uses arbitrary keys with no categories (like 'use' and 'roles'). The CherryPy config may result in more keystrokes, but, as with Python itself, "readability counts". And, if you edit your config files in a Python-aware IDE, you get free code-completion support, since config syntax exactly matches imperative Python syntax.

In most cases, CherryPy config keys describe attributes on objects. So when your config file contains the entry engine.autoreload_frequency = 1, this sets the 'autoreload_frequency' attribute of the 'engine' object to 1, exactly as you would expect the equivalent Python code to do. Paste's config hosts a bewildering array of effects, often mixing behaviors from different architectural layers into the same syntax, and never in a way which is intuitively obvious to Python developers (or anyone else).

Finally, the set of namespaces is completely extendable; frameworks built on top of CherryPy can easily add new namespaces that behave in whatever fashion desired. And, because all such extensions must be in a namespace, users have a much easier time noticing, comprehending, and finding help on how to use your extension.

Object-oriented extensions

WSGI middleware is often seen as a dumping ground for various functionalities; developers have grand, collectivistic dreams that all the "tough bits" of web development can be made into middleware once and shared freely among all projects. This simply isn't possible. Although some such features should be and have been developed and shared in this manner, there are many more which cannot. In fact, existing middleware (provided by Paste, flup or selector) already offers nearly the complete range of functionalities which middleware can comfortably take.

CherryPy provides a set of Tools and Toolboxes, in addition to middleware, for several reasons:

First, extensions are easier to write as a Tool than as WSGI middleware, because Tools have complete access to the full, beautiful CherryPy API.

Second, Tools are easier to declare than WSGI middleware. Tools can be declared in config files, in startup scripts, at the controller level, as attributes on a page handler method, as decorators on a handler method, and finally can be called directly inside handlers.

Third, Tools are easier to apply than WSGI middleware. Because of its nested nature, WSGI middleware is extremely limited in the types of tasks it can undertake. Forcing more complex needs into middleware results in "too much magic". In addition, the WSGI protocol is itself complicated, and it is especially difficult to get all of the corner cases right.

For just one example, PEP 333 says, "...middleware must return an iterable as soon as its underlying application returns an iterable. It is also forbidden for middleware to use the write() callable to transmit data that is yielded by an underlying application. Middleware may only use their parent server's write() callable to transmit data that the underlying application sent using a middleware-provided write() callable." This limitation alone denies any sort of middleware which:

  1. composes content from multiple sources,
  2. adds or removes content blocks from the response based on upstream content, or
  3. performs any flow-control during the response.

...and this is just one item from the spec. Nearly every paragraph in the spec contains similar "gotchas" and limitations. WSGI servers must understand and obey one set of these. WSGI applications (or hopefully, frameworks) must understand and obey another. WSGI middleware must understand and obey them all.

Fourth, Tools are easier to manage than WSGI middleware. CherryPy provides a rich set of hooks and a robust, prioritized process for you to run your extensions when you want, where you want, and how you want. Tool authors have complete control over these aspects; WSGI middleware authors do not. It is not uncommon at all for HTTP features to require a precise order of application; for example, gzip MUST run after response encoding, and caching of the response should occur after both. WSGI middleware, precisely because of its decentralized nature, gives no help to either authors or deployers to solve these kinds of conflicts. CherryPy, on the other hand, understands these issues, and provides a priority-declaration system for Tool authors. Builtin tools (like encode, gzip, and caching) already have these priorities in the correct order. And, in the face of inevitable variance and change, deployers have the right and ability to override these settings.

Finally, Tools are easier to coordinate than WSGI middleware. Recently, we have seen a demand appear for more specification of common tasks that different middlewares would undertake. This shows that the WSGI interface, as simple as it is, cannot fulfill the promise of pluggability without a minimum of understanding across implementations. This is completely fair and sane and will indeed increase pluggability. It will also eventually reduce the WSGI components offered, because not following the specification will put you out of business. Interestingly, PJE was not keen on this idea.

Paste often gets more credit in the WSGI world because it was the first package written expressly to meet the needs and potential of WSGI. CherryPy, in contrast, was written to meet the needs of users, and WSGI has been more carefully integrated into the code. Unfortunately, some onlookers have interpreted this difference between the two projects to mean that CherryPy is "against" WSGI. This is not CherryPy's stance at all; in fact, we could say that with Pylons and Paste, CherryPy is one of the most coherent WSGI stacks. Just look at all the different ways to manage WSGI components with CherryPy 3. But wherever CherryPy provides WSGI integration, the user API looks like normal, object-oriented Python. Wherever Paste exposes WSGI, it looks like WSGI.

But there's a more insidious problem with moving features to middleware. Despite the rhetoric, frameworks are not evil per se; they provide a common interface to all components and are crucial for performing expensive operations once. WSGI does provide a lowest-common-denominator interface, but few people want to program to the WSGI interface all day long. PEP 333 says this itself: "Note, however, that simplicity of implementation for a framework author is not the same thing as ease of use for a web application author. WSGI presents an absolutely "no frills" interface to the framework author, because bells and whistles like response objects and cookie handling would just get in the way of existing frameworks' handling of these issues. Again, the goal of WSGI is to facilitate easy interconnection of existing servers and applications or frameworks, not to create a new web framework."

So alternatives (like CherryPy, or paste.wsgiwrappers) are recommended to make working with WSGI more pleasant. But there's a vicious circle which accompanies the desire to "not have a framework":

  1. You build your initial offering using middleware from project A.
  2. The developers of project A find that parsing headers, params, and various calculated data is slow, so they cache behind the scenes in library Z.
  3. The holy grail of WSGI middleware is that we should be able to steal code from lots of different projects, so you add another bit of WSGI middleware from project B.
  4. The developers of project B find that parsing headers, params, and various calculated data is slow, so they cache behind the scenes in library Y.
  5. As the number of libraries Z, Y, X... increases, the number of times the same data is parsed rises (increasing CPU time) and ALSO the number of places it is cached (increasing memory footprint).
  6. Consumers find this state of affairs unacceptable, and decree, "we will only support middleware built with library Z (or no library)".
  7. Library Z therefore plays the role of a framework, no matter how much "proof by repeated assertion" that it is "not a framework" floats around in its marketing.

Why go through that entire loop when CherryPy is ready now to play the part of library Z?

In addition, the pain of referencing, importing and calling such libraries inside middleware encourages sloppy coding. Take a look at the gzip middleware from Paste:

    def __call__(self, environ, start_response):
        if 'gzip' not in environ.get('HTTP_ACCEPT_ENCODING', ''):
            # nothing for us to do, so this middleware will
            # be a no-op:
            return self.application(environ, start_response)
        response = GzipResponse(start_response, self.compress_level)
        app_iter = self.application(environ,
                                    response.gzip_start_response)
        if app_iter:
            response.finish_response(app_iter)

        return response.write()

Rather than use even their own library for header parsing, the developer here chose to inspect the header directly. Unfortunately, he forgot to handle several issues, which are easy enough to quote straight from the HTTP/1.1 spec (with my emphasis):

  1. If the content-coding is one of the content-codings listed in the Accept-Encoding field, then it is acceptable, unless it is accompanied by a qvalue of 0. (As defined in section 3.9, a qvalue of 0 means "not acceptable.")
  2. If multiple content-codings are acceptable, then the acceptable content-coding with the highest non-zero qvalue is preferred.
  3. If an Accept-Encoding field is present in a request, and if the server cannot send a response which is acceptable according to the Accept-Encoding header, then the server SHOULD send an error response with the 406 (Not Acceptable) status code.

And finally, the list of content-codings is open to extension. None of these are handled by the gzip code above. CherryPy handles them all; I would argue that it does so because CherryPy's gzip logic is written for use as a Tool, and therefore has easy access to the always-on, full-featured CherryPy API. Paste's gzip code could be fixed, but it would either be fixed with a backend library that contributes to middleware library bloat, or with pure WSGI in all its ugly glory.

Object-oriented execution

CherryPy treats your "Python app" (Python objects in your script or package) as the prime object, and understands that your deployment schemes (not just config entries) may vary widely between development, testing, staging, and production environments. In development, you may run the builtin HTTP server, but choose to have Apache host your app (via mod_python) in production. This complexity often requires flexible, imperative Python to get right. Paste, on the other hand, treats your config file as the object which it "runs", and expects you to use command-line arguments to control it:

usage: /usr/local/bin/paster serve [options] CONFIG_FILE [start|stop|restart|status]
Serve the described application

If start/stop/restart is given, then it will start (normal
operation), stop (--stop-daemon), or do both.  You probably want
``--daemon`` as well for stopping.

Options:
  -h, --help            show this help message and exit
  -v, --verbose
  -q, --quiet
  -nNAME, --app-name=NAME
                        Load the named application (default main)
  -sSERVER_TYPE, --server=SERVER_TYPE
                        Use the named server.
  --server-name=SECTION_NAME
                        Use the named server as defined in the configuration
                        file (default: main)
  --daemon              Run in daemon (background) mode
  --pid-file=FILENAME   Save PID to file (default to paster.pid if running in
                        daemon mode)
  --log-file=LOG_FILE   Save output to the given log file (redirects stdout)
  --reload              Use auto-restart file monitor
  --reload-interval=RELOAD_INTERVAL
                        Seconds between checking files (low number can cause
                        significant CPU usage)
  --status              Show the status of the (presumably daemonized) server
  --user=USERNAME       Set the user (usually only possible when run as root)
  --group=GROUP         Set the group (usually only possible when run as root)
  --stop-daemon         Stop a daemonized server (given a PID file, or default
                        paster.pid file)

This results in Yet Another Configuration Mode, and adds to developer confusion. CherryPy prefers the mode of normal, imperative, object-oriented Python (and makes a concession for conventional config files as a secondary, declarative mode). Paste already adds its own config syntax, prefers functional over object-oriented style, and over-exposes WSGI internals (a protocol which, while complete and stable, includes so many concessions to fringe requirements and the lowest common denominator that many web developers find it difficult to grasp). Adding a set of command-line args (to server applications, no less) is needlessly complicated. There should be one, and preferably only one, obvious way to do it.

Conclusion

CherryPy is more natural and more usable because it adheres to Pythonic ideals:

  • Beautiful is better than ugly. Good objects are inherently more beautiful than good functions.
  • Explicit is better than implicit. Imperative Python is better than declarative config.
  • Simple is better than complex. Config files should not invent syntax.
  • Complex is better than complicated. Normal Python is better than DSL's.
  • Flat is better than nested. Config file sections should be one level deep.
  • Sparse is better than dense. http://www.cherrypy.org/browser/tags/cherrypy-3.0.0/cherrypy/_cpdispatch.py is better than http://pythonpaste.org/paste/urlparser.py.html.
  • Readability counts. New syntax should be avoided.
  • Special cases aren't special enough to break the rules. Not even WSGI is that special.
  • Although practicality beats purity. But it is a worthwhile standard and needs to be wrapped (not exposed).
  • There should be one-- and preferably only one --obvious way to do it. Modes should be minimized.
  • If the implementation is hard to explain, it's a bad idea. http://www.google.com/search?q=python+paste+understand
  • If the implementation is easy to explain, it may be a good idea. http://www.google.com/search?q=cherrypy+understand
  • Namespaces are one honking great idea -- let's do more of those! CherryPy does.

See Also

Architectures

CherryPy

http://www.cherrypy.org/attachment/wiki/CherryPyAndPaste/cparch.gif?format=raw

Paste

http://www.cherrypy.org/attachment/wiki/CherryPyAndPaste/pastearch.gif?format=raw

Attachments

Hosted by WebFaction

Log in as guest/cpguest to create tickets