[[PageOutline]] [This document is complete to rev 1460.] = What's new in CherryPy 3.0 = This document only describes new features in CherryPy 3.0. A detailed "How To Upgrade" document is at [wiki:UpgradeTo30 UpgradeTo30]. == Speed == CherryPy 3 is much faster than CherryPy 2 (as much as three times faster in benchmarks). == Config == === _cp_config: attaching config to handlers === In CP 2, you could only specify "config" in a config file or dict, where it was always keyed by URL. For example: {{{ [/path/to/page] methods_with_bodies = ("POST", "PUT", "PROPPATCH") }}} It's obvious that the extra method is the norm for that path; in fact, the code could be considered broken without it. In CherryPy 3, you can attach that bit of config directly on the page handler: {{{ def page(self): return "Hello, world!" page.exposed = True page._cp_config = {"request.methods_with_bodies": ("POST", "PUT", "PROPPATCH")} }}} This can be done at any point in the cherrypy tree; for example, we could have attached that config to a class which contains the page method: {{{ class SetOPages: _cp_config = {"request.methods_with_bodies": ("POST", "PUT", "PROPPATCH")} def page(self): return "Hullo, Werld!" page.exposed = True }}} This technique allows you to: * Put config near where it's used for improved readability and maintainability. * Attach config to objects instead of URL's. This allows multiple URL's to point to the same object, yet you only need to define the config once. * Provide defaults which are still overridable in a config file. === Separate configuration scopes === CherryPy 2 used a single config dict for global, per-application, and per-path config. CherryPy 3 separates these scopes in a couple of ways: First, and most '''important''', {{{cherrypy.config}}} now only holds global config data; that is, config entries which affects all mounted applications. Each Application object keeps its own config in {{{app.config}}}. You must pass global config to {{{cherrypy.config.update}}}, and per-application config to {{{cherrypy.tree.mount}}}. You ''may'' use a single config file and hand the same file (or filename) to both methods; put your global config in a [global] section to signal {{{cherrypy.config.update}}} which entries to grab. Second, when a request is processed, these two config sources (global and per-application) are merged and collapsed to form a single config dict stored inside {{{cherrypy.request.config}}}. This dict contains only those config entries which apply to the given request; that is, per-path config. Note that when you do an InternalRedirect, this config is recalculated for the new path. === Configuration namespaces === In CherryPy 2, config entries were somewhat haphazard about their naming and scope. They were always inspected as late as possible, often multiple times, and their default values were locked away inside the source code. In CherryPy 3, all config entries (except "environment") are now prefixed with a namespace. When you provide a config entry, it is now bound as early as possible to the actual object referenced by the namespace; for example, CP 2's "stream_response" is now "response.stream", and actually sets the "stream" attribute of cherrypy.response. In this way, you can easily determine the default value by firing up a python interpreter and typing: {{{ >>> import cherrypy >>> cherrypy.response.stream False }}} This also means that some objects (the Request class in particular) have grown a number of new attributes, to avoid the need for config.get(). Entries from each namespace may be allowed in the global, application root ("/") or per-path config, or a combination: ||Scope||Global||Application Root||App Path|| ||engine||X|| || || ||hooks||X||X||X|| ||log||X||X|| || ||request||X||X||X|| ||response||X||X||X|| ||server||X|| || || ||tools||X||X||X|| ==== Custom config namespaces ==== You can define your own namespaces if you like, and they can do far more than simply set attributes. The {{{test/test_config}}} module, for example, shows an example of a custom namespace that coerces incoming params and outgoing body content. The {{{_cpwsgi}}} module includes an additional, builtin namespace for invoking WSGI middleware. In essence, a config namespace handler is just a function, that gets passed any config entries in its namespace. You add it to a namespaces registry (a dict), where keys are namespace names and values are handler functions. When a config entry for your namespace is encountered, the corresponding handler function will be called, passing the config key and value; that is, {{{namespaces[namespace](k, v)}}}. For example, if you write: {{{ def db_namespace(k, v): if k == 'connstring': orm.connect(v) cherrypy.config.namespaces['db'] = db_namespace }}} ...then {{{cherrypy.config.update({"db.connstring": "Oracle:host=1.10.100.200;sid=TEST"})}}} will call {{{db_namespace('connstring', 'Oracle:host=1.10.100.200;sid=TEST')}}}. The point at which your namespace handler is called depends on where you add it: ||Namespace ||Handler is called in || ||config.namespaces ||cherrypy.config.update|| ||Application.namespaces ||Application.merge (which is called by cherrypy.tree.mount)|| ||engine.request_class.namespaces||Request.configure (called for each request, after the handler is looked up)|| If you need additional code to run when all your namespace keys are collected, you can supply a callable context manager in place of a normal function for the handler. Context managers are defined in [http://www.python.org/dev/peps/pep-0343/ PEP 343]. == Tools == === Using builtin tools === Filters are gone! In their place are Tools, which allow for much more flexibility. If your favorite builtin filter has changed to a tool, it's easy to convert your code. See [wiki:UpgradeTo30 UpgradeTo30] for a complete list of name changes. Instead of this: {{{ [/docroot] static_filter.on: True static_filter.root: "/path/to/app" static_filter.dir: 'static' }}} ...use the "tools" namespace like this: {{{ [/docroot] tools.staticdir.on: True tools.staticdir.root: "/path/to/app" tools.staticdir.dir: 'static' }}} We can also use our new friend {{{_cp_config}}} (see above): {{{ class docroot(object): _cp_config = {'tools.staticdir.on': True, 'tools.staticdir.root: "/path/to/app", 'tools.staticdir.dir': 'static'} }}} But we can do even better by using the '''builtin decorator support''' that all Tools have: {{{ class docroot(object): @tools.staticdir(root="/path/to/app", dir='static') def page(self): ... }}} ...and in this case, we can do even '''better''' because tools.staticdir is a 'HandlerTool', and therefore can be used directly as a page handler: {{{ class docroot(object): static = tools.staticdir.handler(section='static', root="/path/to/app", dir='static') }}} Finally, you can use (most) Tools directly, by calling the function they wrap. They expose this via the 'callable' attribute: {{{ def page(self): tools.response_headers.callable([('Content-Language', 'fr')]) return "Bonjour, le Monde!" page.exposed = True }}} Because the underlying function is wrapped in a tool, you need to call help(tools.whatevertool.callable) if you want the docstring for it. Using help(tools.whatevertool) will give you help on how to use it as a Tool (for example, as a decorator). Tools also are also '''inspectable''' automatically. They expose their own arguments as attributes: {{{ >>> dir(cherrypy.tools.session_auth) [..., 'anonymous', 'callable', 'check_username_and_password', 'do_check', 'do_login', 'do_logout', 'handler', 'login_screen', 'on_check', 'on_login', 'on_logout', 'run', 'session_key'] }}} This makes IDE calltips especially useful, even when writing config files! === New and improved builtin tools === ==== tools.proxy ==== This replaces and enhances the old baseurl_filter. The old way: {{{ baseurl_filter.base_url = "http://myhost" baseurl_filter.use_x_forwarded_host = False }}} The new way: {{{ tools.proxy(base=None, local='X-Forwarded-Host', remote='X-Forwarded-For', scheme='X-Forwarded-Proto') }}} This changes the base URL (scheme://host[:port][/path]), and is most useful when running a CP server behind Apache or some other webserver. {{{tools.proxy.local}}} defines the request header which will be used to auto-fill the new request.base. If you want the new request.base to include path info (not just the host), you must explicitly set base to the full base path, and ALSO set {{{tools.proxy.local}}} to "" (empty string), so that the X-Forwarded-Host request header (which never includes path info) does not override it. New in CP 3: cherrypy.request.remote.ip (the IP address of the client) will be rewritten if the header specified by {{{tools.proxy.remote}}} is valid. By default, 'remote' is set to 'X-Forwarded-For'. If you do not want to rewrite remote.ip, set the 'remote' arg to an empty string. ==== tools.log_tracebacks ==== This replaces the CP 2 feature: "server.log_tracebacks". ==== tools.log_headers ==== This replaces the CP 2 feature: "server.log_request_headers". ==== tools.err_redirect ==== Turn this tool on to redirect all unhandled errors to a different page. Supply the new URL via {{{tools.err_redirect.url}}}. By default, this raises InternalRedirect. To use HTTPRedirect, set {{{tools.err_redirect.internal}}} to False. ==== tools.etags ==== This new tool validates the current ETag response header against If-Match and If-None-Match headers, and raises "304 Not Modified" or "412 Precondition Failed" as needed. If {{{tools.etags.autotags}}} is True, an ETag response-header value will be provided from an MD5 hash of the response body (unless some other code has already provided an ETag header). If False (the default), the ETag will not be automatic. ==== tools.expires ==== A tool for influencing cache mechanisms using the 'Expires' header. {{{tools.expires.secs}}} must be either an int or a datetime.timedelta, and indicates the number of seconds between response.time and when the response should expire. The 'Expires' header will be set to (response.time + secs). If zero (the default), the following "cache prevention" headers are also set: {{{ 'Pragma': 'no-cache' 'Cache-Control': 'no-cache' }}} If {{{tools.expires.force}}} is False (the default), the following headers are checked: 'Etag', 'Last-Modified', 'Age', 'Expires'. If any are already present, none of the above response headers are set. ==== tools.basic_auth ==== A tool for doing basic authentication. It takes a "realm" setting (a string) and a "users" dict of {username: password} pairs (or a callable which returns that dict). If authentication fails, 401 Unauthorized is raised. ==== tools.digest_auth ==== A tool for doing Digest authentication (RFC 2617). It takes a "realm" setting (a string) and a "users" dict of {username: password} pairs (or a callable which returns that dict). If authentication fails, 401 Unauthorized is raised. ==== tools.trailing_slash ==== A tool that lets you control whether URL's with a missing or extra trailing slash should raise HTTPRedirect. It's on by default, with these settings: {{{ tools.trailing_slash.on = True tools.trailing_slash.missing = True tools.trailing_slash.extra = False }}} That is, if a trailing slash is missing for an index handler, HTTPRedirect is raised. But if a non-index handler has an extra slash, it's not redirected by default. ==== tools.accept ==== A tool for verifying that the client is willing to accept the Content-Type of the response. {{{tools.accept.media}}}, if provided, should be the Content-Type value (as a string) or values (as a list or tuple of strings) which the current request can emit. The client's acceptable media ranges (as declared in the Accept request header) will be matched in order to these Content-Type values; the first such string is returned. That is, the return value will always be one of the strings provided in the 'media' arg (or None if 'media' is None). The return value doesn't mean anything when used as a Tool, but you can call {{{tools.accept.callable(media)}}} directly to dispatch based on the client's preferred Content-Type: {{{ def select(self): mtype = tools.accept.callable(['text/html', 'text/plain']) if mtype == 'text/html': return "

Page Title

" else: return "PAGE TITLE" select.exposed = True }}} Regardless of whether you call it directly or just turn on the Tool, if no match is found, then HTTPError 406 (Not Acceptable) is raised. Note that most web browsers send */* as a (low-quality) acceptable media range, which should match any Content-Type. In addition, "...if no Accept header field is present, then it is assumed that the client accepts all media types." === Custom tools === You can make your own tools and register them to gain all the benefits the builtin Tools enjoy. Usually, this is as simple as: {{{ cherrypy.tools.my_tool = cherrypy.Tool('before_request_body', my_callback) }}} {{{cherrypy.tools}}} is an instance of {{{_cptools.Toolbox}}}. When you add your Tool to it, then config entries in the "tools.my_tool.*" namespace automatically get passed to your callback as keyword arguments. See {{{cherrypy._cptools}}} for more examples. === Custom toolboxes === If you're building a framework on top of !CherryPy, you might want to use your own toolbox to avoid conflicting with builtin tools. It's just a single line: {{{mytools = cherrypy._cptools.Toolbox("mytools")}}}. This one line creates a new Toolbox and automatically registers the "mytools" config namespace. == Hooks == Tools use hooks under the covers. Each Hook has a "callback" attribute, and is registered at a "hook point" in a HookMap called {{{cherrypy.request.hooks}}}. As a request is processed, hooks are called at the following hook points: 'on_start_resource', 'before_request_body', 'before_handler', 'before_finalize', 'on_end_resource', 'on_end_request', 'before_error_response', and 'after_error_response'. If you can't make a Tool, you can provide custom hooks in config by writing {{{hooks. = function}}}, and the function you provide will be called at that hook point. If you want to do it in code (especially for a custom Tool, see above), use {{{cherrypy.request.hooks.attach(self, point, callback, failsafe=None, priority=None, **kwargs)}}}. Some Hook objects are "failsafe", which means that they are guaranteed to run even if other Hooks in the same hook point raise exceptions (if more than one fails, they are all logged, but only the last exception is raised). You can either set {{{Hook.failsafe = True}}}, or provide it as {{{Hook(callback, failsafe=True)}}}. Additionally, you may be able to set {{{callback.failsafe = True}}}, in which case the Hook will automatically copy that value to itself. Hook objects also have a "priority", in the closed interval of [0, 100]. By default, Hook.priority is 50, but you can change it (as with failsafe, above). This is a necessary evil to make sure that, for example, the encoding Tool's hooks run before the gzip Tool's hooks (if they were reversed, the request would almost certainly fail, because the encoding Tool was designed to operate on text output, not binary). == Dispatch == "Dispatch" refers to the way the framework looks up and calls application code. By default, CherryPy traverses a tree of objects to find a page handler that you've written. Then it calls that function, passing any virtual path segments as positional arguments and any request parameters (form or querystring values) as keyword arguments. In CherryPy 2, this process was hard-coded into the core; to change it, you had to subclass the Request object. CherryPy 3 separates dispatch into a new "request.dispatch" object, which you can specify in config per-path. It must refer to a callable that 1) takes a {{{path_info}}} argument, and 2) sets {{{cherrypy.request.handler}}} (a callable that takes no arguments) and {{{cherrypy.request.config}}} (a flat dict containing all config entries that apply to the current request). There's a new !MethodDispatcher and !RoutesDispatcher in {{{cherrypy.dispatch}}}, too. Feel free to try them out. == URL construction == There's a new {{{cherrypy.url(path)}}} function which can be used to construct portable URL's for your application. It calculates new paths relative to the current SCRIPT_NAME (if you pass a path which starts with "/") or relative to the current PATH_INFO (if you pass a path which ''doesn't'' start with "/"). == Autoreload == The autoreload feature has been completely reworked. In CherryPy 2.x, it would immediately start a second process (using {{{os.spawnve(os.P_WAIT, ...)}}}). This caused repeated confusion and complaints when applications would "mysteriously" run startup code twice. In CherryPy 3, the autoreload mechanism does nothing to the initial process, it simply replaces its own process when needed (using {{{os.execv}}}). You can also now trigger this behavior yourself, outside of the autoreload file-checking logic, by calling {{{cherrypy.engine.reexec}}}. Finally, if your platform supports the HUP signal, then a SIGHUP will automatically call cherrypy.engine.reexec (whereas SIGTERM shuts down CherryPy, now). We've also borrowed an idea from Turbogears: {{{engine.autoreload_match}}} is a regular expression pattern (default .*) that you can change to filter which files are monitored. == WSGI improvements == === WSGI server === The builtin WSGI server is now HTTP/1.1 compliant! It correctly handles persistent connections, pipelining, Expect/100-continue, and the "chunked" transfer-coding (receive only). It also now emits a custom WSGI environ entry: ACTUAL_SERVER_PROTOCOL. Clients can calculate min(SERVER_PROTOCOL, ACTUAL_SERVER_PROTOCOL) in order to determine which level of HTTP features to support. CherryPy applications can see this min() value in {{{cherrypy.request.protocol}}}. It also supports HTTPS/SSL! Just set server.ssl_certificate and server.ssl_private_key to the names of each file in your config. As always, the code in {{{wsgiserver.py}}} is usable anywhere, as it doesn't depend on CherryPy in any way. Feel free to use it with other WSGI stacks. === WSGI applications === cherrypy.Application objects are now WSGI applications, automatically. Whenever you call {{{cherrypy.tree.mount(Root())}}}, the "Root" object you pass is wrapped up in an Application object, and added to cherrypy.tree.apps. One big difference between CherryPy Application objects and a lot of other WSGI applications is that CherryPy apps usually know their own SCRIPT_NAME before being called. If you cannot or don't want to set this in stone, set app.script_name to None, and the Application will provide it from the WSGI environ['SCRIPT_NAME'] on each request. In addition, cherrypy.tree is also usable as a "WSGI application"; it acts as dispatching middleware to all mounted apps. === WSGI middleware === In addition to mounting cherrypy.Application objects onto cherrypy.tree, you can also mount plain 'ol WSGI callables, too, using {{{cherrypy.tree.graft(wsgi_callable, script_name="")}}}. Then hand cherrypy.tree to your WSGI server, and it will happily dispatch to both CherryPy apps and foreign WSGI apps. The profile module is now implemented as WSGI middleware, too. Use {{{cherrypy.lib.profiler.make_app(nextapp, path, aggregate=False)}}} to use it. If 'aggregate' is False, a separate profile dump will be made for each request. If True, all requests (for the same 'nextapp') will be aggregated together into a single results file. Finally, there's a new "pipeline" helper in cherrypy.wsgi. The config entry {{{wsgi.pipeline = [(name, wsgiapp_factory), ...]}}} will pipe the request through the supplied wsgiapps before handing it off to the CherryPy application. See {{{help(cherrypy.wsgi.CPWSGIApp)}}} for details. If you want to do it in code instead of config, write: {{{ app = cherrypy.Application(Root()) app.wsgiapp.pipeline.append((name, wsgiapp_factory)) cherrypy.tree.mount(app, config={'/': root_conf}) }}} == Logging == CherryPy 3 now uses the standard library's {{{logging}}} module, which means you have access to its RotatingFileHandler(s), SocketHandler, SysLogHandler, NTEventLogHandler, SMTPHandler, and HTTPHandler (and other goodies). In CherryPy 2, log config was specifiable per-path (since it used very simple handlers). Now, there are separate error and access logs for each mounted Application (named "cherrypy.error.%s" % id(app)), as well as global error and access logs (named "cherrypy.error" and "cherrypy.access"). This naming scheme means that messages sent to "cherrypy.access.723863" will automatically also be sent to the global "cherrypy.access" log. == Code inspection == A lot of work has been done to make CherryPy 3 play nice with the interactive interpreter. If you don't know or can't recall how something works or even what features are available, start with help(cherrypy), and work your way through the available attributes. Two items in particular need mentioning: * The {{{cherrypy.request}}} and {{{response}}} objects are dummy objects, and exist only for your benefit when you write an application. The values of their attributes should be considered read-only, and are only intended to let you see default values easily. * Tools have two faces. They have their own answers to help() that tell you how to use them as tools; if you want to see the docstrings for the functions they wrap, try help(tools..callable) instead. They ''do'', however copy the argument names of the wrapped function to themselves as attributes (all None), so you should be able to use dir(tools.) with no problems. == Redirection and Deadlock == The {{{cherrypy.request}}} object now has improved support for !InternalRedirect situations. First, on redirect, it creates an entirely new Request object, and sets {{{Request.prev}}} to point to the previous Request object. It also inspects the list of seen URL's at each redirect, and, if the new path + querystring has already been visited during this request, raises an error. This stops infinite redirect loops. If for some reason you ''want'' to visit the same path twice in a single request, set {{{wsgi.iredir.recursive = True}}} in config. You may also now raise !InternalRedirect at any time during the run of a Request. In the past, you could only do so during the "before_main" hook and inside page handlers. Each response object also has a {{{time}}} attribute (set to time.time() when created), a {{{timeout}}} attribute (default 300 seconds), and a {{{timed_out}}} attribute, a bool. Assuming {{{cherrypy.engine.deadlock_poll_freq}}} is greater than 0, a monitor thread will check if {{{now > response.time + response.timeout}}}; if so, it sets response.timed_out. This is checked at various places in the core, and cherrypy.TimeoutError is raised if response.time_out is True. Feel free to check it and raise TimeoutError in your own code's critical sections. == Drop privileges == There is a new {{{engine.drop_privileges}}} function which may setuid/gid and/or set a new umask, or raise NotImplemented, depending on your platform. If you're on UNIX, you'll probably see engine.uid, engine.gid (names or numbers), and engine.umask attributes which you can set (from config, if you want). If you're on Windows, you'll only see the umask attribute. Other platforms may see none of these. Whatever happens, it'll get logged so you know when it works and it'll raise errors when it doesn't work. == Native support for mod_python == The popular "mpcp" module has been ... uh ... "embraced and extended" and is now included in the standard CherryPy 3 distribution as {{{cherrypy._cpmodpy}}}. Thanks to Jamie Turner for his ingenuity and generosity! == Multiple HTTP server support == The new {{{cherrypy.server}}} object can now control more than one HTTP server. Add additional ones via {{{server.httpservers[myserver] = (host, port)}}}. This can be used to listen on multiple ports or protocols.