Download Install Tutorial Docs FAQ Tools WikiLicense Team IRC Planet Involvement Shop Book

Ticket #691 (defect)

Opened 1 year ago

Last modified 1 year ago

Repeated Ctrl-C hangs wsgiserver

Status: closed (fixed)

Reported by: jlowery Assigned to: fumanchu
Priority: normal Milestone: 3.0
Component: CherryPy code Keywords:
Cc:

Ever since I have gotten cherrypy3, I've had problems with the cherrypy process hanging in the shell after a CTRL-C is sent to process. My only resort would be to send a SIGTERM or SIGKILL.

I was looking around in the code and came up with this:

When CTRL-C is sent to to the cherrypy process, each connection thread with an active connection waits until until a request does not return a body or close_connection on the request is set to True (wsgiserver/__init__.py:611). The forever loop here implements HTTP/1.1 and Keep-Alive. with the close_connection attribute on the HTTPRequest.

Therefore, there is a noticable time lag between when a KeyboardInterrupt? is sent and all of the worker threads are able to process their SHUTDOWN request (because conn.communicate is blocking) wsgiserver/__init__.py:693.

While the loop is spinning on the Connection's communicate method, sending another CTRL-C to the process causes the thread to abort and the callstack bubble up out of the thread join. At this point, the only recourse is to manually SIGTERM.

It would appear that since the shutdown requests are put in a worker queue, there currently isn't any way to give the HTTPConnection a flag to say stop processing requests.

I suppose that the best thing to do would allow the KeyBoardInterrupts? to not lock the process during the shutdown process.

A fix I came up with is adding KeyboardInterrupt? to the thread join loop in wsgiserver/__init__.py near line 986

Go from

   while self._workerThreads:
            worker = self._workerThreads.pop()
            if worker is not current and worker.isAlive:
                try:
                    worker.join()
                except AssertionError:
                    pass

to

   while self._workerThreads:
            worker = self._workerThreads.pop()
            if worker is not current and worker.isAlive:
                try:
                    worker.join()
                except (AssertionError, KeyboardInterrupt):
                    pass

This at least allows the process to shutdown gracefully after multiple CTRL-C's.

To reproduce this situation (test system Ubuntu Feisty):

test file:

#broke.py
import cherrypy

class Root(object):
    @cherrypy.expose
    def index(self):
        return 'hi'

cherrypy.quickstart(Root())

$python broke.py

<open localhost:8080 in web browser to activate a connection> tap CTRL-C twice.

However, I wish that I could figure out how to interrupt the connection loop when a shutdown is occuring, since Keep-Alive could keep the server in the shutdown state for an arbitrarily long time while multiple requests from the same client.

Attachments

multictrlc.patch (2.3 kB) - added by fumanchu on 06/06/07 15:47:16.

Change History

06/06/07 13:47:49: Modified by fumanchu

  • status changed from new to assigned.

Excellent work! I've been aware of this for some time and just haven't had the time to really nail it down. Thanks for the patch!

Re: "interrupt the connection loop"...I think you're talking about the "while" loop inside HTTPConnection.communicate? That usually waits on rfile.readline() inside parse_request, and *that* is waiting on socket._fileobject._sock.recv(). I wonder what happens if we call rfile._sock.close() in the main thread while another thread is waiting on recv()?

06/06/07 13:52:17: Modified by jlowery

Well, I tested this out. .close() doesn't work, but shutdown() does. I'm still learning the threading internals of cherrypy, so my solution here may have thread-related side-effects. I also think this fix is kind of kludgy. Maybe you know a better way to fit it into the design.

All of these changes are in wsgiserver/__init__.py

Modify the WorkerThread? class thusly (minus my debug print lines of course):

class WorkerThread(threading.Thread):
    """Thread which continuously polls a Queue for Connection objects.

    server: the HTTP Server which spawned this thread, and which owns the
        Queue and is placing active connections into it.
    ready: a simple flag for the calling server to know when this thread
        has begun polling the Queue.

    Due to the timing issues of polling a Queue, a WorkerThread does not
    check its own 'ready' flag after it has started. To stop the thread,
    it is necessary to stick a _SHUTDOWNREQUEST object onto the Queue
    (one for each running WorkerThread).
    """
    conn = False

    def __init__(self, server):
        self.ready = False
        self.server = server
        threading.Thread.__init__(self)

    def run(self):
        try:
            self.ready = True
            while True:
                conn = self.conn = self.server.requests.get()
                #print 'handling request on thread %s with conn %s' % (self, conn)
                if conn is _SHUTDOWNREQUEST:
                    return

                try:
                    #print 'thread %s communicating' % self
                    conn.communicate()
                    #print 'thread %s end communicate' % self
                finally:
                    conn.close()
        except (KeyboardInterrupt, SystemExit), exc:
            self.server.interrupt = exc

In the CherrypyWSGIServer:stop method, change:

    # Must shut down threads here so the code that calls
    # this method can know when all threads are stopped.
    for worker in self._workerThreads:
        self.requests.put(_SHUTDOWNREQUEST)

to

    # Must shut down threads here so the code that calls
    # this method can know when all threads are stopped.
    for worker in self._workerThreads:
        self.requests.put(_SHUTDOWNREQUEST)
        if worker.conn and not worker.conn.rfile.closed:
            worker.conn.rfile._sock.shutdown(_socket.SHUT_RDWR)

also need to import _socket in the top imports.

One side effect of this is that it is possible for the socket calls under the communicate() method to throw a bad file descriptor exception. I was able to generate one, but it did not hang the process and everything shut down properly. (Just got an exception occured in WorkerThread? message).

In effect, this kills the worker thread dead in its tracks.

Hitting CTRL-C stops the process instantly, although this should probably be tested on win32 as winsock can be a PITA sometimes.

06/06/07 13:52:57: Modified by fumanchu

I can test on win32...I'll try to fold this into test_states so we can test on multiple platforms more easily, and get the regression benefits.

We should give the deployer a "server.shutdown_timeout" config entry--setting it to zero makes SIGTERM or Ctrl-C stop the process instantly, setting it to a positive float gives existing connections N seconds to complete before being forcibly shut down. That should allow a reasonable way to shutdown gracefully--if you need immediate forced shutdown, there's always SIGKILL.

06/06/07 13:53:31: Modified by jlowery

Makes sense. For the timeout mechanism. Do you think the best way would be to write a Shutdown thread that is launched in stop() that time.sleep() before killing all of the sockets?

06/06/07 13:56:09: Modified by fumanchu

It looks like we can just call worker.join(timeout=self.shutdown_timeout):

"When the timeout argument is present and not None, it should be a floating point number specifying a timeout for the operation in seconds (or fractions thereof). As join() always returns None, you must call isAlive() to decide whether a timeout happened."

    current = threading.currentThread()
    while self._workerThreads:
        worker = self._workerThreads.pop()
        if worker is not current and worker.isAlive:
            try:
                worker.join(self.shutdown_timeout)
                if worker.isAlive:
                    # We exhausted the timeout. Forcibly shut down the socket.
                    c = worker.conn
                    if c and not c.rfile.closed:
                        c.rfile._sock.shutdown(_socket.SHUT_RDWR)
            # except (AssertionError, KeyboardInterrupt):
            except AssertionError:
                pass

06/06/07 14:27:38: Modified by Jeremy Lowery

Ok. I gave your timeout to join solution a try and it works well. Suppose can just disregard the shutdown thread solution.

06/06/07 14:33:36: Modified by Jeremy Lowery

Just tested and on Ubuntu if you don't have the KeyboardInterrupt? except, the process still hangs on a non-zero shutdown_time with multiple CTRL-C's.

06/06/07 14:58:59: Modified by lawouach

Would that work?

current = threading.currentThread()
while self._workerThreads:
     self.requests.put(_SHUTDOWNREQUEST)
     worker = self._workerThreads.pop()
     if worker is not current and worker.isAlive:
        try:
            worker.join()
        except AssertionError:
            pass

Basically the idea would be to force the normal processing of shutting down of each worker thread.

As in: http://www.cherrypy.org/browser/trunk/cherrypy/wsgiserver/__init__.py#L687

06/06/07 15:11:56: Modified by fumanchu

The _SHUTDOWNREQUEST's are done first so that idle threads don't pick up new requests while other threads are being shut down. Moving those queue.put calls would make shutdown take a lot longer under heavy load.

06/06/07 15:16:57: Modified by fumanchu

> if you don't have the !KeyboardInterrupt except,
> the process still hangs on a non-zero shutdown_time
> with multiple CTRL-C's.

Right; we need both. Can you produce a unified diff of all the changes so far, and attach it to this ticket? I'm still working on the test case.

06/06/07 15:20:59: Modified by Jeremy Lowery

I believe that the _sock.shutdown call would be needed to force an abort on a workerThread that is currently handling a client.

A softer technique that could be used would be to modify the HTTPRequest and change close_connection to a property that always returns true when the server is in shutdown state (probably by setting a global flag or threading the state through the worker).

This way, the communicate loop will finish up the last HTTP request that it was served before it closes the connection.

The only exception here would be for long lived HTTP requests. I believe streaming is an example of where this wouldn't would.

06/06/07 15:33:08: Modified by Jeremy Lowery

Here is a svn diff from the head. I don't see anywhere in the interface for me to attach a file:

Index: cherrypy/wsgiserver/init.py =================================================================== --- cherrypy/wsgiserver/init.py (revision 1658) +++ cherrypy/wsgiserver/init.py (working copy) @@ -47,6 +47,7 @@

quoted_slash = re.compile("(?i)%2F") import rfc822 import socket +import _socket try:

import cStringIO as StringIO

except ImportError?: @@ -673,7 +674,8 @@

it is necessary to stick a _SHUTDOWNREQUEST object onto the Queue (one for each running WorkerThread?). """ - + conn = None + def init(self, server):

self.ready = False self.server = server @@ -683,7 +685,7 @@ try:

self.ready = True while True: - conn = self.server.requests.get() + conn = self.conn = self.server.requests.get()

if conn is _SHUTDOWNREQUEST:

return @@ -766,7 +768,7 @@

ssl_private_key = None def init(self, bind_addr, wsgi_app, numthreads=10, server_name=None, - max=-1, request_queue_size=5, timeout=10): + max=-1, request_queue_size=5, timeout=10, shutdown_timeout=3):

self.requests = Queue.Queue(max) if callable(wsgi_app): @@ -790,6 +792,7 @@ self._workerThreads = [] self.timeout = timeout + self.shutdown_timeout = shutdown_timeout

def start(self):

"""Run the server forever.""" @@ -968,12 +971,20 @@ # Don't join currentThread (when stop is called inside a request). current = threading.currentThread() + + while self._workerThreads:

worker = self._workerThreads.pop() if worker is not current and worker.isAlive:

try: - worker.join() - except AssertionError?: + worker.join(self.shutdown_timeout) + if worker.isAlive: + # We exhausted the timeout. + # Forcibly shut down the socket. + c = worker.conn + if c and not c.rfile.closed: + c.rfile._sock.shutdown(_socket.SHUT_RDWR) + except (AssertionError?, KeyboardInterrupt?):

pass

def populate_ssl_environ(self):

06/06/07 15:34:41: Modified by Jeremy Lowery

Ok, that didn't work well. get it here.

http://koarcg.com/~jlowery/wsgiserver_head.diff

06/06/07 15:46:21: Modified by Jeremy Lowery

Here is a diff to allow configuration of the parameter:

http://koarcg.com/~jlowery/config.diff

06/06/07 15:47:16: Modified by fumanchu

  • attachment multictrlc.patch added.

06/06/07 17:33:16: Modified by fumanchu

  • status changed from assigned to closed.
  • resolution set to fixed.

Fixed in [1660]. I had to add some special logic for SSL. I also added a join() even after the join(timeout), because the spec says stop MUST block until all threads are shut down. I also allowed a shutdown_timeout of None to mean "wait forever; don't shutdown sockets".

Hosted by WebFaction

Log in as guest/cpguest to create tickets