We at engageSPARK are using uWSGI to run our Django app in production. uWSGI’s awesomeness includes extensive docs and excellent community tutorials such as this one by DigitalOcean, making it easy to get things running quickly.

Now have a look at the configuration options. (Hint: Look at the size of the scroll bar.) If you just heard a thud—yes, that was your chin hitting the floor. uWSGI is truly a cornucopia of options. This means, while it’s possible to get things running quickly, it’s not obvious how to make things run well. Even if just for the simple reason that you can’t find the right option.

In this article we share some pointers so your Python app runs robustly and so you can more easily monitor and debug it. The GitHub repository for this blog article provides a simple web app and some scripts to easily test the features presented here.

Monitoring workers with the stats server

Using the stats server, you can view statistics about the master process, such as size of the listen queue, PID, locks, and statistics about its individual workers, such as harakiri count or average response time.

To enable the stats server, just provide the interface and port where to listen on:

--stats=":8001"

The stats server exposes its data using HTTP and JSON, so a simple “curl” will do to query the data.

To get an overview quickly, you can use uwsgitop, a simple “top”-like command-line interface.

What your workers are doing: Python tracebacker

Ever wondered what your uWSGI workers are doing when you’re not watching? Well, me neither. But when you are watching
the tracebacker module allows you to look at their current stack trace!
Enable it using the following option, and give it a filepath-prefix:

    --py-tracebacker /tmp/pytracebacker.socket

For each worker, a socket file will be created. (Don’t worry, they’ll be gone when uWSGI quits.) If you have three workers, the following three files will be created: /app/pytracebacker.socket{1,2,3}.

Introspect the Python traceback using a helper in uWSGI:

    uwsgi --connect-and-read /tmp/pytracebacker.socket1

Remote debugger

Sometimes you know where your workers are stuck, but you don’t know why—you need to debug more. Littering your code with logging statements is not always the best option; a “pdb” shell can be worth it’s weight in gold. (Or ipdb for that matter.)

    import ipdb; ipdb.set_trace()

Unfortunately, adding such a debug statement to a code path means that all workers executing that code will enter the interactive debugger and freeze. Also, it’s generally not advisable to put that kind of statement into production code.

Would it not be great, if you could just inspect the worker process that is currently blocking? And maybe even only on demand?
Well, that’s what remote_pdb is there for!

remote_pdb is a debugger implementation that exposes the PDB shell via a TCP socket. The server is part of your uWSGI worker process, and listens on a given port. You connect to that port for example using telnet, and then you have a normal PDB shell, allowing you to inspect the running python process.

The last question is: How do you trigger the remote debugger on demand, so that it starts the server, allowing you to inspect the running worker? There are a couple of options:

First, you can use the global exception handler. Another option is to notify the trigger using uWSGI signals. Be aware that signal handlers are subject to the same harakiri timeout as HTTP handlers. The GitHub repository mentioned above demonstrates this mechanism; see its README for details.

Both these approaches will launch the debugger in a handler thread, meaning it will not be part of the problematic stack.

Know when to give up, with Harakiri

Know the feeling when you’re stuck, and you just can’t find the solution to your problem? Great! Then you know how your uWSGI workers feel, when they’re blocked and just don’t return.

If all your workers end up being stuck, your web app will not answer requests anymore—it’s down. To mitigate this, you can order your workers to give up and restart after a defined number of seconds.

For example, you may think that 65 seconds ought to be more than enough to answer any request in your web app. Here’s how you configure uWSGI to restart a worker if it spends more than that on a single request:

    --harakiri=65

Note that the request than will be terminated by resetting the connection. This is not the most user-friendly behavior, so it really should be a measure of last resort. 🙂

The uwsgi module

Finally, there is the uwsgi module. At the cost of adding an explicit dependency to uwsgi in your production code, it allows you to read statistics about uWSGI, reload its workers, register signal handlers and cron-jobs and much more. One nice feature is the ability to access memory shared by all workers.

To access this language-agnostic API in a more Pythonic way, see the uwsgidecorators module.

And by the way, this is an amazing method name. If a method name starts with “sorry”, you know things are desperate. 😀

Hope this helps. You can quickly test the features mentioned here using the  GitHub repository for this blog article.