signalfd

This article covers signalfd, a system call only available on Linux. If anyone knows of an equivalent for OSX or BSDs,* please let me know. It’d be great to create a compatibility layer.

Writing asynchronous IO code is fun; handling signals is not. signalfd allows you to move your signal handling code into your main event loop instead of hooking up global handlers and using the featureless set_wakeup_fd function to break the main loop.

Luckily Jean-Paul Calderone had already created a great Python wrapper for the signalfd and sigprocmask system calls. Unfortunately it doesn’t include a way to parse the siginfo_t structure which contains all the useful information about the signal you’re handling.

I’ve added a helper to do just that in a branch:

https://code.launchpad.net/~schmichael/python-signalfd/helpers

A sample program would look like:

 
import os
import select
import signal
 
import signalfd
 
 
def sigfdtest():
    # Catch them all!
    sigs = []
    for attr in dir(signal):
        if attr.startswith('SIG') and not attr.startswith('SIG_'):
            sigs.append(getattr(signal, attr))
 
    sfd = signalfd.create_signalfd(sigs)
    print 'Capturing: %r' % sorted(sigs)
 
    while 1:
        print 'selecting - pid: %d' % os.getpid()
        r = select.select([sfd], [], [])[0]
        for s in r:
            assert s is sfd, 'Python nicely re-uses the fd instance'
            sig = signalfd.read_signalfd(sfd)
            print sig
 
 
if __name__ == '__main__':
    sigfdtest()

When run you can throw some signals at it:

Capturing: [6, 14, 7, 17, 17, 18, 8, 1, 4, 2, 29, 6, 9, 13, 29, 27, 30, 3, 64, 34, 11, 19, 31, 15, 5, 20, 21, 22, 23, 10, 12, 26, 28, 24, 25]
selecting - pid: 6523
^CSIGINT
selecting - pid: 6523
^CSIGINT
selecting - pid: 6523
SIGHUP
selecting - pid: 6523
Killed

Of course you’ll need to use an uninterpretable signal like KILL to exit.

Jean-Paul attempted to get signalfd included in Python 2.7’s signal module, and it was slated for inclusion in 3.2. However, given that 3.2 was just released without it, I’m guessing the attempt to get this functionality into Python’s stdlib has been forgotten.

Up next: eventfd perhaps?

Posted in GNU/Linux, Open Source, Python, Technology | 3 Comments

schmongodb slides from Update Portland

A few months ago someone in #pdxwebdev on Freenode asked an innocent MongoDB question. In response I ranted seemingly endlessly about our experience with MongoDB at Urban Airship. After a few moments somebody (perhaps sarcastically? who can know on IRC) suggested I give a talk on my experiences with MongoDB. That led me to realize despite Portland’s amazing meetup culture there were no tech-meetups that focused on either:

  1. Narrative talks based on experiences in production (not how-tos)
  2. Database-agnostic backend systems focused groups (not just a NoSQL meetup)

So I started one: Update Portland.

And I gave my promised MongoDB talk: schmongodb.

And 10gen sent swag! (Thanks to Meghan! It was a big hit.)

And my brilliant coworker Erik Onnen gave a short talk on how he’s beginning to use Kafka at Urban Airship. (Expect a long form talk on that in the future!)

Thanks to everyone who showed up. I had a great time and have high hopes for the upcoming meetings. (Sign up for the mailing list!)

The slides may come across as overly negative. After all Urban Airship is actively moving away from MongoDB for our largest and busiest pieces of data. So I want to make 2 things very clear:

  1. I like MongoDB and would like to use it again in the future. There’s a lot I don’t like about it, but I can’t think of any “perfect” piece of software.
  2. The IO situation in EC2, particularly EBS’s poor performance (RAIDing really doesn’t help) made life with MongoDB miserable. This story may have been very different if we were running MongoDB on bare metal with fast disks.

Mike Herrick, the VP of Engineering at Urban Airship, put me on the spot at the end of my talk by asking me by asking me: “Knowing what you know now, what would you have done differently?”

I didn’t have a good answer, and I still don’t. Despite all of the misadventures, MongoDB wasn’t the wrong choice. Scaling systems is just hard, and if you want something to work under load, you’re going to have to learn all of its ins and outs. We initially started moving to Cassandra, and while it has tons of wonderful attributes, we’re running into plenty of problems with it as well.

So I think the answer is knowing then what I know now. In other words: Do your homework. That way we could have avoided these shortcomings and perhaps still be happy with MongoDB today. Hopefully these slides will help others in how they plan to use MongoDB so they can use it properly and happily.

Note: I added lots of comments to the speaker notes, so you’ll probably want to view those while looking at the slides.

Posted in Open Source, Technology | Tagged , , | 7 Comments

Deploying Python behind Nginx Talk Slides

I gave a talk on deploying Python WSGI apps behind nginx at the Portland Python User Group meeting on January 11th and finally got around to publishing the slides: schmingx.

I should mention Jason Kirtland informed me after the meeting that FastCGI supports persistent connections (and a host of other features) between a load balancer and backend app servers.

Posted in GNU/Linux, Open Source, Python, Technology | Tagged , , , , | 4 Comments

A Complete Noobs Guide to Hacking Nginx

At Urban Airship our RESTful HTTP API uses PUT requests for, among other things, registering a device. Since the application registering the device is the HTTP Basic Auth username, there’s often no body (entity body in HTTP parlance). Unfortunately nginx (as of 0.8.54, and I believe 0.9.3) doesn’t support PUT requests without a Content-Length header and responds with a 411 Length Required response. While the chunkin module adds Transfer-Encoding: chunked support, it doesn’t fix the empty PUT problem since HTTP requests without bodies don’t require Content-Length nor Transfer-Encoding headers.

So let’s hack nginx shall we?

I know a bit of C but am primarily a Python developer, so hacking an established C project doesn’t come easily to me. To make matters worse, as far as I can tell there’s no official public source repository (but there’s a mirror) and it seems to be mainly developed by the creator, Igor Sysoev. At least the code looks clean.

First Pass

I had nginx-0.8.54.tar.gz handy from compiling the source and nginx was nice enough to log an error for PUTs without Content-Length:

client sent PUT method without “Content-Length” header while reading client request headers, client: …, server: , request: “PUT / HTTP/1.1″ …

So let’s use ack to find it:

ack-grep 'client sent .* method without'

A quick vim +1532 src/http/ngx_http_request.c later and we’re looking at the problem:

    if (r->method & NGX_HTTP_PUT && r->headers_in.content_length_n == -1) {
        ngx_log_error(NGX_LOG_INFO, r->connection->log, 0,
                  "client sent %V method without \"Content-Length\" header",
                  &r->method_name);
        ngx_http_finalize_request(r, NGX_HTTP_LENGTH_REQUIRED);
        return NGX_ERROR;
    }

This code returns the 411 Length Required response for PUTs lacking a Content-Length header. Remove it, recompile (make -j2 && sudo make install && sudo service nginx restart), and test:

$ curl -vk -X PUT -u '...:...' http://web-0/api/device_tokens/FE66489F304DC75B8D6E8200DFF8A456E8DAEACEC428B427E9518741C92C6660
* About to connect() to web-0 port 80 (#0)
* Trying 10.... connected
* Connected to web-0 (10....) port 80 (#0)
* Server auth using Basic with user '...'
> PUT /api/device_tokens/FE66489F304DC75B8D6E8200DFF8A456E8DAEACEC428B427E9518741C92C6660 HTTP/1.1
> Authorization: Basic ...
> User-Agent: curl/7.19.5 (i486-pc-linux-gnu) libcurl/7.19.5 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.15
> Host: web-0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/0.8.54
< Date: Tue, 28 Dec 2010 23:06:54 GMT
< Content-Type: text/plain
< Transfer-Encoding: chunked
< Connection: keep-alive
< Vary: Authorization,Cookie,Accept-Encoding
<
* Connection #0 to host web-0 left intact
* Closing connection #0

Success! Now create a patch diff -ru nginx-0.8.54 nginx > fix-empty-put.patch and post it to the nginx-devel mailing list.

Now to play Minecraft for 12 hours as you wait for the Russian developers to wake up and take notice of your patch. Possibly sleep.

Second Pass: Fixing WebDAV

A positive reply from Maxim Dounin to my patch! I don't use WebDAV though, but if I want this patch accepted I better make sure it doesn't break official modules.

This time around I wanted to work locally, so I installed nginx with the following configuration:


./configure --prefix=/home/schmichael/local/nginx --with-debug --user=schmichael --group=schmichael --with-http_ssl_module --with-http_stub_status_module --with-http_gzip_static_module --without-mail_pop3_module --without-mail_imap_module --without-mail_smtp_module --with-http_dav_module

Note that I set the prefix to a path in my home directory, turned on debugging and the dav module, and set nginx to run as my user and group. A quick symlink from /home/schmichael/local/nginx/sbin/nginx to ~/bin/nginx, and I can start and restart nginx quickly and easily. More importantly I can attach a debugger to it.

The importance of being able to attach a debugger became clear as soon as I tested dav support (with their standard config):

$ curl -v -X PUT http://localhost:8888/foo
* About to connect() to localhost port 8888 (#0)
* Trying ::1... Connection refused
* Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8888 (#0)
> PUT /foo HTTP/1.1
> User-Agent: curl/7.21.0 (x86_64-pc-linux-gnu) libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18
> Host: localhost:8888
> Accept: */*
>
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
* Closing connection #0

My patch was causing a segfault in the dav module that killed nginx's worker process. Bumping up my error logging to debug level didn't give me many clues:

2010/12/28 10:32:55 [debug] 15548#0: *1 http put filename: "/home/schmichael/local/nginx/dav/foo"
2010/12/28 10:32:55 [notice] 15547#0: signal 17 (SIGCHLD) received
2010/12/28 10:32:55 [alert] 15547#0: worker process 15548 exited on signal 11

Time to break out the debugger! While I've used gdb --pid to attach to running processes before, I'd just installed Eclipse to work on some Java code and wondered if it might make debugging a bit easier.

After installing plugins for C/C++, Autotools, and GDB, I could easily import nginx by creating a "New Makefile project with existing code":
Import existing code

Now create a new Debug Configuration:
Debug Configuration

Note on Linux systems (at least Ubuntu): by default PTRACE is disabled in the kernel. Just flip the 1 to 0 in /etc/sysctl.d/10-ptrace.conf and run sudo sysctl -p /etc/sysctl.d/10-ptrace.conf to allow PTRACE.

Finally click "Debug" and select the nginx worker process from your process list:
Select Process

By default GDB will pause the process it attaches to, so make sure to click the Resume button (or press F8) to allow nginx to continue serving requests.

Crashing nginx
Now cause the segfault by running our curl command curl -v -X PUT http://localhost:8888/foo. This time curl won't return because gdb/Eclipse caught the segfault in the nginx child process, leaving the socket to curl open. A quick peek in Eclipse shows us exactly where the segfault occurs:
Debugging in Eclipse

Eclipse makes it quick and easy to interactively inspect the variables. Doing that I discovered the culprit was the src variable being uninitialized. Bouncing up the stack once you can see dav's put handler expects to be given a temporary file (&r->request_body->temp_file->file.name) full of PUT data (of which we sent none), and it copies that to the destination file (path).

Bounce up the stack again to ngx_http_read_client_request_body and you can see this relevant code:

if (r->headers_in.content_length_n < 0) {

nginx's core HTTP module short circuits a bit when there's no Content-Length specified. It skips the temp file creation because there's no data to put into the temp file!

So we have our problem:

  1. The dav module put handler expects a temp file containing the data to be saved.
  2. The http module doesn't create a temp file when there's no body data.

The 2 solutions I can think of are:

  1. Always create a temp file, even if it's empty.
  2. Add a special case to the dav module's put handler for when the temp file doesn't exist.

I really don't want to hack the core http module just to make a sub-module happy. It makes sense that no temporary file exists when there's no body data. Sub-modules shouldn't be lazy and expect it to exist. So I decided to try #2.

The Fix

You can see my implementation of solution #2 on GitHub. Simply put, if the temp file exists, follow the existing logic. If the temp file does not exist we have a PUT with an empty body: use nginx's open wrapper to do a create or truncate (O_CREAT|O_TRUNC) on the destination file (since an empty PUT should create an empty file).

I don't know if this is the best solution or even a correct one, but it appears to work and was a fun journey arriving at it. You can follow the discussion on the mailing list.

Updated to switch from bitbucket to github.

Posted in Open Source, Technology | Tagged , , , | 12 Comments

Less Pagination, More More

We live in a brave new (to some) world of databases other than a relational database with a SQL interface. Normally end users never notice a difference, but the astute viewer may notice the slow demise of an old friend: pagination.

Traditionally with SQL databases pagination has looked something like this:

There are previous and next links as well as links for jumping right to the beginning and end. Pretty boring stuff.

What’s interesting is that this standard interface is disappearing in favor of something like this:

Twitter

Facebook

And soon beta testers of Urban Airship’s push service for Android will see a More link on the page that lists devices associated with their app:

The simplest possible explanation for this dumbing down of pagination is that count (for total pages) and skip/offset are expensive operations.

Not only are those operations expensive, but in eventually consistent databases, which many modern non-relational databases are, they’re extremely expensive, if not impossible, to perform.

Cassandra

At Urban Airship we, like Facebook, use Cassandra: a distributed column-based database. This deals two deadly blows to traditional pagination:

  1. No way to count columns in a row (without reading every column).
  2. No way to skip by numeric offset (so you can’t say, skip to page 5).

In Cassandra columns are ordered, so you start reading from the beginning and read N+1 columns where N is the number of items you’d like to display. The last column’s key is then used to determine whether the More link is enabled, and if so, what key to start the next “page” at.

Both of those are solvable problems if you really need them, but I would suspect you would end up creating a column count cache as well as some sort of table of contents for the various page offsets. Not what I want to spend my time implementing.

The fact of the matter is that for many use cases, a simple More button works just as well (if not better) than traditional pagination. It’s also far cheaper to implement, which means more developer time free to work on features and more hardware resources available to push your 140 character insights around the web.

MongoDB

I should note that MongoDB is fairly unique in the non-relational database world as its dynamic querying features include count and skip operations. However, as with any database, you’ll want to make sure these queries hit indexes.

Sadly MongoDB currently doesn’t have the distributed features necessary to automatically handle data too big for a single server.

Posted in SQL, Technology | Tagged , , | 3 Comments

New Job, New Blog

The title is a bit misleading, but I haven’t updated my blog in far too long.

In April I started working for Urban Airship, and I’ve been meaning to upgrade my blog and move to it to a shorter URL for some time. You should be reading this on schmichael.com instead of on the old site at michael.susens-schurter.com. I think I setup the write .htaccess magic to make all of the old links properly redirect to the new domain. Sorry if I broke anything.

For anyone confused by the personal rebranding from “Michael Susens-Schurter” to “schmichael”, it’s all because I’m lazy and hate typing. Also, there’s already a Michael at Urban Airship, so I’m pretty much “schmichael” to everyone these days.

Posted in Personal | Leave a comment