schmongodb slides from Update Portland

A few months ago someone in #pdxwebdev on Freenode asked an innocent MongoDB question. In response I ranted seemingly endlessly about our experience with MongoDB at Urban Airship. After a few moments somebody (perhaps sarcastically? who can know on IRC) suggested I give a talk on my experiences with MongoDB. That led me to realize despite Portland’s amazing meetup culture there were no tech-meetups that focused on either:

  1. Narrative talks based on experiences in production (not how-tos)
  2. Database-agnostic backend systems focused groups (not just a NoSQL meetup)

So I started one: Update Portland.

And I gave my promised MongoDB talk: schmongodb.

And 10gen sent swag! (Thanks to Meghan! It was a big hit.)

And my brilliant coworker Erik Onnen gave a short talk on how he’s beginning to use Kafka at Urban Airship. (Expect a long form talk on that in the future!)

Thanks to everyone who showed up. I had a great time and have high hopes for the upcoming meetings. (Sign up for the mailing list!)

The slides may come across as overly negative. After all Urban Airship is actively moving away from MongoDB for our largest and busiest pieces of data. So I want to make 2 things very clear:

  1. I like MongoDB and would like to use it again in the future. There’s a lot I don’t like about it, but I can’t think of any “perfect” piece of software.
  2. The IO situation in EC2, particularly EBS’s poor performance (RAIDing really doesn’t help) made life with MongoDB miserable. This story may have been very different if we were running MongoDB on bare metal with fast disks.

Mike Herrick, the VP of Engineering at Urban Airship, put me on the spot at the end of my talk by asking me by asking me: “Knowing what you know now, what would you have done differently?”

I didn’t have a good answer, and I still don’t. Despite all of the misadventures, MongoDB wasn’t the wrong choice. Scaling systems is just hard, and if you want something to work under load, you’re going to have to learn all of its ins and outs. We initially started moving to Cassandra, and while it has tons of wonderful attributes, we’re running into plenty of problems with it as well.

So I think the answer is knowing then what I know now. In other words: Do your homework. That way we could have avoided these shortcomings and perhaps still be happy with MongoDB today. Hopefully these slides will help others in how they plan to use MongoDB so they can use it properly and happily.

Note: I added lots of comments to the speaker notes, so you’ll probably want to view those while looking at the slides.

This entry was posted in Open Source, Technology and tagged , , . Bookmark the permalink.
  • jrimmer

    Given the excellent analysis of your current situation and slide 23’s bullet point ‘Moving bulk of data off of MongoDB’ what’s the next step and why?

  • http://michael.susens-schurter.com/blog/ Michael Schurter

    @jrimmer

    Some data has moved into Cassandra and we’re actually working on moving others into a sharded PostgreSQL setup. It wiill be interesting to see how we feel about each path in a month or two.

  • jrimmer

    Your experience with MongoDB must have been interesting indeed if a viable alternative is to replace the one system with two as a fix, especially considering their varying data representations. You’re effectively doubling your operational and development costs simultaneously.

  • http://michael.susens-schurter.com/blog/ Michael Schurter

    Ah, sorry, I should have been more clear:

    We’ve used Cassandra mostly for new development — only a small amount of data was moved out of MongoDB. The bulk of data that we’re migrating out of MongoDB is going into Postgres which is actually where it started out until about a year ago. The main development effort will be in sharding Postgres which we’d have to develop with MongoDB anyway (unless we gained confidence in MongoDB’s auto-sharding).

    So we’re not really doubling our operational or development costs as we’re rolling the migration into our larger roadmap which already necessitates evolving our data systems.

  • http://blog.zawodny.com/ Jeremy Zawodny

    Thanks for making the slides available. Some of the issues you’ve seen (syncdelay) had be stumped for a bit too. I think MongoDB is a good fit for our use case, but it’s good to see examples where it’s not right too.

  • Mike Stoddart

    You mentioned the replication in PostgreSQL 9. I would love to read a similar type of post on that after a while.

  • Pingback: async I/O News » Failing with MongoDB()