Steve Huffman on scaling Web Apps at Reddit – FOWA 2010 Miami
Presentations slides can be found here. Huffman discusses 7 lessons he learned while at reddit.com. The most interesting lesson for me, as aluded to above, is lesson #3: Open Schema. Here’s what Steve had to say:
Huffman: An open schema. In the early days, our database was a very traditional kind of relational database. We had a table for links, a table for users, and a table for comments. The columns in the database were what you’d expect. Everything was normalized.
This would be an example of the link table. There were IDs and the number of up votes and the number of down votes, and then title, URL, and tons of foreign keys and these complex mini relationships. We spent a lot of time thinking about the database and working on it.
It seemed okay at the time, but there were a lot of problems looking back.
One of the things was we spent too much time thinking about the database. We really shouldn’t have to do that. Every time we added a feature, when you can save links, or hide links, or when we added commenting – we didn’t have that at first – we had to update our schema.
Schema updates are really painful, as you grow. The database gets bigger and if your database is under a lot of load, you can’t just add another column or add another table and expect it to be totally fine. Adding a column to a table that has 10 million rows in it takes a long time and will totally kill your database.
Replication – we use database replication for backup and for scaling. Trying to do schema updates and maintain replication is a total pain. We would often have to restart our replication so then we had these periods where we would have a day with no backups and we were really kind of playing with fire there.
Deployments were complex because updating the database took so much time that we would have to walk this line of when do we deploy our code versus when do we update our database. It wasn’t a fun time.
The way we’ve changed is we use an “open schema”. Sometimes it’s called “entity attribute value”. It’s basically a large key value store. We have two types of tables for every data type. There is a “thing” table, and then a “data” table. Everything in Reddit is comprised of what we call things: users, links, comments, sub-Reddit’s, awards.
Everything on Reddit is a thing. The schema for those elements look the same. It looks like this top table here: ups, downs, a type, a creation date, some properties that are fundamental across all of the objects in Reddit.
Then we have what’s called the “data” table, which is basically this huge table with three columns: the thing idea we’re talking about is the left-most column, then a key, and a value. For example, these two links would be represented by two links in a thing table, and then one row in the data table for every value on that link. There would be a key for title, and a value for that title for that link; and a key for URL and a key for the author, and then a key for how many spam votes that are on it.
What this allowed us to do is whenever we add new features; we don’t have to deal with the database anymore. We just store new data; we just add more properties to whatever things we’re storing. If we add a new thing, we don’t need to add new tables anymore.
It kind of frees of from all those headaches of how are we doing to update our database to fit this new feature; how are we going to maintain this replication; how are we going to distribute this? All of those problems basically went away.
One of the things we don’t do anymore is we don’t do any joins in the database anymore, at least not in those databases so it becomes really easy to distribute. We can put different chunks of data on different machines, and it scales really nicely.
We don’t have to worry about foreign keys and doing joins, and how are we going to split this piece of data up. It just all splits up very nicely and it made life a lot simpler for us.
The only downside is you’re not using – we’re using Postgres, and if you’re using Postgres on MySQL, they’re designed to store data relationally and we’re not using it in that way, so we don’t get to take advantage of all the cool relational things that those databases do. We have to maintain consistency ourselves because we’re just storing chunks of data that isn’t related to anything. But in the long run, it’s worked out really well for us, and I’m really happy we made that switch.
If you’re using Google App Engine, or you’re on Amazon and using SimpleDB, this is basically where the trend is going. This is the type of storage that you get with them, this document-based storage. If you’re using CouchDB, it’s up and coming. I don’t know if it’s ready for production yet, but they’re getting there. Hopefully, the worries of using a relational database are kind of a thing of the past.