July 27, 2010
Updated 3/21/2013 – The MongoDB Overview plugin discussed below has been split into 2 separate plugins: MongoDB Server Status and MongoDB Database Stats. The server status plugin reports global MongoDB metrics and the database stats plugin reports metrics specific to a particular database.
John Nunemaker of Ordered List knows his way around MongoDB, the high-performance open-source, document-orientated database. We were obviously excited to see that John created a MongoDB Monitoring plugin.
Like all Scout plugins, installing it is a button click away.
John’s MongoDB Monitoring plugin gives both an overview of key Mongo performance metrics and a detailed display when you need to dig deeper.
At Scout HQ, I brought out some hot cocoa, lit a fire, and demanded that John reflect on MongoDB. You wouldn’t know it from his appearance, but John had some rather enlightening things to say.
John: I was introduced to MongoDB at breakfast during RailsConf 2009 in Vegas. I happened to sit down at a table with a few guys (Wynn Netherland, Jim Mulholland) who were using the Twitter gem I created and MongoDB to build Twitter apps. They raved about MongoDB and I listened like a polite person would. I thought they were crazy as AltDB’s did not have the traction they do now and I was still (mostly) happily using MySQL.
When I got home from RailsConf, I visited the MongoDB.org website and quickly became enamored. I spent every evening for a week reading through all the documentation. I quickly realized there were not any good object mappers for it, so I started MongoMapper. For the first 4 or 5 months after discovering Mongo, most of my interaction with it was through developing MongoMapper.
John: Most of my web developing life has been spent building content management systems of one sort or another. Steve Smith, my partner at Ordered List, and I previously worked together for the University of Notre Dame. While there, we built a multi-site CMS in Rails on MySQL. It was a huge success and cut down our delivery time for new sites drastically.
When I joined Steve at Ordered List in 2008, we both knew we wanted to build something similar that was our own and thus Harmony was born. Over the years of content management, I learned that content is not a title and a huge content box, but rather, it is made up of lots of little pieces of different types of information.
Modeling this type of data in MySQL was quickly becoming a pain, so we pulled the trigger in November 2009 and switched Harmony from MySQL to Mongo (despite being about 90% done in MySQL). With the change to Mongo, all those little pieces of information that make up a web page can be stored in one document and natively as whatever type of data they are (number, text, etc.).
In addition to Harmony, I have several toy apps that I use throughout the day that are powered by Mongo. I have also helped a few clients power pieces of their applications that needed more flexible schemas as well.
John: As of now, Mongo is pretty much my default database. This could be the type of work that I often end up doing or that Mongo just fits a lot of different projects. I think it is especially suited for situations where you need a dynamic schema, such as content management, human resource information, analytics, etc.
John: Thankfully, I have not really any hit performance issues yet with Mongo. That said, the thing I watch the closest is the slow query log. In Mongo, you can turn profiling on and set a slow query threshold. These slow queries then get stored in a collection in your database, which you can query against to inspect performance. I monitor this quite closely to help me tune indexes and keep queries speedy, as quick reads are more important than quick writes in Harmony. The MongoDB Slow Query plugin in the official Scout plugins repo is great for this and has been quite an education in indexing. Another important thing is to make sure that your indexes fit in memory.
John: Harmony is a hosted, multi-site website management application. The goal is not to rid the need for web developers, but to empower them to build sites that are easier for clients and content producers to update. It is 100% MongoDB at the moment. We will probably be introducing redis soon for a few things, but we are not using any type of relational database.
Each user has one account across the entire system and we use single sign on to allow you to switch between sites/domains without having to sign in again each time. Every “page” in Harmony is stored in an items collection. We use single collection inheritance (from MongoMapper) to give different types of content (page, blog, link, etc.) different behaviors and data.
Another interesting tidbit, technology-wise is that we are even storing all files in Mongo. Yes, you heard right, every asset (picture, document, css, js, templates) is stored in Mongo. Currently, we are just storing them in documents in different collections (and limiting size to ~3MB per), but soon we will be rolling out the storing of them in Mongo’s GridFS thanks to Joint, another project of mine that makes GridFS and MongoMapper insanely easy. The sweet thing about storing all your data and files in one place is that you just have one policy for backups. We can also restore the entire system by just restoring the data and the code.
John: I am by no means an expert in performance, but I would say they are similar in more ways than you would expect. Mongo and MySQL share a lot of the same features, dynamic queries, secondary indexes, etc. If your queries are getting slow, start profiling and watching your slow query log. If you see something come through that has a nscanned way higher than nreturned, you need an index. If something comes through that is slow and the result length is huge, you probably need to have a smaller limit or only select certain fields.
How much tuning have you done with MongoDB? Any key settings to adjust from factory defaults?
John: We have not done any tuning at this point. Websites get updated a few times a day at most, but continually get visited, which means Harmony is heavy on the reads and light on the writes. We cache everything to disk when it is requested (and soon will be changing this to varnish) so subsequent requests get served from Apache and result in 0 Mongo queries. Our tuning will probably always be at the HTTP level. I doubt we will ever even bruise Mongo.
John: Mongo is probably one of the easiest databases around to get up and running, as they have pre-built binaries for OSX and several Linux distros. They follow a very public schedule for releases and are extremely responsive on the mailing list (commercial support is also available). They have plentiful docs and command line programs for keeping track of things (mongo, mongodump, mongorestore, mongostat, etc.).
Personally, I have found it pretty easy to set up and maintain Mongo installs. For those that do not want to hassle with the administration part, there are a few hosted Mongo solutions: mongohq.com and mongomachine.com. I have tried both with no issues. For Harmony, we trust the fine folks at RailsMachine. They setup and maintain our Mongo install, including backups and replication. Oh, and of course we monitor it with Scout.