December 02, 2010
Your cluster of servers ping Scout every minute, but you may only login to our website once a day. There’s a large gap between the amount of work on the data-processing side and the number of times it results in a visit to Scout. It’s a typical profile of a data processing application.
For a write-heavy application like Scout, hardware costs can increase linearly with user growth. It doesn’t scale like a read-heavy web application that can leverage a fast, in-memory caching layer for frequent reads. You need to make writing data more efficient.
For us, teaser checks have dramatically decreased the time spent processing data.
Unless you live in Seattle, you don’t grab your rain boots, umbrella, and waterproof jacket every time you walk out the door. You check the weather forecast for rain first. Checking the forecast is far less work than wearing your funny-looking rain gear every day.
A teaser check works just like a weather report. It’s a super-efficient piece of code that runs instead of the full-scale analysis. The more intensive work only occurs if the teaser says it’s needed.
Example, please? When you receive an alert from Scout, it’s generated by a trigger (ex: memory usage exceeds 90%). When we started, we ran these triggers every minute, whether or not (1) the server was reporting data or (2) the metric actually exceeded the threshold.
This gets expensive when you have thousands of servers reporting data, each with 20 or-so metrics. Now when a server reports, we quickly check if a metric exceeds the trigger threshold. If it does, we mark the trigger, telling it to run in the background.
Our teaser check is very efficient – it’s no more work on the database. Since just a handful of triggers actually fire, it’s far more efficient checking the data with a teaser first.