June 13, 2017
The last 3,650 days of my professional life have been focused on making Rails apps faster. Below are five lessons I've learned the hard way.
The typical pitch to prioritize making an app faster & more reliable goes something like: "every additional second of load time increases bounce rates by X percent, which triggers an X percent decrease in sales".
Yes, that's probably true. What's also true? It's a boring, cold, unemotional pitch.
The big problem with slow, unreliable apps? A stream of unpredictable stability problems spreads distrust throughout a team and slowly break a company down. A toxic work environment forms when on-call developers are worn out and sales and support teams can't trust the technical team.
You need a reliable web app in the same way you don't want to stress about how your car will holdup over a week-long road trip. A reliable app is a core emotional need of a healthy company.
After 10 years, it's still hard not to think of the response time distribution of requests to a web app as a bell curve. What feels natural vs. what actually happens in real life:
This means that if you are optimizing your app based on aggregate data from an average request, you are frequently optimizing a scenario that doesn't exist. To better understand what's going on for the slowest requests, I make extensive use of context within Scout and our exception monitoring tools. More often than not, I'll see that the slowest requests are only occurring for specific users. This makes it easier to reproduce and optimize.
Outside of practicing for major changes that can't be simulated in development, I've found staging environments to be a wasteland. It's typically cost prohibitive to replicate a production environment exactly in staging and the load on staging is far less than production. Staging ends up becoming an odd limbo phase halfway between development and product.
There's a lot involved in scaling a database. Some of the best money we've ever spent at Scout was when we brought in outside help to assist with configuring our databases as we grew. Be okay knowing you have limits.
When a customer reports an issue and all I can do is throw up my hands and say "can't reproduce", I feel pretty inept. I'm much more aggressive logging sensitive areas of code today than ten years ago. A robust test suite isn't a replacement for all of the odd edge cases that can occur in a production.
Want more Ruby insights like this delivered monthly to your inbox? Just put your email into the sidebar form.