How we're turbo-charging our traces with ScoutProf

The transaction trace is application monitoring. If you can't track a performance problem to a line-of-code, you aren't app monitoring.

However, there's a gaping hole in the quality of transaction traces today.

We've been instrumenting the quality of our traces since our launch, and noticed that 46% of the time spent in an average Rails request is in custom application code. This falls outside the instrumentation that we, and every other app monitoring service, like New Relic, provides.

Let's fix that.

It started with Stackprof

The release of Ruby 2.1 introduced a C-api for fetching ruby backtraces. What's great about this API? Here's Aman Gupta of GitHub:

The (rb_profile_frames()) api performs no allocations and adds minimal cpu overhead making it ideal for profiling, even in production environments.

Amon wrote Stackprof, a terrific call-stack profiler that leverages rb_profile_frame(). We used it extensively to improve the performance of our agent.

However, with all of that visibility comes a problem: Stackprof output is difficult to decipher.

Enter ScoutProf

ScoutProf builds on StackProf: we're aiming to make all of that data easily decipherable plus work in threaded environments (like Puma). The key part to making it decipherable and actionable? The aggregation. From our docs:

ScoutProf attempts to sample your application every millisecond, capturing a snapshot backtrace of what is running in each thread. For each successful backtrace captured, that is a sample. Later, when we process the raw backtraces, identical traces get combined and the sample count is how many of each unique backtrace were seen.

Screenshot please?

Let's take a look at a trace from New Relic. Notice 2/3rds of the time is spent in "Controller":

nr trace

If you inspect the detailed trace, you'll see the following:

nr trace

Sad face :). That's a big black box of time.

Now, here's a trace to that same controller-action with ScoutProf:

screenshot

The mystery of "Controller" time revealed: we're seeing a breakdown of time spent in our custom code with zero configuration.

The big consumer above is ChartPartData#descriptors. What's going on there?

screen

We're seeing the overhead of ActiveRecord: all that ORM doesn't come for free. Now, who knows about this code? Let's find out with Scout's GitHub integration:

screenshot

Looks like I should talk to Andre :).

Improving the performance of complex Ruby code is time-consuming: it frequently involves the coordination of several people on your team. We're hoping to make it easier to identify these bottlenecks and the humans that can solve them.

Using ScoutProf

Our instructions for enabling ScoutProf are on our help site. Note that ScoutProf is in BETA.