A short and fast journey through some of the profiling options available in the Ruby 2.x world, including a look at flamegraphs and new ways of tracking memory usage in the MRI.
2. Why Profiling?
Program analysis (often in space or time)
What is my code doing on this path/request? (and why so slow??)
What is the code doing in production?
And while we're here, where did all my memory go?
3. The World of MRI
Jealous of all the JVM goodness (e.g. VisualVM)
Bits and pieces (memprof, etc.)
2.x brings a host of improvements
11. Stackprof
Call-stack sample profiler (using new rb_profile_frames() in
2.1)
Very low-overhead operation
Samples on wall time, cpu time, object allocation counts or
YOUR_CUSTOM_PHASE_OF_THE_MOON
Standalone & Rack middleware
Off and on-able (accumulates between start/stop)
Defaults: cpu, 1000 microsecond intervals
17. Flamegraphs
What are they?
Visualization technique for sample stack traces
Turning thousands of dense traces into a single image
Invented by Brendan Gregg (Joyent / Netflix)
22. Rails Flamegraph
Default Stackprof flamegraphs show repeated calls to same
methods
Can hide patterns
Gregg's flamegraph includes a 'collapse' preprocessing phase to
combine repeated calls
23. Another example
Working on a pure Ruby application
'Why is it running so slow?'
'Can we see any quick way of shaving off some execution time?'
26. Interpretation
Most of the execution time is spent in Excon and Fog methods
These are talking to network (OpenStack / Puppet)
Caching some results provided a quick win that shaved ~30s
Most of execution time still network-based
Medium / Long-term solution to move to pre-baked images and
thus eliminate need for Puppet run
Result: Runtime of 8 minutes (!) down to 20s.
29. dump & dump_all
JSON representation of object (more info provided if allocation
tracing is on)
GIVE ME THE ENTIRE HEAP! ObjectSpace.dump_all
Dump is multiple lines of JSON
(Obviously, can be large!)
30. Example - pry
Q. How many STRINGS are there in my pry session?
require 'objspace'
ObjectSpace.dump_all(output: File.open('heap.dump','w'))
$> grep '"type":"STRING"' heap.dump | wc -l
A. ???
31. Hunting for leaks with
rbtrace
wabbit season
Idea - GC, dump, repeat, and compare
Remove objects from dump 2 that are in dump 1
(Remove missing objects in dump 3 from dump 2)
Not necessarily leaks but a great place to start looking
32. Rbtrace & Leaks
How to get the dumps from a live server?
rbtrace -e
e.g. rbtrace -p $PID -e 'Rails.root.to_s'
watch out for eval timeouts
36. Demo
Let's look at Sinatra
MemoryProfiler.report { require 'sinatra' }.pretty_prin
Freeze your strings!
37. GC
GC is in a state of flux
1.9.x, 2.0, 2.1, 2.2 all have different GC strategies.
Mostly worked with 2.1 (2.2 is improvement on 2.1 strategy)
Tuning? Here be dragons…
38. gc_tracer
Uses new 2.1 hooks for GC profiling
Outputs TSV (GC.stat, minor/major GC runs, etc.)
Useful for ideas on GC tuning
41. Summing Up
Things are getting better!
Still a bunch of separate tools (with some overlap)
(more things abound - ruby-prof, rack-mini-profiler, etc)
Good idea to send some of this to logging / graphite / etc.
Lower level - SystemTap, DTrace, perf