Dynamic Method Definitions

2013-03-03 @ 16:52

TL;DR: depending on your app, using define_method is faster on boot, consumes less memory, and probably doesn’t significantly impact performance.

Throughout the Rails code base, I typically see dynamic methods defined using class_eval. What I mean by “dynamic methods” is methods with names or bodies that are calculated at runtime, then defined.

For example, something like this:

class Foo
  class_eval <<EORUBY, __FILE__, __LINE__ + 1
    def wow_#{Time.now.to_i}
      # ...
    end
  EORUBY
end

I’m not sure why they are define this way versus using define_method. Why don’t we compare and contrast defining methods using class_eval and define_method?

The tests I’ll do here use MRI, Ruby 2.0.0.

Definition Performance

When defining a method, is it faster to use class_eval or define_method? Here is a trivial benchmark where we simulate defining 100,000 methods:

require 'benchmark'

GC.disable

N = 100000
Benchmark.bm(13) do |x|
  x.report("define_method") {
    class Foo
      N.times { |i| define_method("foo_#{i}") { } }
    end
  }

  x.report("class_eval") {
    class Bar
      N.times { |i|
        class_eval <<-eoruby, __FILE__, __LINE__ + 1
          def bar_#{i}
          end
        eoruby
      }
    end
  }
end

Results on my machine:

$ ruby test.rb
                    user     system      total        real
define_method   0.290000   0.030000   0.320000 (  0.318222)
class_eval      1.300000   0.120000   1.420000 (  1.518075)

The class_eval version seems to be much slower than the define_method version.

Why is definition performance different?

The reason performance is different is that on each call to class_eval, MRI creates a new parser and parses the string. In the define_method case, the parser is only run once.

We can see when the parser executes using DTrace. We will compare two programs, one with class_eval:

class Foo
  5.times do |i|
    class_eval "def f_#{i}; end", __FILE__, __LINE__
  end
end

and one with define_method:

class Foo
  5.times do |i|
    define_method("f_#{i}") { }
  end
end

Using DTrace, we can monitor the parse-begin probe which fires before MRI runs it’s parser and compiles instruction sequences:

ruby$target:::parse-begin
/copyinstr(arg0) == "test.rb"/
{
  printf("%s:%d\n", copyinstr(arg0), arg1);
}

Run DTrace using the define_method program:

$ sudo dtrace -q -s x.d -c"$(rbenv which ruby) test.rb"
test.rb:1

Now run again with the class_eval version:

$ sudo dtrace -q -s x.d -c"$(rbenv which ruby) test.rb"
test.rb:1
test.rb:3
test.rb:3
test.rb:3
test.rb:3
test.rb:3

In the class_eval version, the parser runs and compiles instruction sequences 6 times, where the define_method case only runs once.

Call speed

It seems it’s faster to define methods via define_method, but which method is faster to call? Let’s try with a trivial example:

require 'benchmark/ips'

GC.disable

class Foo
  define_method("foo") { }
  class_eval 'def bar; end'
end

Benchmark.ips do |x|
  foo = Foo.new
  x.report("class_eval") { foo.bar }
  x.report("define_method") { foo.foo }
end

Here are the results on my machine:

$ ruby test.rb
Calculating -------------------------------------
          class_eval    115154 i/100ms
       define_method    106872 i/100ms
-------------------------------------------------
          class_eval  7454955.2 (±5.0%) i/s -   37194742 in   5.004418s
       define_method  5061216.4 (±5.2%) i/s -   25221792 in   5.000041s

Clearly methods defined with class_eval are faster. But does it matter? Let’s try a test where we add a little work to each method:

require 'benchmark/ips'

GC.disable

class Foo
  define_method("foo") { 10.times.map { "foo".length } }
  class_eval 'def bar; 10.times.map { "foo".length }; end'
end

Benchmark.ips do |x|
  foo = Foo.new
  x.report("define_method") { foo.foo }
  x.report("class_eval") { foo.bar }
end

Running these on my machine, I get:

$ ruby test.rb
Calculating -------------------------------------
       define_method     23949 i/100ms
          class_eval     23015 i/100ms
-------------------------------------------------
       define_method   261039.7 (±6.3%) i/s -    1317195 in   5.066215s
          class_eval   228819.7 (±12.2%) i/s -    1150750 in   5.286635s

A small amount of work is enough to overcome the performance difference between them.

How about memory consumption?

Let’s compare class_eval and define_method on memory. We’ll use this program to compare maximum RSS for N methods:

N = (ENV['N'] || 100_000).to_i

class Foo
  N.times do |i|
    if ENV['EVAL']
      class_eval "def bar_#{i}; end"
    else
      define_method("bar_#{i}") { }
    end
  end
end

Here are the results (I’ve trimmed them a little for clarity):

$ EVAL=1 time -l ruby test.rb
        3.77 real         3.68 user         0.08 sys
 127389696  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
     38716  page reclaims
$ DEFN=1 time -l ruby test.rb
        0.69 real         0.63 user         0.05 sys
  69103616  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
     24487  page reclaims
$

The maximum RSS for the class_eval version is much higher than the define_method version. Why?

I mentioned earlier that the class_eval version instantiates a new parser and compiles the source. Each method definition in the class_eval version does not share instruction sequences, where the define_method version does.

Let’s verify this claim by using ObjectSpace.memsize_of_all!

Measuring Instructions

MRI will let us measure the total memory usage of the instruction sequences. Here we’ll modify the previous program to measure the instruction sequence size (in bytes) after defining many methods:

require 'objspace'

N = (ENV['N'] || 100_000).to_i

class Foo
  N.times do |i|
    if ENV['EVAL']
      class_eval "def bar_#{i}; end"
    else
      define_method("bar_#{i}") { }
    end
  end
end

GC.start

p ObjectSpace.memsize_of_all(RubyVM::InstructionSequence)

Let’s see the difference:

$ EVAL=1 ruby test.rb
44718112
$ DEFN=1 ruby test.rb
718112

Growth Rate

Now let’s see the growth rate between the two. Here is the growth rate for the class_eval case:

$ N=100 EVAL=1 ruby test.rb
762112
$ N=1000 EVAL=1 ruby test.rb
1158112
$ N=10000 EVAL=1 ruby test.rb
5118112
$ N=100000 EVAL=1 ruby test.rb
44718112

Now let’s compare to the define_method case:

$ N=100 DEFN=1 ruby test.rb
718112
$ N=1000 DEFN=1 ruby test.rb
718112
$ N=10000 DEFN=1 ruby test.rb
718112
$ N=100000 DEFN=1 ruby test.rb
718112

The memory consumed by instruction sequences in the class_eval case continually grows, where in the define_method case it does not. MRI reuses the instruction sequences in the case of define_method, so we see no growth.

Caution

Defining methods with define_method is faster, consumes less memory, and depending on your application isn’t significantly slower than using a class_eval defined method. So what is the down side?

Closures

The main down side is that define_method creates a closure. The closure could hold references to large objects, and those large objects will never be garbage collected. For example:

class Foo
  x = "X" * 1024000 # Not GC'd
  define_method("foo") { }
end

class Bar
  x = "X" * 1024000 # Is GC'd
  class_eval("def foo; end")
end

The closure could access the local variable x in the Foo class, so that variable cannot be garbage collected.

When using define_method be careful not to hold references to objects you don’t care about.

THE END

I hope you enjoyed this! <3<3<3<3

« go back