Skip to content

Commit

Permalink
Don't parse YAML twice
Browse files Browse the repository at this point in the history
Once for the duplication check, and once again to convert it to a ruby hash

We can instead use the yaml ast from the duplication check and convert
that into a hash directly.

```
# Rubocop
# Base:
$ hyperfine -w 5 -r 20 "bundle exec rubocop Rakefile"
Benchmark 1: bundle exec rubocop Rakefile
  Time (mean ± σ):      1.041 s ±  0.014 s    [User: 0.898 s, System: 0.140 s]
  Range (min … max):    1.013 s …  1.061 s    20 runs

# Patch:
$ hyperfine -w 5 -r 20 "bundle exec rubocop Rakefile"
Benchmark 1: bundle exec rubocop Rakefile
  Time (mean ± σ):      1.013 s ±  0.008 s    [User: 0.865 s, System: 0.146 s]
  Range (min … max):    0.998 s …  1.027 s    20 runs
```

~2.7%  Faster

```
# Gitlab
# Base:
$ hyperfine -w 5 -r 20 "bundle exec rubocop Rakefile"
Benchmark 1: bundle exec rubocop Rakefile
  Time (mean ± σ):      3.485 s ±  0.027 s    [User: 2.565 s, System: 0.909 s]
  Range (min … max):    3.445 s …  3.529 s    20 runs

# Applied:
$ hyperfine -w 5 -r 20 "bundle exec rubocop Rakefile"
Benchmark 1: bundle exec rubocop Rakefile
  Time (mean ± σ):      3.401 s ±  0.029 s    [User: 2.483 s, System: 0.906 s]
  Range (min … max):    3.356 s …  3.466 s    20 runs
```

~2.5% faster. I expected more since they have so many config files but hey, still something.
  • Loading branch information
Earlopain authored and bbatsov committed Sep 4, 2024
1 parent 5f3481f commit c2ae362
Show file tree
Hide file tree
Showing 3 changed files with 41 additions and 8 deletions.
22 changes: 14 additions & 8 deletions lib/rubocop/config_loader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,8 @@ def load_file(file, check: true)
def load_yaml_configuration(absolute_path)
file_contents = read_file(absolute_path)
yaml_code = Dir.chdir(File.dirname(absolute_path)) { ERB.new(file_contents).result }
check_duplication(yaml_code, absolute_path)
hash = yaml_safe_load(yaml_code, absolute_path) || {}
yaml_tree = check_duplication(yaml_code, absolute_path)
hash = yaml_tree_to_hash(yaml_tree) || {}

puts "configuration from #{absolute_path}" if debug?

Expand Down Expand Up @@ -235,8 +235,8 @@ def read_file(absolute_path)
raise ConfigNotFoundError, "Configuration file not found: #{absolute_path}"
end

def yaml_safe_load(yaml_code, filename)
yaml_safe_load!(yaml_code, filename)
def yaml_tree_to_hash(yaml_tree)
yaml_tree_to_hash!(yaml_tree)
rescue ::StandardError
if defined?(::SafeYAML)
raise 'SafeYAML is unmaintained, no longer needed and should be removed'
Expand All @@ -245,10 +245,16 @@ def yaml_safe_load(yaml_code, filename)
raise
end

def yaml_safe_load!(yaml_code, filename)
YAML.safe_load(
yaml_code, permitted_classes: [Regexp, Symbol], aliases: true, filename: filename
)
def yaml_tree_to_hash!(yaml_tree)
return nil unless yaml_tree

# Optimization: Because we checked for duplicate keys, we already have the
# yaml tree and don't need to parse it again.
# Also see https://github.com/ruby/psych/blob/v5.1.2/lib/psych.rb#L322-L336
class_loader = YAML::ClassLoader::Restricted.new(%w[Regexp Symbol], [])
scanner = YAML::ScalarScanner.new(class_loader)
visitor = YAML::Visitors::ToRuby.new(scanner, class_loader)
visitor.accept(yaml_tree)
end
end

Expand Down
1 change: 1 addition & 0 deletions lib/rubocop/yaml_duplication_checker.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ def self.check(yaml_string, filename, &on_duplicated)
return unless tree

traverse(tree, &on_duplicated)
tree
end

def self.traverse(tree, &on_duplicated)
Expand Down
26 changes: 26 additions & 0 deletions spec/rubocop/config_loader_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1768,6 +1768,32 @@ def self.inject!
expect(configuration.to_h).to eq({})
end

it 'rejects non-allowed yaml types' do
create_file(configuration_path, <<~YAML)
foo: !ruby/object:Rational
numerator: 1
denominator: 2
YAML

expect { load_file }.to raise_error(Psych::DisallowedClass, /Rational/)
end

it 'allows yaml anchors' do
create_file(configuration_path, <<~YAML)
Style/Alias: &anchor
Enabled: false
Style/Encoding:
<<: *anchor
YAML

expect(load_file.to_h).to eq(
{
'Style/Alias' => { 'Enabled' => false },
'Style/Encoding' => { 'Enabled' => false }
}
)
end

context 'set neither true nor false to value to Enabled' do
before do
create_file(configuration_path, <<~YAML)
Expand Down

0 comments on commit c2ae362

Please sign in to comment.