Skip to content

Conversation

@regeciovad
Copy link
Contributor

The goal was to create a deterministic way to detect potentially slow scanning due to a lower quality of rules.
The first version tested the actual speed. However, other factors, such as CPU usage, could influence this.
In this version, I was focusing more on indicators of the rules themselves.

The first indicator is where Yara is using 0-length atoms, basically testing input byte by byte. This problem is partially addressed by existing warnings about the low quality of atoms (aka famous slowing-down scanning). Still, due to the changing nature of heuristics for these calculations, it is sometimes hard to conclude this is the case.
However, I did not want to generate a callback if the size of the scanned input is relatively small; thus, the effect of the slowing is not that significant. I tested how the slow rules behave on different sizes of inputs. The slowing was more notable when the files were bigger than 0.2 MB. For that reason, I am generating a callback just for files that are larger than that.

The second indicator is the number of potential matches. If the count is higher than one million, the ERROR_TOO_MANY_MATCHES is returned. However, even the lower bound can indicate that something is wrong.
I tested some additional factors, but these two showed up as the simplest yet the most effective so far.

Example:

$ cat rule.yar
rule rule_com {
  strings:
    $com = /.{1,2}\.com/
  condition:
    $com
}
$ ./yara rule.yar top-1m.csv
warning: rule "rule_com": scanning with string $com is taking a very long time, it is either too general or very common.
rule_com top-1m.csv

@plusvic
Copy link
Member

plusvic commented May 10, 2023

It looks like the test cases are failing due to some heap overflow detected with --enable-address-sanitizer.

https://github.com/VirusTotal/yara/actions/runs/4927239541/jobs/8803939475?pr=1921

@regeciovad
Copy link
Contributor Author

I am sorry for the late reply. The PR should be fixed now.

@plusvic plusvic merged commit 7f46c88 into VirusTotal:master May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants