Overview of the YARA-L 2.0 language

Supported in:

Google SecOps SIEM

YARA-L 2.0 is a computer language used to create rules for searching through your enterprise log data as it is ingested into your Google Security Operations instance. The YARA-L syntax is derived from the YARA language developed by VirusTotal. The language works in conjunction with the Google Security Operations Detection Engine and enables you to hunt for threats and other events across large volumes of data.

For more information, see the following:

YARA-L 2.0 example rules

The following examples show rules written in YARA-L 2.0. Each demonstrates how to correlate events within the rule language.

Rules and tuning

The following rule checks for specific patterns in event data and creates a detection if it finds the patterns. This rule includes a variable $e1 for tracking event type and metadata.event_type UDM field. The rule checks for specific occurrences of regular expression matches with e1. When the event $e1 takes place, a detection is created. A not condition is included in the rule to exclude certain non-malicious paths. You can add not conditions to prevent false positives.

rule suspicious_unusual_location_svchost_execution
{

 meta:
   author = "Google Cloud Security"
   description = "Windows 'svchost' executed from an unusual location"
   yara_version = "YL2.0"
   rule_version = "1.0"

 events:

   $e1.metadata.event_type = "PROCESS_LAUNCH"
   re.regex($e1.principal.process.command_line, `\bsvchost(\.exe)?\b`) nocase
   not re.regex($e1.principal.process.command_line, `\\Windows\\System32\\`) nocase

condition:

   $e1
}

Logins from different cities

The following rule searches for users that have logged in to your enterprise from two or more cities in less than 5 minutes:

rule DifferentCityLogin {
  meta:

  events:
    $udm.metadata.event_type = "USER_LOGIN"
    $udm.principal.user.userid = $user
    $udm.principal.location.city = $city

  match:
    $user over 5m

  condition:
    $udm and #city > 1
}

Match variable: $user

Event variable:$udm

Placeholder variable: $city and $user

The following describes how this rule works:

Groups events with username ($user) and returns it ($user) when a match is found.
Timespan is 5 minutes, meaning only events that are less than 5 minutes apart are correlated.
Searching for an event group ($udm) whose event type is USER_LOGIN.
For that event group, the rule calls the user ID as $user and the login city as $city.
Returns a match if the distinct number of city values (denoted by #city) is greater than 1 in the event group ($udm) within the 5 minute time range.

Rapid user creation and deletion

The following rule searches for users that have been created and then deleted within 4 hours:

rule UserCreationThenDeletion {
  meta:

  events:
    $create.target.user.userid = $user
    $create.metadata.event_type = "USER_CREATION"

    $delete.target.user.userid = $user
    $delete.metadata.event_type = "USER_DELETION"

    $create.metadata.event_timestamp.seconds <=
       $delete.metadata.event_timestamp.seconds

  match:
    $user over 4h

  condition:
    $create and $delete
}

Event variables:$create and $delete

Match variable: $user

Placeholder variable: N/A

The following describes how this rule works:

Groups events with username ($user) and returns it ($user) when a match is found.
Time window is 4 hours, meaning only events separated by less than 4 hours are correlated.
Searches for two event groups ($create and $delete, where $create is equivalent to #create >= 1).
$create corresponds to USER_CREATION events and calls the user ID as $user.
$user is used to join the two groups of events together.
$delete corresponds to USER_DELETION events and calls the user ID as $user. This rule looks for a match where the user identifier in the two event groups is the same.
This rule looks for cases where the event from $delete happens later than the event from $create, returning a match when discovered.

Single event rule

Single event rules are rules that correlate over a single event. A single event rule can be:

Any rule without a match section.
Rule with a match section and a condition section only checking for the existence of 1 event (for example, "$e", "#e > 0", "#e >= 1", "1 <= #e", "0 < #e").

For example, the following rule searches for a user login event and would return the first one it encounters within the enterprise data stored within your Google Security Operations account:

rule SingleEventRule {
  meta:
    author = "[email protected]"

  events:
    $e.metadata.event_type = "USER_LOGIN"

  condition:
    $e
}

Here is another example of a single event rule with a match section. This rule searches for a user who has logged in at least once in less than 5 minutes. It checks for the simple existence of a user login event.

rule SingleEventRule {
  meta:
    author = "[email protected]"
    description = "windowed single event example rule"

  events:
    $e.metadata.event_type = "USER_LOGIN"
    $e.principal.user.userid = $user

  match:
    $user over 5m

  condition:
    #e > 0
}

rule MultiEventRule{
  meta:
    author = "[email protected]"
    description = "Rule with outcome condition and simple existence condition on one event variable"

  events:
    $e.metadata.event_type = "USER_LOGIN"
    $e.principal.user.userid = $user

  match:
    $user over 10m

  outcome:
    $num_events_in_match_window = count($e.metadata.id)

  condition:
    #e > 0 and $num_events_in_match_window >= 10 // Could be rewritten as #e >= 10
}

Multiple event rule

Use multiple event rules to group many events over a specified time window and try to find correlations between events. A typical multiple event rule will have the following:

A match section which specifies the time range over which events need to be grouped.
A condition section specifying what condition should trigger the detection and checking for the existence of multiple events.

For example, the following rule searches for a user who has logged in at least 10 times in less than 10 minutes:

rule MultiEventRule {
  meta:
    author = "[email protected]"

  events:
    $e.metadata.event_type = "USER_LOGIN"
    $e.principal.user.userid = $user

  match:
    $user over 10m

  condition:
    #e >= 10
}

Single event within range of IP addresses

The following example shows a single event rule searching for a match between two specific users and a specific range of IP addresses:

rule OrsAndNetworkRange {
  meta:
    author = "[email protected]"

  events:
    // Checks CIDR ranges.
    net.ip_in_range_cidr($e.principal.ip, "203.0.113.0/24")

    // Detection when the hostname field matches either value using or.
    $e.principal.hostname = /pbateman/ or $e.principal.hostname = /sspade/

  condition:
    $e
}

any and all rule example

The following rule searches for login events where all source IP addresses do not match an IP address known to be secure within a timespan of 5 minutes.

rule SuspiciousIPLogins {
  meta:
    author = "[email protected]"

  events:
    $e.metadata.event_type = "USER_LOGIN"

    // Detects if all source IP addresses in an event do not match "100.97.16.0"
    // For example, if an event has source IP addresses
    // ["100.97.16.1", "100.97.16.2", "100.97.16.3"],
    // it will be detected since "100.97.16.1", "100.97.16.2",
    // and "100.97.16.3" all do not match "100.97.16.0".

    all $e.principal.ip != "100.97.16.0"

    // Assigns placeholder variable $ip to the $e.principal.ip repeated field.
    // There will be one detection per source IP address.
    // For example, if an event has source IP addresses
    // ["100.97.16.1", "100.97.16.2", "100.97.16.3"],
    // there will be one detection per address.

    $e.principal.ip = $ip

  match:
    $ip over 5m

  condition:
    $e
}

Regular expressions in a rule

The following YARA-L 2.0 regular expression example searches for events with emails received from the altostrat.com domain. Since nocase has been added to the $host variable regex comparison and the regex function, both these comparisons are case insensitive.

rule RegexRuleExample {
  meta:
    author = "[email protected]"

  events:
    $e.principal.hostname = $host
    $host = /.*HoSt.*/ nocase
    re.regex($e.network.email.from, `.*altostrat\.com`) nocase

  match:
    $host over 10m

  condition:
    #e > 10
}

Sliding window rule example

The following YARA-L 2.0 sliding window example searches for the absence of firewall_2 events after firewall_1 events. The after keyword is used with the pivot event variable $e1 to specify that only 10 minute windows after each firewall_1 event should be checked when correlating events.

rule SlidingWindowRuleExample {
  meta:
    author = "[email protected]"

  events:
    $e1.metadata.product_name = "firewall_1"
    $e1.principal.hostname = $host

    $e2.metadata.product_name = "firewall_2"
    $e2.principal.hostname = $host

  match:
    $host over 10m after $e1

  condition:
    $e1 and !$e2
}

Zero value exclusion example

Rules Engine implicitly filters out the zero values for all placeholders that are used in the match section. For more information, see zero value handling in the match section. This can be disabled by using the allow_zero_values option as described in allow_zero_values.

However, for other referenced event fields, zero values are not excluded unless you explicitly specify such conditions.

rule ExcludeZeroValues {
  meta:
    author = "[email protected]"

  events:
    $e1.metadata.event_type = "NETWORK_DNS"
    $e1.principal.hostname = $hostname

    // $e1.principal.user.userid may be empty string.
    $e1.principal.user.userid != "Guest"

    $e2.metadata.event_type = "NETWORK_HTTP"
    $e2.principal.hostname = $hostname

    // $e2.target.asset_id cannot be empty string as explicitly specified.
    $e2.target.asset_id != ""

  match:
    // $hostname cannot be empty string. The rule behaves as if the
    // predicate, `$hostname != ""` was added to the events section, because
    // `$hostname` is used in the match section.
    $hostname over 1h

  condition:
    $e1 and $e2
}

Rule with `outcome` section example

You can add the optional outcome section in YARA-L 2.0 rule to extract additional information of each detection. In the condition section, you can also specify conditionals on outcome variables. You can use the outcome section of a detection rule to set variables for downstream consumption. For example, you can set a severity score based on data from the events being analyzed.

For more information, see the following:

Multi-event rule with outcome section:

The following rule looks at two events to get the value of $hostname. If the value of $hostname matches over a 5-minute period, then a severity score is applied. When include a time period in the match section, the rule checks within the specified time period.

rule OutcomeRuleMultiEvent {
    meta:
      author = "Google Cloud Security"
    events:
      $u.udm.principal.hostname = $hostname
      $asset_context.graph.entity.hostname = $hostname

      $severity = $asset_context.graph.entity.asset.vulnerabilities.severity

    match:
      $hostname over 5m

    outcome:
      $risk_score =
        max(
            100
          + if($hostname = "my-hostname", 100, 50)
          + if($severity = "HIGH", 10)
          + if($severity = "MEDIUM", 5)
          + if($severity = "LOW", 1)
        )

      $asset_id_list =
        array(
          if($u.principal.asset_id = "",
             "Empty asset id",
             $u.principal.asset_id
          )
        )

      $asset_id_distinct_list = array_distinct($u.principal.asset_id)

      $asset_id_count = count($u.principal.asset_id)

      $asset_id_distinct_count = count_distinct($u.principal.asset_id)

    condition:
      $u and $asset_context and $risk_score > 50 and not arrays.contains($asset_id_list, "id_1234")
}

rule OutcomeRuleMultiEvent {
    meta:
      author = "[email protected]"
    events:
      $u.udm.principal.hostname = $hostname
      $asset_context.graph.entity.hostname = $hostname

      $severity = $asset_context.graph.entity.asset.vulnerabilities.severity

    match:
      $hostname over 5m

    outcome:
      $total_network_bytes = sum($u.network.sent_bytes) + sum($u.network.received_bytes)

      $risk_score = if(total_network_bytes > 1024, 100, 50) + 
        max(
          if($severity = "HIGH", 10)
          + if($severity = "MEDIUM", 5)
          + if($severity = "LOW", 1)
        )

      $asset_id_list =
        array(
          if($u.principal.asset_id = "",
             "Empty asset id",
             $u.principal.asset_id
          )
        )

      $asset_id_distinct_list = array_distinct($u.principal.asset_id)

      $asset_id_count = count($u.principal.asset_id)

      $asset_id_distinct_count = count_distinct($u.principal.asset_id)

    condition:
      $u and $asset_context and $risk_score > 50 and not arrays.contains($asset_id_list, "id_1234")
}

Single-event rule with outcome section:

rule OutcomeRuleSingleEvent {
    meta:
        author = "[email protected]"
    events:
        $u.metadata.event_type = "FILE_COPY"
        $u.principal.file.size = $file_size
        $u.principal.hostname = $hostname

    outcome:
        $suspicious_host = $hostname
        $admin_severity = if($u.principal.userid in %admin_users, "SEVERE", "MODERATE")
        $severity_tag = if($file_size > 1024, $admin_severity, "LOW")

    condition:
        $u
}

Refactoring a multi-event outcome rule into a single-event outcome rule.

You can use the outcome section for both single-event rules (rules without a match section), and multi-event rules (rules with a match section). If you previously designed a rule to be multi-event just so you could use the outcome section, you can optionally refactor those rules by deleting the match section to improve performance. Be aware that because your rule no longer has a match section that applies grouping, you might receive more detections. This refactor is only possible for rules that use one event variable as shown in the following example.

Multi-event outcome rule which uses only one event variable (a good candidate for a refactor):

rule OutcomeMultiEventPreRefactor {
    meta:
      author = "[email protected]"
      description = "Outcome refactor rule, before the refactor"

    events:
      $u.udm.principal.hostname = $hostname

    match:
      $hostname over 5m

    outcome:
      $risk_score = max(if($hostname = "my-hostname", 100, 50))

    condition:
      $u
}

You can refactor the rule by deleting the match section. Note that you must also remove the aggregate in the outcome section since the rule will now be single-event. For more information on aggregations, see outcome aggregations.

rule OutcomeSingleEventPostRefactor {
    meta:
      author = "[email protected]"
      description = "Outcome refactor rule, after the refactor"

    events:
      $u.udm.principal.hostname = $hostname

    // We deleted the match section.

    outcome:
      // We removed the max() aggregate.
      $risk_score = if($hostname = "my-hostname", 100, 50)

    condition:
      $u
}

Function to placeholder rule example

You can assign a placeholder variable to the result of a function call and can use the placeholder variable in other sections of the rule, such as the match section, outcome section, or condition section. See the following example:

rule FunctionToPlaceholderRule {
    meta:
      author = "[email protected]"
      description = "Rule that uses function to placeholder assignments"

    events:
        $u.metadata.event_type = "EMAIL_TRANSACTION"

        // Use function-placeholder assignment to extract the
        // address from an email.
        // [email protected] -> address
        $email_to_address_only = re.capture($u.network.email.from , "(.*)@")

        // Use function-placeholder assignment to normalize an email:
        // uid@??? -> [email protected]
        $email_from_normalized = strings.concat(
            re.capture($u.network.email.from , "(.*)@"),
            "@company.com"
        )

        // Use function-placeholder assignment to get the day of the week of the event.
        // 1 = Sunday, 7 = Saturday.
        $dayofweek = timestamp.get_day_of_week($u.metadata.event_timestamp.seconds)

    match:
        // Use placeholder (from function-placeholder assignment) in match section.
        // Group by the normalized from email, and expose it in the detection.
        $email_from_normalized over 5m

    outcome:
        // Use placeholder (from function-placeholder assignment) in outcome section.
        // Assign more risk if the event happened on weekend.
        $risk_score = max(
            if($dayofweek = 1, 10, 0) +
            if($dayofweek = 7, 10, 0)
        )

    condition:
        // Use placeholder (from function-placeholder assignment) in condition section.
        // Match if an email was sent to multiple addresses.
        #email_to_address_only > 1
}

Outcome conditionals example rule

In the condition section, you can use outcome variables that were defined in the outcome section. The following example demonstrates how to filter on risk scores to reduce noise in detections by using outcome conditionals.

rule OutcomeConditionalRule {
    meta:
        author = "[email protected]"
        description = "Rule that uses outcome conditionals"

    events:
        $u.metadata.event_type = "FILE_COPY"
        $u.principal.file.size = $file_size
        $u.principal.hostname = $hostname

        // 1 = Sunday, 7 = Saturday.
        $dayofweek = timestamp.get_day_of_week($u.metadata.collected_timestamp.seconds)

    outcome:
        $risk_score =
            if($file_size > 500*1024*1024, 2) + // Files 500MB are moderately risky
            if($file_size > 1024*1024*1024, 3) + // Files over 1G get assigned extra risk
            if($dayofweek=1 or $dayofweek=7, 4) + // Events from the weekend are suspicious
            if($hostname = /highly-privileged/, 5) // Check for files from highly privileged devices

    condition:
        $u and $risk_score >= 10
}