Security ContentExabeam How Content Works Guide

Table of Contents

Model and Rule Attributes Definitions

Refer to this appendix to learn more about the model and rule attributes for session events.

The following tables and examples provide definitions of the rule and model attributes for session events.

Note

Expressions and logic are slightly different for Entity Analytics rules and models. Also, expressions are slightly different for user based sequences, such as web and endpoint.

Model Attributes

Refer to this table to learn more about model attributes.

Attribute

Definition

Example

Model Id

The hocon object name, which must be unique. Used in rules to reference the model, and can also be used to search in Data Insights.

PR-UP

ModelTemplate

Any string describing the model.

Printers for user

Description

Any string providing more detail about the model.

Models the printers used by this user

Category

Any string that groups this model with others that are similar. Will show up in the UI in Data Insights.

Print Activity

IconName

Icon to display next to the model in the UI. Can be left empty.

user

ScopeType

Type of scope. Options are ORG, USER, PEERS, or DEVICE.

USER

Scope

The entity for which the model is created. In this case, a model instance will be an instance for every value of the user field. For models with scope type ORG, this field should be "org".

user

Feature

The value that should be modeled for the entity. In this case, it's the value of the printer_name field.

printer_name

FeatureName

Any string displayed as the header of the feature table when viewing the histogram in the UI.

printer

TrainIf

When the model should be trained. Common expressions are count() and TRUE. count(myField, 'event1','event2',...)=1 means once per value of the myField field is observed in the specified event types. Counts are reset on new sessions/sequences and are per user. The count() expression is independent of the events that will be considered in the model, which means you do not have to include the event types in the model for the count() expression to use them.

Note

You cannot use other expressions within count(), for example count(concat(f1,f2)...) is not supported. You will have to create an enriched field with the expression you want to count on and use that.

In this example the model will be trained once per printer_name as appears in the print-activity events per user per session. This can be combined with other expressions. Setting the field to TRUE would train the model on every event.

count(printer_name,'print-activity')=1

ModelType

Whether the model holds categorical or numerical data. Options are: CATEGORICAL, NUMERICAL_CLUSTERED, or NUMERICAL_TIME_OF_WEEK.

CATEGORICAL

AgingWindow

Starting in Advanced Analytics I48, represents the number of weeks data will be held in the model before purging. The default value is 16.

24

CutOff

Number of events below which the confidence_factor or ConfidenceFactorAboveOrEqual() will return 0 regardless of actual calculation.

5

MaxNumberOfBins

Starting in Advanced Analytics I48, represents the max number of bins the model is allowed before the instance is disabled.

1000

Alpha

Used in the confidence_factor calculation: ((N-C)/N)^a in which N=total data points in the model, C=number of bins, and a=alpha. The higher the alpha the more data would be required for the model to converge.

1

ConvergenceFilter

Used only for internal stats. This parameter signifies the amount of data needed for the model to train. It's calculated using the expression mentioned above. So, in this example, the model does not train the data if it does not satisfy the condition confidence_factor>=0.8.

The confidence factor (cf) goes from 0 when we have no confidence to 1 when we have full confidence. The formula is

cf = [(N-C) / N]α
N: number of observed events.
C: number of unique observed events.
α: a factor determining how quickly the confidence grows. The higher the number the slower confidence grows.

confidence_factor>=0.8

HistogramEventTypes

Array of events to be considered by the model.

[ "print-activity" ]

Disabled

Determines whether the model should be enabled. If TRUE no data is collected.

FALSE

Model Example

Refer to this example to learn more about model attributes.

Here is an example model containing many of the attributes described in the model attributes table.

PR-UP {
    ModelTemplate = "Printers for user"
    Description = "Models the printers used by this user."
    Category = "Print Activity"
    IconName = ""
    ScopeType = "USER"
    Scope = """user"""
    Feature = """printer_name"""
    FeatureName = "printer"
    FeatureType = "printer"
    TrainIf = """count(printer_name,'print-activity')=1"""
    ModelType = "CATEGORICAL"
    AgingWindow = ""
    CutOff = "5"
    MaxNumberOfBins = "20000"
    Alpha = "1"
    ConvergenceFilter = "confidence_factor>=0.8"
    HistogramEventTypes = [
      "print-activity"
    ]
    Disabled = "FALSE"
  }

Rule Attributes

Refer to this table to learn more about rule attributes.

Attribute

Definition

Example

Rule Id

ID of the rule, which must be unique. This value can be used when searching for a rule in Threat Hunter.

rule_id_name { all the \n below \n fields }

PR-UP-F

RuleName

Free text describing the rule. This text appears in the UI when a rule triggers several times in a session and is aggregated. It can also be used to identify a rule in Threat Hunter.

First print activity from printer for user

RuleDescription

Text providing more details about the rule. This appears in the UI when the rule details are expanded.

This is the first time for this user to print from this printer. This can be significant because printing can be a way to exfiltrate data from the organization

ReasonTemplate

Text that appears in the UI that explains the rule. {default|featureValue|histogram} is a placeholder that is replaced with event specific values when rendered in the UI.

The first part, default, indicates there is no special treatment of the value. Other options are asset, location.country, location.zone, time.day_of_week, time.time_of_week, user, or user.group.

featureValue is replaced with the feature value, printer_name in this case. scopeValue or event. <field_name> can also be used. The field has to be persisted in Mongo for the value to show up correctly.

|histogram is an optional part which makes the value clickable and will link to the model instance.

First print activity from printer for user:

{default|featureValue|histogram}

AggregateReasonTemplate

Rules that trigger multiple times in a session will be aggregated when reviewing the session in the user page. The header of the aggregated rules is the RuleName field. When expanding the aggregated rule this is the text that will show up. The event specific placeholder is the same as in ReasonTemplate.

First print activity from printer for user: {default|featureValue|histogram}

RuleType

Session or sequence the rule should trigger in. Possible values are account-lockout, asset, database, endpoint, file, session, or web.

session

RuleCategory

Free text field describing use case/classification. Rules are grouped under this value in the rule editor UI.

Data Loss Prevention

ClassifyIf

Expression indicating how often the rule should trigger. The expression in this example means once per printer name. The syntax and logic is the same as for the model TrainIf attribute.

Note

For fact based rules this value, and all conditions placed in the RuleExpression attribute, has to be TRUE.

count(printer_name, 'print-activity')=1

RuleEventTypes

Array indicating which event could trigger the rule.

[ "print-activity" ]

Disabled

Whether the rule is enabled or not.

FALSE

Model

The model used by the rule. For fact based rules this should be FACT.

PR-UP

FactFeatureName

For fact based rules only, this value will be shown when using featureValue in the ReasonTemplate and AggregateReasonTemplate fields.

printer_name

Score

How much the rule should be scored. Starting in Advanced Analytics I46, this can be an expression, for example multiply(field1,field2). Negative values are also allowed to reduce session risk. This score will be adjusted based on the data if Histogram shaping and Bayesian scoring are enabled.

10.0

RuleLabels

Rule tagging. Currently used for Mitre Att&ck feature.

mitre = ["T1052"]

PercentileThreshold

Percentile below which values are considered anomalous. Will affect the value of the percentile_threshold_count (p-value) expression, such as in the expression num_observations<percentile_threshold_count.

0.1

RuleExpression

Expression that defines when the rule should trigger.

In model based rules num_observations indicates how many times the current value (feature) was observed. 0 means never observed before (first). Other model based expressions are:

  • num_observations – the number of times this feature appears in the model.

  • probability – the number of times the current value exists in the model divided by the total data points in the model.

  • total_events – the number of data points in the model.

  • num_bins– the number of bins in the model.

  • confidence_factor – the result of the calculation ((N-C)/N)^a in which N=total data points in the model, C=number of bins, and a=alpha.

  • ConfidenceFactorAboveOrEqual() – returns true if confidence_factor is above or equal 0.8. The 0.8 threshold is defined in the GlobalConfidenceFactor parameter in the configuration file. ConfidenceFactorAboveOrEqual(n) can also be used to specify a different threshold.

num_observations=0 && ConfidenceFactorAboveOrEqual()

DependencyExpression

Whether the rule should trigger only if a different rule has triggered (or not) on the same event. Other rules are referenced by Id and can be used in boolean operations, for example (R1 || R2) && !R3.

NA

ScoreTarget (optional)

For asset based rules, with events with both a dest and src, send the points to only the one specified in scoreTarget.

scoreTarget = src_host

Rule Example

The following example contains many of the attributes described in the rule attributes table.

PR-UP-F {
    RuleName = "First print activity from printer for user"
    RuleDescription = "This is the first time for this user to print from this printer"
    ReasonTemplate = "First print activity from printer {default|featureValue|histogram} for user"
    AggregateReasonTemplate = "First print activity from printer for user: {default|featureValue|histogram}"
    RuleType = "session"    RuleCategory = "Data Loss Prevention"
    ClassifyIf = """count(printer_name, 'print-activity')=1"""
    RuleEventTypes = [
	  "print-activity"
    ]
    Disabled = "FALSE"
    Model = "PR-UP"
    FactFeatureName = "printer_name"
    Score = "10.0"
    RuleLabels {
 	  mitre = ["T1052"]
    }
    PercentileThreshold = "0.1"
    RuleExpression = """num_observations=0 && ConfidenceFactorAboveOrEqual()"""
    DependencyExpression = "NA"
  }