Skip to main content

Security ContentExabeam Security Content in the Common Information Model

Model Attributes

Attributes are the expressions that comprise a model. They define everything from the scope of a model to the type of data a model should train on and under what conditions the model training should occur. The table below provides definitions and examples of possible model attributes.

Attribute

Definition

Example

Model ID

A unique string identifier for a model. It represents the name of the HOCON block in the model configuration file. If a subsequent model in the configuration file has the same key, the first model will be overwritten.

PR-UP

ModelTemplate

Any string describing the model.

"Printers for user"

Description

Any string providing more detail about the model.

"Models the printers used by this user"

Category

Any string that groups this model with others that are similar.

For a list of Exabeam Category values, see Model Categories.

"Print Activity"

IconName

Icon to display next to the model in the UI. Can be left empty.

""

ScopeType

The type of data for which the model is created. Options include ORG, USER, PEERS, or DEVICE.

"USER"

Scope

Specifies the field for which the model is collecting data. In the case of a user scope, each model instance represents a value of the user field. For models with scope type ORG, this field value should be org.

"user"

Feature

The data object for which values are being collected.

"printer_name"

FeatureName

Any string displayed as the header of the feature table when viewing the histogram in the UI.

"printer"

TrainIf

An expression that defines when the model should train on the data. Common expressions include the following:

  • For user-based models – Count, sum, sequenceCount, sequenceSum, DistinctCount, sequenceDistinctCount

  • For asset-based models – CountBy, sumBy, CountByIf, sumByIf, DistinctCountBy, DistinctCountByIf

  • TRUE – The model trains on the specified feature data whenever the activity included in the HistogramEventTypes array occurs.

The example expression on the right indicates that the model should train once per value of printer_name observed during a print-activity event.

Counts are reset on new sessions or sequences. The count() expression is independent of the events that will be considered in the model, which means the event type does not need to be included in the model in order for them to be used by the count() expression.

Note

Other expressions cannot be used within the count() expression. For example count(concat(f1,f2)...) is not supported. Instead, create and use an enriched field with the count expression you want to include.

"count(printer_name,'print-activity')=1"

ModelType

Indicates whether the model holds categorical or numerical data. Options include: CATEGORICAL, NUMERICAL_CLUSTERED, or NUMERICAL_TIME_OF_WEEK.

"CATEGORICAL"

AgingWindow

Starting in Advanced Analytics I48, represents the number of weeks data will be held in the model before purging. The default value is 16.

"24"

CutOff

Number of events below which the confidence_factor or ConfidenceFactorAboveOrEqual() will not be calculated. If the number of data points does not reach this minimum number, the confidence factor cannot be calculated accurately.

"5"

MaxNumberOfBins

Starting in Advanced Analytics I48, represents the maximum number of bins the model is allowed before the instance is disabled.

"1000"

Alpha

A factor that can be used to adjust the calculation of the confidence_factor. It's useful, for high-volume data sources when the model seems to achieve convergence too quickly and without enough data. Alpha can be used to increase the amount of data required to calculate confidence.

Confidence (cf) is calculated as follows:

cf = [(N-C) / N]α
N: number of observed events.
C: number of unique observed events.
α: a factor determining how quickly the 
   confidence grows. The higher the number 
   the slower confidence grows.

The higher the value of alpha, the greater the amount of data required for the model to converge.

"1"

ConvergenceFilter

Signifies when the model is suitable to be used as a base line to trigger rules. Until the model achieves this confidence level, it cannot be used as a baseline. The calculation can be adjusted using the Alpha expression described above.

Values for the confidence factor (cf) range from 0, indicating no confidence, to 1 indicating full confidence.

Confidence (cf) is calculated as follows:

cf = [(N-C) / N]α
N: number of observed events.
C: number of unique observed events.
α: a factor determining how quickly the 
   confidence grows. The higher the number 
   the slower confidence grows.

"confidence_factor>=0.8"

HistogramEventTypes

Array of events to be considered by the model.

[ "print-activity" ]

Disabled

Indicates whether or not the model is disabled. If TRUE no data is collected. Can be enabled.

"FALSE"