- Advanced Analytics
- Understand the Basics of Advanced Analytics
- Deploy Exabeam Products
- Considerations for Installing and Deploying Exabeam Products
- Things You Need to Know About Deploying Advanced Analytics
- Pre-Check Scripts for an On-Premises or Cloud Deployment
- Install Exabeam Software
- Upgrade an Exabeam Product
- Add Ingestion (LIME) Nodes to an Existing Advanced Analytics Cluster
- Apply Pre-approved CentOS Updates
- Configure Advanced Analytics
- Set Up Admin Operations
- Access Exabeam Advanced Analytics
- A. Supported Browsers
- Set Up Log Management
- Set Up Training & Scoring
- Set Up Log Feeds
- Draft/Published Modes for Log Feeds
- Advanced Analytics Transaction Log and Configuration Backup and Restore
- Configure Advanced Analytics System Activity Notifications
- Exabeam Licenses
- Exabeam Cluster Authentication Token
- Set Up Authentication and Access Control
- What Are Accounts & Groups?
- What Are Assets & Networks?
- Common Access Card (CAC) Authentication
- Role-Based Access Control
- Out-of-the-Box Roles
- Set Up User Management
- Manage Users
- Set Up LDAP Server
- Set Up LDAP Authentication
- Third-Party Identity Provider Configuration
- Azure AD Context Enrichment
- Set Up Context Management
- Custom Context Tables
- How Audit Logging Works
- Starting the Analytics Engine
- Additional Configurations
- Configure Static Mappings of Hosts to/from IP Addresses
- Associate Machine Oriented Log Events to User Sessions
- Display a Custom Login Message
- Configure Threat Hunter Maximum Search Result Limit
- Change Date and Time Formats
- Set Up Machine Learning Algorithms (Beta)
- Detect Phishing
- Restart the Analytics Engine
- Restart Log Ingestion and Messaging Engine (LIME)
- Custom Configuration Validation
- Advanced Analytics Transaction Log and Configuration Backup and Restore
- Reprocess Jobs
- Re-Assign to a New IP (Appliance Only)
- Hadoop Distributed File System (HDFS) Namenode Storage Redundancy
- User Engagement Analytics Policy
- Configure Settings to Search for Data Lake Logs in Advanced Analytics
- Enable Settings to Detect Email Sent to Personal Accounts
- Configure Smart Timeline™ to Display More Accurate Times for When Rules Triggered
- Configure Rules
- Exabeam Threat Intelligence Service
- Threat Intelligence Service Prerequisites
- Connect to Threat Intelligence Service through a Proxy
- View Threat Intelligence Feeds
- Threat Intelligence Context Tables
- View Threat Intelligence Context Tables
- Assign a Threat Intelligence Feed to a New Context Table
- Create a New Context Table from a Threat Intelligence Feed
- Check ExaCloud Connector Service Health Status
- Disaster Recovery
- Manage Security Content in Advanced Analytics
- Exabeam Hardening
- Set Up Admin Operations
- Health Status Page
- Troubleshoot Advanced Analytics Data Ingestion Issues
- Generate a Support File
- View Version Information
- Syslog Notifications Key-Value Pair Definitions
Troubleshoot Advanced Analytics Data Ingestion Issues
Hardware and Virtual Deployments Only
If events appear incorrectly in Advanced Analytics, track a log as it's ingested into Advanced Analytics to identify where and when it encounters any issues.
An event may appear incorrectly or not at all because there was an issue with ingesting data into Advanced Analytics:
Log Ingestion and Messaging Engine (LIME) or the Analytics Engine didn't parse the log, or parsed it incorrectly.
The Analytics Engine enriched the event with incorrect information.
To identify which of these issues caused your problem, track the logs as they're processed into the Advanced Analytics system. You can track logs based on criteria you specify, or specific logs.
If you track logs based on criteria you specify, you can only track the corresponding events as they're processed through the Analytics Engine. You create a JSON file that specifies which event fields the corresponding events must contain and the issue you're troubleshooting. Then, you run a Python script. The Python script adds a flag, EXA_ENABLE_TRACE_FILTERING
, to all events with the event fields you specified. The flag marks the log so the Analytics Engine knows to record additional information about them. The script also sends the JSON file to the Analytics Engine for every node in your cluster.
If you track specific set of logs, you can track them as they're processed through both LIME and the Analytics Engine. You add a flag, EXA_ENABLE_TRACE
, to the end of the log. This field marks the log so LIME and the Analytics Engine records additional information about it. Then, you ingest the log using the Syslog protocol.
As logs are ingested into Advanced Analytics, the Analytics Engine returns messages about what's happening to the logs; for example, if LIME parsed your log, which parser it used, and how it parsed the log. These messages are saved to a file called exabeam.log
.
Use the returned messages to identify the problem and take steps to resolve it.
Returned Messages for Troubleshooting Data Ingestion Issues
Before you investigate your data ingestion issues, understand how the returned messages are structured and what they mean.
When troubleshooting potential data ingestion issues, you track logs as they're ingested through Log Ingestion Messaging Engine (LIME) or the Analytics Engine. After the log is ingested, the tracking mechanism returns messages to exabeam.log
describing what happened as the log was ingested; for example:
| - | 2020-04-29 16:38:43.731 | 6.620 | INFO | c.exabeam.lime.connector.SyslogImporter | Created SyslogLine, objType=com.exabeam.lime.connector.SyslogImporter$SyslogLine, obj=SyslogLine(1585766883718,2020-04-01 18:48:03.718000+00:00,2020-04-01T14:48:03.718694-04:00,eamagnt01,"filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg="" EXA_ENABLE_TRACE",None) | - | 2020-04-29 16:38:45.530 | 8.419 | INFO | c.exabeam.lime.event.LogToEventConverter | Created message info., objType=com.exabeam.lime.util.MessageInfo, obj=MessageInfo(exabeam_time=2020-04-01 18:48:03.718000+00:00 exabeam_reported_time=2020-04-01T14:48:03.718694-04:00 exabeam_host=eamagnt01 exabeam_raw=filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg=" EXA_ENABLE_TRACE ,MessageType(proofpoint-m2,dlp-email-alert,yyyy-MM-dd HH:mm:ss,false,Vector(mod=session cmd=data rcpt=),ArrayBuffer(FieldExtractor(\sx=({xid}.+?)\s+(\w+=|$),ListBuffer('xid)), FieldExtractor(\srcpt=({recipient}[^@=]+?@({external_domain_recipient}.+?))\s+(\w+=|$),ListBuffer('recipient, 'external_domain_recipient))),None,None,Some(Proofpoint),Some(Proofpoint DLP)),msgType: "proofpoint-m2", dataType: "dlp-email-alert", msgOrder: "0", time: "1585766883718", rawLogTime: "1585766883718", timeFormat: "yyyy-MM-dd HH:mm:ss", fields: ('EXA_ENABLE_TRACE,"f._2"), ('recipient,"f._2"), ('external_domain_recipient,"f._2"), ('xid,"f._2"),true) | - | 2020-04-29 16:38:45.559 | 8.449 | INFO | com.exabeam.event.EventBuilderHelper | Found handlers: Vector(com.exabeam.event.EventHandler@4388f8d3, com.exabeam.event.EventHandler@596c9550, com.exabeam.event.EventHandler@358d9f22, com.exabeam.event.EventHandler@6f4bf4f8), objType=com.exabeam.bar.message.Message, obj=msgType: "proofpoint-m2", dataType: "dlp-email-alert", msgOrder: "0", time: "1585766883718", rawLogTime: "1585766883718", timeFormat: "yyyy-MM-dd HH:mm:ss", fields: ('EXA_ENABLE_TRACE,"f._2"), ('recipient,"f._2"), ('external_domain_recipient,"f._2"), ('xid,"f._2")
Each output message has four parts:
<summary>, objType=<objType>, filteredBy=, <obj=<obj>
A short summary of what happened as the log was ingested
objType
– Describes what object was traced:SyslogLine
,MessageInfo
,Message
,Event
.SyslogLine
– A raw log as it's ingested into LIME.MessageInfo
– A raw log that's been identified to an event type.Message
– A message that's been parsed from a raw log. Next, an Event Builder turns it into an event.Event – An event an Event Builder created. Information from the message has been mapped to event type fields.
If you're tracking multiple logs, the
objType
in your output messages is alwaysEvent
because logs are tracked only as they're ingested through the Analytics Engine.
filteredBy
– Appears only if you're tracking multiple logs; describes what models you're troubleshooting as indicated in the JSON file, undermodelingFilter
models
.obj
– Describes what objType looks like at a given moment
For example, let's take this message:
| - | 2020-04-29 16:38:43.731 | 6.620 | INFO | c.exabeam.lime.connector.SyslogImporter | Created SyslogLine, objType=com.exabeam.lime.connector.SyslogImporter$SyslogLine, obj=SyslogLine(1585766883718,2020-04-01 18:48:03.718000+00:00,2020-04-01T14:48:03.718694-04:00,eamagnt01,"filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg="" EXA_ENABLE_TRACE",None)
Created SyslogLine
indicates that you're looking at the raw log.The
objType
,objType=com.exabeam.lime.connector.SyslogImporter$SyslogLine
, indicates that the tracking mechanism tracked the raw log.The
obj
,obj=SyslogLine(1585766883718,2020-04-01 18:48:03.718000+00:00,2020-04-01T14:48:03.718694-04:00,eamagnt01,"filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg="" EXA_ENABLE_TRACE",None)
, is what the raw log looked like.
Identify Which Event Details Were Enriched
Hardware and Virtual Deployments Only
Verify if an event is appearing incorrectly because the Analytics Engine enriched the event with incorrect information.
1. Identify logs to troubleshoot
If you don't know which log may be the issue, track a subset of logs that contain certain event fields. Create a JSON file that specifies these fields, then run a Python script to send the JSON file to Log Ingestion Messaging Engine (LIME) and the Analytics Engine for every node in your cluster.
Track specific logs
If you think that a specific set of logs may be the issue, track the log to verify this.
Add EXA_ENABLE_TRACE
to the end of a raw log. EXA_ENABLE TRACE
flags the log and lets the Analytics Engine know that it should log additional data. For example:
filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg="" EXA_ENABLE_TRACE
Track logs based on event fields
If you don't know which log may be the issue, track a subset of logs that contain certain event fields. Create a JSON file that specifies these fields, then run a Python script to send the JSON file to Log Ingestion Messaging Engine (LIME) and the Analytics Engine for every node in your cluster.
In a new JSON file, define the type of logs you're troubleshooting by event fields they contain. Create a key,
eventFields
.To track a specific subset of logs, under
eventFields
, list the event fields these logs must contain as key/value pairs:{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] },
You can list any existing event field, including:
event_type
source
vendor
time
user
GetValue ('zone_info', src)
To ensure that you cover an appropriate amount of logs, specify at least the
event_type
anduser
fields. For example, to trackvpn-login
andvpn-logout
events for users Barbara Salazar and Sherri Lee, write:"eventFields" :{ "user" : ["Blake Crouch", "Ted Chiang"] "event_type" : ["vpn-login", "vpn-logout"] },
You can't use event field aliases that are used in MongoDB. When events are saved to MongoDB, the event fields are replaced with aliases; for example, the
user
event field is replaced with theg
alias. In this JSON file, you must useuser
, notg
.To track all events, leave
eventFields
empty:{ "eventFields":{},
To enable the tracking mechanism, add a key,
pipelineFilter
, with the value"enabled:True"
:"pipelineFilter":{ "enabled":true }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, }
Keys
modelingFilter
,ruleTriggeringFilter
andeventDroppingFilter
are used to troubleshoot other issues. You must add these keys and set value"enabled":false
:"modelingFilter:{ "enabled":false "models":[] }, "eventDroppingFilter":{ "enabled":false, }, "ruleTriggeringFilter":{ "enabled":false, "names":[] }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "modelingFilter":{ "enabled":false "models":[] }, "eventDroppingFilter":{ "enabled":false }, "ruleTriggeringFilter":{ "enabled":false, "names":[] } }
Add three fields:
maxEventsToMark
,maxMarkingTimeMs
, andmaxWorkingTime
. To keep your system running correctly, these fields limit how many logs you can track and how long it takes to process them:"maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000
maxEventsToMark
– The maximum number of logs to track.maxMarkingTimeMs
– How long, in seconds, ingested logs are flagged for tracking. After this period, the tracking mechanism may continue flagging logs for a few seconds and may flag more logs thanmaxEventsToMark
allows. If you reachmaxEventsToMark
beforemaxMarkingTime
, the tracking mechanism stops flagging logs.maxWorkingTime
– How long, in seconds, the tracking mechanism is enabled. After this period, the tracking mechanism may continue for an additional 30 seconds.
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "modelingFilter":{ "enabled":false }, "eventDroppingFilter":{ "enabled":false }, "ruleTriggeringFilter":{ "enabled":false, "names":[] }, "maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000, }
Save the file.
2. Ingest the logs
After you identify which logs to troubleshoot, ingest the log so the tracking mechanism returns data about how the log is being processed.
Ingest specific logs
If you're tracking a specific set of logs, ingest the log using the Syslog protocol. Ensure that you ingest one log at a time. If you ingest multiple logs at once, your system may run slower.
If you have command line access to your hardware appliance, ingest the log by appending to the Syslog file:
cat <log file> >> /opt/exabeam/data/input/rsyslog/$(date '+%Y-%m-%dT%k').Internal.syslog.log
If you don't have command line access to your hardware appliance, ingest the log using a host that has a Syslog listener:
cat <log file> | nc <appliance IP>
If Advanced Analytics fetches the log from another SIEM, send the raw log, with the added
EXA_ENABLE_TRACE
string, to that SIEM.
The log is tracked as it's ingested through LIME and the Analytics Engine.
Ingest logs based on criteria
If you're tracking logs based on criteria you specified, run a python script to ingest the log. This script adds a field, EXA_ENABLE TRACE_FILTERING
, to all logs that match the event fields you specified in the JSON file. This field flags the logs and lets the Analytics Engine know that it should record additional data. It also sends the JSON file to LIME and the Analytics Engine for every node in your cluster.
To start tracking the logs, run:
./martini_tracing_filter_control.py -c <yourfile.json>
If you have a multi-node deployment and you want to aggregate and track logs from all nodes, run:
run_shell_cmd_against_nodes.py -t
To see what other commands you can run with
run_shell_cmd_against_nodes.py
, run:run_shell_cmd_against_nodes.py -h
The log is tracked as it's ingested through the Analytics Engine only. It is not tracked as it's ingested through LIME.
3. Interpret Returned Messages
After the log is ingested, the tracking mechanism returns messages to exabeam.log
describing what happened as the log was ingested. Use these messages to identify if LIME or the Analytics Engine encounters any errors when ingesting your log.
Ensure that you understand what the returned messages mean.
View the returned messages for the tracked logs:
To view all returned messages for all logs that were tracked, run:
cat /opt/exabeam/data/logs/exabeam.log | grep "objType"=<objType>
If no messages appear, it may be because:
One of the requirements in your JSON file is incorrect. Return to your JSON file and ensure that all fields in your JSON file are correct.
The event was discarded. Verify if this is the case.
To understand what details the event was enriched with, compare the event before and after it enters the Analytics Engine:
diff -u <(cat evt1 | sed 's/),(/),~(/g' | tr ',~' ',\n') <(cat evt2 | sed 's/),(/),~(/g' | tr ',~' ',\n') | grep -E "^\+"
The command compares the log before it enters the Analytics Engine and before it enters the node. For example, the output may look like:
(.env) [exabeam@cct210219-140012 ~]$ diff -u <(cat evt1 | sed 's/),(/),~(/g' | tr ',~' ',\n') <(cat evt2 | sed 's/),(/),~(/g' | tr ',~' ',\n') | grep -E "^\+" +++ /dev/fd/62 2021-02-19 14:57:39.581127012 +0000 +| - | 2021-02-19 14:27:56.138 | 805.761 | INFO | com.exabeam.martini.logging.LogTracing [MartiniContext-akka.actor.pipeline-dispatcher-44] | Received event at node='mainrun-context-enricher', objType=com.exabeam.martini.extractions.Event, obj=Event[time:2014-05-15 23:08:58,eType:vpn-login,source:VPN,vendor:Juniper VPN,id:2885@m,rawLogTime:1400206138000,optRawLogs:None,optRawLogRefs:None,optAlertUniqueId:None,sessionId:user63-20140515230858,sessionLabel:vpn-in,sessionStage:login,lockoutId:NA,lockoutStage:NA,fields:,(count(user, 'vpn-login'),1), +(duration,0), +(count(getvalue('isp', src_ip), 'app-login', 'app-activity', 'authentication-successful', 'vpn-login', 'remote-logon'),1), +(getvalue('country_code', src_ip),IL), +(distinctcount(badge_id, 'physical-access'),0), +(distinctcount(getvalue('zone_info', dest)),0), +(count(event_type, 'remote-access', 'remote-logon', 'local-logon', 'kerberos-logon', 'ntlm-logon', 'vpn-login', 'account-creation', 'account-deleted', 'member-added', 'member-removed', 'account-switch', 'app-login', 'app-activity'),1), +(count(event_type, 'remote-access', 'remote-logon', 'local-logon', 'kerberos-logon', 'ntlm-logon', 'vpn-login', 'account-password-reset', 'account-password-change', 'account-creation', 'account-deleted', 'member-added', 'member-removed', 'account-switch', 'app-login', 'app-activity', 'privileged-access', 'privileged-object-access', 'audit-policy-change', 'audit-log-clear', 'authentication-successful', 'database-login', 'nac-logon', 'physical-access', 'account-unlocked', 'account-unlocked-dup', 'account-enabled', 'account-disabled', 'member-added-dup', 'passwordvault-account-switch-dup', 'dlp-email-alert-out'),1), +(getvalue('isp', src_ip),Bezeq International), +(count(src_ip, 'vpn-login'),1), +(activity_types,Set(vpn)), +(count(getvalue('country_code', src_ip), 'app-login', 'app-activity', 'vpn-login', 'authentication-successful', 'remote-logon'),1),
If the event wasn't enriched with details you expect, check your parsers or enrichment configuration for errors:
If the missing detail is in the raw log, verify if the parser is parsing your logs correctly.
If the detail is enriched from a context table, verify that your enrichment configuration and context management settings are correct.
Identify If an Event Was Discarded
Hardware and Virtual Deployments Only
Verify if an event is appearing incorrectly because the Analytics Engine incorrectly discarded the event.
First, create a JSON file that identifies the logs you're troubleshooting. Then, run a Python script that starts tracking logs as they're ingested into your system. The script saves any additional information logged during this process to exabeam.log
.
1. Identify logs to troubleshoot
To troubleshoot an issue, you track logs as they're ingested into Advanced Analytics. If you think that one, specific log may be the issue, track just that log. If you're unsure about which log is causing the issue, track multiple logs based on criteria you specify, like event type, source, and user.
Track specific logs
If you think that a specific set of logs may be the issue, track the log to verify this.
Add EXA_ENABLE_TRACE
to the end of a raw log. EXA_ENABLE TRACE
flags the log and lets the Analytics Engine know that it should log additional data. For example:
filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg="" EXA_ENABLE_TRACE
Track logs based on event fields
If you don't know which log may be the issue, track a subset of logs that contain certain event fields. Create a JSON file that specifies these fields, then run a Python script to send the JSON file to Log Ingestion Messaging Engine (LIME) and the Analytics Engine for every node in your cluster.
In a new JSON file, define the type of logs you're troubleshooting by event fields they contain. Create a key,
eventFields
.To track a specific subset of logs, under
eventFields
, list the event fields these logs must contain as key/value pairs:{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] },
You can list any existing event field, including:
event_type
source
vendor
time
user
GetValue ('zone_info', src)
To ensure that you cover an appropriate amount of logs, specify at least the
event_type
anduser
fields. For example, to trackvpn-login
andvpn-logout
events for users Barbara Salazar and Sherri Lee, write:"eventFields" :{ "user" : ["Blake Crouch", "Ted Chiang"] "event_type" : ["vpn-login", "vpn-logout"] },
You can't use event field aliases that are used in MongoDB. When events are saved to MongoDB, the event fields are replaced with aliases; for example, the
user
event field is replaced with theg
alias. In this JSON file, you must useuser
, notg
.To track all events, leave
eventFields
empty:{ "eventFields":{},
To enable the tracking mechanism, add a key,
pipelineFilter
, with the value"enabled:True"
:"pipelineFilter":{ "enabled":true }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, }
To track whether the specified events are going to be discarded, add a key,
eventDroppingFilter
with the value"enabled":true"
:"eventDroppingFilter":{ "enabled":true }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "eventDroppingFilter":{ "enabled":true }, }
Keys
ruleTriggeringFilter
andmodelingFilter
are used to troubleshoot other issues. You must add these keys and set value"enabled":false
:"ruleTriggeringFilter":{ "enabled":false, "names":[] }, "modelingFilter":{ "enabled":false, "models":[] }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "eventDroppingFilter":{ "enabled":true }, "ruleTriggeringFilter":{ "enabled":false, "names":[] }, "modelingFilter":{ "enabled":false, "models":[] } }
Add three fields:
maxEventsToMark
,maxMarkingTimeMs
, andmaxWorkingTime
. To keep your system running correctly, these fields limit how many logs you can track and how long it takes to process them:"maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000
maxEventsToMark
– The maximum number of logs to track.maxMarkingTimeMs
– How long, in seconds, ingested logs are flagged for tracking. After this period, the tracking mechanism may continue flagging logs for a few seconds and may flag more logs thanmaxEventsToMark
allows. If you reachmaxEventsToMark
beforemaxMarkingTime
, the tracking mechanism stops flagging logs.maxWorkingTime
– How long, in seconds, the tracking mechanism is enabled. After this period, the tracking mechanism may continue for an additional 30 seconds.
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "eventDroppingFilter":{ "enabled":true }, "ruleTriggeringFilter":{ "enabled":false, "names":[] } "modelingFilter":{ "enabled":false, "models":[] }, "maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000, }
Save the file.
2. Ingest the logs
After you identify which logs to troubleshoot, ingest the log so the tracking mechanism returns data about how the log is being processed.
Ingest specific logs
If you're tracking a specific set of logs, ingest the log using the Syslog protocol. Ensure that you ingest one log at a time. If you ingest multiple logs at once, your system may run slower.
If you don't have command line access to your hardware appliance, ingest the log using a host that has a Syslog listener:
cat <log file> | nc locahost 514
If you have command line access to your hardware appliance, ingest the log by appending to the Syslog file:
cat <log file> >> /opt/exabeam/data/input/rsyslog/$(date '+%Y-%m-%dT%k').Internal.syslog.log
If Advanced Analytics fetches the log from another SIEM, send the raw log, with the added
EXA_ENABLE_TRACE
string, to that SIEM.
Ingest logs based on criteria
If you're tracking logs based on criteria you specified, run a python script to ingest the log. This script adds a field, EXA_ENABLE TRACE_FILTERING
, to all logs that match the event fields you specified in the JSON file. This field flags the logs and lets the Analytics Engine know that it should record additional data. It also sends the JSON file to LIME and the Analytics Engine for every node in your cluster.
To start tracking the logs, run:
./martini_tracing_filter_control.py -c <yourfile.json>
If you have a multi-node deployment and you want to aggregate and track logs from all nodes, run:
run_shell_cmd_against_nodes.py -t
To see what other commands you can run with
run_shell_cmd_against_nodes.py
, run:run_shell_cmd_against_nodes.py -h
The log is tracked as it's ingested through the Analytics Engine only. It is not tracked as it's ingested through LIME.
3. Interpret Returned Messages
After the log is ingested, the tracking mechanism returns messages to exabeam.log
describing what happened as the log was ingested. Use these messages to identify if LIME or the Analytics Engine encounters any errors when ingesting your log.
Ensure that you understand what the returned messages mean.
To view all returned messages for all logs that were tracked, run:
cat /opt/exabeam/data/logs/exabeam.log | grep "objType"=<objType>
If no messages appear, it may be because:
One of the requirements in your JSON file is incorrect. Return to your JSON file and ensure that all fields in your JSON file are correct.
To identify if an event was discarded, run:
cat /opt/exabeam/data/logs/exabeam.log | grep -i 'discard .* due to'
If the event was discarded, you may see messages indicating that the event was discarded and why, including:
discard share-access due to filter, objType=com.exabeam.martini.extractions.Event, obj=Event[...]
– The event was discarded because of conditions specified in theDiscardIf
hook in/opt/exabeam/config/custom/custom_exabeam_config.conf
.discard EVENT_TYPE due to login-filter, ...
– The Analytics Engine couldn't find a session.discard EVENT_TYPE due to time-before-login, ...
– The event happened before the session.discard EVENT_TYPE due to MaxUsers, ...
–MaxUsers
ignores new user's events once the number of existing users reaches a specified threshold. The event was discarded because it met theMaxUsers
threshold and the user is not in Active Directory.discard EVENT_TYPE due to DropEventCondition-$CONDITION, ...
–Data Sanity
detects misparsed or other erroneous data, then discards an event. The event was discarded because it matches one of theData Sanity
conditions in/opt/exabeam/config/custom/custom_exabeam_config.conf
.discard EVENT_TYPE due to MaxEvents
– The event was discarded because the event type was disabled for that user, the user or asset was disabled, or if this is a user-related event, the user was neither logged in nor logged out of the session.discard EVENT_TYPE due to default-filter, ...
– The event was discarded because this is a computer logon event and asset sequences are disabled, this is a domain controller remote access event and theuser
field is a computer name ending with $, or the event was duplicated when both the domain controller and host sent the same log.discard EVENT_TYPE due to error, ...
– The event was discarded because of an exception in the Analytics Engine.
If the event was discarded and this isn't what you expect, modify the
DiscardIf
hook in/opt/exabeam/config/custom/custom_exabeam_config.conf
. Modifying theDiscardIf
hook may have significant consequences for your system, so it's best that you contact Exabeam Customer Success to assist you.If you don't see these messages, the event wasn't discarded. Investigate what else may be causing your data ingestion issues.
Identify if a Model is Training
Hardware and Virtual Deployments Only
Verify if an event is appearing incorrectly because a model is training incorrectly on your events.
First, create a JSON file that identifies the logs and models you're troubleshooting. Then, run a Python script that starts tracking logs as they're ingested into your system. The script saves any additional information recorded during this process to exabeam.log
. Use the returned messages in exabeam.log
to investigate why your events might not be appearing correctly.
1. Identify logs to troubleshoot
To troubleshoot an issue, you track logs as they're ingested into Advanced Analytics. If you think that one, specific log may be the issue, track just that log. If you're unsure about which log is causing the issue, track multiple logs based on criteria you specify, like event type, source, and user.
Track specific logs
If you think that a specific set of logs may be the issue, track the log to verify this.
Add EXA_ENABLE_TRACE
to the end of a raw log. EXA_ENABLE TRACE
flags the log and lets the Analytics Engine know that it should log additional data. For example:
filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg="" EXA_ENABLE_TRACE
Track logs based on event fields
If you don't know which log may be the issue, track a subset of logs that contain certain event fields. Create a JSON file that specifies these fields, then run a Python script to send the JSON file to Log Ingestion Messaging Engine (LIME) and the Analytics Engine for every node in your cluster.
In a new JSON file, define the type of logs you're troubleshooting by event fields they contain. Create a key,
eventFields
.To track a specific subset of logs, under
eventFields
, list the event fields these logs must contain as key/value pairs:{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] },
You can list any existing event field, including:
event_type
source
vendor
time
user
GetValue ('zone_info', src)
To ensure that you cover an appropriate amount of logs, specify at least the
event_type
anduser
fields. For example, to trackvpn-login
andvpn-logout
events for users Barbara Salazar and Sherri Lee, write:"eventFields" :{ "user" : ["Blake Crouch", "Ted Chiang"] "event_type" : ["vpn-login", "vpn-logout"] },
You can't use event field aliases that are used in MongoDB. When events are saved to MongoDB, the event fields are replaced with aliases; for example, the
user
event field is replaced with theg
alias. In this JSON file, you must useuser
, notg
.To track all events, leave
eventFields
empty:{ "eventFields":{},
To enable the tracking mechanism, add a key,
pipelineFilter
, with the value"enabled:True"
:"pipelineFilter":{ "enabled":true }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, }
To track whether a model is learning, add a key,
modelingFilter
with the value"enabled":true"
:"modelingFilter:{ "enabled":true }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "modelingFilter":{ "enabled":true }, }
Under the
modelingFilter
key, add another value,"models":[ ]
. In the list, enter the models you're troubleshooting as a dictionary, with"name"
as the key and the model as the value:"modelingFilter:{ "enabled":true "models":[] }
For example, to troubleshoot model UA-UI:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "modelingFilter":{ "enabled":true "models":[{ "name":"UA-UI" }] }, }
Keys
ruleTriggeringFilter
andeventDroppingFilter
are used to troubleshoot other issues. You must add these keys and set value"enabled":false
:"ruleTriggeringFilter":{ "enabled":false, "names":[] }, "eventDroppingFilter":{ "enabled":false, }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "modelingFilter":{ "enabled":true "models":[{ "name":"UA-UI" }] }, "eventDroppingFilter":{ "enabled":false }, "ruleTriggeringFilter":{ "enabled":false, "names":[] } }
Add three fields:
maxEventsToMark
,maxMarkingTimeMs
, andmaxWorkingTime
. To keep your system running correctly, these fields limit how many logs you can track and how long it takes to process them:"maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000
maxEventsToMark
– The maximum number of logs to track.maxMarkingTimeMs
– How long, in seconds, ingested logs are flagged for tracking. After this period, the tracking mechanism may continue flagging logs for a few seconds and may flag more logs thanmaxEventsToMark
allows. If you reachmaxEventsToMark
beforemaxMarkingTime
, the tracking mechanism stops flagging logs.maxWorkingTime
– How long, in seconds, the tracking mechanism is enabled. After this period, the tracking mechanism may continue for an additional 30 seconds.
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "modelingFilter":{ "enabled":true "models":[{ "name":"UA-UI" }] }, "eventDroppingFilter":{ "enabled":false }, "ruleTriggeringFilter":{ "enabled":false, "names":[] }, "maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000, }
Save the file.
2. Ingest the logs
After you identify which logs to troubleshoot, ingest the log so the tracking mechanism returns data about how the log is being processed.
Ingest specific logs
If you're tracking a specific set of logs, ingest the log using the Syslog protocol. Ensure that you ingest one log at a time. If you ingest multiple logs at once, your system may run slower.
If you have command line access to your hardware appliance, ingest the log by appending to the Syslog file:
cat <log file> >> /opt/exabeam/data/input/rsyslog/$(date '+%Y-%m-%dT%k').Internal.syslog.log
If you don't have command line access to your hardware appliance, ingest the log using a host that has a Syslog listener:
cat <log file> | nc <appliance IP>
If Advanced Analytics fetches the log from another SIEM, send the raw log, with the added
EXA_ENABLE_TRACE
string, to that SIEM.
The log is tracked as it's ingested through LIME and the Analytics Engine.
Ingest logs based on criteria
If you're tracking logs based on criteria you specified, run a python script to ingest the log. This script adds a field, EXA_ENABLE TRACE_FILTERING
, to all logs that match the event fields you specified in the JSON file. This field flags the logs and lets the Analytics Engine know that it should record additional data. It also sends the JSON file to LIME and the Analytics Engine for every node in your cluster.
To start tracking the logs, run:
./martini_tracing_filter_control.py -c <yourfile.json>
If you have a multi-node deployment and you want to aggregate and track logs from all nodes, run:
run_shell_cmd_against_nodes.py -t
To see what other commands you can run with
run_shell_cmd_against_nodes.py
, run:run_shell_cmd_against_nodes.py -h
The log is tracked as it's ingested through the Analytics Engine only. It is not tracked as it's ingested through LIME.
3. Interpret Returned Messages
After the log is ingested, the tracking mechanism returns messages to exabeam.log
describing what happened as the log was ingested. Use these messages to identify if LIME or the Analytics Engine encounters any errors when ingesting your log.
Ensure that you understand what the returned messages mean.
View the returned messages for the tracked logs:
To view all returned messages for all logs that were tracked, run:
cat /opt/exabeam/data/logs/exabeam.log | grep "objType"=<objType>
To view all logs related to the models you specified under
modelingFilter
, run:cat /opt/exabeam/data/logs/exabeam.log | grep "filteredBy"=<model>
If no messages appear, it may be because:
One of the requirements in your JSON file is incorrect. Return to your JSON file and ensure that all fields in your JSON file are correct.
The model isn't supposed to train on the event. Verify that the Analytics Engine didn't discard the event. If the Analytics Engine didn't discard the event, the event likely trained a different model than the one you're troubleshooting.
Verify whether the model trained on the logs.
For each returned message, you see different values for
obj
. If you see a message whereobj
isGoing to update total histogram for model
, the model trained on the event. Investigate what else may be causing your data ingestion issues.If you don't see any message where obj is
Going to update total histogram for model
, the model didn't train on the event. To investigate why this may be, work backwards from the latest returned message to fix a specific message. Some messages you may see include:Might train model using event
– This model is relevant to this event.Going to early-train model
– If the modelRuleExpression
containsnum_observation=0
, the Analytics Engine might train the model before triggering the associated rule.Really modeling event_id= with model=
–trainIf=true
means that the model should train on this event.firsttimeEvtTraining=true
means that this is the first time the model is training on this event.Skipping training model=... because of firstTimeEvtTraining=false and TrainIf=false, TrainifExpr=..., feature=...
–firstTimeEvtTraining=false
means that the model previously tried training on this event.trainIf=false
means that the model shouldn't train on this event.Calculated fv/gfv for model: featureValues=, groupingFeatureValues=
– Displays thefeatureValue
andgroupingFeatureValue
attributes for the model, which may be helpful for debugging whether this model should train on these events.Going to update total histogram for model, containerID=...
– Model trained on the event. At the end of the session, results are written tolocal_model_db
.Won't do delayed training for event with containedId=... because skipTraining=true and isAfterWarmup=false
– At this point, it's unclear which model, if any, is going to train on the event. IfisAfterWarmup=false
, no model will train on the event.
If a returned message doesn't reflect behavior you expect, investigate other returned messages, your logs, or the model definition to discover if this is an error or if the model was never supposed to train on the event.
If the model isn't training on the events and you think this is an error, verify if the event was enriched with the correct information.
Identify if a Rule is Triggering
Hardware and Virtual Deployments Only
Verify if an event is appearing incorrectly because a rule isn't triggering.
First, create a JSON file that identifies the logs and rules you're troubleshooting. Then, run a Python script that starts tracking logs as they're ingested into your system. The script saves any additional information recorded during this process to exabeam.log
. Use the returned messages in exabeam.log
to investigate why your events might not be appearing correctly.
1. Identify logs to troubleshoot
To troubleshoot an issue, you track logs as they're ingested into Advanced Analytics. If you think that one, specific log may be the issue, track just that log. If you're unsure about which log is causing the issue, track multiple logs based on criteria you specify, like event type, source, and user.
Track specific logs
If you think that a specific set of logs may be the issue, track the log to verify this.
Add EXA_ENABLE_TRACE
to the end of a raw log. EXA_ENABLE TRACE
flags the log and lets the Analytics Engine know that it should log additional data. For example:
filter_instance1[16174]: rprt s=3021wwe1wv m=1 x=3021wwe1wv-1 mod=session cmd=data [email protected] suborg="" EXA_ENABLE_TRACE
Track logs based on event fields
If you don't know which log may be the issue, track a subset of logs that contain certain event fields. Create a JSON file that specifies these fields, then run a Python script to send the JSON file to Log Ingestion Messaging Engine (LIME) and the Analytics Engine for every node in your cluster.
In a new JSON file, define the type of logs you're troubleshooting by event fields they contain. Create a key,
eventFields
.To track a specific subset of logs, under
eventFields
, list the event fields these logs must contain as key/value pairs:{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] },
You can list any existing event field, including:
event_type
source
vendor
time
user
GetValue ('zone_info', src)
To ensure that you cover an appropriate amount of logs, specify at least the
event_type
anduser
fields. For example, to trackvpn-login
andvpn-logout
events for users Barbara Salazar and Sherri Lee, write:"eventFields" :{ "user" : ["Blake Crouch", "Ted Chiang"] "event_type" : ["vpn-login", "vpn-logout"] },
You can't use event field aliases that are used in MongoDB. When events are saved to MongoDB, the event fields are replaced with aliases; for example, the
user
event field is replaced with theg
alias. In this JSON file, you must useuser
, notg
.To track all events, leave
eventFields
empty:{ "eventFields":{},
To enable the tracking mechanism, add a key,
pipelineFilter
, with the value"enabled:True"
:"pipelineFilter":{ "enabled":true }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, }
To track whether a rule is triggering, add a key,
ruleTriggeringFilter
with the value"enabled":true"
:"ruleTriggeringFilter":{ "enabled":true, }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "ruleTriggeringFilter":{ "enabled":true, }, }
Under the
ruleTriggeringFilter
key, add another value,"names":[ ]
. In the list, enter the names of the rules you're troubleshooting in a list"ruleTriggeringFilter:{ "enabled":true "names":[] }
For example, to troubleshoot rules NEW-USER-F and UA-UC-Two:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "ruleTriggeringFilter":{ "enabled":true, "names":["NEW-USER-F", "UA-UC-Two"] }, }
Keys
modelingFilter
andeventDroppingFilter
are used to troubleshoot other issues. You must add these keys and set value"enabled":false
:"modelingFilter":{ "enabled":false, "models":[] }, "eventDroppingFilter":{ "enabled":false, }
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "ruleTriggeringFilter":{ "enabled":true, "names":["NEW-USER-F", "UA-UC-Two"] }, "modelingFilter":{ "enabled":false }, "eventDroppingFilter":{ "enabled":false }, }
Add three fields:
maxEventsToMark
,maxMarkingTimeMs
, andmaxWorkingTime
. To keep your system running correctly, these fields limit how many logs you can track and how long it takes to process them:"maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000
maxEventsToMark
– The maximum number of logs to track.maxMarkingTimeMs
– How long, in seconds, ingested logs are flagged for tracking. After this period, the tracking mechanism may continue flagging logs for a few seconds and may flag more logs thanmaxEventsToMark
allows. If you reachmaxEventsToMark
beforemaxMarkingTime
, the tracking mechanism stops flagging logs.maxWorkingTime
– How long, in seconds, the tracking mechanism is enabled. After this period, the tracking mechanism may continue for an additional 30 seconds.
For example:
{ "eventFields":{ event_type: ["eventtype1", "eventtype2"] source: ["source1", "source2"] user: ["user1", "user2"] }, "pipelineFilter":{ "enabled":true }, "ruleTriggeringFilter":{ "enabled":true, "names":["NEW-USER-F", "UA-UC-Two"] }, "modelingFilter":{ "enabled":false }, "eventDroppingFilter":{ "enabled":false }, "maxEventsToMark" : 50, "maxMarkingTimeMs" : 3600000, "maxWorkingTimeMs" : 3600000, }
Save the file.
2. Ingest the logs
After you identify which logs to troubleshoot, ingest the log so the tracking mechanism returns data about how the log is being processed.
Ingest specific logs
If you're tracking a specific set of logs, ingest the log using the Syslog protocol. Ensure that you ingest one log at a time. If you ingest multiple logs at once, your system may run slower.
If you have command line access to your hardware appliance, ingest the log by appending to the Syslog file:
cat <log file> >> /opt/exabeam/data/input/rsyslog/$(date '+%Y-%m-%dT%k').Internal.syslog.log
If you don't have command line access to your hardware appliance, ingest the log using a host that has a Syslog listener:
cat <log file> | nc <appliance IP>
If Advanced Analytics fetches the log from another SIEM, send the raw log, with the added
EXA_ENABLE_TRACE
string, to that SIEM.
The log is tracked as it's ingested through LIME and the Analytics Engine.
Ingest logs based on criteria
If you're tracking logs based on criteria you specified, run a python script to ingest the log. This script adds a field, EXA_ENABLE TRACE_FILTERING
, to all logs that match the event fields you specified in the JSON file. This field flags the logs and lets the Analytics Engine know that it should record additional data. It also sends the JSON file to LIME and the Analytics Engine for every node in your cluster.
To start tracking the logs, run:
./martini_tracing_filter_control.py -c <yourfile.json>
If you have a multi-node deployment and you want to aggregate and track logs from all nodes, run:
run_shell_cmd_against_nodes.py -t
To see what other commands you can run with
run_shell_cmd_against_nodes.py
, run:run_shell_cmd_against_nodes.py -h
The log is tracked as it's ingested through the Analytics Engine only. It is not tracked as it's ingested through LIME.
3. Interpret Returned Messages
After the log is ingested, the tracking mechanism returns messages to exabeam.log
describing what happened as the log was ingested. Use these messages to identify if LIME or the Analytics Engine encounters any errors when ingesting your log.
Ensure that you understand what the returned messages mean.
View the returned messages for the tracked logs:
To view all returned messages for all logs that were tracked, run:
cat /opt/exabeam/data/logs/exabeam.log | grep "objType"=<objType>
To view all logs related to the rules you specified under
ruleTriggeringFilter
, run:cat /opt/exabeam/data/logs/exabeam.log | grep "filteredBy"=<rule>
If no messages appear, it may be because:
One of the requirements in your JSON file is incorrect. Return to your JSON file and ensure that all fields in your JSON file are correct.
The event isn't supposed to trigger this rule. Ensure that the associated model trained on the event. If the associated model didn't train on the event, the event likely triggered another rule than the one you're troubleshooting.
Look at rule definition, expression, events should be triggering n
Check event and view log; something may not be referenced correctly in log; might indicate that log isn't parsed correctly
Verify whether the rule triggered on the event.
For each returned message, you see different values for
obj
. If you see a message whereobj
isruleTriggered=true
, the rule triggered on the event. Investigate what else may be causing your data ingestion issues.If you see a returned message where
obj
isruleTriggered=false
, the rule didn't train on the event and theRuleExpression
was evaluated to false. To investigate, work backwards from the latest returned message to understand why this happened. Some messages include:Going to evaluate
– The rule was evaluated against the event, and your system will run further checks.Rule was not evaluated because matchesRuleEventType=true and ignoreRule=true
– The rule was ignored and not evaluated becauseClassifyIf
was evaluated to false or the event didn't match the rule type.Computed ruleEventSeq=Seq(...) for the event. Going to evaluate rule for each entity
– For asset-related rules, the rule will be evaluated against each asset.Evaluating rule
– The rule was evaluated against the event and passed all pre-checks. This may appear multiple times with new data; for example, as the rule evaluates against each asset.Chained rule triggered
orChained rule was not triggered
– A chained rule is a rule that depends on another rule. The chained rule'sRuleExpression
was evaluated to true or false, respectively.Evaluated rule. Triggered=false
– For asset-related rules, the rule was evaluated against each asset.Skipping classification because of ClassifyIf=false, rule=...
–ClassifyIf
indicates how often the rule should trigger. The rule wasn't evaluated becauseClassifyIf
was evaluated to false.Skipping classification because of unknown model for rule=...
– The rule wasn't evaluated because the rule or model wasn't configured correctly.Rule was not evaluated because matchesRuleEventType=false and ignoreRule=true
–ClassifyIf
indicates how often the rule should trigger. The rule was not triggered becauseClassifyIf=false
, or the rule or model wasn't configured correctly.
If a returned message doesn't reflect behavior you expect, investigate other returned messages; or, compare your logs, the Advanced Analytics event, or the rule definition and expression to understand if this is an error or if the rule was never supposed to trigger on the event.
If the rule isn't triggering on the events and you think this is an error, verify if the event was parsed correctly, or verify if the event was enriched with the correct information.