# Exabeam Cloud PlatformAuto Parser Generator

## Create a Custom Parser Using Auto Parser Generator

Use Auto Parser Generator to create a custom parser from scratch or from a default parser as a starting point.

You can only create a custom parser for Advanced Analytics. To create a Data Lake parser, create an Advanced Analytics parser, then open a case on the Exabeam Community. Exabeam Customer Success helps you convert the parser so it's compatible with Data Lake.

To create a custom Advanced Analytics parser and corresponding custom event builder, you:

Feel free to pause your work at any time. Auto Parser Generator saves your progress after each step and after you change anything. If you leave while creating a parser, the incomplete parser appears in the list of parsers with an In Progress status. To pick up where you left off, edit the parser.

### Prerequisites

• Be familiar with how parsers and event builders work, and how both components work with the Analytics Engine.

• Gather sample logs containing the same format and syntax as the raw data Advanced Analytics typically ingests.

• If you obtain the logs from another system, some log lines may be nested under structures not sent to Advanced Analytics; for example, a syslog may be nested in a CSV file. Ensure that you remove these structures.

• If you obtain the logs in a custom or non-standard way, like using a proprietary script or from an uncommon log system, remove any redundant escape characters; for example, "", """", \r, \n, \t, \\\\, or \{.

### 1. Start from scratch or from a default parser

You can create a custom parser from scratch or duplicate a default, out-of-the-box parser. Consider duplicating a default parser to modify a faulty parser or use the parser as a helpful starting point.

• To create a custom parser from scratch, ensure you're on the Custom Parsers tab, then click Add a Parser.

• To duplicate a default, out-of-the-box parser, navigate to the Default Parsers tab, then for a default parser, click DUPLICATE .

### 2. Import sample logs

Import sample logs that represent the type of information Advanced Analytics typically ingests, so you can create a parser that properly extracts this information.

1. Select sample logs to to import:

• To select a log file from your file system, select Add a file, then drag and drop a file or click Select a File. You may upload a .gz or .tgz file that is no more than 100 MB.

• To copy and paste logs, select Copy and paste raw logs, then paste the content into the text box. You may enter up to 100 lines.

2. Next, Auto Parser Generator matches your sample logs to the latest available parsers in the content repository. These are parsers created by Exabeam, and can include parsers added or updated more recently than the latest product release. It also uses other custom parsers you or other team members have created.

To match your sample logs only to out-of-the-box parsers and not other custom parsers, click the Match Exabeam default parsers only box.

Click Find Matching Parsers.

### 3. Determine a subset of the sample logs for which to create a Parser

After Auto Parser Generator analyzes the sample logs, it compares the conditions in the sample logs with those in existing parsers, and identifies a match if the conditions are similar. Identifying similar existing parsers help Auto Parser Generator extract information about the sample logs correctly.

Once Auto Parser Generator has analyzed and matched the sample logs to existing parsers, you must decide which subset of the logs you want to create a parser for. You can create a parser for:

• Logs that match an existing parser you want to duplicate and modify.

• Logs that don't match any existing parsers.

#### Duplicate and modify an existing parser

1. On the [#] Parsers Matched tab, review the existing parsers that match the sample logs. If the parser is one that's unavailable to you because you don't have the latest version of Advanced Analytics or Data Lake, upgrade to the latest product version.

2. (Optional) It's important to ensure the matching parser parses your sample logs correctly. Although Auto Parser Generator uses the parser's conditions to match parsers to your sample logs, parsers use regular expressions to extract values from your logs, not conditions. Even if the parser's conditions match your sample logs, its regular expressions may extract values incorrectly.

For a matched parser, click View Parser Details.

• To view which log values match the parser's event type fields, click the FIELDS tab. Each log is numbered. Each field is listed in the top row.

To view the fields that have a matching value in every log, select Matches. Ensure that the parser has extracted log values to the appropriate event type fields; for example, src_ip should contain an IP address. Click the arrow to view the matching values highlighted in the raw log.

To view the fields that don't have a matching value in every log, select Non-matches.

• To view which log values match the parser conditions, select the CONDITIONS tab. Each log is numbered. The matched log values are listed.

• To edit the matched parser, click Edit, then map event type fields to log values.

3. Click the More menu, select Edit, then map event type fields to log values.

#### Create a parser for logs that don't match any existing parsers

1. Click the Raw Log Lines Without Parsers Matched tab.

2. Review the log lines that didn't match any existing parsers.

Conditions are a string, or set of strings, that uniquely exist in specific logs. The Log Ingestion and Messaging Engine (LIME) looks for conditions in your logs to identify the correct parser to use. Determine which conditions must be in a log so it matches your custom parser.

Ensure that you understand how LIME identifies the correct parser to use. Keep in mind that a log is evaluated against custom parsers first, then out-of-the-box parsers. LIME uses the first parser that matches all conditions. If a log doesn't include all conditions, it won't match the parser.

To avoid matching a log with the wrong parser, you must carefully choose conditions that uniquely exist in specific logs. Conditions can't be too general or strict. Let's take this log as an example:

%{host} KAFKA_CONNECT_SYSLOG: <110>1 2020-04-01T02:08:22.073Z 485cafdca7ac Skyformation - 1192192023904365343 - CEF:0|Skyformation|SkyFormation Cloud Apps Security|2.0.0|sk4-audit-event|audit-event|0|cat=audit cfp3=34.0544 cfp3Label=latitude cfp4=-118.244 cfp4Label=longitude cs6Label=raw-event destinationServiceName=Okta flexString1=app.inbound_del_auth.login_success flexString1Label=application-action src=13.108.238.8 suid=system suser=barbara.salazar request=Success deviceInboundInterface=5d6baf21-742d-11ea-9f5a-7fa07153

If conditions are too general and can match with many logs, the parser may parse logs incorrectly, creating Advanced Analytics events with inaccurate data and distorting Smart Timelines™, rules, and models. For example, if you choose just [“CEF:0|Skyformation”] as a condition, the parser matches any log received from Exabeam Cloud Connectors. Choosing two conditions, like [“CEF:0|Skyformation”,  “destinationServiceName=Okta”] is better, but the parser still matches any Okta log received from Exabeam Cloud Connectors.

If conditions are too strict, the parser can't cover all relevant logs. In general, don't use values for log variables like IP address, time, and host name as conditions. For example, if you choose [“CEF:0|Skyformation”, “destinationServiceName=Okta”, “src=13.108.238.8”] as your conditions, the parser only matches Okta logs from source IP address 13.108.238.8 and that were received from Skyformation; only very specific logs would match all these conditions.

To ensure that your conditions aren't too general or strict, it's best if you include conditions that indicate:

• Vendor or product that generated the log; for example, "Windows", "Okta", "Cisco ASA v9.8".

• Log format; for example, "CEF:0", "LEEF:1.0", "LEEF:2.0".

• Event type; for example, "app.inbound_del_auth.login_success", "action=security-alert-high", "vpn-session-started".

If your sample logs don't contain this information, choose other conditions that are unique to your sample log's product or vendor, format, and event type. Let's take this log as an example:

<134>Aug 30 22:35:23 DNSLOG: src=192.25.5.12 spt=53 dst=156.140.56.11 dpt=28317 proto=UDP mid=59898 op=0 fl=|QR|RD|RA| rc=SRVFAIL cl=IN tp=PTR name=70.6.29.113.in-addr.arpa

You could select "DNSLOG:", "src=", "spt=", "dst=", "dpt=", "proto=", "mid=", "fl=", "rc=", and "name=" as your conditions because they are unique to your log and unlikely to appear in logs from other vendors or products.

As a last resort, before the log is ingested, append a value to the log so the parser can easily identify it. For example, append a value like "aa_log_source=DNSLOG_FROM_CISCO_UMBRELLA" to the end of the log and use it as your only condition:

<134>Aug 30 22:35:23 DNSLOG: src=192.25.5.12 spt=53 dst=156.140.56.11 dpt=28317 proto=UDP mid=59898 op=0 fl=|QR|RD|RA| rc=SRVFAIL cl=IN tp=PTR name=70.6.29.113.in-addr.arpa aa_log_source=DNSLOG_FROM_CISCO_UMBRELLA
1. In the list of Original Logs, highlight a string with your cursor.

2. Click Add Condition. Under Original Logs, the string is highlighted in yellow so you can see if all sample logs contain the string.

3. Under Extraction Preview, carefully review the logs that match the conditions you added. To view the logs that match every condition, click the Matches tab. To view the fields that don't match every condition, click the Non-matches tab.

4. Click Next.

### 5. Identify the log vendor and product

The custom parser uses the log vendor and product information as metadata. The custom event builder passes this information on to other components in the Analytics Engine.

1. Select the vendor that generated the logs you imported:

• Click Select a vendor, then select from the list of vendors. If Auto Parser Generator identifies a vendor name in the logs, it appears in the list of recommended vendors.

• To search for a vendor, start typing in the Select a vendor field.

• If you don't find the vendor, manually add it:

1. Click + Add a Vendor.

2. Enter the Vendor Name and Product Name.

3. Click Create.

2. To find the product that generated the logs, click Select a product, then select one from the list. To search for a product, start typing in the Select a product field.

3. Click Next.

### 6. Select an event type

An event type is a schema that all events are mapped to so components — including the Analytics Engine, models, rules, and Smart Timelines — process them consistently. To ensure that your custom event builder creates an event from your sample logs that other components can process correctly, you must categorize your logs into an event type.

To simplify this process, you can only choose one event type. To categorize the logs into another event type, you must create another parser. Typically, after a parser parses a log, the event builder matches the log to an appropriate event type in two ways: using fields in the event builder definition, or the parser name. Currently, an event builder you create with Auto Parser Generator only uses the parser name, and not the event builder definition fields.

For example, let's look at login-success and login-failure events. These events differ only on the result field. For login-success, result = success; for login-failure, result = failure. If the event builder matches logs to event types using event builder definition fields, one parser parses both events and the event builder decides on the appropriate event type, based on the results field. Since Auto Parser Generator event builders only use the parser name, you must create two different parsers, then associate each event builder with a different event type.

If you can't find the appropriate event type for your sample logs:

• The event type may be named differently in Auto Parser Generator than what you had in mind. To find your event type, you might find it helpful to review a list of all supported event types.

• The event type may not best be detected or investigated using Advanced Analytics because it doesn't indicate much useful information about a user or asset; for example, when a user logs out or terminates a process. Instead, consider storing and investigating the log in Exabeam Data Lake .

1. Find the event type that best describes the sample logs.

• To view the fields required for an event type, click Required Fields.

• To search for an event type, enter a query in the search bar.

• If none of the event types describe your sample logs, post a request on the Exabeam Community Ideas board.

2. Select the event type.

3. Click Next.

### 7. Map event type fields to log values

You see all required, extended, and informational fields for the event type you selected.

For each field, enter a JRegex pattern. To help validate the pattern, you can also enter a key. Since required fields are mandatory to create an event, you must enter a JRegex pattern for these fields. Although extended and informational field are optional, it's best to enter a JRegex pattern for these fields because they help process and display an event.

There are three ways to enter a JRegex pattern for a field:

• Generate a JRegex pattern from a list of keys.

• Generate a JRegex pattern directly from a value in the sample logs.

• Manually enter a JRegex pattern or key. You can also use this method to edit a JRegex pattern you created using the other two methods.

After you enter a JRegex pattern for each field, ensure the fields are in the correct order, review the matching event type fields and log values, then continue.

#### Generate a JRegex pattern from a list of keys

1. To generate a new JRegex pattern, click +Add another regex.

2. Under KEY, click Select.

3. From the list, select a key, then click Select key. Auto Parser Generator recommends keys by comparing the custom parser to similar existing parsers.

4. Under MAPS TO FIELD, select an event type field from the list. The JRegex pattern is automatically populated.

5. Click the check mark .

6. Under % LINES, verify the percentage of log lines from which the JRegex pattern extracted a value. Next to the event type field, verify the number of values extracted, in parentheses.

#### Generate a JRegex pattern directly from a value in the logs

Sometimes, the host and time fields are required but you can't find appropriate values in your sample logs. If you retrieved your sample logs from a SIEM like Advanced Analytics , manually enter a JRegex pattern using special keys.

1. Under Sample Log Lines, click a highlighted value.

A value is highlighted:

• If the log is in a well-structured format, like CEF, LEEF, or JSON, Auto Parser Generator can tokenize key/value pairs. If a value isn't highlighted, consider configuring the log so it uses one of these formats.

• When a value matches a JRegex pattern that is commonly used in existing parsers to extract this field, for this event type.

2. From the list, select a field, then click MAP EXTRACTION. The JRegex pattern is automatically populated.

3. Under % LINES, verify the percentage of log lines from which the JRegex pattern extracted a value. Next to the event type field, verify the number of values extracted, in parentheses.

#### Manually enter or edit the JRegex pattern

1. To generate a new JRegex pattern, click +Add another regex.

2. In the REGEX field, enter the JRegex pattern.

Keep in mind:

• You should include a value from the log in the pattern. You must include the field name in the capturing group. If you enter a key, include it in the pattern, outside the capturing group.

For example, let's say you have an field called alert-name. In the sample logs, you have a key, foo=, and values 696 and 636. You can enter a JRegex pattern that is more specific; for example, foo=({alert-name}6[39]6). Or, you can enter a JRegex pattern that is more inclusive; for example, foo=({alert-name}\d+).

• You can enter a JRegex pattern that extracts multiple fields. Consider creating pattern like this if your log values are concatenated; your logs aren't formatted in key-value pairs, so you're identifying values based on location; or your parser covers multiple, slightly different log formats.

For example, in the s-xml-4663 parser, a single JRegex pattern extracts the process, directory, and process_name fields:

• Sometimes, the host and time fields are required but you can't find appropriate values in your sample logs. If you retrieved your sample logs from a SIEM like Advanced Analytics, you can enter a JRegex pattern using special keys, exabeam_host and exabeam_time; for example, exabeam_host=({host}[\w.\-]+). Your SIEM may add a header to the raw log that includes host and time information, and these special keys parse that information.

3. Click the check mark .

4. Under % LINES, verify the percentage of log lines from which the JRegex pattern extracted a value. Next to the event type field, verify the number of values extracted, in parentheses.

#### Reorder JRegex patterns

A parser evaluates its JRegex patterns against a log consecutively from top to bottom. Ensure that your JRegex patterns are in an order that correctly parses your logs.

To reorder an event type field, drag the field to a new place in the list.

If you enter multiple JRegex patterns for the same field, the values that are parsed but won't be extracted are outlined under Sample Log Lines; for example:

#### Review the matching event type fields and log values

Under Sample Log Lines, carefully review the values that match a field's JRegex pattern for each log.

To view the fields that have a matching value in every log, click the Matches tab. To view the fields that don't have a matching value in every log, click the Non-matches tab.

Once you're done, click Next.

### 8. Enter general information about the parser

The parser's name is used to identify and trace the parser as it's used in the Analytics Engine. Identifying the time format helps to map the dates and times in the log to a Unix timestamp that is displayed in Advanced Analytics Smart Timeline™ events. Information about the log management system is useful metadata that documents how the log was formatted coming into the Analytics Engine.

1. Enter the name, time format, and log management system information:

• Name – Name the parser.

• Time format – Select a format that best matches how dates and times are formatted in the sample logs.

When the format you select matches the dates and times in the sample logs, they are highlighted in yellow under Original Logs.

• Log management system – Select the log management system where the logs were stored.

2. Click Next.

### 9. Review the parser

Carefully review the matching event type fields and values for each log.

1. To view a raw log, click the arrow. To view the matching values highlighted in the raw log, toggle Field highlighting on.

2. Click VIEW UNIQUE FIELD VALUES, then select a field from the list to review the matching log values. The Count indicates how many times the value appears across all logs.

3. Click Next.

### 10. Install the parser and event builder

You created your own custom parser and event builder. Before the Analytics Engine can use them, you must install them onto your environment.