Skip to main content

CollectorsCloud Collectors Administration Guide

Configure the AWS S3 Cloud Collector

Set up the AWS S3 Cloud Collector to continuously ingest events from AWS S3 buckets.

  1. Before you configure the AWS S3 Cloud Collector, ensure that you complete the Prerequisites to Configure the AWS S3 Cloud Collector.

  2. Log in to the New-Scale Security Operations Platform with your registered credentials as an administrator.

  3. Navigate to Collectors > Cloud Collectors.

  4. Click New Collector.

    AWS_S3_collectors_page.png
  5. Click AWS S3.

  6. Enter the following information for the cloud collector:

    AWS_S3_3.png
    • Name – Specify a name for the Cloud Collector instance.

    • Account – Select the AWS account you want to use with this Cloud Collector.

      If you have not yet created an AWS account, in the Account list, select Create Account. For instructions, see Add Accounts for AWS Cloud Collectors.

    • SQS-URL – Enter the SQS URL that you obtained while completing prerequisites.

    • SQS-Region – Select the SQS region. For example, us-east-1.

    • S3-Region – Select the S3 region. For example, us-west-1.

    • SQS Message-Origin – Based on the integration you are configuring; messages are put in the queue in two different ways:

      • S3_TO_SNS_TO_SQS – Select this option to place messages in the queue to send messages from S3 to SNS, and then to configure them in SNS to be sent to SQS.

      • S3_TO_SQS – Select this option to place messages in the queue so that S3 notifies the SQS directly.

    • File Processing – By default, the cloud collector processes Events per Line, however you can change the file processing to process events embedded in a JSON Array or Object. If you choose JSON Array or Object, you can optionally specify the JSON Path To Events (Period Delimited).

      Additionally, the AWS S3 Cloud Collector supports extraction of events from JSON Objects or JSON Arrays from any level by specifying period delimited path to the specific event object or event array.

      Refer to the following examples with multiple JSON formats.

      Example 1: In this example, k3.k3_2 is the JSON path to the following embedded JSON array.

      {
          "k1": "v1",
          "k2": [
              "v2_1",
              "v2_2"
          ],
          "k3": {
              "k3_1": "v3_1",
              "k3_2": [
                  {
                      "obj1Key": "obj1Val"
                  },
                  {
                      "obj2Key": "obj2Val"
                  }
              ]
          }
      }

      Example 2: In this example, k3.k3_2 is the JSON path to the embedded JSON array that contains multiple JSON objects each with k3.k3_2 JSON array. The collector ingests contents of each  k3.k3_2 JSON array as an individual log event.

      [{
          "k1": "v1",
          "k2": [
              "v2_1",
              "v2_2"],
          "k3": {
              "k3_1": "v3_1",
              "k3_2": [
                  {
                      "obj1Key": "obj1Val"},
                  {
                      "obj2Key": "obj2Val"}
              ]
          }
      },
      {
          "k1": "v1",
          "k2": [
              "v2_1",
              "v2_2"],
          "k3": {
              "k3_1": "v3_1",
              "k3_2": [
                  {
                      "obj1Key": "obj1Val"},
                  {
                      "obj2Key": "obj2Val"}
              ]
          }
      },
      {
          "k1": "v1",
          "k2": [
              "v2_1",
              "v2_2"],
          "k3": {
              "k3_1": "v3_1",
              "k3_2": [
                  {
                      "obj1Key": "obj1Val"},
                  {
                      "obj2Key": "obj2Val"}
              ]
          }
      }]

      Example 3: In the following example, if JSON PATH TO EVENTS is specified as events.payload, the cloud collector extracts payload (JSON Object) from each element of the events (JSON Array) and ingests it as an event.

      {
        "cursor": {
          "partition": "0",
        },
        "events": [
          {
            "metadata": {
              "occurred_at": "2024-02-05T19:30:27.517754Z"
            },
            "payload": {
              "cluster_alias": "bi-kub-test",
              "event": {
                "verb": "get"
              }
            }
          },
          {
            "metadata": {
              "occurred_at": "2024-02-05T19:30:27.515269Z",
            },
            "payload": {
              "cluster_alias": "bi-kub-test",
              "event": {
                "verb": "update"
              }
            }
          }
        ],
        "info": {
          "debug": "Stream started"
        }
      }

      If you do not specify a path for the JSON Array, the cloud collector assumes the events are elements of the array.

      [
          {
              "obj1Key": "obj1Val"
          },
          {
              "obj2Key": "obj2Val"
          }
      ]

  7. (Optional) SITE – Select an existing site or to create a new site with a unique ID, click manage your sites. Adding a site name helps you to ensure efficient management of environments with overlapping IP addresses.

    By entering a site name, you associate the logs with a specific independent site. A sitename metadata field is automatically added to all the events that are going to be ingested via this collector. For more information about Site Management, see Define a Unique Site Name.

  8. (Optional) TIMEZONE – Select a time zone applicable to you for accurate detections and event monitoring.

    By entering a time zone, you override the default log time zone. A timezone metadata field is automatically added to all events ingested through this collector.

    Timezone_sitename_site_management_1.png
  9. (Optional) Add filter conditions using regex syntax to include and exempt logs.

    Note

    This feature is available as a part of the early access program. To participate, see Sign Up for the Early Access Program.

    Azure_Event_Hub_Regex_Egress_filtering.png
    • In the Allowed Conditions section, add conditions to include logs to be sent to the New-Scale Security Operations Platform.

      For example, for EventCode that matches 100X, use the Allowed Condition EventCode=100[0-9].

      Raw log:

      LogName=Application EventCode=1001 EventType=4 ComputerName=windows-splunk-forwarder-vp-23 SourceName=Windows Error Reporting Type=Information RecordNumber=168946846 Keywords=Classic TaskCategory=None OpCode=Info Message=Fault bucket , type 0 Event Name: APPCRASH Response: Not available Cab Id: 0
    • In the Deny Conditions section, add conditions for the logs that you don't want to send to the New-Scale Security Operations Platform.

      For example, if you don't want to include eventType between 1 and 5, use the Deny Condition EventType=[1-5].

      Raw log:

      LogName=Application EventCode=1001 EventType=4 ComputerName=windows-splunk-forwarder-vp-23 SourceName=Windows Error Reporting Type=Information RecordNumber=168946846 Keywords=Classic TaskCategory=None OpCode=Info Message=Fault bucket , type 0 Event Name: APPCRASH Response: Not available Cab Id: 0
      Azure_Event_Hub_Regex_Egress_filtering2.png
  10. Click Check Filters to verify if the filters are providing the correct results. Then, add log files or paste raw logs in the right pane to verify if the filters that you set are working appropriately.

    Azure_Event_Hub_Regex_Egress_filtering_3_1.png
  11. Click Import.

    If you specify the Allowed and Deny conditions, the cloud collector processes logs based on your conditions. Ensure that the regex pattern that you specify is valid.

    Refer to the following screenshot as an example of the logs that are processed based on the Allowed and Deny conditions.

    Azure_Event_Hub_Regex_Egress_filtering_3.png

    Note

    Using egress filters affects the performance of the collector and decreases overall EPS. The more you set complex filters, the bigger impact you will observe on the collector performance. For example with three filters, overall EPS can decrease approximately by 15% to 20% based on the filter complexity.

  12. To verify that the cloud collector communicates with the AWS service, click Test Connection.

    Note

    You can start the cloud collector via the user interface only if the test connection during cloud collector configuration is successful.

  13. Click Install.

    AWS_S3_2.png

    A confirmation message informs you that the new Cloud Collector is created.