Skip to main content

Data LakeData Lake Administration Guide

Table of Contents

Cross-cluster Search in Exabeam Data Lake

Hardware and Virtual Deployments Only

Cross-cluster searches enable queries across one or more trusted and connected clusters at a time. Any cluster can be used to issue a search across multiple clusters accessible to it. The querying cluster can execute search requests across a list of selected clusters for each search query. These searches allow you to comprehensively examine data across all clusters.

Cross-cluster search is well suited to customers who have multiple Data Lake clusters, especially those with :

  • Geographically distributed data centers

  • Regional clusters (such as satellite offices) with their own clusters alongside a central cluster

  • A need to manage smaller clusters instead of a single large cluster

Warning

Cross cluster search results export is limited to 10 million events per search query for on-premises deployments. Up to 1,000,000 events per search query can be exported in CSV format.

Prerequisites for Exabeam Data Lake Cross-cluster Search

Hardware and Virtual Deployments Only

Cross-cluster searches can run only between configured trusted clusters within a routable network. For cross-cluster searches to work, the following requirements must be met:

  • Peer clusters must have secure routing. The CA (Certificate Authority) certificate must be shared amongst querying and remote clusters to enable encrypted communication across clusters.

  • There must be sufficient bandwidth between clusters to support network traffic from results aggregation.

  • The path between peer clusters must be routable, with necessary firewall traversal. (Firewall ports should have network subnet or whitelist restrictions to protect access to the clusters.)

  • Opening of port 9300 on participating clusters to be accessible for searches. For more information about port configurations, see Network Ports.

  • Necessary permissions are set up for users and roles on each participating cluster.

Remote Cluster Management for Exabeam Data Lake Cross-cluster Search

Hardware and Virtual Deployments Only

Data Lake allows the Administrator to add clusters to perform cross-cluster searches. The cluster list must have:

  • Only registered clusters can be queried.

  • All member clusters must have CA credentials.

Each participating cluster must be set up to allow searches to be issued from it. You need to provide the remote cluster IP address and cross-cluster permissions for cross-cluster connections to work correctly.

Note

The greater the number of clusters, the greater the network load and search latency. Please see Prerequisites.

Register a Remote Cluster in Exabeam Data Lake for Cross-cluster Search

Hardware and Virtual Deployments Only

For remote clusters to be queried using cross-cluster search, each remote cluster must:

  • Have a CA certificate file.

  • Be registered individually in that cluster's registry.

To download a certificate on a remote cluster:

  1. Navigate to Settings > Cluster Management.

  2. Select Download certificate. A download will automatically initiate. See you downloads directory for the certificate file.

    DL-Settings-ClusterManagement-DownloadCert.jpg

Note

Do not modify the certificate file name with blank spaces and special characters such as ( ), [ ], @, !, #, etc. Such files will not be accepted. It is advised that the geo-location or group name of the cluster be appended to the file name to differentiate one certificate from another. For example, ca.pem can be renamed ca-west.pem.

To register a remote cluster:

  1. Navigate to Settings > Cluster Management > Clusters.

    DL-Settings-ClusterManagement-Clusters.jpg
  2. Click ADD CLUSTERS.

    DL-Settings-Clusters-AddClusters.jpg
  3. Fill in the cluster Name and Hostname or IP of the master node, then click UPLOAD THE CERTIFICATE to attach the CA certificate to the cluster.

    DL-Settings-Clusters-AddClusterUI.jpg

    Note

    Up to 2 hostnames and IP addresses are supported. Two hostnames or IPs are recommended for mid-sized clusters of less than 10 nodes. To determine the identities of the first and second hosts, see host1.yml and host2.yml in /opt/exabeam_installer/host_vars.

  4. Click ADD to apply the configuration.

  5. Repeat steps 2-4 for all remote clusters before proceeding to step 6.

  6. After adding all remote clusters, click APPLY ALL to apply the changes. Some nodes will be restarted; therefore the current cluster may be slow for roughly 10-30 minutes depending on the cluster size. It is recommended to add all the remote clusters first before clicking on APPLY ALL. This way the restart will pick up all the new configs in one step. It is also recommended to do this during a downtime period.

Exabeam Data Lake Cross-cluster Health Monitoring and Handling

Hardware and Virtual Deployments Only

The availability of a participating remote cluster is shown with color indicators on the cluster list. At the time of a cross-cluster query, only active clusters are queried. Clusters with less than available status are not queried for results. The status of each cluster is shown in the Availability column and is updated at the time of querying.

Green = enabled and available cluster

Grey = disabled and unavailable cluster and will be passed over during querying

Red = enabled cluster but is not available and will be passed over during querying

Yellow = enabled and the availability is unknown.

Note

The yellow cluster status will usually change to the actual status of green/red within 5 seconds after first setting up a cluster.

Clusters with less than available status are those that did not respond to health checks in a timely manner. This can be a delayed response due to load issues or no response at all due to an outage. In either case, such clusters should be investigated. The statuses of restored or repaired clusters are upgraded in the health check before the next cross-cluster search is run.

DL-Settings-Clusters-Availability.jpg

How to Enable/Disable/Delete Exabeam Data Lake Remote Clusters for Cross-cluster Search

Hardware and Virtual Deployments Only

In a cross-cluster search list, a registered cluster can be taken out of or reinstated into the query pool at the Clusters submenu manually, regardless of the health of the cluster.

To change the availability of a cluster:

  1. Navigate to Settings > Cluster Management > Clusters.

  2. Click the vertical ellipsis to open the status menu.

  3. Select the cluster status:

  • Enable -- Reinstate a cluster into the query pool. Use this status if the cluster has been returned to working order and was previously removed.

  • Disable -- Take a cluster out of use in the query pool. You may use this status to prevent the use of a cluster that is under repair or maintenance.

  • Delete -- Permanently disassociate the cluster from the query pool. You may use this status to remove clusters that will no longer participate in the query pool. The cluster is not delete and can be reregistered.

DL-Settings-Clusters-ClusterStatus-DisableEditDelete.jpg

Exabeam Data Lake Remote Cluster Data Access Permissions for Cross-cluster Search

Hardware and Virtual Deployments Only

To run cross-cluster searches, in addition to registering remote clusters, access permissions need to configured. Tiers of access is managed globally and at the clusters locally. Permission to execute cross-cluster searches is configured in two parts:

Role-Based Permissions

You can restrict the user roles that are allowed to perform a cross-cluster search. By default, only users with Administrator role have full permissions to create and configure which roles can perform cross-cluster searches. Role-based permissions must be configured at each cluster.

Warning

Cross-cluster searches are performance intensive, so Exabeam highly recommends limiting the number of users and roles who can perform cross-cluster searches.

For more information on role-based permissions and configuration, see User Management > Role-based Access Control.

Cluster-Based Permissions

Each cluster in the cluster list is configured individually and is independent of its peers. You can control which user roles are allowed to execute a cross-cluster search on each individual cluster.

Note

Cluster permissions supersede role permissions. Even if your role has permission to execute a cross-cluster search, clusters that are not configured with access for your role will not grant permission for you to perform a search.

To configure access per cluster:

  1. Navigate to Settings > Cluster Management > Clusters.

  2. Select the cluster to configure in the left panel. The corresponding Access Management menu will appear to the right.

  3. Select the checkboxes to the user roles that can run to cross-cluster searches.

    DL-Settings-Clusters-AccessManagement.jpg
  4. Click SAVE to save the configuration.

When remote clusters have been successfully incorporated, you will see them selectable in the cluster menu above the Search field.

DL-Search-CrossClusterUI.jpg