Skip to main content

Site CollectorSite Collector Administration Guide

Table of Contents

Choose the Right Collector based on Data Sources

Selecting a type of collector—either File Collector or Archive Collector—depends on how the data source is generated and updated, not on the total size of your data.

  • File Collector – Select File Collector for the files that continuously grow. For example, log files where data keeps getting added and applications continuously append new data. The File Collector tracks the last line it read in a file and continues collecting data from that point during each fetch interval.

    File Collector is designed to:

    • Track updates and continue processing from the last read position in each file.

    • Resume collection from the last read point during each scheduled fetch.

    • Support large files, up to 20 GB per individual file.

    You can choose File Collector for monitoring and collecting data from active, evolving files.

  • Archive Collector – Use the Archive Collector for files that are static and do not change after being created. These files are typically compressed backup files such as .zip, .gzip, or .tar.

    Archive Collector is designed to:

    • Monitor for new files being added to a specified location.

    • Work with compressed files that are not modified after creation.

    • Support compressed files up to 2 GB each, which can expand to 15–20 GB uncompressed depending on the compression algorithm used.

      Note

      When the collector processes large files, the collector can process fewer number of files in parallel.

    You can choose Archive collector for environments where data is stored in periodic, packaged archives rather than being appended to continuously.

    Note

    If you have a unique scenario where a process generates new files and appends them to existing ones using a script, you can use both type of collectors with minor adjustments. To utilize Archive collector in this case, you can generate the new files and compress them into gzip files, and instead of appending them to existing files, place the new .gzip files into a designated location. The Archive Collector can then detect and process these new files automatically.