Chapter 6. Insights client data redaction


Red Hat Insights handles data collection by using Insights core collection. For redaction purposes, core collection replaces a limited JSON-file (.cache.json and .uploader.json) and shell script data collection method. In addition, core collection:

  • Is compatible with Insights client 3.0 and later.
  • Uses YAML files to determine which commands and files to redact.
  • Uses robust Python data cleaning processes.
  • Supports complex data collection by using datasources and avoids the possible limitations of using JSON files, such as uploader.json and shell scripts.
  • Allows you to use more easily maintained datasources to know which datasources that are part of insights-core.

For more information about Red Hat Insights client core collection, datasources, and collection rules, see the following resources:

Important

Using .cache.json and uploader.json files to redact data is no longer supported by Red Hat. Use Datasources instead.

Preventing the collection of Personally Identifiable Information (PII)

Red Hat Insights for Red Hat Enterprise Linux collects a minimal amount of data, including data that might contain personally identifiable information (PII). To prevent PII (or other configuration data) from being collected, apply data redaction.

For information about how Red Hat Insights for Red Hat Enterprise Linux handles data collection, see Red Hat Insights data and application security.

6.1. Creating and configuring the required redaction YAML files

To redact data in Red Hat Insights, you need Insights client 3.0 and the following YAML configuration files to control the redaction actions:

  • file-redaction.yaml
  • file-content-redaction.yaml

You can use one or both files, depending on the content you want to redact.

To find the items to redact, Insights client uses the default configuration of the insights-client.conf configuration file to call the file-redaction.yaml and file-content-redaction.yaml files. The following example shows an example of the default configuration for redaction in the insights-client.conf file:

# Location of the redaction file for commands, files, and components
#redaction_file=/etc/insights-client/file-redaction.yaml

# Location of the redaction file for patterns and keywords
#content_redaction_file=/etc/insights-client/file-content-redaction.yaml

While you do not need to change the configuration of the insights-client.conf file, you do need to create the YAML files.

Important

Red Hat no longer supports the use of the remove.conf configuration file to redact data.

How the YAML files work

The Insights client /etc/insights-client/file-redaction.yaml file lists commands and files that you want to redact. A Python data cleaning process runs on the file-redaction.yaml file and redacts the listed commands and files.

Note

When the Python data cleaning process runs, it redacts the specified content before adding it to the archive file.

The /etc/insights-client/file-content-redaction.yaml defines pattern redaction and keyword replacement. For pattern redaction, the process redacts patterns or regular expressions that match those specified in the YAML file. For keyword replacement, the process replaces the specified keywords with generic identifiers.

6.1.1. Configuring file-redaction.yaml to redact commands and system files

You can create the /etc/insights-client/file-redaction.yaml file and include a list of commands and system files that you want redacted. When the data redaction takes place, a Python data cleaning process runs, and analyzes the contents of the YAML file.

Note

The output of the listed commands or files does not get included in the uploaded archive file.

Prerequisites

  • You must be familiar with the basics of YAML syntax. For more information about YAML, see yaml.org.
  • You must have root-level access to the system.

Procedure

  1. Use an editor to create the /etc/insights-client/file-redaction.yaml file.
  2. Enter the strings, files: and commands:, on separate lines in the YAML file.

    files:
    
    
    commands:
  3. Enter files and commands that you want redact.

    1. On the line following files:, enter the files that you want to redact. Use the information in the Datasources catalog to identify which files and commands to specify. For example, if you want to redact the auditd.conf file, this is how it would look:

      files:
        - /etc/audit/auditd.conf
    2. On the line following commands:, enter the commands that you want to redact. Use the information in the Datasources catalog to identify which commands to specify. For example, if you want to redact the ethtool -i command, this is how it would look:

      commands:
        - ethtool_i
  4. Save the YAML file in /etc/insights-client/.
  5. Verify that the file-redaction.yaml file permissions are root owner only by running ll file-redaction.yaml as root, on the command line.

    [root@insights]# ll file-redaction.yaml
    -rw-------. 1 root root 145 Sep 25 17:39 file-redaction.yaml

Example file-redaction.yaml file with comments

The following example shows a sample file-redaction.yaml file that includes commands and files to redact. Comments, which are lines preceded by a hash symbol (#), also offer guidance to help you configure the YAML file.

# file-redaction.yaml
---
# Redact the entire output of commands
# Specify commands by either full command or by the "symbolic_name" like “ethtool_i.”
# Refer to the “Datasource Catalog” and “General Datasources” at https://insights-core.readthedocs.io/en/latest/specs_catalog.html#general-datasource for a full list of available symbolic_names, and the commands and files they correspond to.

 commands:
- /bin/rpm -qa
- /bin/ls
- ethtool_i

# Redact the entire output of files
# Specify files either by full filename or
by the "symbolic_name" for example, “cluster_conf.”
# Refer to the “Datasource Catalog” and “General Datasources” at https://insights-core.readthedocs.io/en/latest/specs_catalog.html#general-datasource for a full list of available symbolic_names, and the commands and files they correspond to.

files:
- /etc/audit/auditd.conf
- cluster_conf

Verification step

To verify that your redaction file is working, you can run the insights-client command with the --no-upload option, then review the output messages on your console or terminal.

  1. On the command line, enter the insights-client command with the --no-upload option, and press Return.

    [root@insights]# insights-client --no-upload
  2. The command runs and displays informational messages. The following example shows the redaction of the dmesg command and the cluster.conf file.

    WARNING: Excluding data from files
    Starting to collect Insights data for I-HOST
    WARNING: Skipping command /bin/dmesg
    WARNING: Skipping file /etc/cluster/cluster.conf
    Archive saved at /var/tmp/qsINM9/insights-ITC-4-20190925180232.tar.gz

The generated archive file gets saved to /var/tmp but the file is not uploaded to Red Hat.

6.1.2. Configuring YAML pattern and keyword redaction

The /etc/insights-client/file-content-redaction.yaml file redacts files using two methods: pattern redaction and keyword replacement. Pattern redaction uses either a pattern match or regular expression match. In keyword replacement, a Python data cleaning process replaces the keyword with a generic identifier.

Prerequisites

  • You must be familiar with the basics of YAML syntax. Explaining YAML is beyond the scope of this procedure.
  • You must have root-level access to the system.

Procedure

  1. Use an editor to create the /etc/insights-client/file-content-redaction.yaml file.

    Example

    # file-content-redaction.yaml
    ---
    # Pattern redaction per matching line
    #  Lines that match a pattern are excluded from files and command output.
    #  Patterns are processed in the order that they are listed.
    # Example
    
    patterns:
     - "a_string_1"
     - "a_string_2"
    
    # Regular expression pattern redaction per line
    #  Use "regex:" to wrap patterns with regular expressions"
    # Example
    
    patterns:
     regex:
     - "abc.*def"
     - "localhost[[:digit:]]"
    
    
    # Keyword replacement redaction
    #  Replace keywords in files and command output with generic identifiers
    #  Keyword does not support regex
    # Example
    
    keywords:
    - "1.1.1.1"
    - "My Name"
    - "a_name"

  2. Make sure the file-content-redaction.yaml file permissions are set for root owner only.

    [root@insights]# ll file-content-redaction.yaml
    -rw-------. 1 root root 145 Sep 25 17:39 file-content-redaction.yaml

6.2. Verifying the Insights client archive

You can verify the contents of the archive file. By inspecting the archive file, you can confirm what data is sent to Red Hat.

If you use obfuscation or redaction, you can inspect the archive before it uploads. If you want to preserve the archive file, you can keep it on your system.

6.2.1. Verifying the archive before uploading

To inspect the archive before the Python data cleaning script uploads it to Red Hat Insights for Red Hat Enterprise Linux, run the insights-client command with the --no-upload option, and then save the file without uploading it. This allows you to view the information that the client sends to Insights for Red Hat Enterprise Linux, and to verify your obfuscation or redaction settings.

The archive file is stored in the /var/tmp/ directory. When insights-client completes, it displays the file name.

Prerequisites

  • Make sure the /etc/insights-client/insights-client.conf file is correctly configured.

Procedure

  1. Enter the insights-client command with the --no-upload option.

    [root@insights]# insights-client --no-upload

    The command displays informational messages when redaction or obfuscation is applied.

    WARNING: Excluding data from files
    Starting to collect Insights data for ITC-4
    WARNING: Skipping patterns found in remove.conf
    WARNING: Skipping command /bin/dmesg
    WARNING: Skipping command /bin/hostname
    WARNING: Skipping file /etc/cluster/cluster.conf
    WARNING: Skipping file /etc/hosts
    Archive saved at /var/tmp/qsINM9/insights-ITC-4-20190925180232.tar.gz
  2. Navigate to the temporary storage directory as shown in the Archive saved at message.

    [root@insights]# cd /var/tmp/qsINM9/
  3. Unpack the compressed tar.gz file.

    [root@insights]# tar -xzf insights-ITC-4-20190925180232.tar.gz

    The script creates a new directory that contains the files.

6.2.2. Verifying the Insights client archive after uploading

To keep a copy of the archive for inspection after the Python data cleaning script uploads it to Red Hat Insights for Red Hat Enterprise Linux, run insights-client and then save the file. This allows you to verify the information that the client sends to Insights for Red Hat Enterprise Linux, and to verify your obfuscation or redaction settings.

Prerequisites

  • Make sure the /etc/insights-client/insights-client.conf file is correctly configured.

Procedure

  1. Enter the insights-client command with the --keep-archive option.

    [root@insights]# insights-client --keep-archive

    The command displays informational messages.

    Starting to collect Insights data for ITC-4
    Uploading Insights data.
    Successfully uploaded report from ITC-4 to account 6229994.
    Insights archive retained in /var/tmp/ozM8bY/insights-ITC-4-20190925181622.tar.gz
  2. Navigate to the temporary storage directory displayed in the Insights archive retained in message.

    [root@insights]# cd /var/tmp/ozM8bY/
  3. Unpack the compressed tar.gz file.

    [root@insights]# tar -xzf insights-ITC-4-20190925181622.tar.gz

    The script creates a new directory that contains the files.

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.