Chapter 6. Insights client data redaction
Red Hat Insights handles data collection by using Insights core collection. For redaction purposes, core collection replaces a limited JSON-file (.cache.json and .uploader.json) and shell script data collection method. In addition, core collection:
- Is compatible with Insights client 3.0 and later.
- Uses YAML files to determine which commands and files to redact.
- Uses robust Python data cleaning processes.
- Supports complex data collection by using datasources and avoids the possible limitations of using JSON files, such as uploader.json and shell scripts.
- Allows you to use more easily maintained datasources to know which datasources that are part of insights-core.
For more information about Red Hat Insights client core collection, datasources, and collection rules, see the following resources:
Using .cache.json and uploader.json
files to redact data is no longer supported by Red Hat. Use Datasources instead.
Preventing the collection of Personally Identifiable Information (PII)
Red Hat Insights for Red Hat Enterprise Linux collects a minimal amount of data, including data that might contain personally identifiable information (PII). To prevent PII (or other configuration data) from being collected, apply data redaction.
For information about how Red Hat Insights for Red Hat Enterprise Linux handles data collection, see Red Hat Insights data and application security.
6.1. Creating and configuring the required redaction YAML files
To redact data in Red Hat Insights, you need Insights client 3.0 and the following YAML configuration files to control the redaction actions:
-
file-redaction.yaml
-
file-content-redaction.yaml
You can use one or both files, depending on the content you want to redact.
To find the items to redact, Insights client uses the default configuration of the insights-client.conf
configuration file to call the file-redaction.yaml
and file-content-redaction.yaml
files. The following example shows an example of the default configuration for redaction in the insights-client.conf
file:
# Location of the redaction file for commands, files, and components #redaction_file=/etc/insights-client/file-redaction.yaml # Location of the redaction file for patterns and keywords #content_redaction_file=/etc/insights-client/file-content-redaction.yaml
While you do not need to change the configuration of the insights-client.conf
file, you do need to create the YAML files.
Red Hat no longer supports the use of the remove.conf
configuration file to redact data.
How the YAML files work
The Insights client /etc/insights-client/file-redaction.yaml
file lists commands and files that you want to redact. A Python data cleaning process runs on the file-redaction.yaml
file and redacts the listed commands and files.
When the Python data cleaning process runs, it redacts the specified content before adding it to the archive file.
The /etc/insights-client/file-content-redaction.yaml
defines pattern redaction and keyword replacement. For pattern redaction, the process redacts patterns or regular expressions that match those specified in the YAML file. For keyword replacement, the process replaces the specified keywords with generic identifiers.
6.1.1. Configuring file-redaction.yaml
to redact commands and system files
You can create the /etc/insights-client/file-redaction.yaml
file and include a list of commands and system files that you want redacted. When the data redaction takes place, a Python data cleaning process runs, and analyzes the contents of the YAML file.
The output of the listed commands or files does not get included in the uploaded archive file.
Prerequisites
- You must be familiar with the basics of YAML syntax. For more information about YAML, see yaml.org.
- You must have root-level access to the system.
Procedure
-
Use an editor to create the
/etc/insights-client/file-redaction.yaml
file. Enter the strings,
files:
andcommands:
, on separate lines in the YAML file.files: commands:
Enter files and commands that you want redact.
On the line following
files:
, enter the files that you want to redact. Use the information in the Datasources catalog to identify which files and commands to specify. For example, if you want to redact theauditd.conf
file, this is how it would look:files: - /etc/audit/auditd.conf
On the line following
commands:
, enter the commands that you want to redact. Use the information in the Datasources catalog to identify which commands to specify. For example, if you want to redact theethtool -i
command, this is how it would look:commands: - ethtool_i
-
Save the YAML file in
/etc/insights-client/
. Verify that the
file-redaction.yaml
file permissions areroot
owner only by runningll file-redaction.yaml
as root, on the command line.[root@insights]# ll file-redaction.yaml -rw-------. 1 root root 145 Sep 25 17:39 file-redaction.yaml
Example file-redaction.yaml file with comments
The following example shows a sample file-redaction.yaml
file that includes commands and files to redact. Comments, which are lines preceded by a hash symbol (#), also offer guidance to help you configure the YAML file.
# file-redaction.yaml --- # Redact the entire output of commands # Specify commands by either full command or by the "symbolic_name" like “ethtool_i.” # Refer to the “Datasource Catalog” and “General Datasources” at https://insights-core.readthedocs.io/en/latest/specs_catalog.html#general-datasource for a full list of available symbolic_names, and the commands and files they correspond to. commands: - /bin/rpm -qa - /bin/ls - ethtool_i # Redact the entire output of files # Specify files either by full filename or by the "symbolic_name" for example, “cluster_conf.” # Refer to the “Datasource Catalog” and “General Datasources” at https://insights-core.readthedocs.io/en/latest/specs_catalog.html#general-datasource for a full list of available symbolic_names, and the commands and files they correspond to. files: - /etc/audit/auditd.conf - cluster_conf
Verification step
To verify that your redaction file is working, you can run the insights-client
command with the --no-upload
option, then review the output messages on your console or terminal.
On the command line, enter the
insights-client
command with the--no-upload
option, and press Return.[root@insights]# insights-client --no-upload
The command runs and displays informational messages. The following example shows the redaction of the
dmesg
command and thecluster.conf
file.WARNING: Excluding data from files Starting to collect Insights data for I-HOST WARNING: Skipping command /bin/dmesg WARNING: Skipping file /etc/cluster/cluster.conf Archive saved at /var/tmp/qsINM9/insights-ITC-4-20190925180232.tar.gz
The generated archive file gets saved to /var/tmp
but the file is not uploaded to Red Hat.
6.1.2. Configuring YAML pattern and keyword redaction
The /etc/insights-client/file-content-redaction.yaml
file redacts files using two methods: pattern redaction and keyword replacement. Pattern redaction uses either a pattern match or regular expression match. In keyword replacement, a Python data cleaning process replaces the keyword with a generic identifier.
Prerequisites
- You must be familiar with the basics of YAML syntax. Explaining YAML is beyond the scope of this procedure.
- You must have root-level access to the system.
Procedure
Use an editor to create the
/etc/insights-client/file-content-redaction.yaml
file.Example
# file-content-redaction.yaml --- # Pattern redaction per matching line # Lines that match a pattern are excluded from files and command output. # Patterns are processed in the order that they are listed. # Example patterns: - "a_string_1" - "a_string_2" # Regular expression pattern redaction per line # Use "regex:" to wrap patterns with regular expressions" # Example patterns: regex: - "abc.*def" - "localhost[[:digit:]]" # Keyword replacement redaction # Replace keywords in files and command output with generic identifiers # Keyword does not support regex # Example keywords: - "1.1.1.1" - "My Name" - "a_name"
Make sure the
file-content-redaction.yaml
file permissions are set forroot
owner only.[root@insights]# ll file-content-redaction.yaml -rw-------. 1 root root 145 Sep 25 17:39 file-content-redaction.yaml
6.2. Verifying the Insights client archive
You can verify the contents of the archive file. By inspecting the archive file, you can confirm what data is sent to Red Hat.
If you use obfuscation or redaction, you can inspect the archive before it uploads. If you want to preserve the archive file, you can keep it on your system.
6.2.1. Verifying the archive before uploading
To inspect the archive before the Python data cleaning script uploads it to Red Hat Insights for Red Hat Enterprise Linux, run the insights-client
command with the --no-upload
option, and then save the file without uploading it. This allows you to view the information that the client sends to Insights for Red Hat Enterprise Linux, and to verify your obfuscation or redaction settings.
The archive file is stored in the /var/tmp/
directory. When insights-client
completes, it displays the file name.
Prerequisites
-
Make sure the
/etc/insights-client/insights-client.conf
file is correctly configured.
Procedure
Enter the
insights-client
command with the--no-upload
option.[root@insights]# insights-client --no-upload
The command displays informational messages when redaction or obfuscation is applied.
WARNING: Excluding data from files Starting to collect Insights data for ITC-4 WARNING: Skipping patterns found in remove.conf WARNING: Skipping command /bin/dmesg WARNING: Skipping command /bin/hostname WARNING: Skipping file /etc/cluster/cluster.conf WARNING: Skipping file /etc/hosts Archive saved at /var/tmp/qsINM9/insights-ITC-4-20190925180232.tar.gz
Navigate to the temporary storage directory as shown in the
Archive saved at
message.[root@insights]# cd /var/tmp/qsINM9/
Unpack the compressed
tar.gz
file.[root@insights]# tar -xzf insights-ITC-4-20190925180232.tar.gz
The script creates a new directory that contains the files.
6.2.2. Verifying the Insights client archive after uploading
To keep a copy of the archive for inspection after the Python data cleaning script uploads it to Red Hat Insights for Red Hat Enterprise Linux, run insights-client
and then save the file. This allows you to verify the information that the client sends to Insights for Red Hat Enterprise Linux, and to verify your obfuscation or redaction settings.
Prerequisites
-
Make sure the
/etc/insights-client/insights-client.conf
file is correctly configured.
Procedure
Enter the
insights-client
command with the--keep-archive
option.[root@insights]# insights-client --keep-archive
The command displays informational messages.
Starting to collect Insights data for ITC-4 Uploading Insights data. Successfully uploaded report from ITC-4 to account 6229994. Insights archive retained in /var/tmp/ozM8bY/insights-ITC-4-20190925181622.tar.gz
Navigate to the temporary storage directory displayed in the
Insights archive retained in
message.[root@insights]# cd /var/tmp/ozM8bY/
Unpack the compressed
tar.gz
file.[root@insights]# tar -xzf insights-ITC-4-20190925181622.tar.gz
The script creates a new directory that contains the files.