Troubleshooting Collector
Troubleshooting Collector
Abstract
Chapter 1. Retrieving and analyzing the Collector logs and pod status
The first step in troubleshooting is to retrieve the logs and pods status. The logs allow you to identify the root cause of an error. In addition, examining the pod’s most recent status can provide information about failure messages.
1.1. Retrieving the Collector logs
First, you should examine the logs from failing Collectors. Depending on your environment and access rights, you can obtain these logs in two ways:
1.1.1. Retrieving the logs with the oc
or kubectl
command
You can use either the oc
or kubectl
command to obtain logs from your running Collector pod. Optionally, you can even check the logs from a previous Collector pod if your current Collector pod is restarting.
Prerequisites
Ensure that you have the authority to list the pods and logs:
$ oc auth can-i get pods && oc auth can-i get pods --subresource=logs 1
- 1
- If you use Kubernetes, enter
kubectl
instead ofoc
.
Procedure
List all the pods with label
app=collector
:$ oc get pods -n stackrox -l app=collector 1
- 1
- If you use Kubernetes, enter
kubectl
instead ofoc
.
Example output
collector-vclg5 1/2 CrashLoopBackOff 2 (25s ago) 2m41s+
Get the logs for the Collector pod:
$ oc logs -n stackrox <collector_pod_name> collector 1
- 1
- If you use Kubernetes, enter
kubectl
instead ofoc
. For<collector_pod_name>
, specify the name of your Collector pod, for example,collector-vclg5
.
(Optional) If the current Collector pod is restarting, you can check the logs for the previous Collector pod:
$ oc logs -n stackrox <collector_pod_name> collector --previous 1
- 1
- If you use Kubernetes, enter
kubectl
instead ofoc
. For<collector_pod_name>
, specify the name of your Collector pod, for example,collector-vclg5
.
1.1.2. Retrieving logs from a RHACS diagnostic bundle
You can also access Collector logs by downloading a diagnostic bundle from the Red Hat Advanced Cluster Security for Kubernetes (RHACS) user interface. Once you have downloaded the diagnostic bundle, you can inspect the logs for all the Collector pods. For more information, see Generating a diagnostic bundle.
1.2. Analyzing the Collector pod status
Examining the pod’s most recent status is another easy way to determine the cause of a Collector crash. Failure messages are recorded to the most recent status and are accessible using the kubectl describe pod
or oc describe pod
command.
Procedure
Describe the Collector pod:
$ oc describe pod -n stackrox <collector_pod_name> 1
- 1
- If you use Kubernetes, enter
kubectl
instead ofoc
. For<collector_pod_name>
, specify the name of your Collector pod, for example,collector-vclg5
.
Example output
[...] Last State: Terminated Reason: Error Message: No suitable kernel object downloaded 1 Exit Code: 1 Started: Fri, 21 Oct 2022 11:50:56 +0100 Finished: Fri, 21 Oct 2022 11:51:25 +0100 [...]
- 1
- In this example, you can see that Collector has failed to download a kernel driver.
Chapter 2. Commonly occurring error conditions
Most errors occur during Collector startup when Collector configures itself and finds or downloads a kernel driver for the system.
The following diagram describes the main parts of Collector startup process:
Figure 2.1. Collector pod startup process

If any part of the startup procedure fails, the logs display a diagnostic summary detailing which steps succeeded or failed .
The following log file example shows a successful startup:
[INFO 2022/11/28 13:21:55] == Collector Startup Diagnostics: == [INFO 2022/11/28 13:21:55] Connected to Sensor? true [INFO 2022/11/28 13:21:55] Kernel driver available? true [INFO 2022/11/28 13:21:55] Driver loaded into kernel? true [INFO 2022/11/28 13:21:55] ====================================
The log output confirms that Collector connected to Sensor and located and loaded the kernel driver. You can use this log to check for the successful startup of Collector.
2.1. Unable to connect to the Sensor
When starting, first check if you can connect to Sensor. Sensor is responsible for downloading kernel drivers and CIDR blocks for processing network events, making it an essential part of the startup process. The following logs indicate you are unable to connect to the Sensor:
Collector Version: 3.12.0 OS: Ubuntu 20.04.4 LTS Kernel Version: 5.4.0-126-generic Starting StackRox Collector... [INFO 2022/10/13 12:20:43] Hostname: 'hostname' [...] [INFO 2022/10/13 12:20:43] Sensor configured at address: sensor.stackrox.svc:9998 [INFO 2022/10/13 12:20:43] Attempting to connect to Sensor [INFO 2022/10/13 12:21:13] [INFO 2022/10/13 12:21:13] == Collector Startup Diagnostics: == [INFO 2022/10/13 12:21:13] Connected to Sensor? false [INFO 2022/10/13 12:21:13] Kernel driver available? false [INFO 2022/10/13 12:21:13] Driver loaded into kernel? false [INFO 2022/10/13 12:21:13] ==================================== [INFO 2022/10/13 12:21:13] [FATAL 2022/10/13 12:21:13] Unable to connect to Sensor.
This error could mean that Sensor has not started correctly or that Collector configuration is incorrect. To fix this issue, you must verify Collector configuration to ensure that Sensor address is correct and that the Sensor pod is running correctly.
View the Collector logs to specifically check the configured Sensor address. Alternatively, you can run the following command:
$ kubectl -n stackrox get pod <collector_pod_name> -o jsonpath='{.spec.containers[0].env[?(@.name=="GRPC_SERVER")].value}' 1
- 1
- For
<collector_pod_name>
, specify the name of your Collector pod, for example,collector-vclg5
.
2.3. Failing to load the kernel driver
Before Collector starts, it loads the kernel driver. However, in rare cases, you might encounter issues where Collector cannot load the kernel driver, resulting in various error messages or exceptions. In such cases, you must check the logs to identify the problems with failure in loading the kernel driver.
Consider the following Collector log:
[INFO 2022/10/13 14:25:13] Hostname: 'hostname' [...] [INFO 2022/10/13 14:25:13] Successfully downloaded and decompressed /module/collector.ko [INFO 2022/10/13 14:25:13] [INFO 2022/10/13 14:25:13] This product uses kernel module and ebpf subcomponents licensed under the GNU [INFO 2022/10/13 14:25:13] GENERAL PURPOSE LICENSE Version 2 outlined in the /kernel-modules/LICENSE file. [INFO 2022/10/13 14:25:13] Source code for the kernel module and ebpf subcomponents is available upon [INFO 2022/10/13 14:25:13] request by contacting support@stackrox.com. [INFO 2022/10/13 14:25:13] [...] [INFO 2022/10/13 14:25:13] Inserting kernel module /module/collector.ko with indefinite removal and retry if required. [ERROR 2022/10/13 14:25:13] Error inserting kernel module: /module/collector.ko: Operation not permitted. Aborting... [ERROR 2022/10/13 14:25:13] Failed to insert kernel module [ERROR 2022/10/13 14:25:13] Failed to setup Kernel module [INFO 2022/10/13 14:25:13] [INFO 2022/10/13 14:25:13] == Collector Startup Diagnostics: == [INFO 2022/10/13 14:25:13] Connected to Sensor? true [INFO 2022/10/13 14:25:13] Kernel driver available? true [INFO 2022/10/13 14:25:13] Driver loaded into kernel? false [INFO 2022/10/13 14:25:13] ==================================== [INFO 2022/10/13 14:25:13] [FATAL 2022/10/13 14:25:13] Failed to initialize collector kernel components.
If you encounter this kind of error, it is unlikely that you can fix it yourself. So instead, report it to Red Hat Advanced Cluster Security for Kubernetes (RHACS) support team or create a GitHub issue.