이 콘텐츠는 선택한 언어로 제공되지 않습니다.
Chapter 1. Troubleshooting Red Hat Edge Manager
When working with devices in Red Hat Edge Manager, troubleshooting begins with interpreting the structured status messages provided by the device. By identifying the specific phase and component where a failure occurred, you can quickly determine whether an issue is caused by local resource constraints, network connectivity, or configuration errors.
1.1. Troubleshooting device error codes 링크 복사링크가 클립보드에 복사되었습니다!
To improve security and performance, Red Hat Edge Manager uses structured error codes in device status responses. These codes replace verbose system logs with categorized, actionable summaries, ensuring sensitive data (like credentials) is never exposed in the API or UI.
1.1.1. Error message anatomy 링크 복사링크가 클립보드에 복사되었습니다!
Every error message follows a standardized 250-character format to help you quickly pinpoint the phase, component, and specific cause of a failure.
The error message format is as follows:
[timestamp] While <Phase>, <Component> failed [for "<Element>"]: <Category> issue - <STATUS_CODE>
[timestamp] While <Phase>, <Component> failed [for "<Element>"]: <Category> issue - <STATUS_CODE>
| Field | Description | Examples |
|---|---|---|
| Phase | The stage of the operation where the error occurred. |
|
| Component | The specific system area affected. |
|
| Element | The specific resource (file, service, or image). |
|
| Category | The functional area of the failure. |
|
| Status Code | The standardized gRPC-based error code. |
|
1.1.1.1. Error reference & resolution 링크 복사링크가 클립보드에 복사되었습니다!
Use the table below to identify the root cause of a status code and the recommended next steps for resolution.
| Category | Status Code | Common Causes | Recommended Action |
|---|---|---|---|
| Network |
| DNS failure, registry unreachable, or connection timeout. Image non-existent or inaccessible due to registry permissions. | Check device internet connectivity and firewall rules for registry access. Verify the image name/tag and registry-level access permissions. |
| Security |
| Invalid credentials, expired tokens, or insufficient permissions. | Verify registry credentials and ensure the device identity is valid. |
| Configuration |
| Syntax errors in YAML/JSON or missing mandatory fields. Invalid element, token, or path format. | Validate your configuration spec against the schema. |
| Filesystem |
| Missing files, directory conflicts, or path errors. | Verify the existence of required local resources or mount points. |
| Resource |
| Disk full, Out of Memory (OOM), or CPU throttling. | Check device telemetry for disk usage and memory pressure. |
| System |
| Unexpected system faults or unclassified errors. | See Deep Dive Debugging below to correlate with journal logs. |
1.1.1.2. Deep dive debugging 링크 복사링크가 클립보드에 복사되었습니다!
While API status responses are sanitized for security, full error details—including stack traces and raw Go error chains—are preserved in the local device journal.
Procedure
If you encounter an UNKNOWN or INTERNAL error, or if the status message is truncated, you can map the status code to the detailed log:
Retrieve the Device Status, making sure to note the
timestampandcomponentfrom the message field.flightctl get device/<device-name> -o yaml
flightctl get device/<device-name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Access the device logs: Search the local journal for the corresponding error context to see the unredacted failure:
journalctl -u fleet-agent | grep "failed to reload systemd daemon"
journalctl -u fleet-agent | grep "failed to reload systemd daemon"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
API responses are limited to 250 characters. For the full diagnostic context—including raw Go error strings and detailed stack traces—refer to the local logs on the device.
1.2. Generating a device log bundle 링크 복사링크가 클립보드에 복사되었습니다!
Use the integrated flightctl-must-gather script directly on the device to generate a comprehensive bundle of diagnostic logs. This log bundle, in a standard .tar format, provides the necessary data to debug the device agent and assists in efficient troubleshooting and bug reporting.
Run the following command on the device and include the .tar file in the bug report.
This depends on an SSH connection to extract the .tar file.
sudo flightctl-must-gather
sudo flightctl-must-gatherCopy to Clipboard Copied! Toggle word wrap Toggle overflow
1.3. Viewing a device’s effective target configuration 링크 복사링크가 클립보드에 복사되었습니다!
The device manifest returned by the flightctl get device command still only has references to external configuration and secret objects. Only when the device agent queries the service, the service replaces the references with the actual configuration and secret data.
While this better protects potentially sensitive data, it also makes troubleshooting faulty configurations hard. This is why a user can be authorized to query the effective configuration as rendered by the service to the agent.
Procedure
To query the effective configuration, use the following command:
flightctl get device/${device_name} --rendered | jqflightctl get device/${device_name} --rendered | jqCopy to Clipboard Copied! Toggle word wrap Toggle overflow