此内容没有您所选择的语言版本。
Troubleshooting Guide
Troubleshooting OpenShift Enterprise
Abstract
- Configuration of standard Linux components and corresponding log files
- Configuration of OpenShift Enterprise components and corresponding log files
- Recognizing common system problems
- Error messages that may occur when creating applications
Chapter 1. Introduction to OpenShift Enterprise 复制链接链接已复制到粘贴板!
1.1. What's New in Current Release 复制链接链接已复制到粘贴板!
Chapter 2. Log Files and Validation Scripts 复制链接链接已复制到粘贴板!
2.1.1. General Information 复制链接链接已复制到粘贴板!
/var/log/messages
file. This serves as a good starting point to investigate issues that might not be logged anywhere else.
/var/log/httpd/access_log
file shows whether your web request was received by the host.
/var/log/httpd/error_log
file can be helpful in troubleshooting certain problems on broker and node hosts.
/var/log/audit/audit.log
file is useful for finding problems that might be caused by SELinux violations.
/var/log/secure
file logs user and SSH interactions. Because users can SSH into their gears, and all Git requests also authenticate using SSH, this file is useful for checking interaction with gears on node hosts.
2.1.2. Networking 复制链接链接已复制到粘贴板!
The best place for Linux operators to begin troubleshooting DNS problems on broker, node, or client hosts is the /etc/resolv.conf
file. On client hosts running other operating systems, look in the appropriate network configuration file.
/etc/resolv.conf
file as the first nameserver.
/etc/resolv.conf
file should point to your OpenShift Enterprise installation, either receiving updates from it, or delegating the domain to the nameserver of your installation.
dig hostname
# dig hostnamedig hostname
If you are running a BIND server on the broker (or supporting) host, the configuration information is contained in the /var/named/dynamic
directory. The zone file syntax is domain.com.db.zone
; so if the domain of your OpenShift Enterprise installation is example.com, the zone file name would be example.com.db.zone
. However, not all changes will be in the zone file. Recent changes can be contained in a binary journal file.
dig domain axfr
# dig domain axfrdig domain axfrdig domain axfr
For broker and node hosts, DHCP is currently only supported if the host IPs are pinned, meaning they do not change during lease renewal. This also applies to nameservers, in that they should also not change if pinned.
/etc/dhcp/dhclient-network-interface.conf
file to verify the nameservers provided by the DHCP service are being overwritten when a new lease is obtained.
/etc/resolv.conf
file is overwritten with incorrect values, check your configuration in the dhclient-network-interface.conf
file.
2.1.3. SELinux 复制链接链接已复制到粘贴板!
Procedure 2.1. To Troubleshoot SELinux Issues:
- As root, run the following command to set SELinux to permissive mode:
setenforce 0
# setenforce 0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Retry the failing action. If the action succeeds then the issue is SELinux related.
- Run the following command to set SELinux back to enforcing mode:
setenforce 1
# setenforce 1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Check the
/var/log/audit/audit.log
file for any SELinux violations.
2.1.4. Control Groups on Node Hosts 复制链接链接已复制到粘贴板!
cgconfig
service is running correctly on a node host, you see the following:
- The
/etc/cgconfig.conf
file exists with the SELinux label:system_u:object_r:cgconfig_etc_t:s0
. - The
/etc/cgconfig.conf
file joins CPU, cpuacct, memory, freezer, and net_cls in the/croup/all
directory. - The
/cgroup
directory exists, with the SELinux label:system_u:object_r:cgroup_t:s0
. - The cgconfig service is running.
- The
/etc/cgrules.conf
file exists with the SELinux label:system_u:object_r:cgrules_etc_t:s0
- The cgred service is running.
- A line for each gear in the
/etc/cgrules.conf
file. - A directory for each gear in the
/cgroup/all/openshift
directory. - All processes with the gear UUID are listed in the gear's
cgroup.procs
file. This file is located in the/cgroup/all/openshift/gear_UUID
directory.
Important
unconfined_u
and not system_u
. For example, the SELinux label in /etc/cgconfig.conf
would be unconfined_u:object_r:cgconfig_etc_t:s0
.
2.1.5. Pluggable Authentication Modules 复制链接链接已复制到粘贴板!
nproc
value to control the number of processes a given account can create.
/etc/openshift/resource_limits.conf
file on the node host:
limits_nproc=2048
limits_nproc=2048
84-gear_UUID.conf
file is created on the node host, in the /etc/security/limits.d
directory. Replace gear_UUID with the UNIX account name for the gear. This contains a rule set that defines the limits for that UNIX account. The first field of each line in the file is the gear UUID.
nproc
limit for an individual gear is increased by changing the value in the 84-gear_UUID.conf
file:
nproc
limit.
2.1.6. Disk Quotas 复制链接链接已复制到粘贴板!
/var/lib/openshift
directory has the usrquota
option enabled in the /etc/fstab
file, and has been mounted. Remount the directory if necessary using the command shown below, and check the output.
mount -o remount filesystem
# mount -o remount filesystemmount -o remount filesystem
repquota -a
# repquota -a
2.1.7. iptables 复制链接链接已复制到粘贴板!
iptables -L
# iptables -L
iptables -L
command for both a broker host and a node host are shown below.
2.2.1. General Configuration 复制链接链接已复制到粘贴板!
/etc/openshift
directory contains the most important configuration files for OpenShift Enterprise. These configuration files correspond to the type of installation; for example, a broker host, node host, or a client host. Check the corresponding configuration file to verify that the settings are suitable for your system.
2.2.2. Broker Host Failures 复制链接链接已复制到粘贴板!
/var/log/openshift/broker/httpd/
directory, check the access_log
and error_log
files when user interactions with the broker host are failing. Verify that the request was authenticated and forwarded to the broker application.
/var/log/openshift/broker/production.log
file.
/var/log/openshift/broker/user_action.log
file. This log file includes gears created and deleted by a user. However, the logs do not include gear UUIDs.
2.2.3. MCollective 复制链接链接已复制到粘贴板!
Note
oo-mco ping
command is not running successfully, it could be that openshift-origin-util-scl is not properly installed on your machine, or that oo-mco ping
is missing. Install the openshift-origin-util-scl package in order to run the command.
/var/log/openshift/node/ruby193-mcollective.log
on node hosts/var/log/openshift/broker/ruby193-mcollective-client.log
on broker hosts
/var/log/openshift/node/platform.log
and /var/log/openshift/node/platform-trace.log.
dig
or host
command, with the application's hostname.
2.2.4. Gears 复制链接链接已复制到粘贴板!
/var/lib/openshift
directory on that gear's node host, and represented by the gear's UUID. This directory contains the following information:
- Gears themselves
- Web server configuration
- Operation directories
ls
command to show the contents of the /var/lib/openshift/.httpd.d
directory.
ls /var/lib/openshift/.httpd.d/
# ls /var/lib/openshift/.httpd.d/
aliases.db frontend-mod-rewrite-https-template.erb idler.db nodes.db routes.json sts.txt
aliases.txt geardb.json idler.txt nodes.txt sts.db
/etc/passwd
file.
2.3. Validation Scripts 复制链接链接已复制到粘贴板!
2.3.1. Broker Host Scripts 复制链接链接已复制到粘贴板!
2.3.1.1. Verifying Broker Host Configuration 复制链接链接已复制到粘贴板!
oo-accept-broker
script without any options to report potential problems in the broker host configuration. The output from this script indicates how many problems are found.
2.3.1.2. Fixing Gear Discrepancies 复制链接链接已复制到粘贴板!
oo-admin-chk
script without any options to compare gear records in the broker's Mongo datastore to the gears actually present on the node hosts. The script reports any discrepancies that are found.
Example 2.1. Diagnosing Problems Using oo-admin-chk
oo-admin-chk
# oo-admin-chk
Check failed.
FAIL - user user@domain.com has a mismatch in consumed gears (-1) and actual gears (0)!
oo-admin-ctl-user
command:
oo-admin-ctl-user -l user@domain.com --setconsumedgears 0
# oo-admin-ctl-user -l user@domain.com --setconsumedgears 0
Example 2.2. Diagnosing Problems Using oo-admin-chk
oo-admin-chk
# oo-admin-chk
Gear 9bb07b76dca44c3b939c9042ecf1e2fe exists on node [node1.example.com, uid:2828] but does not exist in mongo database
/var/lib/openshift
directory, and removing the user from the node host.
oo-admin-chk
script. The script should be self-explanatory to resolve most problems.
2.3.2. Node Host Scripts 复制链接链接已复制到粘贴板!
2.3.2.1. Verifying Node Host Configuration 复制链接链接已复制到粘贴板!
oo-accept-node
script without any options to report potential problems in the node host configuration. The output from this script indicates how many problems are found.
2.3.3. Additional Diagnostics 复制链接链接已复制到粘贴板!
oo-diagnostics
script can be run on any OpenShift Enterprise host to diagnose common problems and provide potential solutions. It can also be helpful for gathering information (particularly when run with the -v
option for verbose output) to provide to Red Hat Support when opening a support case.
Chapter 3. Recognizing System Problems 复制链接链接已复制到粘贴板!
3.1. Missing Repositories 复制链接链接已复制到粘贴板!
Name of Repository | Description |
---|---|
Red Hat OpenShift Enterprise Infrastructure | Broker / BIND / Mongo hosts |
Red Hat OpenShift Enterprise Application Node | Node hosts |
Red Hat OpenShift Enterprise Client Tools | Client hosts |
Red Hat OpenShift Enterprise JBoss EAP add-on | Included with EAP support purchase. See note below. |
Red Hat OpenShift Enterprise Application Platform | Included with EAP support purchase. See note below. |
Red Hat OpenShift Enterprise Web Server | Included with bundle purchase. See note below. |
Note
Important
3.2. Missing Node Host 复制链接链接已复制到粘贴板!
oo-mco ping
command on the broker host, all node hosts should be listed in the output. Although applications on an unlisted node host can continue to operate without problems, the unlisted node hosts are not controlled by the broker host.
oo-mco ping
command if the clock on the broker host is not synchronized with the clock on the node host. MCollective messages have a TTL of 60 seconds. Therefore, if the clocks are not synchronized the MCollective messages can be dropped, causing communication issues. Verify that the broker host and node host clocks are synchronized, and the ntpd service is enabled. All configured hosts must use the same NTP server.
/var/log/openshift/node/ruby193-mcollective.log
file on the node host, and could look like the following sample screen output:
W, [2012-08-10T14:27:01.526544 #12179] WARN -- : runner.rb:62:in `run' Message 8beea9354f9784de939ec5693940d5ce from uid=48@broker.example.com created at 1344622854 is 367 seconds old, TTL is 60
W, [2012-08-10T14:27:01.526544 #12179] WARN -- : runner.rb:62:in `run' Message 8beea9354f9784de939ec5693940d5ce from uid=48@broker.example.com created at 1344622854 is 367 seconds old, TTL is 60
oo-mco ping
command if ActiveMQ on the broker host cannot communicate with MCollective on the node host. Verify that the ruby193-mcollective
service is running on the node host, and it can communicate with ActiveMQ on the broker host. If a configuration has been modified recently, use the following command to restart the ruby193-mcollective
service:
service ruby193-mcollective restart
# service ruby193-mcollective restart
3.3. Broker Application Response Failure 复制链接链接已复制到粘贴板!
Passenger
service. There can be cases when the broker host service appears to be running, but in reality is not. If the Passenger
service fails to start for some reason, the broker host service will not start, even if the httpd
service is running. So even though the service openshift-broker start
command reports success, the service may not actually be running.
Passenger
service are logged in the /var/www/openshift/broker/httpd/logs/error_log
file on the broker host, as shown in the following screen output:
[Wed Oct 17 23:48:04 2012] [error] *** Passenger could not be initialized because of this error: Unable to start the Phusion Passenger watchdog (/usr/lib64/gems/exts/passenger-3.0.17/agents/PassengerWatchdog): Permission denied (13)
[Wed Oct 17 23:48:04 2012] [error] *** Passenger could not be initialized because of this error: Unable to start the Phusion Passenger watchdog (/usr/lib64/gems/exts/passenger-3.0.17/agents/PassengerWatchdog): Permission denied (13)
Passenger
service has failed to start. This can be caused by dependency issues with the RubyGems
package, which often occurs when Bundler
attempts to regenerate the /var/www/openshift/broker/Gemfile.lock
file.
cd /var/www/openshift/broker/ bundle --local
# cd /var/www/openshift/broker/
# bundle --local
Could not find rack-1.3.0 in any of the sources
openshift-broker
service could resolve this issue.
3.3.1. Missing Gems with Validation Scripts 复制链接链接已复制到粘贴板!
Bundler
and RubyGems
dependencies. This is because the validation scripts, such as oo-admin-chk
, use the broker Rails configuration and also depend on the /var/www/openshift/broker/Gemfile.lock
file, as shown in the following sample output:
oo-admin-chk
# oo-admin-chk
Could not find rack-1.3.0 in any of the sources
Run `bundle install` to install missing gems.
openshift-broker
service will regenerate the Gemfile.lock
file, and could solve this issue. Be sure to run the yum update
command before restarting the openshift-broker
service.
Warning
bundle install
command as the output asks you to do. Running this command will download and install unsupported and untested software packages, resulting in problems with your OpenShift Enterprise installation.
3.4. DNS Propagation Fails when Creating an Application 复制链接链接已复制到粘贴板!
PUBLIC_HOSTNAME
setting in the /etc/openshift/node.conf
file on the node host is incorrectly configured.
Note
oo-admin-chk
script on the broker host can help detect this problem.
git clone
is performed, a developer can authenticate successfully, but then be disconnected by the remote host. This could be due to PAM being misconfigured. An example of this error is shown in the output below.
pam_selinux
should be changed to pam_openshift
in /etc/pam.d/sshd
, and a line with pam_namespace.so
should be at the end of each file modified. If your change management system overwrote these settings, ensure that your system will retain the correctly modified files in the future.
3.6. Gears Not Idling 复制链接链接已复制到粘贴板!
oddjob
daemon must be running on node hosts for gear idling to work correctly. Error messages for gear idling issues are logged in the /var/log/httpd/error_log
file on the node host. The following error message, from the error-log
file, shows that the oddjob
daemon is not running.
org.freedesktop.DBus.Error.ServiceUnknown: The name com.redhat.oddjob_openshift was not provided by any .service files
org.freedesktop.DBus.Error.ServiceUnknown: The name com.redhat.oddjob_openshift was not provided by any .service files
oddjob
daemon, and enable it to start at boot:
service oddjobd start chkconfig oddjobd on
# service oddjobd start
# chkconfig oddjobd on
3.7. cgconfig Service Fails to Start 复制链接链接已复制到粘贴板!
cgconfig
service fails to start, look for AVC messages in the /var/log/audit/audit.log
SELinux audit log file. The error messages could indicate incorrect SELinux labels in the following files and directories:
/etc/cgconfig.conf
/etc/cgrules.conf
/cgroup
restorecon -v filename
command to restore the correct SELinux labels for each of the files:
restorecon -v /etc/cgconfig.conf restorecon -v /etc/cgrules.conf restorecon -rv /cgroup
# restorecon -v /etc/cgconfig.conf
# restorecon -v /etc/cgrules.conf
# restorecon -rv /cgroup
/etc/cgrules.conf
file.
cgconfig
service using the following command:
service cgconfig start
# service cgconfig start
3.8. MongoDB Failures 复制链接链接已复制到粘贴板!
service mongod status
# service mongod status
mongod
service is not running, look in the /var/log/mongodb/mongodb.log
file for information. Look for duplicate configuration lines, which cause problems with MongoDB, and result in the multiple_occurences error message. Verify that there are no duplicate configuration lines in the /etc/mongodb.conf
file to enable the mongod
service to start.
/etc/openshift/broker.conf
file for MongoDB configuration details such as database host, port, name, user, and password.
Example 3.1. Example MongoDB Configuration
mongod
service running, use the following command to connect to the database, replacing configuration settings accordingly:
mongo localhost:27017/openshift_broker -u mongouser -p mongopassword
# mongo localhost:27017/openshift_broker -u mongouser -p mongopasswordmongo localhost:27017/openshift_broker -u mongouser -p mongopasswordmongo localhost:27017/openshift_broker -u mongouser -p mongopasswordmongo localhost:27017/openshift_broker -u mongouser -p mongopassword
3.9. Jenkins Build Failures 复制链接链接已复制到粘贴板!
AUTH_SALT
setting is changed in the /etc/openshift/broker.conf
file, subsequent Jenkins builds will initially fail with the following:
service openshift-broker restart
# service openshift-broker restart
oo-admin-broker-auth
tool to rekey the gears' authorization tokens. To rekey the tokens for all applicable gears, run the tool with the --rekey-all
option:
oo-admin-broker-auth --rekey-all
# oo-admin-broker-auth --rekey-all
--help
output and man page for additional options and more detailed use cases.
3.10. Outdated Cartridge List 复制链接链接已复制到粘贴板!
oo-admin-broker-cache --clear
# oo-admin-broker-cache --clear
--console
option implies --clear
):
oo-admin-broker-cache --console
# oo-admin-broker-cache --console
Chapter 4. Error Messages when Creating Applications 复制链接链接已复制到粘贴板!
4.1. cpu.cfs_quota_us: No such file 复制链接链接已复制到粘贴板!
rhc app create
command can fail to create an application if cgroups are not working properly. These error messages are logged in the /var/log/openshift/node/ruby193-mcollective.log
file on the node host, and can look like the following:
/cgroup/all/openshift/*/cpu.cfs_quota_us: No such file
/cgroup/all/openshift/*/cpu.cfs_quota_us: No such file
4.2. Password Prompt 复制链接链接已复制到粘贴板!
PUBLIC_HOSTNAME
must be configured correctly in the /etc/openshift/node.conf
file on the node host.
The authenticity of host 'myapp-domain.example.com (::1)' can't be established. RSA key fingerprint is 88:49:43:d2:e9:b4:4d:84:a1:d6:8a:30:85:73:d7:7f. Are you sure you want to continue connecting (yes/no)? yes e9bdfc309bef4c13889a21ddbea45f@myapp-domain.example.com's password:
The authenticity of host 'myapp-domain.example.com (::1)' can't be established.
RSA key fingerprint is 88:49:43:d2:e9:b4:4d:84:a1:d6:8a:30:85:73:d7:7f.
Are you sure you want to continue connecting (yes/no)? yes
e9bdfc309bef4c13889a21ddbea45f@myapp-domain.example.com's password:
PUBLIC_HOSTNAME
resolves to the wrong IP address. In this case, PUBLIC_HOSTNAME
is set to localhost.localdomain
, as shown in the sample screen output below.
PUBLIC_HOSTNAME=localhost.localdomain
PUBLIC_HOSTNAME=localhost.localdomain
localhost.localdomain
as the hostname for the node host. When Git attempts to authenticate using the gear user ID and SSH key, the SSH authentication fails because the application gear does not exist on localhost.localdomain
, and you are prompted for a password.
(::1)
, which is pointing to localhost, and is not a valid IP for an application's gear. Verify that the IP address of an application's gear is a valid IP address of the node host.
PUBLIC_HOSTNAME
fails to resolve at all as a FQDN, DNS resolution times out and the Git clone process fails.
Note
oo-admin-chk
script on the broker host can help detect this problem.
4.3. Communication Issue after Node Host Reboot 复制链接链接已复制到粘贴板!
rhc app create
command can fail to create an application, resulting in the following error:
An error occurred while communicating with the server. This problem may only be temporary. Check that you have correctly specified your OpenShift server 'https://broker.example.com/broker/rest/domain/domain-name/applications'.
An error occurred while communicating with the server. This problem may only be temporary. Check that you have correctly specified your OpenShift server
'https://broker.example.com/broker/rest/domain/domain-name/applications'.
oo-mco
commands on the broker host may continue to find the rebooted node host without any issues:
activemq
service:
service activemq restart
# service activemq restart
Note
Chapter 5. Debugging Problems with Specific Applications 复制链接链接已复制到粘贴板!
5.1. Common Resources 复制链接链接已复制到粘贴板!
/etc/passwd
file for information unique to that particular gear. You will see an account for the gear, represented with the gear's UUID. This file also provides the path to the login shell for the application's gear. The following sample screen output shows how gears are represented in the /etc/passwd
file.
- The
/var/lib/openshift/gear_UUID
directory on the node host is the home directory for each application gear. Check the SELinux contexts. - The
/var/lib/openshift/.httpd.d/gear_UUID*
directory on the node host is the operations directory for each application gear. It contains thehttpd
configuration for that particular application gear. - The
/var/log
directory on the node host contains theruby193-mcollective.log
file. - Searching the
/var/log/openshift
directory on the node host for the gear's user UUID usinggrep
could help you find problems with application gears that generate error messages. - The
/var/log/openshift/user_action.log
file on the broker host contains logs of user actions.
5.2. Rails Applications 复制链接链接已复制到粘贴板!
root
as some OpenShift API calls are cached under /var/www/openshift/broker/tmp/cache
and are owned by the user who runs the console. When the cache expires, the broker attempts to invalidate the cache. Since the broker runs as the apache
user it is unable to clear the root-owned files and returns 500 errors.
apache
user:
su --shell=/bin/bash -l apache cd /var/www/openshift/console ./script/rails console production
# su --shell=/bin/bash -l apache
$ cd /var/www/openshift/console
$ ./script/rails console production
Chapter 6. Technical Support 复制链接链接已复制到粘贴板!
6.1. Reporting Bugs 复制链接链接已复制到粘贴板!
6.2. Getting Help 复制链接链接已复制到粘贴板!
sos
RPM, and use the following command to create an archive of relevant host information to include with your support request.
sosreport
# sosreport
6.3. Participating in Development 复制链接链接已复制到粘贴板!
Appendix A. Revision History 复制链接链接已复制到粘贴板!
Revision History | ||||||
---|---|---|---|---|---|---|
Revision 2.1-1 | Wed Jun 11 2014 | |||||
| ||||||
Revision 2.1-0 | Fri May 16 2014 | |||||
| ||||||
Revision 2.0-1 | Tue Jan 14 2014 | |||||
| ||||||
Revision 2.0-0 | Tue Dec 10 2013 | |||||
|
Legal Notice
Copyright © 2025 Red Hat
OpenShift documentation is licensed under the Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0).
Modified versions must remove all Red Hat trademarks.
Portions adapted from https://github.com/kubernetes-incubator/service-catalog/ with modifications by Red Hat.
Red Hat, Red Hat Enterprise Linux, the Red Hat logo, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation’s permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.