Chapter 6. Troubleshooting common problems in Jupyter for administrators
If your users are experiencing errors in Red Hat OpenShift Data Science relating to Jupyter, their notebooks, or their notebook server, read this section to understand what could be causing the problem, and how to resolve the problem.
If you cannot see the problem here or in the release notes, contact Red Hat Support.
6.1. A user receives a 404: Page not found error when logging in to Jupyter
Problem
If you have configured specialized OpenShift Data Science user groups, the user name might not be added to the default user group for OpenShift Data Science.
Diagnosis
Check whether the user is part of the default user group.
- Find the names of groups allowed access to Jupyter.
- Log in to OpenShift Dedicated web console.
-
Click User Management
Groups. Click the name of your user group, for example,
rhods-users
.The Group details page for that group appears.
- Click the Details tab for the group and confirm that the Users section for the relevant group, contains the users who have permission to access Jupyter.
Resolution
- If the user is not added to any of the groups allowed access to Jupyter, follow Adding users for OpenShift Data Science to add them.
- If the user is already added to a group that is allowed to access Jupyter, contact Red Hat Support.
6.2. A user’s notebook server does not start
The OpenShift Dedicated cluster that hosts the user’s notebook server might not have access to enough resources, or the Jupyter pod may have failed.
- Log in to OpenShift Dedicated web console.
Delete and restart the notebook server pod for this user.
-
Click Workloads
Pods and set the Project to rhods-notebooks
. Search for the notebook server pod that belongs to this user, for example,
jupyter-nb-<username>-*
.If the notebook server pod exists, an intermittent failure may have occurred in the notebook server pod.
If the notebook server pod for the user does not exist, continue with diagnosis.
-
Click Workloads
Check the resources currently available in the OpenShift cluster against the resources required by the selected notebook server image.
If worker nodes with sufficient CPU and RAM are available for scheduling in the cluster, continue with diagnosis.
- Check the state of the Jupyter pod.
Resolution
If there was an intermittent failure of the notebook server pod:
- Delete the notebook server pod that belongs to the user.
- Ask the user to start their notebook server again.
- If the notebook server does not have sufficient resources to run the selected notebook server image, either add more resources to the OpenShift cluster, or choose a smaller image size.
If the Jupyter pod is in a FAILED state:
-
Retrieve the logs for the
jupyter-nb-*
pod and send them to Red Hat Support for further evaluation. -
Delete the
jupyter-nb-*
pod.
-
Retrieve the logs for the
- If none of the previous resolutions apply, contact Red Hat Support.
6.3. The user receives a database or disk is full error or a no space left on device error when they run notebook cells
Problem
The user might have run out of storage space on their notebook server.
Diagnosis
- Log in to Jupyter and start the notebook server that belongs to the user having problems. If the notebook server does not start, follow these steps to check whether the user has run out of storage space:
- Log in to OpenShift Dedicated web console.
-
Click Workloads
Pods and set the Project to rhods-notebooks
. -
Click the notebook server pod that belongs to this user, for example,
jupyter-nb-<idp>-<username>-*
. Click Logs. The user has exceeded their available capacity if you see lines similar to the following:
Unexpected error while saving file: XXXX database or disk is full
Resolution
- Increase the user’s available storage by expanding their persistent volume: Expanding persistent volumes
-
Work with the user to identify files that can be deleted from the
/opt/app-root/src
directory on their notebook server to free up their existing storage space.