Search

Chapter 7. Troubleshooting common problems in Jupyter for administrators

download PDF

If your users are experiencing errors in Red Hat OpenShift AI relating to Jupyter, their notebooks, or their notebook server, read this section to understand what could be causing the problem, and how to resolve the problem.

If you cannot see the problem here or in the release notes, contact Red Hat Support.

7.1. A user receives a 404: Page not found error when logging in to Jupyter

Problem

If you have configured specialized user groups for OpenShift AI, the user name might not be added to the default user group for OpenShift AI.

Diagnosis

Check whether the user is part of the default user group.

  1. Find the names of groups allowed access to Jupyter.

    1. Log in to the OpenShift Container Platform web console.
    2. Click User Management Groups.
    3. Click the name of your user group, for example, rhoai-users.

      The Group details page for that group appears.

  2. Click the Details tab for the group and confirm that the Users section for the relevant group contains the users who have permission to access Jupyter.

Resolution

  • If the user is not added to any of the groups with permission access to Jupyter, follow Adding users to add them.
  • If the user is already added to a group with permission to access Jupyter, contact Red Hat Support.

7.2. A user’s notebook server does not start

Problem

The OpenShift Container Platform cluster that hosts the user’s notebook server might not have access to enough resources, or the Jupyter pod may have failed.

Diagnosis

  1. Log in to the OpenShift Container Platform web console.
  2. Delete and restart the notebook server pod for this user.

    1. Click Workloads Pods and set the Project to rhods-notebooks.
    2. Search for the notebook server pod that belongs to this user, for example, jupyter-nb-<username>-*.

      If the notebook server pod exists, an intermittent failure may have occurred in the notebook server pod.

      If the notebook server pod for the user does not exist, continue with diagnosis.

  3. Check the resources currently available in the OpenShift Container Platform cluster against the resources required by the selected notebook server image.

    If worker nodes with sufficient CPU and RAM are available for scheduling in the cluster, continue with diagnosis.

  4. Check the state of the Jupyter pod.

Resolution

  • If there was an intermittent failure of the notebook server pod:

    1. Delete the notebook server pod that belongs to the user.
    2. Ask the user to start their notebook server again.
  • If the notebook server does not have sufficient resources to run the selected notebook server image, either add more resources to the OpenShift Container Platform cluster, or choose a smaller image size.
  • If the Jupyter pod is in a FAILED state:

    1. Retrieve the logs for the jupyter-nb-* pod and send them to Red Hat Support for further evaluation.
    2. Delete the jupyter-nb-* pod.
  • If none of the previous resolutions apply, contact Red Hat Support.

7.3. The user receives a database or disk is full error or a no space left on device error when they run notebook cells

Problem

The user might have run out of storage space on their notebook server.

Diagnosis

  1. Log in to Jupyter and start the notebook server that belongs to the user having problems. If the notebook server does not start, follow these steps to check whether the user has run out of storage space:

    1. Log in to the OpenShift Container Platform web console.
    2. Click Workloads Pods and set the Project to rhods-notebooks.
    3. Click the notebook server pod that belongs to this user, for example, jupyter-nb-<idp>-<username>-*.
    4. Click Logs. The user has exceeded their available capacity if you see lines similar to the following:

      Unexpected error while saving file: XXXX database or disk is full

Resolution

  • Increase the user’s available storage by expanding their persistent volume: Expanding persistent volumes
  • Work with the user to identify files that can be deleted from the /opt/app-root/src directory on their notebook server to free up their existing storage space.
Note

When you delete files using the JupyterLab file explorer, the files move to the hidden /opt/app-root/src/.local/share/Trash/files folder in the persistent storage for the notebook. To free up storage space for notebooks, you must permanently delete these files.

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.