Chapter 2. Working in JupyterLab
JupyterLab is a web-based interactive development environment for Jupyter notebooks, code, and data. You can configure and arrange workflows in data science and machine learning. JupyterLab is an open source web application that supports over 40 programming languages, including Python and R.
2.1. Creating and importing Jupyter notebooks
You can create a blank Jupyter notebook or import a Jupyter notebook in JupyterLab from several different sources.
2.1.1. Creating a Jupyter notebook
You can create a Jupyter notebook from an existing notebook container image to access its resources and properties. The Workbench control panel contains a list of available container images that you can run as a single-user workbench.
Prerequisites
- Ensure that you have logged in to Red Hat OpenShift AI.
- Ensure that you have launched your workbench and logged in to JupyterLab.
- The workbench image exists in a registry, image stream, and is accessible.
Procedure
-
Click File
New Notebook. If prompted, select a kernel for your Jupyter notebook from the list.
If you want to use a kernel, click Select. If you do not want to use a kernel, click No Kernel.
Verification
- Check that the notebook file is visible in the JupyterLab interface.
2.1.2. Uploading an existing notebook file to JupyterLab from local storage
You can load an existing notebook file from local storage into JupyterLab to continue work, or adapt a project for a new use case.
Prerequisites
- Credentials for logging in to JupyterLab.
- You have a launched and running workbench based on a JupyterLab image.
- A notebook file exists in your local storage.
Procedure
-
In the File Browser in the left sidebar of the JupyterLab interface, click Upload Files (
).
Locate and select the notebook file and then click Open.
The file is displayed in the File Browser.
Verification
- The notebook file appears in the File Browser in the left sidebar of the JupyterLab interface.
- You can open the notebook file in JupyterLab.
2.2. Collaborating on Jupyter notebooks by using Git
If your files are stored in Git version control, you can clone a Git repository to work with them in JupyterLab. When you are ready, you can push your changes back to the Git repository so that others can review or use your models.
2.2.1. Uploading an existing notebook file from a Git repository by using JupyterLab
You can use the JupyterLab user interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.
Prerequisites
- You have a launched and running workbench based on a JupyterLab image.
- Read access for the Git repository you want to clone.
Procedure
Copy the HTTPS URL for the Git repository.
-
In GitHub, click ⤓ Code
HTTPS and then click the Copy URL to clipboard icon. - In GitLab, click Code and then click the Copy URL icon under Clone with HTTPS.
-
In GitHub, click ⤓ Code
In the JupyterLab interface, click the Git Clone button (
).
You can also click Git
Clone a repository in the menu, or click the Git icon ( ) and click the Clone a repository button.
The Clone a repo dialog appears.
- Enter the HTTPS URL of the repository that contains your notebook file.
- Click CLONE.
- If prompted, enter your username and password for the Git repository.
Verification
-
Check that the contents of the repository are visible in the file browser in JupyterLab, or run the
ls
command in the terminal to verify that the repository shows as a directory.
2.2.2. Uploading an existing notebook file to JupyterLab from a Git repository by using the CLI
You can use the command line interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.
Prerequisites
- You have a launched and running workbench based on a JupyterLab image.
Procedure
Copy the HTTPS URL for the Git repository.
-
In GitHub, click ⤓ Code
HTTPS and then click the Copy URL to clipboard icon. - In GitLab, click Code and then click the Copy URL icon under Clone with HTTPS.
-
In GitHub, click ⤓ Code
-
In JupyterLab, click File
New Terminal to open a terminal window. Enter the
git clone
command:git clone <git-clone-URL>
git clone <git-clone-URL>
Copy to Clipboard Copied! Replace
git-clone-URL>
with the HTTPS URL, for example:git clone https://github.com/example/myrepo.git
[1234567890@jupyter-nb-jdoe ~]$ git clone https://github.com/example/myrepo.git Cloning into myrepo... remote: Enumerating objects: 11, done. remote: Counting objects: 100% (11/11), done. remote: Compressing objects: 100% (10/10), done. remote: Total 2821 (delta 1), reused 5 (delta 1), pack-reused 2810 Receiving objects: 100% (2821/2821), 39.17 MiB | 23.89 MiB/s, done. Resolving deltas: 100% (1416/1416), done.
Copy to Clipboard Copied!
Verification
-
Check that the contents of the repository are visible in the file browser in JupyterLab, or run the
ls
command in the terminal to verify that the repository shows as a directory.
2.2.3. Updating your project with changes from a remote Git repository
You can pull changes made by other users into your data science project from a remote Git repository.
Prerequisites
- You have configured the remote Git repository.
- You have imported the Git repository into JupyterLab, and the contents of the repository are visible in the file browser in JupyterLab.
- You have permissions to pull files from the remote Git repository to your local repository.
- You have credentials for logging in to Jupyter.
- You have a launched and running workbench based on a JupyterLab image.
Procedure
-
In the JupyterLab interface, click the Git button (
).
-
Click the Pull latest changes button (
).
Verification
- You can view the changes pulled from the remote repository on the History tab in the Git pane.
2.2.4. Pushing project changes to a Git repository
To build and deploy your application in a production environment, upload your work to a remote Git repository.
Prerequisites
- You have opened a Jupyter notebook in the JupyterLab interface.
- You have added the relevant Git repository to your workbench.
- You have permission to push changes to the relevant Git repository.
- You have installed the Git version control extension.
Procedure
-
Click File
Save All to save any unsaved changes. -
Click the Git icon (
) to open the Git pane in the JupyterLab interface.
Confirm that your changed files appear under Changed.
If your changed files appear under Untracked, click Git
Simple Staging to enable a simplified Git process. Commit your changes.
- Ensure that all files under Changed have a blue checkmark beside them.
- In the Summary field, enter a brief description of the changes you made.
- Click Commit.
-
Click Git
Push to Remote to push your changes to the remote repository. - When prompted, enter your Git credentials and click OK.
Verification
- Your most recently pushed changes are visible in the remote Git repository.
2.3. Managing Python packages
In JupyterLab, you can view the Python packages that are installed on your workbench image and install additional packages.
2.3.1. Viewing Python packages installed on your workbench
You can check which Python packages are installed on your workbench and which version of the package you have by running the pip
tool in a notebook cell.
Prerequisites
- Log in to JupyterLab and open a Jupyter notebook.
Procedure
Enter the following in a new cell in your Jupyter notebook:
!pip list
!pip list
Copy to Clipboard Copied! - Run the cell.
Verification
The output shows an alphabetical list of all installed Python packages and their versions. For example, if you use the
pip list
command immediately after creating a workbench that uses the Minimal image, the first packages shown are similar to the following:Package Version --------------------------------- ---------- aiohttp 3.7.3 alembic 1.5.2 appdirs 1.4.4 argo-workflows 3.6.1 argon2-cffi 20.1.0 async-generator 1.10 async-timeout 3.0.1 attrdict 2.0.1 attrs 20.3.0 backcall 0.2.0
Package Version --------------------------------- ---------- aiohttp 3.7.3 alembic 1.5.2 appdirs 1.4.4 argo-workflows 3.6.1 argon2-cffi 20.1.0 async-generator 1.10 async-timeout 3.0.1 attrdict 2.0.1 attrs 20.3.0 backcall 0.2.0
Copy to Clipboard Copied!
2.3.2. Installing Python packages on your workbench
You can install Python packages that are not part of the default workbench by adding the package and the version to a requirements.txt
file and then running the pip install
command in a notebook cell.
Although you can install packages directly, it is recommended that you use a requirements.txt
file so that the packages stated in the file can be easily re-used across different workbenches.
Prerequisites
- Log in to JupyterLab and open a Jupyter notebook.
Procedure
Create a new text file using one of the following methods:
- Click + to open a new launcher and then click Text file.
-
Click File
New Text File.
Rename the text file to
requirements.txt
.- Right-click the name of the file and then click Rename Text. The Rename File dialog opens.
-
Enter
requirements.txt
in the New Name field and then click Rename.
Add the packages to install to the
requirements.txt
file.altair
altair
Copy to Clipboard Copied! You can specify the exact version to install by using the
==
(equal to) operator, for example:altair==4.1.0
altair==4.1.0
Copy to Clipboard Copied! NoteRed Hat recommends specifying exact package versions to enhance the stability of your workbench over time. New package versions can introduce undesirable or unexpected changes in your environment’s behavior.
To install multiple packages at the same time, place each package on a separate line.
Install the packages in
requirements.txt
to your server by using a notebook cell.Create a new notebook cell and enter the following command:
!pip install -r requirements.txt
!pip install -r requirements.txt
Copy to Clipboard Copied! - Run the cell by pressing Shift and Enter.
ImportantThe
pip install
command installs the package on your workbench. However, you must run theimport
statement in a code cell to use the package in your code.import altair
import altair
Copy to Clipboard Copied!
Verification
-
Confirm that the packages in the
requirements.txt
file appear in the list of packages installed on the workbench. See Viewing Python packages installed on your workbench for details.
2.4. Troubleshooting common problems in workbenches for users
If you are seeing errors in Red Hat OpenShift AI related to Jupyter, your Jupyter notebooks, or your workbench, read this section to understand what could be causing the problem.
If you cannot see your problem here or in the release notes, contact Red Hat Support.
- I see a 403: Forbidden error when I log in to Jupyter
Problem
If your cluster administrator has configured OpenShift AI user groups, your username might not be added to the default user group or the default administrator group for OpenShift AI.
Resolution
Contact your cluster administrator so that they can add you to the correct group/s.
Contact your cluster administrator so that they can add you to the correct group/s.
Copy to Clipboard Copied! - My workbench does not start
Problem
The OpenShift cluster that hosts your workbench might not have access to enough resources, or the workbench pod may have failed.
Resolution
Check the logs in the Events section in OpenShift for error messages associated with the problem. For example:
Server requested 2021-10-28T13:31:29.830991Z [Warning] 0/7 nodes are available: 2 Insufficient memory, 2 node(s) had taint {node-role.kubernetes.io/infra: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Server requested 2021-10-28T13:31:29.830991Z [Warning] 0/7 nodes are available: 2 Insufficient memory, 2 node(s) had taint {node-role.kubernetes.io/infra: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Copy to Clipboard Copied! Contact your cluster administrator with details of any relevant error messages so that they can perform further checks.
- I see a database or disk is full error or a no space left on device error when I run my notebook cells
Problem
You might have run out of storage space on your workbench.
Resolution
Contact your cluster administrator so that they can perform further checks.