Chapter 3. Requirements for OpenShift Data Science self-managed
Your environment must meet certain requirements to receive support for Red Hat OpenShift Data Science.
Installation requirements
You must meet the following requirements before you are able to install OpenShift Data Science on your Red Hat OpenShift Container Platform cluster.
Product subscriptions
A subscription for Red Hat OpenShift Data Science self-managed
Contact your Red Hat account manager to purchase new subscriptions. If you do not yet have an account manager, complete the form at https://www.redhat.com/en/contact to request one.
An OpenShift Container Platform cluster 4.11 or greater
Use an existing cluster or create a new cluster by following the OpenShift Container Platform documentation: OpenShift Container Platform installation overview.
Your cluster must have at least 2 worker nodes with at least 8 CPUs and 32 GiB RAM available for OpenShift Data Science to use when you install the Operator. The installation process fails to start and an error is displayed if this requirement is not met. To ensure that OpenShift Data Science is usable, additional cluster resources are required beyond the minimum requirements.
A default storage class that can be dynamically provisioned must be configured.
Confirm that a default storage class is configured by running the
oc get storageclass
command. If no storage classes are noted with(default)
beside the name, follow the OpenShift Container Platform documentation to configure a default storage class: Changing the default storage class. For more information about dynamic provisioning, see Dynamic provisioning.Open Data Hub must not be installed on the cluster.
For more information about managing the machines that make up an OpenShift cluster, see Overview of machine management.
An identity provider configured for OpenShift Container Platform
Access to the cluster as a user with the
cluster-admin
role; thekubeadmin
user is not allowed.Red Hat OpenShift Data Science supports the same authentication systems as Red Hat OpenShift Container Platform. See Understanding identity provider configuration for more information on configuring identity providers.
Internet access
Along with Internet access, the following domains must be accessible during the installation of OpenShift Data Science self-managed:
For CUDA-based images, the following domains must be accessible:
OpenShift Pipelines operator installation
The Red Hat OpenShift Pipelines operator enables support for installation of pipelines in a self-managed environment.
Before you use data science pipelines in OpenShift Data Science, you must install the Red Hat OpenShift Pipelines Operator. For more information, see Installing OpenShift Pipelines. If your deployment is in a disconnected self-managed environment, see Red Hat OpenShift Pipelines Operator in a restricted environment.
- Before you can execute a pipeline in a disconnected environment, you must mirror any images used by your pipelines to a private registry.
You can store your pipeline artifacts in an Amazon Web Services (AWS) Simple Storage Service (S3) bucket to ensure that you do not consume local storage. To do this, you must first configure write access to your S3 bucket on your AWS account.
If you do not have access to Amazon S3 storage, you must configure your own storage solution for use with pipelines.