Chapter 1. Architecture of OpenShift Data Science self-managed
Red Hat OpenShift Data Science self-managed is an Operator that is available on a self-managed environment, such as Red Hat OpenShift Container Platform.
OpenShift Data Science integrates the following components and services:
At the service layer:
- OpenShift Data Science dashboard
- A customer-facing dashboard that shows available and installed applications for the OpenShift Data Science environment as well as learning resources such as tutorials, quick starts, and documentation. Administrative users can access functionality to manage users, clusters, notebook images, and model-serving runtimes. Data scientists can use the dashboard to create projects to organize their data science work.
- Model serving
- Data scientists can deploy trained machine-learning models to serve intelligent applications in production. After deployment, applications can send requests to the model using its deployed API endpoint.
- Data science pipelines
- Data scientists can build portable machine learning (ML) workflows with data science pipelines, using Docker containers. This enables your data scientists to automate workflows as they develop their data science models.
- Jupyter (self-managed)
- A self-managed application that allows data scientists to configure their own notebook server environment and develop machine learning models in JupyterLab.
- Distributed workloads
- Data scientists can use multiple nodes in parallel to train machine-learning models or process data more quickly. This approach significantly reduces the task completion time, and enables the use of larger datasets and more complex models.
The distributed workloads feature is currently available in Red Hat OpenShift Data Science 2.4 as a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
At the management layer:
- The Red Hat OpenShift Data Science Operator
- A meta-operator that deploys and maintains all components and sub-operators that are part of OpenShift Data Science.
- Monitoring services
- Prometheus gathers metrics from OpenShift Data Science for monitoring purposes.
When you install the OpenShift Data Science Operator in the OpenShift Container Platform cluster, the following new projects are created:
-
The
redhat-ods-operator
project contains the OpenShift Data Science operator. -
The
redhat-ods-applications
project installs the dashboard and other required components of OpenShift Data Science. -
The
redhat-ods-monitoring
project contains services for monitoring. -
The
rhods-notebooks
project is where notebook environments are deployed by default.
You or your data scientists must create additional projects for the applications that will use your machine learning models.
Do not install independent software vendor (ISV) applications in namespaces associated with OpenShift Data Science.