Chapter 5. Installing the single-model serving platform
5.1. About the single-model serving platform
For deploying large models such as large language models (LLMs), OpenShift AI includes a single-model serving platform that is based on the KServe component. To install the single-model serving platform, the following components are required:
- KServe: A Kubernetes custom resource definition (CRD) that orchestrates model serving for all types of models. KServe includes model-serving runtimes that implement the loading of given types of model servers. KServe also handles the lifecycle of the deployment object, storage access, and networking setup.
- Red Hat OpenShift Serverless: A cloud-native development model that allows for serverless deployments of models. OpenShift Serverless is based on the open source Knative project.
Red Hat OpenShift Service Mesh: A service mesh networking layer that manages traffic flows and enforces access policies. OpenShift Service Mesh is based on the open source Istio project.
NoteCurrently, only OpenShift Service Mesh v2 is supported. For more information, see Supported Configurations.
You can install the single-model serving platform manually or in an automated fashion:
- Automated installation
-
If you have not already created a
ServiceMeshControlPlane
orKNativeServing
resource on your OpenShift cluster, you can configure the Red Hat OpenShift AI Operator to install KServe and configure its dependencies. For more information, see Configuring automated installation of KServe - Manual installation
If you have already created a
ServiceMeshControlPlane
orKNativeServing
resource on your OpenShift cluster, you cannot configure the Red Hat OpenShift AI Operator to install KServe and configure its dependencies. In this situation, you must install KServe manually. For more information, see Manually installing KServe.NoteYou can run KServe in
Unmanaged
mode during manual installations of the single-model serving platform. This mode is useful when you need more control over KServe components, such as modifying resource limits for the KServe controller.
5.2. Configuring automated installation of KServe
If you have not already created a ServiceMeshControlPlane
or KNativeServing
resource on your OpenShift cluster, you can configure the Red Hat OpenShift AI Operator to install KServe and configure its dependencies.
You can configure KServe in advanced or standard deployment mode. For more information, see About KServe deployment modes. If you configure KServe for advanced deployment mode, you can set up your data science project to serve models in both advanced and standard deployment mode. However, if you configure KServe for only standard deployment mode, you can only use standard deployment mode.
If you have created a ServiceMeshControlPlane
or KNativeServing
resource on your cluster, the Red Hat OpenShift AI Operator cannot install KServe and configure its dependencies and the installation does not proceed. In this situation, you must follow the manual installation instructions to install KServe.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- Your cluster has a node with 4 CPUs and 16 GB memory.
- You have downloaded and installed the OpenShift command-line interface (CLI). For more information, see Installing the OpenShift CLI.
(Advanced deployment mode): You have installed the Red Hat OpenShift Service Mesh Operator and dependent Operators.
NoteTo enable automated installation of KServe, install only the required Operators for Red Hat OpenShift Service Mesh. Do not perform any additional configuration or create a
ServiceMeshControlPlane
resource.(Advanced deployment mode): You have installed the Red Hat OpenShift Serverless Operator.
NoteTo enable automated installation of KServe, install only the Red Hat OpenShift Serverless Operator. Do not perform any additional configuration or create a
KNativeServing
resource.-
You have installed the Red Hat OpenShift AI Operator and created a
DataScienceCluster
object. -
(Advanced deployment mode): To add Authorino as an authorization provider so that you can enable token authentication for deployed models, you have installed the
Red Hat - Authorino
Operator. See Installing the Authorino Operator.
Procedure
- Log in to the OpenShift web console as a cluster administrator.
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. Install OpenShift Service Mesh as follows:
- Click the DSC Initialization tab.
- Click the default-dsci object.
- Click the YAML tab.
(Advanced deployment mode): In the
spec
section, validate that the value of themanagementState
field for theserviceMesh
component is set toManaged
, as shown:spec: applicationsNamespace: redhat-ods-applications monitoring: managementState: Managed namespace: redhat-ods-monitoring serviceMesh: controlPlane: metricsCollection: Istio name: data-science-smcp namespace: istio-system managementState: Managed
spec: applicationsNamespace: redhat-ods-applications monitoring: managementState: Managed namespace: redhat-ods-monitoring serviceMesh: controlPlane: metricsCollection: Istio name: data-science-smcp namespace: istio-system managementState: Managed
Copy to Clipboard Copied! (Standard deployment mode): In the
spec
section, validate that the value of themanagementState
field for theserviceMesh
component is set toRemoved
, as shown:spec: applicationsNamespace: redhat-ods-applications monitoring: managementState: Managed namespace: redhat-ods-monitoring serviceMesh: controlPlane: metricsCollection: Istio name: data-science-smcp namespace: istio-system managementState: Removed
spec: applicationsNamespace: redhat-ods-applications monitoring: managementState: Managed namespace: redhat-ods-monitoring serviceMesh: controlPlane: metricsCollection: Istio name: data-science-smcp namespace: istio-system managementState: Removed
Copy to Clipboard Copied! NoteDo not change the
istio-system
namespace that is specified for theserviceMesh
component by default. Other namespace values are not supported.Click Save.
Based on the configuration you added to the
DSCInitialization
object, the Red Hat OpenShift AI Operator installs OpenShift Service Mesh.
(Standard deployment mode only): Install KServe as follows:
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. - Click the Data Science Cluster tab.
- Click the default-dsc DSC object.
- Click the YAML tab.
In the
spec.components
section, configure thekserve
component as shown.kserve: defaultDeploymentMode: RawDeployment RawDeploymentServiceConfig: Headed managementState: Managed serving: managementState: Removed name: knative-serving
kserve: defaultDeploymentMode: RawDeployment RawDeploymentServiceConfig: Headed
1 managementState: Managed serving: managementState: Removed
2 name: knative-serving
Copy to Clipboard Copied! Click Save.
The preceding configuration installs KServe in standard deployment mode, which is based on the KServe RawDeployment feature. In this configuration, observe the following details:
-
In the web console, click Operators
(Advanced deployment mode): Install both KServe and OpenShift Serverless as follows:
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. - Click the Data Science Cluster tab.
- Click the default-dsc DSC object.
- Click the YAML tab.
In the
spec.components
section, configure thekserve
component as shown.spec: components: kserve: managementState: Managed defaultDeploymentMode: Serverless RawDeploymentServiceConfig: Headed serving: ingressGateway: certificate: secretName: knative-serving-cert type: OpenshiftDefaultIngress managementState: Managed name: knative-serving
spec: components: kserve: managementState: Managed defaultDeploymentMode: Serverless
1 RawDeploymentServiceConfig: Headed
2 serving: ingressGateway: certificate: secretName: knative-serving-cert
3 type: OpenshiftDefaultIngress
4 managementState: Managed name: knative-serving
Copy to Clipboard Copied! Click Save.
The preceding configuration creates an ingress gateway for OpenShift Serverless to receive traffic from OpenShift Service Mesh. In this configuration, both standard and advanced modes can be used.
- 1
- The configuration shown uses the default deployment mode that is selected after configuring KServe. You can set the default value when you create and deploy a model using KServe. To use standard mode as the default, set
defaultDeploymentMode
toRawDeployment
. To use advanced mode as the default, setdefaultDeploymentMode
toServerless
. - 2
- The configuration shown uses the
Headed
configuration to let the cluster perform normal load balancing over workload replicas. For environments inference request load balancing is done on the client side, setRawDeploymentServiceConfig
toHeadless
. - 3
- The configuration shown uses the default ingress certificate configured for OpenShift to secure incoming traffic to your OpenShift cluster and stores the certificate in the
knative-serving-cert
secret that is specified in thesecretName
field. ThesecretName
field can only be set at the time of installation. The default value of thesecretName
field isknative-serving-cert
. Subsequent changes to the certificate secret must be made manually. If you did not use the defaultsecretName
value during installation, create a new secret namedknative-serving-cert
in theistio-system
namespace, and then restart theistiod-datascience-smcp-<suffix>
pod. - 4
- You can specify the following certificate types by updating the value of the
type
field:-
Provided
-
SelfSigned
OpenshiftDefaultIngress
To use a self-signed certificate or to provide your own, update the value of the
secretName
field to specify your secret name and change the value of thetype
field toSelfSigned
orProvided
.NoteIf you provide your own certificate, the certificate must specify the domain name used by the ingress controller of your OpenShift cluster. You can check this value by running the following command:
$ oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}'
-
You must set the value of the
managementState
field toManaged
for both thekserve
andserving
components. Settingkserve.managementState
toManaged
triggers automated installation of KServe. Settingserving.managementState
toManaged
triggers automated installation of OpenShift Serverless. However, installation of OpenShift Serverless will not be triggered ifkserve.managementState
is not also set toManaged
.
-
-
In the web console, click Operators
Verification
Verify installation of KServe as follows:
-
In the web console, click Workloads
Pods. - From the project list, select redhat-ods-applications.This is the project in which OpenShift AI components are installed, including KServe.
Confirm that the project includes a running pod for the KServe controller manager, similar to the following example:
NAME READY STATUS RESTARTS AGE kserve-controller-manager-7fbb7bccd4-t4c5g 1/1 Running 0 22h odh-model-controller-6c4759cc9b-cftmk 1/1 Running 0 129m odh-model-controller-6c4759cc9b-ngj8b 1/1 Running 0 129m odh-model-controller-6c4759cc9b-vnhq5 1/1 Running 0 129m
NAME READY STATUS RESTARTS AGE kserve-controller-manager-7fbb7bccd4-t4c5g 1/1 Running 0 22h odh-model-controller-6c4759cc9b-cftmk 1/1 Running 0 129m odh-model-controller-6c4759cc9b-ngj8b 1/1 Running 0 129m odh-model-controller-6c4759cc9b-vnhq5 1/1 Running 0 129m
Copy to Clipboard Copied!
-
In the web console, click Workloads
(Advanced deployment mode only): Verify installation of OpenShift Service Mesh as follows:
-
In the web console, click Workloads
Pods. - From the project list, select istio-system. This is the project in which OpenShift Service Mesh is installed.
Confirm that there are running pods for the service mesh control plane, ingress gateway, and egress gateway. These pods have the naming patterns shown in the following example:
NAME READY STATUS RESTARTS AGE istio-egressgateway-7c46668687-fzsqj 1/1 Running 0 22h istio-ingressgateway-77f94d8f85-fhsp9 1/1 Running 0 22h istiod-data-science-smcp-cc8cfd9b8-2rkg4 1/1 Running 0 22h
NAME READY STATUS RESTARTS AGE istio-egressgateway-7c46668687-fzsqj 1/1 Running 0 22h istio-ingressgateway-77f94d8f85-fhsp9 1/1 Running 0 22h istiod-data-science-smcp-cc8cfd9b8-2rkg4 1/1 Running 0 22h
Copy to Clipboard Copied!
-
In the web console, click Workloads
(Advanced deployment mode only): Verify installation of OpenShift Serverless as follows:
-
In the web console, click Workloads
Pods. - From the project list, select knative-serving. This is the project in which OpenShift Serverless is installed.
Confirm that there are numerous running pods in the
knative-serving
project, including activator, autoscaler, controller, and domain mapping pods, as well as pods for the Knative Istio controller (which controls the integration of OpenShift Serverless and OpenShift Service Mesh). An example is shown.NAME READY STATUS RESTARTS AGE activator-7586f6f744-nvdlb 2/2 Running 0 22h activator-7586f6f744-sd77w 2/2 Running 0 22h autoscaler-764fdf5d45-p2v98 2/2 Running 0 22h autoscaler-764fdf5d45-x7dc6 2/2 Running 0 22h autoscaler-hpa-7c7c4cd96d-2lkzg 1/1 Running 0 22h autoscaler-hpa-7c7c4cd96d-gks9j 1/1 Running 0 22h controller-5fdfc9567c-6cj9d 1/1 Running 0 22h controller-5fdfc9567c-bf5x7 1/1 Running 0 22h domain-mapping-56ccd85968-2hjvp 1/1 Running 0 22h domain-mapping-56ccd85968-lg6mw 1/1 Running 0 22h domainmapping-webhook-769b88695c-gp2hk 1/1 Running 0 22h domainmapping-webhook-769b88695c-npn8g 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jb4xk 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jxs5p 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-bgd5r 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-hld75 1/1 Running 0 22h webhook-7d49878bc4-8xjbr 1/1 Running 0 22h webhook-7d49878bc4-s4xx4 1/1 Running 0 22h
NAME READY STATUS RESTARTS AGE activator-7586f6f744-nvdlb 2/2 Running 0 22h activator-7586f6f744-sd77w 2/2 Running 0 22h autoscaler-764fdf5d45-p2v98 2/2 Running 0 22h autoscaler-764fdf5d45-x7dc6 2/2 Running 0 22h autoscaler-hpa-7c7c4cd96d-2lkzg 1/1 Running 0 22h autoscaler-hpa-7c7c4cd96d-gks9j 1/1 Running 0 22h controller-5fdfc9567c-6cj9d 1/1 Running 0 22h controller-5fdfc9567c-bf5x7 1/1 Running 0 22h domain-mapping-56ccd85968-2hjvp 1/1 Running 0 22h domain-mapping-56ccd85968-lg6mw 1/1 Running 0 22h domainmapping-webhook-769b88695c-gp2hk 1/1 Running 0 22h domainmapping-webhook-769b88695c-npn8g 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jb4xk 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jxs5p 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-bgd5r 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-hld75 1/1 Running 0 22h webhook-7d49878bc4-8xjbr 1/1 Running 0 22h webhook-7d49878bc4-s4xx4 1/1 Running 0 22h
Copy to Clipboard Copied!
-
In the web console, click Workloads
5.3. Manually installing KServe
If you have already installed the Red Hat OpenShift Service Mesh Operator and created a ServiceMeshControlPlane
resource or if you have installed the Red Hat OpenShift Serverless Operator and created a KNativeServing
resource, the Red Hat OpenShift AI Operator cannot install KServe and configure its dependencies. In this situation, you must install KServe manually.
The procedures in this section show how to perform a new installation of KServe and its dependencies and are intended as a complete installation and configuration reference. If you have already installed and configured OpenShift Service Mesh or OpenShift Serverless, you might not need to follow all steps. If you are unsure about what updates to apply to your existing configuration to use KServe, contact Red Hat Support.
You can run KServe in Unmanaged
mode during manual installations of the single-model serving platform. This mode is useful when you need more control over KServe components, such as modifying resource limits for the KServe controller.
5.3.1. Installing KServe dependencies
Before you install KServe, you must install and configure some dependencies. Specifically, you must create Red Hat OpenShift Service Mesh and Knative Serving instances and then configure secure gateways for Knative Serving.
Currently, only OpenShift Service Mesh v2 is supported. For more information, see Supported Configurations.
5.3.2. Creating an OpenShift Service Mesh instance
The following procedure shows how to create a Red Hat OpenShift Service Mesh instance.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- Your cluster has a node with 4 CPUs and 16 GB memory.
- You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.
- You have installed the Red Hat OpenShift Service Mesh Operator and dependent Operators.
Procedure
In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:
oc login <openshift_cluster_url> -u <admin_username> -p <password>
$ oc login <openshift_cluster_url> -u <admin_username> -p <password>
Copy to Clipboard Copied! Create the required namespace for Red Hat OpenShift Service Mesh.
oc create ns istio-system
$ oc create ns istio-system
Copy to Clipboard Copied! You see the following output:
namespace/istio-system created
namespace/istio-system created
Copy to Clipboard Copied! Define a
ServiceMeshControlPlane
object in a YAML file namedsmcp.yaml
with the following contents:apiVersion: maistra.io/v2 kind: ServiceMeshControlPlane metadata: name: minimal namespace: istio-system spec: tracing: type: None addons: grafana: enabled: false kiali: name: kiali enabled: false prometheus: enabled: false jaeger: name: jaeger security: dataPlane: mtls: true identity: type: ThirdParty techPreview: meshConfig: defaultConfig: terminationDrainDuration: 35s gateways: ingress: service: metadata: labels: knative: ingressgateway proxy: networking: trafficControl: inbound: excludedPorts: - 8444 - 8022
apiVersion: maistra.io/v2 kind: ServiceMeshControlPlane metadata: name: minimal namespace: istio-system spec: tracing: type: None addons: grafana: enabled: false kiali: name: kiali enabled: false prometheus: enabled: false jaeger: name: jaeger security: dataPlane: mtls: true identity: type: ThirdParty techPreview: meshConfig: defaultConfig: terminationDrainDuration: 35s gateways: ingress: service: metadata: labels: knative: ingressgateway proxy: networking: trafficControl: inbound: excludedPorts: - 8444 - 8022
Copy to Clipboard Copied! For more information about the values in the YAML file, see the Service Mesh control plane configuration reference.
Create the service mesh control plane.
oc apply -f smcp.yaml
$ oc apply -f smcp.yaml
Copy to Clipboard Copied!
Verification
Verify creation of the service mesh instance as follows:
In the OpenShift CLI, enter the following command:
oc get pods -n istio-system
$ oc get pods -n istio-system
Copy to Clipboard Copied! The preceding command lists all running pods in the
istio-system
project. This is the project in which OpenShift Service Mesh is installed.Confirm that there are running pods for the service mesh control plane, ingress gateway, and egress gateway. These pods have the following naming patterns:
NAME READY STATUS RESTARTS AGE istio-egressgateway-7c46668687-fzsqj 1/1 Running 0 22h istio-ingressgateway-77f94d8f85-fhsp9 1/1 Running 0 22h istiod-data-science-smcp-cc8cfd9b8-2rkg4 1/1 Running 0 22h
NAME READY STATUS RESTARTS AGE istio-egressgateway-7c46668687-fzsqj 1/1 Running 0 22h istio-ingressgateway-77f94d8f85-fhsp9 1/1 Running 0 22h istiod-data-science-smcp-cc8cfd9b8-2rkg4 1/1 Running 0 22h
Copy to Clipboard Copied!
5.3.3. Creating a Knative Serving instance
The following procedure shows how to install Knative Serving and then create an instance.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- Your cluster has a node with 4 CPUs and 16 GB memory.
- You have downloaded and installed the OpenShift command-line interface (CLI). Installing the OpenShift CLI.
- You have created a Red Hat OpenShift Service Mesh instance.
- You have installed the Red Hat OpenShift Serverless Operator.
Procedure
In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:
oc login <openshift_cluster_url> -u <admin_username> -p <password>
$ oc login <openshift_cluster_url> -u <admin_username> -p <password>
Copy to Clipboard Copied! Check whether the required project (that is, namespace) for Knative Serving already exists.
oc get ns knative-serving
$ oc get ns knative-serving
Copy to Clipboard Copied! If the project exists, you see output similar to the following example:
NAME STATUS AGE knative-serving Active 4d20h
NAME STATUS AGE knative-serving Active 4d20h
Copy to Clipboard Copied! If the
knative-serving
project doesn’t already exist, create it.oc create ns knative-serving
$ oc create ns knative-serving
Copy to Clipboard Copied! You see the following output:
namespace/knative-serving created
namespace/knative-serving created
Copy to Clipboard Copied! Define a
ServiceMeshMember
object in a YAML file calleddefault-smm.yaml
with the following contents:apiVersion: maistra.io/v1 kind: ServiceMeshMember metadata: name: default namespace: knative-serving spec: controlPlaneRef: namespace: istio-system name: minimal
apiVersion: maistra.io/v1 kind: ServiceMeshMember metadata: name: default namespace: knative-serving spec: controlPlaneRef: namespace: istio-system name: minimal
Copy to Clipboard Copied! Create the
ServiceMeshMember
object in theistio-system
namespace.oc apply -f default-smm.yaml
$ oc apply -f default-smm.yaml
Copy to Clipboard Copied! You see the following output:
servicemeshmember.maistra.io/default created
servicemeshmember.maistra.io/default created
Copy to Clipboard Copied! Define a
KnativeServing
object in a YAML file calledknativeserving-istio.yaml
with the following contents:apiVersion: operator.knative.dev/v1beta1 kind: KnativeServing metadata: name: knative-serving namespace: knative-serving annotations: serverless.openshift.io/default-enable-http2: "true" spec: workloads: - name: net-istio-controller env: - container: controller envVars: - name: ENABLE_SECRET_INFORMER_FILTERING_BY_CERT_UID value: 'true' - annotations: sidecar.istio.io/inject: "true" sidecar.istio.io/rewriteAppHTTPProbers: "true" name: activator - annotations: sidecar.istio.io/inject: "true" sidecar.istio.io/rewriteAppHTTPProbers: "true" name: autoscaler ingress: istio: enabled: true config: features: kubernetes.podspec-affinity: enabled kubernetes.podspec-nodeselector: enabled kubernetes.podspec-tolerations: enabled
apiVersion: operator.knative.dev/v1beta1 kind: KnativeServing metadata: name: knative-serving namespace: knative-serving annotations: serverless.openshift.io/default-enable-http2: "true" spec: workloads: - name: net-istio-controller env: - container: controller envVars: - name: ENABLE_SECRET_INFORMER_FILTERING_BY_CERT_UID value: 'true' - annotations: sidecar.istio.io/inject: "true"
1 sidecar.istio.io/rewriteAppHTTPProbers: "true"
2 name: activator - annotations: sidecar.istio.io/inject: "true" sidecar.istio.io/rewriteAppHTTPProbers: "true" name: autoscaler ingress: istio: enabled: true config: features: kubernetes.podspec-affinity: enabled kubernetes.podspec-nodeselector: enabled kubernetes.podspec-tolerations: enabled
Copy to Clipboard Copied! The preceding file defines a custom resource (CR) for a
KnativeServing
object. The CR also adds the following actions to each of the activator and autoscaler pods:NoteIf you configure a custom domain for a Knative service, you can use a TLS certificate to secure the mapped service. To do this, you must create a TLS secret, and then update the
DomainMapping
CR to use the TLS secret that you have created. For more information, see Securing a mapped service using a TLS certificate in the Red Hat OpenShift Serverless documentation.Optional: You can configure
knative serving
to allow zero initial scale for revisions by using theallow-zero-initial-scale
autoscaler configuration:spec: config: autoscaler: allow-zero-initial-scale: "true"
spec: config: autoscaler: allow-zero-initial-scale: "true"
Copy to Clipboard Copied! Optional: You can enable
initContainers
for yourknative serving
instance by adding thekubernetes.podspec-init-containers
feature flag as shown in the following example:spec: config: features: kubernetes.podspec-init-containers: "enabled"
spec: config: features: kubernetes.podspec-init-containers: "enabled"
Copy to Clipboard Copied! You can also add the flag in the
config-feature
ConfigMap in theknative-serving
namespace. For more information, see Feature and extension flags.Create the
KnativeServing
object in the specifiedknative-serving
namespace.oc apply -f knativeserving-istio.yaml
$ oc apply -f knativeserving-istio.yaml
Copy to Clipboard Copied! You see the following output:
knativeserving.operator.knative.dev/knative-serving created
knativeserving.operator.knative.dev/knative-serving created
Copy to Clipboard Copied!
Verification
Review the default
ServiceMeshMemberRoll
object in theistio-system
namespace.oc describe smmr default -n istio-system
$ oc describe smmr default -n istio-system
Copy to Clipboard Copied! In the description of the
ServiceMeshMemberRoll
object, locate theStatus.Members
field and confirm that it includes theknative-serving
namespace.Verify creation of the Knative Serving instance as follows:
In the OpenShift CLI, enter the following command:
oc get pods -n knative-serving
$ oc get pods -n knative-serving
Copy to Clipboard Copied! The preceding command lists all running pods in the
knative-serving
project. This is the project in which you created the Knative Serving instance.Confirm that there are numerous running pods in the
knative-serving
project, including activator, autoscaler, controller, and domain mapping pods, as well as pods for the Knative Istio controller, which controls the integration of OpenShift Serverless and OpenShift Service Mesh. An example is shown.NAME READY STATUS RESTARTS AGE activator-7586f6f744-nvdlb 2/2 Running 0 22h activator-7586f6f744-sd77w 2/2 Running 0 22h autoscaler-764fdf5d45-p2v98 2/2 Running 0 22h autoscaler-764fdf5d45-x7dc6 2/2 Running 0 22h autoscaler-hpa-7c7c4cd96d-2lkzg 1/1 Running 0 22h autoscaler-hpa-7c7c4cd96d-gks9j 1/1 Running 0 22h controller-5fdfc9567c-6cj9d 1/1 Running 0 22h controller-5fdfc9567c-bf5x7 1/1 Running 0 22h domain-mapping-56ccd85968-2hjvp 1/1 Running 0 22h domain-mapping-56ccd85968-lg6mw 1/1 Running 0 22h domainmapping-webhook-769b88695c-gp2hk 1/1 Running 0 22h domainmapping-webhook-769b88695c-npn8g 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jb4xk 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jxs5p 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-bgd5r 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-hld75 1/1 Running 0 22h webhook-7d49878bc4-8xjbr 1/1 Running 0 22h webhook-7d49878bc4-s4xx4 1/1 Running 0 22h
NAME READY STATUS RESTARTS AGE activator-7586f6f744-nvdlb 2/2 Running 0 22h activator-7586f6f744-sd77w 2/2 Running 0 22h autoscaler-764fdf5d45-p2v98 2/2 Running 0 22h autoscaler-764fdf5d45-x7dc6 2/2 Running 0 22h autoscaler-hpa-7c7c4cd96d-2lkzg 1/1 Running 0 22h autoscaler-hpa-7c7c4cd96d-gks9j 1/1 Running 0 22h controller-5fdfc9567c-6cj9d 1/1 Running 0 22h controller-5fdfc9567c-bf5x7 1/1 Running 0 22h domain-mapping-56ccd85968-2hjvp 1/1 Running 0 22h domain-mapping-56ccd85968-lg6mw 1/1 Running 0 22h domainmapping-webhook-769b88695c-gp2hk 1/1 Running 0 22h domainmapping-webhook-769b88695c-npn8g 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jb4xk 1/1 Running 0 22h net-istio-controller-7dfc6f668c-jxs5p 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-bgd5r 1/1 Running 0 22h net-istio-webhook-66d8f75d6f-hld75 1/1 Running 0 22h webhook-7d49878bc4-8xjbr 1/1 Running 0 22h webhook-7d49878bc4-s4xx4 1/1 Running 0 22h
Copy to Clipboard Copied!
5.3.4. Creating secure gateways for Knative Serving
To secure traffic between your Knative Serving instance and the service mesh, you must create secure gateways for your Knative Serving instance.
The following procedure shows how to use OpenSSL version 3 or later to generate a wildcard certificate and key and then use them to create local and ingress gateways for Knative Serving.
If you have your own wildcard certificate and key to specify when configuring the gateways, you can skip to step 11 of this procedure.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.
- You have created a Red Hat OpenShift Service Mesh instance.
- You have created a Knative Serving instance.
- If you intend to generate a wildcard certificate and key, you have downloaded and installed OpenSSL version 3 or later.
Procedure
In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:
oc login <openshift_cluster_url> -u <admin_username> -p <password>
$ oc login <openshift_cluster_url> -u <admin_username> -p <password>
Copy to Clipboard Copied! ImportantIf you have your own wildcard certificate and key to specify when configuring the gateways, skip to step 11 of this procedure.
Set environment variables to define base directories for generation of a wildcard certificate and key for the gateways.
export BASE_DIR=/tmp/kserve export BASE_CERT_DIR=${BASE_DIR}/certs
$ export BASE_DIR=/tmp/kserve $ export BASE_CERT_DIR=${BASE_DIR}/certs
Copy to Clipboard Copied! Set an environment variable to define the common name used by the ingress controller of your OpenShift cluster.
export COMMON_NAME=$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}' | awk -F'.' '{print $(NF-1)"."$NF}')
$ export COMMON_NAME=$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}' | awk -F'.' '{print $(NF-1)"."$NF}')
Copy to Clipboard Copied! Set an environment variable to define the domain name used by the ingress controller of your OpenShift cluster.
export DOMAIN_NAME=$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
$ export DOMAIN_NAME=$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
Copy to Clipboard Copied! Create the required base directories for the certificate generation, based on the environment variables that you previously set.
mkdir ${BASE_DIR} mkdir ${BASE_CERT_DIR}
$ mkdir ${BASE_DIR} $ mkdir ${BASE_CERT_DIR}
Copy to Clipboard Copied! Create the OpenSSL configuration for generation of a wildcard certificate.
cat <<EOF> ${BASE_DIR}/openssl-san.config [ req ] distinguished_name = req [ san ] subjectAltName = DNS:*.${DOMAIN_NAME} EOF
$ cat <<EOF> ${BASE_DIR}/openssl-san.config [ req ] distinguished_name = req [ san ] subjectAltName = DNS:*.${DOMAIN_NAME} EOF
Copy to Clipboard Copied! Generate a root certificate.
openssl req -x509 -sha256 -nodes -days 3650 -newkey rsa:2048 \ -subj "/O=Example Inc./CN=${COMMON_NAME}" \ -keyout ${BASE_CERT_DIR}/root.key \ -out ${BASE_CERT_DIR}/root.crt
$ openssl req -x509 -sha256 -nodes -days 3650 -newkey rsa:2048 \ -subj "/O=Example Inc./CN=${COMMON_NAME}" \ -keyout ${BASE_CERT_DIR}/root.key \ -out ${BASE_CERT_DIR}/root.crt
Copy to Clipboard Copied! Generate a wildcard certificate signed by the root certificate.
openssl req -x509 -newkey rsa:2048 \ -sha256 -days 3560 -nodes \ -subj "/CN=${COMMON_NAME}/O=Example Inc." \ -extensions san -config ${BASE_DIR}/openssl-san.config \ -CA ${BASE_CERT_DIR}/root.crt \ -CAkey ${BASE_CERT_DIR}/root.key \ -keyout ${BASE_CERT_DIR}/wildcard.key \ -out ${BASE_CERT_DIR}/wildcard.crt openssl x509 -in ${BASE_CERT_DIR}/wildcard.crt -text
$ openssl req -x509 -newkey rsa:2048 \ -sha256 -days 3560 -nodes \ -subj "/CN=${COMMON_NAME}/O=Example Inc." \ -extensions san -config ${BASE_DIR}/openssl-san.config \ -CA ${BASE_CERT_DIR}/root.crt \ -CAkey ${BASE_CERT_DIR}/root.key \ -keyout ${BASE_CERT_DIR}/wildcard.key \ -out ${BASE_CERT_DIR}/wildcard.crt $ openssl x509 -in ${BASE_CERT_DIR}/wildcard.crt -text
Copy to Clipboard Copied! Verify the wildcard certificate.
openssl verify -CAfile ${BASE_CERT_DIR}/root.crt ${BASE_CERT_DIR}/wildcard.crt
$ openssl verify -CAfile ${BASE_CERT_DIR}/root.crt ${BASE_CERT_DIR}/wildcard.crt
Copy to Clipboard Copied! Export the wildcard key and certificate that were created by the script to new environment variables.
export TARGET_CUSTOM_CERT=${BASE_CERT_DIR}/wildcard.crt export TARGET_CUSTOM_KEY=${BASE_CERT_DIR}/wildcard.key
$ export TARGET_CUSTOM_CERT=${BASE_CERT_DIR}/wildcard.crt $ export TARGET_CUSTOM_KEY=${BASE_CERT_DIR}/wildcard.key
Copy to Clipboard Copied! Optional: To export your own wildcard key and certificate to new environment variables, enter the following commands:
export TARGET_CUSTOM_CERT=<path_to_certificate> export TARGET_CUSTOM_KEY=<path_to_key>
$ export TARGET_CUSTOM_CERT=<path_to_certificate> $ export TARGET_CUSTOM_KEY=<path_to_key>
Copy to Clipboard Copied! NoteIn the certificate that you provide, you must specify the domain name used by the ingress controller of your OpenShift cluster. You can check this value by running the following command:
$ oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}'
Create a TLS secret in the
istio-system
namespace using the environment variables that you set for the wildcard certificate and key.oc create secret tls wildcard-certs --cert=${TARGET_CUSTOM_CERT} --key=${TARGET_CUSTOM_KEY} -n istio-system
$ oc create secret tls wildcard-certs --cert=${TARGET_CUSTOM_CERT} --key=${TARGET_CUSTOM_KEY} -n istio-system
Copy to Clipboard Copied! Create a
gateways.yaml
YAML file with the following contents:apiVersion: v1 kind: Service metadata: labels: experimental.istio.io/disable-gateway-port-translation: "true" name: knative-local-gateway namespace: istio-system spec: ports: - name: http2 port: 80 protocol: TCP targetPort: 8081 selector: knative: ingressgateway type: ClusterIP --- apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: knative-ingress-gateway namespace: knative-serving spec: selector: knative: ingressgateway servers: - hosts: - '*' port: name: https number: 443 protocol: HTTPS tls: credentialName: wildcard-certs mode: SIMPLE --- apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: knative-local-gateway namespace: knative-serving spec: selector: knative: ingressgateway servers: - port: number: 8081 name: https protocol: HTTPS tls: mode: ISTIO_MUTUAL hosts: - "*"
apiVersion: v1 kind: Service
1 metadata: labels: experimental.istio.io/disable-gateway-port-translation: "true" name: knative-local-gateway namespace: istio-system spec: ports: - name: http2 port: 80 protocol: TCP targetPort: 8081 selector: knative: ingressgateway type: ClusterIP --- apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: knative-ingress-gateway
2 namespace: knative-serving spec: selector: knative: ingressgateway servers: - hosts: - '*' port: name: https number: 443 protocol: HTTPS tls: credentialName: wildcard-certs mode: SIMPLE --- apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: knative-local-gateway
3 namespace: knative-serving spec: selector: knative: ingressgateway servers: - port: number: 8081 name: https protocol: HTTPS tls: mode: ISTIO_MUTUAL hosts: - "*"
Copy to Clipboard Copied! - 1
- Defines a service in the
istio-system
namespace for the Knative local gateway. - 2
- Defines an ingress gateway in the
knative-serving namespace
. The gateway uses the TLS secret you created earlier in this procedure. The ingress gateway handles external traffic to Knative. - 3
- Defines a local gateway for Knative in the
knative-serving
namespace.
Apply the
gateways.yaml
file to create the defined resources.oc apply -f gateways.yaml
$ oc apply -f gateways.yaml
Copy to Clipboard Copied! You see the following output:
service/knative-local-gateway created gateway.networking.istio.io/knative-ingress-gateway created gateway.networking.istio.io/knative-local-gateway created
service/knative-local-gateway created gateway.networking.istio.io/knative-ingress-gateway created gateway.networking.istio.io/knative-local-gateway created
Copy to Clipboard Copied!
Verification
Review the gateways that you created.
oc get gateway --all-namespaces
$ oc get gateway --all-namespaces
Copy to Clipboard Copied! Confirm that you see the local and ingress gateways that you created in the
knative-serving
namespace, as shown in the following example:NAMESPACE NAME AGE knative-serving knative-ingress-gateway 69s knative-serving knative-local-gateway 2m
NAMESPACE NAME AGE knative-serving knative-ingress-gateway 69s knative-serving knative-local-gateway 2m
Copy to Clipboard Copied!
5.3.5. Installing KServe
To complete manual installation of KServe, you must install the Red Hat OpenShift AI Operator. Then, you can configure the Operator to install KServe.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- Your cluster has a node with 4 CPUs and 16 GB memory.
- You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.
- You have created a Red Hat OpenShift Service Mesh instance.
- You have created a Knative Serving instance.
- You have created secure gateways for Knative Serving.
-
You have installed the Red Hat OpenShift AI Operator and created a
DataScienceCluster
object.
Procedure
- Log in to the OpenShift web console as a cluster administrator.
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. For installation of KServe, configure the OpenShift Service Mesh component as follows:
- Click the DSC Initialization tab.
- Click the default-dsci object.
- Click the YAML tab.
In the
spec
section, add and configure theserviceMesh
component as shown:spec: serviceMesh: managementState: Unmanaged
spec: serviceMesh: managementState: Unmanaged
Copy to Clipboard Copied! - Click Save.
For installation of KServe, configure the KServe and OpenShift Serverless components as follows:
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. - Click the Data Science Cluster tab.
- Click the default-dsc DSC object.
- Click the YAML tab.
In the
spec.components
section, configure thekserve
component as shown:spec: components: kserve: managementState: Managed
spec: components: kserve: managementState: Managed
Copy to Clipboard Copied! Within the
kserve
component, add theserving
component, and configure it as shown:spec: components: kserve: managementState: Managed serving: managementState: Unmanaged
spec: components: kserve: managementState: Managed serving: managementState: Unmanaged
Copy to Clipboard Copied! - Click Save.
-
In the web console, click Operators
5.3.6. Configuring persistent volume claims (PVC) on KServe
Enable persistent volume claims (PVC) on your inference service so you can provison persistent storage. For more information about PVC, see Understanding persistent storage.
To enable PVC, from the OpenShift AI dashboard, select the Project drop-down and click knative-serving
. Then, follow the steps in Enabling PVC support.
Verification
Verify that the inference service allows PVC as follows:
- In the OpenShift web console, change into the Administrator perspective.
-
Click Home
Search. -
In Resources, search for
InferenceService
. - Click the name of the inference service.
- Click the YAML tab.
Confirm that
volumeMounts
appears, similar to the following output:apiVersion: "serving.kserve.io/v1beta1" kind: "InferenceService" metadata: name: "sklearn-iris" spec: predictor: model: runtime: kserve-mlserver modelFormat: name: sklearn storageUri: "gs://kfserving-examples/models/sklearn/1.0/model" volumeMounts: - name: my-dynamic-volume mountPath: /tmp/data volumes: - name: my-dynamic-volume persistentVolumeClaim: claimName: my-dynamic-pvc
apiVersion: "serving.kserve.io/v1beta1" kind: "InferenceService" metadata: name: "sklearn-iris" spec: predictor: model: runtime: kserve-mlserver modelFormat: name: sklearn storageUri: "gs://kfserving-examples/models/sklearn/1.0/model" volumeMounts: - name: my-dynamic-volume mountPath: /tmp/data volumes: - name: my-dynamic-volume persistentVolumeClaim: claimName: my-dynamic-pvc
Copy to Clipboard Copied!
5.3.7. Disabling KServe dependencies
If you have not enabled the KServe component (that is, you set the value of the managementState
field to Removed
), you must also disable the dependent Service Mesh component to avoid errors.
Prerequisites
- You have used the OpenShift command-line interface (CLI) or web console to disable the KServe component.
Procedure
- Log in to the OpenShift web console as a cluster administrator.
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. Disable the OpenShift Service Mesh component as follows:
- Click the DSC Initialization tab.
- Click the default-dsci object.
- Click the YAML tab.
In the
spec
section, add theserviceMesh
component (if it is not already present) and configure themanagementState
field as shown:spec: serviceMesh: managementState: Removed
spec: serviceMesh: managementState: Removed
Copy to Clipboard Copied! - Click Save.
Verification
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. The Operator details page opens.
- In the Conditions section, confirm that there is no ReconcileComplete condition with a status value of Unknown.
5.4. Adding an authorization provider for the single-model serving platform
You can add Authorino as an authorization provider for the single-model serving platform. Adding an authorization provider allows you to enable token authentication for models that you deploy on the platform, which ensures that only authorized parties can make inference requests to the models.
The method that you use to add Authorino as an authorization provider depends on how you install the single-model serving platform. The installation options for the platform are described as follows:
- Automated installation
If you have not already created a
ServiceMeshControlPlane
orKNativeServing
resource on your OpenShift cluster, you can configure the Red Hat OpenShift AI Operator to install KServe and its dependencies. You can include Authorino as part of the automated installation process.For more information about automated installation, including Authorino, see Configuring automated installation of KServe.
- Manual installation
If you have already created a
ServiceMeshControlPlane
orKNativeServing
resource on your OpenShift cluster, you cannot configure the Red Hat OpenShift AI Operator to install KServe and its dependencies. In this situation, you must install KServe manually. You must also manually configure Authorino.For more information about manual installation, including Authorino, see Manually installing KServe.
NoteYou can run KServe in
Unmanaged
mode during manual installations of the single-model serving platform. This mode is useful when you need more control over KServe components, such as modifying resource limits for the KServe controller.
5.4.1. Manually adding an authorization provider
You can add Authorino as an authorization provider for the single-model serving platform. Adding an authorization provider allows you to enable token authentication for models that you deploy on the platform, which ensures that only authorized parties can make inference requests to the models.
To manually add Authorino as an authorization provider, you must install the Red Hat - Authorino
Operator, create an Authorino instance, and then configure the OpenShift Service Mesh and KServe components to use the instance.
To manually add an authorization provider, you must make configuration updates to your OpenShift Service Mesh instance. To ensure that your OpenShift Service Mesh instance remains in a supported state, make only the updates shown in this section.
Prerequisites
- You have reviewed the options for adding Authorino as an authorization provider and identified manual installation as the appropriate option. See Adding an authorization provider.
- You have manually installed KServe and its dependencies, including OpenShift Service Mesh. See Manually installing KServe.
-
When you manually installed KServe, you set the value of the
managementState
field for theserviceMesh
component toUnmanaged
. This setting is required for manually adding Authorino. See Installing KServe.
5.4.2. Installing the Red Hat Authorino Operator
Before you can add Autorino as an authorization provider, you must install the Red Hat - Authorino
Operator on your OpenShift cluster.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
Procedure
- Log in to the OpenShift web console as a cluster administrator.
-
In the web console, click Operators
OperatorHub. - On the OperatorHub page, in the Filter by keyword field, type Red Hat - Authorino.
- Click the Red Hat - Authorino Operator.
- On the Red Hat - Authorino Operator page, review the Operator information and then click Install.
On the Install Operator page, keep the default values for Installation mode, Installed Namespace and Update Approval.
- For Update channel, select Stable.
- For Version, select 1.2.1 or later.
- Click Install.
Verification
In the OpenShift web console, click Operators
Installed Operators and confirm that the Red Hat - Authorino
Operator shows one of the following statuses:-
Installing
- installation is in progress; wait for this to change toSucceeded
. This might take several minutes. -
Succeeded
- installation is successful.
-
5.4.3. Creating an Authorino instance
When you have installed the Red Hat - Authorino
Operator on your OpenShift cluster, you must create an Authorino instance.
Prerequisites
-
You have installed the
Red Hat - Authorino
Operator. You have privileges to add resources to the project in which your OpenShift Service Mesh instance was created. See Creating an OpenShift Service Mesh instance.
For more information about OpenShift permissions, see Using RBAC to define and apply permissions.
Procedure
- Open a new terminal window.
Log in to the OpenShift command-line interface (CLI) as follows:
oc login <openshift_cluster_url> -u <username> -p <password>
$ oc login <openshift_cluster_url> -u <username> -p <password>
Copy to Clipboard Copied! Create a namespace to install the Authorino instance.
oc new-project <namespace_for_authorino_instance>
$ oc new-project <namespace_for_authorino_instance>
Copy to Clipboard Copied! NoteThe automated installation process creates a namespace called
redhat-ods-applications-auth-provider
for the Authorino instance. Consider using the same namespace name for the manual installation.To enroll the new namespace for the Authorino instance in your existing OpenShift Service Mesh instance, create a new YAML file with the following contents:
apiVersion: maistra.io/v1 kind: ServiceMeshMember metadata: name: default namespace: <namespace_for_authorino_instance> spec: controlPlaneRef: namespace: <namespace_for_service_mesh_instance> name: <name_of_service_mesh_instance>
apiVersion: maistra.io/v1 kind: ServiceMeshMember metadata: name: default namespace: <namespace_for_authorino_instance> spec: controlPlaneRef: namespace: <namespace_for_service_mesh_instance> name: <name_of_service_mesh_instance>
Copy to Clipboard Copied! - Save the YAML file.
Create the
ServiceMeshMember
resource on your cluster.oc create -f <file_name>.yaml
$ oc create -f <file_name>.yaml
Copy to Clipboard Copied! To configure an Authorino instance, create a new YAML file as shown in the following example:
apiVersion: operator.authorino.kuadrant.io/v1beta1 kind: Authorino metadata: name: authorino namespace: <namespace_for_authorino_instance> spec: authConfigLabelSelectors: security.opendatahub.io/authorization-group=default clusterWide: true listener: tls: enabled: false oidcServer: tls: enabled: false
apiVersion: operator.authorino.kuadrant.io/v1beta1 kind: Authorino metadata: name: authorino namespace: <namespace_for_authorino_instance> spec: authConfigLabelSelectors: security.opendatahub.io/authorization-group=default clusterWide: true listener: tls: enabled: false oidcServer: tls: enabled: false
Copy to Clipboard Copied! - Save the YAML file.
Create the
Authorino
resource on your cluster.oc create -f <file_name>.yaml
$ oc create -f <file_name>.yaml
Copy to Clipboard Copied! Patch the Authorino deployment to inject an Istio sidecar, which makes the Authorino instance part of your OpenShift Service Mesh instance.
oc patch deployment <name_of_authorino_instance> -n <namespace_for_authorino_instance> -p '{"spec": {"template":{"metadata":{"labels":{"sidecar.istio.io/inject":"true"}}}} }'
$ oc patch deployment <name_of_authorino_instance> -n <namespace_for_authorino_instance> -p '{"spec": {"template":{"metadata":{"labels":{"sidecar.istio.io/inject":"true"}}}} }'
Copy to Clipboard Copied!
Verification
Confirm that the Authorino instance is running as follows:
Check the pods (and containers) that are running in the namespace that you created for the Authorino instance, as shown in the following example:
oc get pods -n redhat-ods-applications-auth-provider -o="custom-columns=NAME:.metadata.name,STATUS:.status.phase,CONTAINERS:.spec.containers[*].name"
$ oc get pods -n redhat-ods-applications-auth-provider -o="custom-columns=NAME:.metadata.name,STATUS:.status.phase,CONTAINERS:.spec.containers[*].name"
Copy to Clipboard Copied! Confirm that the output resembles the following example:
NAME STATUS CONTAINERS authorino-6bc64bd667-kn28z Running authorino,istio-proxy
NAME STATUS CONTAINERS authorino-6bc64bd667-kn28z Running authorino,istio-proxy
Copy to Clipboard Copied! As shown in the example, there is a single running pod for the Authorino instance. The pod has containers for Authorino and for the Istio sidecar that you injected.
5.4.4. Configuring an OpenShift Service Mesh instance to use Authorino
When you have created an Authorino instance, you must configure your OpenShift Service Mesh instance to use Authorino as an authorization provider.
To ensure that your OpenShift Service Mesh instance remains in a supported state, make only the configuration updates shown in the following procedure.
Prerequisites
- You have created an Authorino instance and enrolled the namespace for the Authorino instance in your OpenShift Service Mesh instance.
- You have privileges to modify the OpenShift Service Mesh instance. See Creating an OpenShift Service Mesh instance.
Procedure
In a terminal window, if you are not already logged in to your OpenShift cluster as a user that has privileges to update the OpenShift Service Mesh instance, log in to the OpenShift CLI as shown in the following example:
oc login <openshift_cluster_url> -u <username> -p <password>
$ oc login <openshift_cluster_url> -u <username> -p <password>
Copy to Clipboard Copied! Create a new YAML file with the following contents:
spec: techPreview: meshConfig: extensionProviders: - name: redhat-ods-applications-auth-provider envoyExtAuthzGrpc: service: <name_of_authorino_instance>-authorino-authorization.<namespace_for_authorino_instance>.svc.cluster.local port: 50051
spec: techPreview: meshConfig: extensionProviders: - name: redhat-ods-applications-auth-provider envoyExtAuthzGrpc: service: <name_of_authorino_instance>-authorino-authorization.<namespace_for_authorino_instance>.svc.cluster.local port: 50051
Copy to Clipboard Copied! - Save the YAML file.
Use the
oc patch
command to apply the YAML file to your OpenShift Service Mesh instance.oc patch smcp <name_of_service_mesh_instance> --type merge -n <namespace_for_service_mesh_instance> --patch-file <file_name>.yaml
$ oc patch smcp <name_of_service_mesh_instance> --type merge -n <namespace_for_service_mesh_instance> --patch-file <file_name>.yaml
Copy to Clipboard Copied! ImportantYou can apply the configuration shown as a patch only if you have not already specified other extension providers in your OpenShift Service Mesh instance. If you have already specified other extension providers, you must manually edit your
ServiceMeshControlPlane
resource to add the configuration.
Verification
Verify that your Authorino instance has been added as an extension provider in your OpenShift Service Mesh configuration as follows:
Inspect the
ConfigMap
object for your OpenShift Service Mesh instance:oc get configmap istio-<name_of_service_mesh_instance> -n <namespace_for_service_mesh_instance> --output=jsonpath={.data.mesh}
$ oc get configmap istio-<name_of_service_mesh_instance> -n <namespace_for_service_mesh_instance> --output=jsonpath={.data.mesh}
Copy to Clipboard Copied! Confirm that you see output similar to the following example, which shows that the Authorino instance has been successfully added as an extension provider.
defaultConfig: discoveryAddress: istiod-data-science-smcp.istio-system.svc:15012 proxyMetadata: ISTIO_META_DNS_AUTO_ALLOCATE: "true" ISTIO_META_DNS_CAPTURE: "true" PROXY_XDS_VIA_AGENT: "true" terminationDrainDuration: 35s tracing: {} dnsRefreshRate: 300s enablePrometheusMerge: true extensionProviders: - envoyExtAuthzGrpc: port: 50051 service: authorino-authorino-authorization.opendatahub-auth-provider.svc.cluster.local name: opendatahub-auth-provider ingressControllerMode: "OFF" rootNamespace: istio-system trustDomain: null%
defaultConfig: discoveryAddress: istiod-data-science-smcp.istio-system.svc:15012 proxyMetadata: ISTIO_META_DNS_AUTO_ALLOCATE: "true" ISTIO_META_DNS_CAPTURE: "true" PROXY_XDS_VIA_AGENT: "true" terminationDrainDuration: 35s tracing: {} dnsRefreshRate: 300s enablePrometheusMerge: true extensionProviders: - envoyExtAuthzGrpc: port: 50051 service: authorino-authorino-authorization.opendatahub-auth-provider.svc.cluster.local name: opendatahub-auth-provider ingressControllerMode: "OFF" rootNamespace: istio-system trustDomain: null%
Copy to Clipboard Copied!
5.4.5. Configuring authorization for KServe
To configure the single-model serving platform to use Authorino, you must create a global AuthorizationPolicy
resource that is applied to the KServe predictor pods that are created when you deploy a model. In addition, to account for the multiple network hops that occur when you make an inference request to a model, you must create an EnvoyFilter
resource that continually resets the HTTP host header to the one initially included in the inference request.
Prerequisites
- You have created an Authorino instance and configured your OpenShift Service Mesh to use it.
- You have privileges to update the KServe deployment on your cluster.
- You have privileges to add resources to the project in which your OpenShift Service Mesh instance was created. See Creating an OpenShift Service Mesh instance.
Procedure
In a terminal window, if you are not already logged in to your OpenShift cluster as a user that has privileges to update the KServe deployment, log in to the OpenShift CLI as shown in the following example:
oc login <openshift_cluster_url> -u <username> -p <password>
$ oc login <openshift_cluster_url> -u <username> -p <password>
Copy to Clipboard Copied! Create a new YAML file with the following contents:
apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: kserve-predictor spec: action: CUSTOM provider: name: redhat-ods-applications-auth-provider rules: - to: - operation: notPaths: - /healthz - /debug/pprof/ - /metrics - /wait-for-drain selector: matchLabels: component: predictor
apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: kserve-predictor spec: action: CUSTOM provider: name: redhat-ods-applications-auth-provider
1 rules: - to: - operation: notPaths: - /healthz - /debug/pprof/ - /metrics - /wait-for-drain selector: matchLabels: component: predictor
Copy to Clipboard Copied! - 1
- The name that you specify must match the name of the extension provider that you added to your OpenShift Service Mesh instance.
- Save the YAML file.
Create the
AuthorizationPolicy
resource in the namespace for your OpenShift Service Mesh instance.oc create -n <namespace_for_service_mesh_instance> -f <file_name>.yaml
$ oc create -n <namespace_for_service_mesh_instance> -f <file_name>.yaml
Copy to Clipboard Copied! Create another new YAML file with the following contents:
apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: activator-host-header spec: priority: 20 workloadSelector: labels: component: predictor configPatches: - applyTo: HTTP_FILTER match: listener: filterChain: filter: name: envoy.filters.network.http_connection_manager patch: operation: INSERT_BEFORE value: name: envoy.filters.http.lua typed_config: '@type': type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua inlineCode: | function envoy_on_request(request_handle) local headers = request_handle:headers() if not headers then return end local original_host = headers:get("k-original-host") if original_host then port_seperator = string.find(original_host, ":", 7) if port_seperator then original_host = string.sub(original_host, 0, port_seperator-1) end headers:replace('host', original_host) end end
apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: activator-host-header spec: priority: 20 workloadSelector: labels: component: predictor configPatches: - applyTo: HTTP_FILTER match: listener: filterChain: filter: name: envoy.filters.network.http_connection_manager patch: operation: INSERT_BEFORE value: name: envoy.filters.http.lua typed_config: '@type': type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua inlineCode: | function envoy_on_request(request_handle) local headers = request_handle:headers() if not headers then return end local original_host = headers:get("k-original-host") if original_host then port_seperator = string.find(original_host, ":", 7) if port_seperator then original_host = string.sub(original_host, 0, port_seperator-1) end headers:replace('host', original_host) end end
Copy to Clipboard Copied! The
EnvoyFilter
resource shown continually resets the HTTP host header to the one initially included in any inference request.Create the
EnvoyFilter
resource in the namespace for your OpenShift Service Mesh instance.oc create -n <namespace_for_service_mesh_instance> -f <file_name>.yaml
$ oc create -n <namespace_for_service_mesh_instance> -f <file_name>.yaml
Copy to Clipboard Copied!
Verification
Check that the
AuthorizationPolicy
resource was successfully created.oc get authorizationpolicies -n <namespace_for_service_mesh_instance>
$ oc get authorizationpolicies -n <namespace_for_service_mesh_instance>
Copy to Clipboard Copied! Confirm that you see output similar to the following example:
NAME AGE kserve-predictor 28h
NAME AGE kserve-predictor 28h
Copy to Clipboard Copied! Check that the
EnvoyFilter
resource was successfully created.oc get envoyfilter -n <namespace_for_service_mesh_instance>
$ oc get envoyfilter -n <namespace_for_service_mesh_instance>
Copy to Clipboard Copied! Confirm that you see output similar to the following example:
NAME AGE activator-host-header 28h
NAME AGE activator-host-header 28h
Copy to Clipboard Copied!