Chapter 5. Installing the single-model serving platform


5.1. About the single-model serving platform

For deploying large models such as large language models (LLMs), OpenShift AI includes a single-model serving platform that is based on the KServe component. To install the single-model serving platform, the following components are required:

  • KServe: A Kubernetes custom resource definition (CRD) that orchestrates model serving for all types of models. KServe includes model-serving runtimes that implement the loading of given types of model servers. KServe also handles the lifecycle of the deployment object, storage access, and networking setup.
  • Red Hat OpenShift Serverless: A cloud-native development model that allows for serverless deployments of models. OpenShift Serverless is based on the open source Knative project.
  • Red Hat OpenShift Service Mesh: A service mesh networking layer that manages traffic flows and enforces access policies. OpenShift Service Mesh is based on the open source Istio project.

    Note

    Currently, only OpenShift Service Mesh v2 is supported. For more information, see Supported Configurations.

You can install the single-model serving platform manually or in an automated fashion:

Automated installation
If you have not already created a ServiceMeshControlPlane or KNativeServing resource on your OpenShift cluster, you can configure the Red Hat OpenShift AI Operator to install KServe and configure its dependencies. For more information, see Configuring automated installation of KServe
Manual installation
If you have already created a ServiceMeshControlPlane or KNativeServing resource on your OpenShift cluster, you cannot configure the Red Hat OpenShift AI Operator to install KServe and configure its dependencies. In this situation, you must install KServe manually. For more information, see Manually installing KServe.

5.2. Configuring automated installation of KServe

If you have not already created a ServiceMeshControlPlane or KNativeServing resource on your OpenShift cluster, you can configure the Red Hat OpenShift AI Operator to install KServe and configure its dependencies.

You can configure KServe in advanced or standard deployment mode. For more information, see About KServe deployment modes. If you configure KServe for advanced deployment mode, you can set up your data science project to serve models in both advanced and standard deployment mode. However, if you configure KServe for only standard deployment mode, you can only use standard deployment mode.

Important

If you have created a ServiceMeshControlPlane or KNativeServing resource on your cluster, the Red Hat OpenShift AI Operator cannot install KServe and configure its dependencies and the installation does not proceed. In this situation, you must follow the manual installation instructions to install KServe.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • Your cluster has a node with 4 CPUs and 16 GB memory.
  • You have downloaded and installed the OpenShift command-line interface (CLI). For more information, see Installing the OpenShift CLI.
  • (Advanced deployment mode): You have installed the Red Hat OpenShift Service Mesh Operator and dependent Operators.

    Note

    To enable automated installation of KServe, install only the required Operators for Red Hat OpenShift Service Mesh. Do not perform any additional configuration or create a ServiceMeshControlPlane resource.

  • (Advanced deployment mode): You have installed the Red Hat OpenShift Serverless Operator.

    Note

    To enable automated installation of KServe, install only the Red Hat OpenShift Serverless Operator. Do not perform any additional configuration or create a KNativeServing resource.

  • You have installed the Red Hat OpenShift AI Operator and created a DataScienceCluster object.
  • (Advanced deployment mode): To add Authorino as an authorization provider so that you can enable token authentication for deployed models, you have installed the Red Hat - Authorino Operator. See Installing the Authorino Operator.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the web console, click Operators Installed Operators and then click the Red Hat OpenShift AI Operator.
  3. Install OpenShift Service Mesh as follows:

    1. Click the DSC Initialization tab.
    2. Click the default-dsci object.
    3. Click the YAML tab.
    4. (Advanced deployment mode): In the spec section, validate that the value of the managementState field for the serviceMesh component is set to Managed, as shown:

      Copy to Clipboard Toggle word wrap
      spec:
       applicationsNamespace: redhat-ods-applications
       monitoring:
         managementState: Managed
         namespace: redhat-ods-monitoring
       serviceMesh:
         controlPlane:
           metricsCollection: Istio
           name: data-science-smcp
           namespace: istio-system
         managementState: Managed
    5. (Standard deployment mode): In the spec section, validate that the value of the managementState field for the serviceMesh component is set to Removed, as shown:

      Copy to Clipboard Toggle word wrap
      spec:
       applicationsNamespace: redhat-ods-applications
       monitoring:
         managementState: Managed
         namespace: redhat-ods-monitoring
       serviceMesh:
         controlPlane:
           metricsCollection: Istio
           name: data-science-smcp
           namespace: istio-system
         managementState: Removed
      Note

      Do not change the istio-system namespace that is specified for the serviceMesh component by default. Other namespace values are not supported.

    6. Click Save.

      Based on the configuration you added to the DSCInitialization object, the Red Hat OpenShift AI Operator installs OpenShift Service Mesh.

  4. (Standard deployment mode only): Install KServe as follows:

    1. In the web console, click Operators Installed Operators and then click the Red Hat OpenShift AI Operator.
    2. Click the Data Science Cluster tab.
    3. Click the default-dsc DSC object.
    4. Click the YAML tab.
    5. In the spec.components section, configure the kserve component as shown.

      Copy to Clipboard Toggle word wrap
      kserve:
          defaultDeploymentMode: RawDeployment
          RawDeploymentServiceConfig: Headed 
      1
      
          managementState: Managed
          serving:
            managementState: Removed 
      2
      
            name: knative-serving
    6. Click Save.

      The preceding configuration installs KServe in standard deployment mode, which is based on the KServe RawDeployment feature. In this configuration, observe the following details:

      1
      The configuration shown uses the Headed configuration to let the cluster perform normal load balancing over workload replicas. For environments inference request load balancing is done on the client side, set RawDeploymentServiceConfig to Headless.
      2
      The managementState is set to Removed.
  5. (Advanced deployment mode): Install both KServe and OpenShift Serverless as follows:

    1. In the web console, click Operators Installed Operators and then click the Red Hat OpenShift AI Operator.
    2. Click the Data Science Cluster tab.
    3. Click the default-dsc DSC object.
    4. Click the YAML tab.
    5. In the spec.components section, configure the kserve component as shown.

      Copy to Clipboard Toggle word wrap
      spec:
       components:
         kserve:
           managementState: Managed
           defaultDeploymentMode: Serverless 
      1
      
           RawDeploymentServiceConfig: Headed 
      2
      
           serving:
             ingressGateway:
               certificate:
                 secretName: knative-serving-cert 
      3
      
                 type: OpenshiftDefaultIngress 
      4
      
             managementState: Managed
             name: knative-serving
    6. Click Save.

      The preceding configuration creates an ingress gateway for OpenShift Serverless to receive traffic from OpenShift Service Mesh. In this configuration, both standard and advanced modes can be used.

      1
      The configuration shown uses the default deployment mode that is selected after configuring KServe. You can set the default value when you create and deploy a model using KServe. To use standard mode as the default, set defaultDeploymentMode to RawDeployment. To use advanced mode as the default, set defaultDeploymentMode to Serverless.
      2
      The configuration shown uses the Headed configuration to let the cluster perform normal load balancing over workload replicas. For environments inference request load balancing is done on the client side, set RawDeploymentServiceConfig to Headless.
      3
      The configuration shown uses the default ingress certificate configured for OpenShift to secure incoming traffic to your OpenShift cluster and stores the certificate in the knative-serving-cert secret that is specified in the secretName field. The secretName field can only be set at the time of installation. The default value of the secretName field is knative-serving-cert. Subsequent changes to the certificate secret must be made manually. If you did not use the default secretName value during installation, create a new secret named knative-serving-cert in the istio-system namespace, and then restart the istiod-datascience-smcp-<suffix> pod.
      4
      You can specify the following certificate types by updating the value of the type field:
      • Provided
      • SelfSigned
      • OpenshiftDefaultIngress

        To use a self-signed certificate or to provide your own, update the value of the secretName field to specify your secret name and change the value of the type field to SelfSigned or Provided.

        Note

        If you provide your own certificate, the certificate must specify the domain name used by the ingress controller of your OpenShift cluster. You can check this value by running the following command:

        $ oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}'

      • You must set the value of the managementState field to Managed for both the kserve and serving components. Setting kserve.managementState to Managed triggers automated installation of KServe. Setting serving.managementState to Managed triggers automated installation of OpenShift Serverless. However, installation of OpenShift Serverless will not be triggered if kserve.managementState is not also set to Managed.

Verification

  • Verify installation of KServe as follows:

    • In the web console, click Workloads Pods.
    • From the project list, select redhat-ods-applications.This is the project in which OpenShift AI components are installed, including KServe.
    • Confirm that the project includes a running pod for the KServe controller manager, similar to the following example:

      Copy to Clipboard Toggle word wrap
      NAME                                          READY   STATUS    RESTARTS   AGE
      kserve-controller-manager-7fbb7bccd4-t4c5g    1/1     Running   0          22h
      odh-model-controller-6c4759cc9b-cftmk         1/1     Running   0          129m
      odh-model-controller-6c4759cc9b-ngj8b         1/1     Running   0          129m
      odh-model-controller-6c4759cc9b-vnhq5         1/1     Running   0          129m
  • (Advanced deployment mode only): Verify installation of OpenShift Service Mesh as follows:

    • In the web console, click Workloads Pods.
    • From the project list, select istio-system. This is the project in which OpenShift Service Mesh is installed.
    • Confirm that there are running pods for the service mesh control plane, ingress gateway, and egress gateway. These pods have the naming patterns shown in the following example:

      Copy to Clipboard Toggle word wrap
      NAME                                      		  READY     STATUS    RESTARTS   AGE
      istio-egressgateway-7c46668687-fzsqj      	 	  1/1       Running   0          22h
      istio-ingressgateway-77f94d8f85-fhsp9      		  1/1       Running   0          22h
      istiod-data-science-smcp-cc8cfd9b8-2rkg4  		  1/1       Running   0          22h
  • (Advanced deployment mode only): Verify installation of OpenShift Serverless as follows:

    • In the web console, click Workloads Pods.
    • From the project list, select knative-serving. This is the project in which OpenShift Serverless is installed.
    • Confirm that there are numerous running pods in the knative-serving project, including activator, autoscaler, controller, and domain mapping pods, as well as pods for the Knative Istio controller (which controls the integration of OpenShift Serverless and OpenShift Service Mesh). An example is shown.

      Copy to Clipboard Toggle word wrap
      NAME                                     	READY     STATUS    RESTARTS  AGE
      activator-7586f6f744-nvdlb               	2/2       Running   0         22h
      activator-7586f6f744-sd77w               	2/2       Running   0         22h
      autoscaler-764fdf5d45-p2v98             	2/2       Running   0         22h
      autoscaler-764fdf5d45-x7dc6              	2/2       Running   0         22h
      autoscaler-hpa-7c7c4cd96d-2lkzg          	1/1       Running   0         22h
      autoscaler-hpa-7c7c4cd96d-gks9j         	1/1       Running   0         22h
      controller-5fdfc9567c-6cj9d              	1/1       Running   0         22h
      controller-5fdfc9567c-bf5x7              	1/1       Running   0         22h
      domain-mapping-56ccd85968-2hjvp          	1/1       Running   0         22h
      domain-mapping-56ccd85968-lg6mw          	1/1       Running   0         22h
      domainmapping-webhook-769b88695c-gp2hk   	1/1       Running   0         22h
      domainmapping-webhook-769b88695c-npn8g   	1/1       Running   0         22h
      net-istio-controller-7dfc6f668c-jb4xk    	1/1       Running   0         22h
      net-istio-controller-7dfc6f668c-jxs5p    	1/1       Running   0         22h
      net-istio-webhook-66d8f75d6f-bgd5r       	1/1       Running   0         22h
      net-istio-webhook-66d8f75d6f-hld75      	1/1       Running   0         22h
      webhook-7d49878bc4-8xjbr                 	1/1       Running   0         22h
      webhook-7d49878bc4-s4xx4                 	1/1       Running   0         22h

5.3. Manually installing KServe

If you have already installed the Red Hat OpenShift Service Mesh Operator and created a ServiceMeshControlPlane resource or if you have installed the Red Hat OpenShift Serverless Operator and created a KNativeServing resource, the Red Hat OpenShift AI Operator cannot install KServe and configure its dependencies. In this situation, you must install KServe manually.

Important

The procedures in this section show how to perform a new installation of KServe and its dependencies and are intended as a complete installation and configuration reference. If you have already installed and configured OpenShift Service Mesh or OpenShift Serverless, you might not need to follow all steps. If you are unsure about what updates to apply to your existing configuration to use KServe, contact Red Hat Support.

5.3.1. Installing KServe dependencies

Before you install KServe, you must install and configure some dependencies. Specifically, you must create Red Hat OpenShift Service Mesh and Knative Serving instances and then configure secure gateways for Knative Serving.

Note

Currently, only OpenShift Service Mesh v2 is supported. For more information, see Supported Configurations.

5.3.2. Creating an OpenShift Service Mesh instance

The following procedure shows how to create a Red Hat OpenShift Service Mesh instance.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • Your cluster has a node with 4 CPUs and 16 GB memory.
  • You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.
  • You have installed the Red Hat OpenShift Service Mesh Operator and dependent Operators.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:

    Copy to Clipboard Toggle word wrap
    $ oc login <openshift_cluster_url> -u <admin_username> -p <password>
  2. Create the required namespace for Red Hat OpenShift Service Mesh.

    Copy to Clipboard Toggle word wrap
    $ oc create ns istio-system

    You see the following output:

    Copy to Clipboard Toggle word wrap
    namespace/istio-system created
  3. Define a ServiceMeshControlPlane object in a YAML file named smcp.yaml with the following contents:

    Copy to Clipboard Toggle word wrap
    apiVersion: maistra.io/v2
    kind: ServiceMeshControlPlane
    metadata:
      name: minimal
      namespace: istio-system
    spec:
      tracing:
        type: None
      addons:
        grafana:
          enabled: false
        kiali:
          name: kiali
          enabled: false
        prometheus:
          enabled: false
        jaeger:
          name: jaeger
      security:
        dataPlane:
          mtls: true
        identity:
          type: ThirdParty
      techPreview:
        meshConfig:
          defaultConfig:
            terminationDrainDuration: 35s
      gateways:
        ingress:
          service:
            metadata:
              labels:
                knative: ingressgateway
      proxy:
        networking:
          trafficControl:
            inbound:
              excludedPorts:
                - 8444
                - 8022

    For more information about the values in the YAML file, see the Service Mesh control plane configuration reference.

  4. Create the service mesh control plane.

    Copy to Clipboard Toggle word wrap
    $ oc apply -f smcp.yaml

Verification

  • Verify creation of the service mesh instance as follows:

    • In the OpenShift CLI, enter the following command:

      Copy to Clipboard Toggle word wrap
      $ oc get pods -n istio-system

      The preceding command lists all running pods in the istio-system project. This is the project in which OpenShift Service Mesh is installed.

    • Confirm that there are running pods for the service mesh control plane, ingress gateway, and egress gateway. These pods have the following naming patterns:

      Copy to Clipboard Toggle word wrap
      NAME                                          READY   STATUS   	  RESTARTS    AGE
      istio-egressgateway-7c46668687-fzsqj          1/1     Running     0           22h
      istio-ingressgateway-77f94d8f85-fhsp9         1/1     Running     0           22h
      istiod-data-science-smcp-cc8cfd9b8-2rkg4      1/1     Running     0           22h

5.3.3. Creating a Knative Serving instance

The following procedure shows how to install Knative Serving and then create an instance.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • Your cluster has a node with 4 CPUs and 16 GB memory.
  • You have downloaded and installed the OpenShift command-line interface (CLI). Installing the OpenShift CLI.
  • You have created a Red Hat OpenShift Service Mesh instance.
  • You have installed the Red Hat OpenShift Serverless Operator.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:

    Copy to Clipboard Toggle word wrap
    $ oc login <openshift_cluster_url> -u <admin_username> -p <password>
  2. Check whether the required project (that is, namespace) for Knative Serving already exists.

    Copy to Clipboard Toggle word wrap
    $ oc get ns knative-serving

    If the project exists, you see output similar to the following example:

    Copy to Clipboard Toggle word wrap
    NAME              STATUS   AGE
    knative-serving   Active   4d20h
  3. If the knative-serving project doesn’t already exist, create it.

    Copy to Clipboard Toggle word wrap
    $ oc create ns knative-serving

    You see the following output:

    Copy to Clipboard Toggle word wrap
    namespace/knative-serving created
  4. Define a ServiceMeshMember object in a YAML file called default-smm.yaml with the following contents:

    Copy to Clipboard Toggle word wrap
    apiVersion: maistra.io/v1
    kind: ServiceMeshMember
    metadata:
      name: default
      namespace: knative-serving
    spec:
      controlPlaneRef:
        namespace: istio-system
        name: minimal
  5. Create the ServiceMeshMember object in the istio-system namespace.

    Copy to Clipboard Toggle word wrap
    $ oc apply -f default-smm.yaml

    You see the following output:

    Copy to Clipboard Toggle word wrap
    servicemeshmember.maistra.io/default created
  6. Define a KnativeServing object in a YAML file called knativeserving-istio.yaml with the following contents:

    Copy to Clipboard Toggle word wrap
    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
    metadata:
      name: knative-serving
      namespace: knative-serving
      annotations:
        serverless.openshift.io/default-enable-http2: "true"
    spec:
      workloads:
        - name: net-istio-controller
          env:
            - container: controller
              envVars:
                - name: ENABLE_SECRET_INFORMER_FILTERING_BY_CERT_UID
                  value: 'true'
        - annotations:
            sidecar.istio.io/inject: "true" 
    1
    
            sidecar.istio.io/rewriteAppHTTPProbers: "true" 
    2
    
          name: activator
        - annotations:
            sidecar.istio.io/inject: "true"
            sidecar.istio.io/rewriteAppHTTPProbers: "true"
          name: autoscaler
      ingress:
        istio:
          enabled: true
      config:
        features:
          kubernetes.podspec-affinity: enabled
          kubernetes.podspec-nodeselector: enabled
          kubernetes.podspec-tolerations: enabled

    The preceding file defines a custom resource (CR) for a KnativeServing object. The CR also adds the following actions to each of the activator and autoscaler pods:

    1
    Injects an Istio sidecar to the pod. This makes the pod part of the service mesh.
    2
    Enables the Istio sidecar to rewrite the HTTP liveness and readiness probes for the pod.
    Note

    If you configure a custom domain for a Knative service, you can use a TLS certificate to secure the mapped service. To do this, you must create a TLS secret, and then update the DomainMapping CR to use the TLS secret that you have created. For more information, see Securing a mapped service using a TLS certificate in the Red Hat OpenShift Serverless documentation.

  7. Create the KnativeServing object in the specified knative-serving namespace.

    Copy to Clipboard Toggle word wrap
    $ oc apply -f knativeserving-istio.yaml

    You see the following output:

    Copy to Clipboard Toggle word wrap
    knativeserving.operator.knative.dev/knative-serving created

Verification

  • Review the default ServiceMeshMemberRoll object in the istio-system namespace.

    Copy to Clipboard Toggle word wrap
    $ oc describe smmr default -n istio-system

    In the description of the ServiceMeshMemberRoll object, locate the Status.Members field and confirm that it includes the knative-serving namespace.

  • Verify creation of the Knative Serving instance as follows:

    • In the OpenShift CLI, enter the following command:

      Copy to Clipboard Toggle word wrap
      $ oc get pods -n knative-serving

      The preceding command lists all running pods in the knative-serving project. This is the project in which you created the Knative Serving instance.

    • Confirm that there are numerous running pods in the knative-serving project, including activator, autoscaler, controller, and domain mapping pods, as well as pods for the Knative Istio controller, which controls the integration of OpenShift Serverless and OpenShift Service Mesh. An example is shown.

      Copy to Clipboard Toggle word wrap
      NAME                                     	READY       STATUS    	RESTARTS   	AGE
      activator-7586f6f744-nvdlb               	2/2         Running   	0          	22h
      activator-7586f6f744-sd77w               	2/2         Running   	0          	22h
      autoscaler-764fdf5d45-p2v98             	2/2         Running   	0          	22h
      autoscaler-764fdf5d45-x7dc6              	2/2         Running   	0          	22h
      autoscaler-hpa-7c7c4cd96d-2lkzg          	1/1         Running   	0          	22h
      autoscaler-hpa-7c7c4cd96d-gks9j         	1/1         Running   	0          	22h
      controller-5fdfc9567c-6cj9d              	1/1         Running   	0          	22h
      controller-5fdfc9567c-bf5x7              	1/1         Running   	0          	22h
      domain-mapping-56ccd85968-2hjvp          	1/1         Running   	0          	22h
      domain-mapping-56ccd85968-lg6mw          	1/1         Running   	0          	22h
      domainmapping-webhook-769b88695c-gp2hk   	1/1         Running     0          	22h
      domainmapping-webhook-769b88695c-npn8g   	1/1         Running   	0          	22h
      net-istio-controller-7dfc6f668c-jb4xk    	1/1         Running   	0          	22h
      net-istio-controller-7dfc6f668c-jxs5p    	1/1         Running   	0          	22h
      net-istio-webhook-66d8f75d6f-bgd5r       	1/1         Running   	0          	22h
      net-istio-webhook-66d8f75d6f-hld75      	1/1         Running   	0          	22h
      webhook-7d49878bc4-8xjbr                 	1/1         Running   	0          	22h
      webhook-7d49878bc4-s4xx4                 	1/1         Running   	0          	22h

5.3.4. Creating secure gateways for Knative Serving

To secure traffic between your Knative Serving instance and the service mesh, you must create secure gateways for your Knative Serving instance.

The following procedure shows how to use OpenSSL version 3 or later to generate a wildcard certificate and key and then use them to create local and ingress gateways for Knative Serving.

Important

If you have your own wildcard certificate and key to specify when configuring the gateways, you can skip to step 11 of this procedure.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.
  • You have created a Red Hat OpenShift Service Mesh instance.
  • You have created a Knative Serving instance.
  • If you intend to generate a wildcard certificate and key, you have downloaded and installed OpenSSL version 3 or later.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:

    Copy to Clipboard Toggle word wrap
    $ oc login <openshift_cluster_url> -u <admin_username> -p <password>
    Important

    If you have your own wildcard certificate and key to specify when configuring the gateways, skip to step 11 of this procedure.

  2. Set environment variables to define base directories for generation of a wildcard certificate and key for the gateways.

    Copy to Clipboard Toggle word wrap
    $ export BASE_DIR=/tmp/kserve
    $ export BASE_CERT_DIR=${BASE_DIR}/certs
  3. Set an environment variable to define the common name used by the ingress controller of your OpenShift cluster.

    Copy to Clipboard Toggle word wrap
    $ export COMMON_NAME=$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}' | awk -F'.' '{print $(NF-1)"."$NF}')
  4. Set an environment variable to define the domain name used by the ingress controller of your OpenShift cluster.

    Copy to Clipboard Toggle word wrap
    $ export DOMAIN_NAME=$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')
  5. Create the required base directories for the certificate generation, based on the environment variables that you previously set.

    Copy to Clipboard Toggle word wrap
    $ mkdir ${BASE_DIR}
    $ mkdir ${BASE_CERT_DIR}
  6. Create the OpenSSL configuration for generation of a wildcard certificate.

    Copy to Clipboard Toggle word wrap
    $ cat <<EOF> ${BASE_DIR}/openssl-san.config
    [ req ]
    distinguished_name = req
    [ san ]
    subjectAltName = DNS:*.${DOMAIN_NAME}
    EOF
  7. Generate a root certificate.

    Copy to Clipboard Toggle word wrap
    $ openssl req -x509 -sha256 -nodes -days 3650 -newkey rsa:2048 \
    -subj "/O=Example Inc./CN=${COMMON_NAME}" \
    -keyout ${BASE_CERT_DIR}/root.key \
    -out ${BASE_CERT_DIR}/root.crt
  8. Generate a wildcard certificate signed by the root certificate.

    Copy to Clipboard Toggle word wrap
    $ openssl req -x509 -newkey rsa:2048 \
    -sha256 -days 3560 -nodes \
    -subj "/CN=${COMMON_NAME}/O=Example Inc." \
    -extensions san -config ${BASE_DIR}/openssl-san.config \
    -CA ${BASE_CERT_DIR}/root.crt \
    -CAkey ${BASE_CERT_DIR}/root.key \
    -keyout ${BASE_CERT_DIR}/wildcard.key  \
    -out ${BASE_CERT_DIR}/wildcard.crt
    
    $ openssl x509 -in ${BASE_CERT_DIR}/wildcard.crt -text
  9. Verify the wildcard certificate.

    Copy to Clipboard Toggle word wrap
    $ openssl verify -CAfile ${BASE_CERT_DIR}/root.crt ${BASE_CERT_DIR}/wildcard.crt
  10. Export the wildcard key and certificate that were created by the script to new environment variables.

    Copy to Clipboard Toggle word wrap
    $ export TARGET_CUSTOM_CERT=${BASE_CERT_DIR}/wildcard.crt
    $ export TARGET_CUSTOM_KEY=${BASE_CERT_DIR}/wildcard.key
  11. Optional: To export your own wildcard key and certificate to new environment variables, enter the following commands:

    Copy to Clipboard Toggle word wrap
    $ export TARGET_CUSTOM_CERT=<path_to_certificate>
    $ export TARGET_CUSTOM_KEY=<path_to_key>
    Note

    In the certificate that you provide, you must specify the domain name used by the ingress controller of your OpenShift cluster. You can check this value by running the following command:

    $ oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}'

  12. Create a TLS secret in the istio-system namespace using the environment variables that you set for the wildcard certificate and key.

    Copy to Clipboard Toggle word wrap
    $ oc create secret tls wildcard-certs --cert=${TARGET_CUSTOM_CERT} --key=${TARGET_CUSTOM_KEY} -n istio-system
  13. Create a gateways.yaml YAML file with the following contents:

    Copy to Clipboard Toggle word wrap
    apiVersion: v1
    kind: Service 
    1
    
    metadata:
      labels:
        experimental.istio.io/disable-gateway-port-translation: "true"
      name: knative-local-gateway
      namespace: istio-system
    spec:
      ports:
        - name: http2
          port: 80
          protocol: TCP
          targetPort: 8081
      selector:
        knative: ingressgateway
      type: ClusterIP
    ---
    apiVersion: networking.istio.io/v1beta1
    kind: Gateway
    metadata:
      name: knative-ingress-gateway 
    2
    
      namespace: knative-serving
    spec:
      selector:
        knative: ingressgateway
      servers:
        - hosts:
            - '*'
          port:
            name: https
            number: 443
            protocol: HTTPS
          tls:
            credentialName: wildcard-certs
            mode: SIMPLE
    ---
    apiVersion: networking.istio.io/v1beta1
    kind: Gateway
    metadata:
     name: knative-local-gateway 
    3
    
     namespace: knative-serving
    spec:
     selector:
       knative: ingressgateway
     servers:
       - port:
           number: 8081
           name: https
           protocol: HTTPS
         tls:
           mode: ISTIO_MUTUAL
         hosts:
           - "*"
    1
    Defines a service in the istio-system namespace for the Knative local gateway.
    2
    Defines an ingress gateway in the knative-serving namespace. The gateway uses the TLS secret you created earlier in this procedure. The ingress gateway handles external traffic to Knative.
    3
    Defines a local gateway for Knative in the knative-serving namespace.
  14. Apply the gateways.yaml file to create the defined resources.

    Copy to Clipboard Toggle word wrap
    $ oc apply -f gateways.yaml

    You see the following output:

    Copy to Clipboard Toggle word wrap
    service/knative-local-gateway created
    gateway.networking.istio.io/knative-ingress-gateway created
    gateway.networking.istio.io/knative-local-gateway created

Verification

  • Review the gateways that you created.

    Copy to Clipboard Toggle word wrap
    $ oc get gateway --all-namespaces

    Confirm that you see the local and ingress gateways that you created in the knative-serving namespace, as shown in the following example:

    Copy to Clipboard Toggle word wrap
    NAMESPACE         	NAME                      	AGE
    knative-serving   	knative-ingress-gateway   	69s
    knative-serving     knative-local-gateway     	2m

5.3.5. Installing KServe

To complete manual installation of KServe, you must install the Red Hat OpenShift AI Operator. Then, you can configure the Operator to install KServe.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • Your cluster has a node with 4 CPUs and 16 GB memory.
  • You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.
  • You have created a Red Hat OpenShift Service Mesh instance.
  • You have created a Knative Serving instance.
  • You have created secure gateways for Knative Serving.
  • You have installed the Red Hat OpenShift AI Operator and created a DataScienceCluster object.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the web console, click Operators Installed Operators and then click the Red Hat OpenShift AI Operator.
  3. For installation of KServe, configure the OpenShift Service Mesh component as follows:

    1. Click the DSC Initialization tab.
    2. Click the default-dsci object.
    3. Click the YAML tab.
    4. In the spec section, add and configure the serviceMesh component as shown:

      Copy to Clipboard Toggle word wrap
      spec:
       serviceMesh:
         managementState: Unmanaged
    5. Click Save.
  4. For installation of KServe, configure the KServe and OpenShift Serverless components as follows:

    1. In the web console, click Operators Installed Operators and then click the Red Hat OpenShift AI Operator.
    2. Click the Data Science Cluster tab.
    3. Click the default-dsc DSC object.
    4. Click the YAML tab.
    5. In the spec.components section, configure the kserve component as shown:

      Copy to Clipboard Toggle word wrap
      spec:
       components:
         kserve:
           managementState: Managed
    6. Within the kserve component, add the serving component, and configure it as shown:

      Copy to Clipboard Toggle word wrap
      spec:
       components:
         kserve:
           managementState: Managed
           serving:
             managementState: Unmanaged
    7. Click Save.

5.3.6. Configuring persistent volume claims (PVC) on KServe

Enable persistent volume claims (PVC) on your inference service so you can provison persistent storage. For more information about PVC, see Understanding persistent storage.

To enable PVC, from the OpenShift AI dashboard, select the Project drop-down and click knative-serving. Then, follow the steps in Enabling PVC support.

Verification

Verify that the inference service allows PVC as follows:

  • In the OpenShift web console, change into the Administrator perspective.
  • Click Home Search.
  • In Resources, search for InferenceService.
  • Click the name of the inference service.
  • Click the YAML tab.
  • Confirm that volumeMounts appears, similar to the following output:

    Copy to Clipboard Toggle word wrap
    apiVersion: "serving.kserve.io/v1beta1"
    kind: "InferenceService"
    metadata:
      name: "sklearn-iris"
    spec:
      predictor:
        model:
          runtime: kserve-mlserver
          modelFormat:
            name: sklearn
          storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
          volumeMounts:
            - name: my-dynamic-volume
              mountPath: /tmp/data
        volumes:
          - name: my-dynamic-volume
            persistentVolumeClaim:
              claimName: my-dynamic-pvc

5.3.7. Disabling KServe dependencies

If you have not enabled the KServe component (that is, you set the value of the managementState field to Removed), you must also disable the dependent Service Mesh component to avoid errors.

Prerequisites

  • You have used the OpenShift command-line interface (CLI) or web console to disable the KServe component.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the web console, click Operators Installed Operators and then click the Red Hat OpenShift AI Operator.
  3. Disable the OpenShift Service Mesh component as follows:

    1. Click the DSC Initialization tab.
    2. Click the default-dsci object.
    3. Click the YAML tab.
    4. In the spec section, add the serviceMesh component (if it is not already present) and configure the managementState field as shown:

      Copy to Clipboard Toggle word wrap
      spec:
       serviceMesh:
         managementState: Removed
    5. Click Save.

Verification

  1. In the web console, click Operators Installed Operators and then click the Red Hat OpenShift AI Operator.

    The Operator details page opens.

  2. In the Conditions section, confirm that there is no ReconcileComplete condition with a status value of Unknown.

5.4. Adding an authorization provider for the single-model serving platform

You can add Authorino as an authorization provider for the single-model serving platform. Adding an authorization provider allows you to enable token authentication for models that you deploy on the platform, which ensures that only authorized parties can make inference requests to the models.

The method that you use to add Authorino as an authorization provider depends on how you install the single-model serving platform. The installation options for the platform are described as follows:

Automated installation

If you have not already created a ServiceMeshControlPlane or KNativeServing resource on your OpenShift cluster, you can configure the Red Hat OpenShift AI Operator to install KServe and its dependencies. You can include Authorino as part of the automated installation process.

For more information about automated installation, including Authorino, see Configuring automated installation of KServe.

Manual installation

If you have already created a ServiceMeshControlPlane or KNativeServing resource on your OpenShift cluster, you cannot configure the Red Hat OpenShift AI Operator to install KServe and its dependencies. In this situation, you must install KServe manually. You must also manually configure Authorino.

For more information about manual installation, including Authorino, see Manually installing KServe.

5.4.1. Manually adding an authorization provider

You can add Authorino as an authorization provider for the single-model serving platform. Adding an authorization provider allows you to enable token authentication for models that you deploy on the platform, which ensures that only authorized parties can make inference requests to the models.

To manually add Authorino as an authorization provider, you must install the Red Hat - Authorino Operator, create an Authorino instance, and then configure the OpenShift Service Mesh and KServe components to use the instance.

Important

To manually add an authorization provider, you must make configuration updates to your OpenShift Service Mesh instance. To ensure that your OpenShift Service Mesh instance remains in a supported state, make only the updates shown in this section.

Prerequisites

  • You have reviewed the options for adding Authorino as an authorization provider and identified manual installation as the appropriate option. See Adding an authorization provider.
  • You have manually installed KServe and its dependencies, including OpenShift Service Mesh. See Manually installing KServe.
  • When you manually installed KServe, you set the value of the managementState field for the serviceMesh component to Unmanaged. This setting is required for manually adding Authorino. See Installing KServe.

5.4.2. Installing the Red Hat Authorino Operator

Before you can add Autorino as an authorization provider, you must install the Red Hat - Authorino Operator on your OpenShift cluster.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the web console, click Operators OperatorHub.
  3. On the OperatorHub page, in the Filter by keyword field, type Red Hat - Authorino.
  4. Click the Red Hat - Authorino Operator.
  5. On the Red Hat - Authorino Operator page, review the Operator information and then click Install.
  6. On the Install Operator page, keep the default values for Update channel, Version, Installation mode, Installed Namespace and Update Approval.
  7. Click Install.

Verification

  • In the OpenShift web console, click Operators Installed Operators and confirm that the Red Hat - Authorino Operator shows one of the following statuses:

    • Installing - installation is in progress; wait for this to change to Succeeded. This might take several minutes.
    • Succeeded - installation is successful.

5.4.3. Creating an Authorino instance

When you have installed the Red Hat - Authorino Operator on your OpenShift cluster, you must create an Authorino instance.

Prerequisites

Procedure

  1. Open a new terminal window.
  2. Log in to the OpenShift command-line interface (CLI) as follows:

    Copy to Clipboard Toggle word wrap
    $ oc login <openshift_cluster_url> -u <username> -p <password>
  3. Create a namespace to install the Authorino instance.

    Copy to Clipboard Toggle word wrap
    $ oc new-project <namespace_for_authorino_instance>
    Note

    The automated installation process creates a namespace called redhat-ods-applications-auth-provider for the Authorino instance. Consider using the same namespace name for the manual installation.

  4. To enroll the new namespace for the Authorino instance in your existing OpenShift Service Mesh instance, create a new YAML file with the following contents:

    Copy to Clipboard Toggle word wrap
    apiVersion: maistra.io/v1
    kind: ServiceMeshMember
    metadata:
      name: default
      namespace: <namespace_for_authorino_instance>
    spec:
      controlPlaneRef:
        namespace: <namespace_for_service_mesh_instance>
        name: <name_of_service_mesh_instance>
  5. Save the YAML file.
  6. Create the ServiceMeshMember resource on your cluster.

    Copy to Clipboard Toggle word wrap
    $ oc create -f <file_name>.yaml
  7. To configure an Authorino instance, create a new YAML file as shown in the following example:

    Copy to Clipboard Toggle word wrap
    apiVersion: operator.authorino.kuadrant.io/v1beta1
    kind: Authorino
    metadata:
      name: authorino
      namespace: <namespace_for_authorino_instance>
    spec:
      authConfigLabelSelectors: security.opendatahub.io/authorization-group=default
      clusterWide: true
      listener:
        tls:
          enabled: false
      oidcServer:
        tls:
          enabled: false
  8. Save the YAML file.
  9. Create the Authorino resource on your cluster.

    Copy to Clipboard Toggle word wrap
    $ oc create -f <file_name>.yaml
  10. Patch the Authorino deployment to inject an Istio sidecar, which makes the Authorino instance part of your OpenShift Service Mesh instance.

    Copy to Clipboard Toggle word wrap
    $ oc patch deployment <name_of_authorino_instance> -n <namespace_for_authorino_instance> -p '{"spec": {"template":{"metadata":{"labels":{"sidecar.istio.io/inject":"true"}}}} }'

Verification

  • Confirm that the Authorino instance is running as follows:

    1. Check the pods (and containers) that are running in the namespace that you created for the Authorino instance, as shown in the following example:

      Copy to Clipboard Toggle word wrap
      $ oc get pods -n redhat-ods-applications-auth-provider -o="custom-columns=NAME:.metadata.name,STATUS:.status.phase,CONTAINERS:.spec.containers[*].name"
    2. Confirm that the output resembles the following example:

      Copy to Clipboard Toggle word wrap
      NAME                         STATUS    CONTAINERS
      authorino-6bc64bd667-kn28z   Running   authorino,istio-proxy

      As shown in the example, there is a single running pod for the Authorino instance. The pod has containers for Authorino and for the Istio sidecar that you injected.

5.4.4. Configuring an OpenShift Service Mesh instance to use Authorino

When you have created an Authorino instance, you must configure your OpenShift Service Mesh instance to use Authorino as an authorization provider.

Important

To ensure that your OpenShift Service Mesh instance remains in a supported state, make only the configuration updates shown in the following procedure.

Prerequisites

  • You have created an Authorino instance and enrolled the namespace for the Authorino instance in your OpenShift Service Mesh instance.
  • You have privileges to modify the OpenShift Service Mesh instance. See Creating an OpenShift Service Mesh instance.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a user that has privileges to update the OpenShift Service Mesh instance, log in to the OpenShift CLI as shown in the following example:

    Copy to Clipboard Toggle word wrap
    $ oc login <openshift_cluster_url> -u <username> -p <password>
  2. Create a new YAML file with the following contents:

    Copy to Clipboard Toggle word wrap
    spec:
     techPreview:
       meshConfig:
         extensionProviders:
         - name: redhat-ods-applications-auth-provider
           envoyExtAuthzGrpc:
             service: <name_of_authorino_instance>-authorino-authorization.<namespace_for_authorino_instance>.svc.cluster.local
             port: 50051
  3. Save the YAML file.
  4. Use the oc patch command to apply the YAML file to your OpenShift Service Mesh instance.

    Copy to Clipboard Toggle word wrap
    $ oc patch smcp <name_of_service_mesh_instance> --type merge -n <namespace_for_service_mesh_instance> --patch-file <file_name>.yaml
    Important

    You can apply the configuration shown as a patch only if you have not already specified other extension providers in your OpenShift Service Mesh instance. If you have already specified other extension providers, you must manually edit your ServiceMeshControlPlane resource to add the configuration.

Verification

  • Verify that your Authorino instance has been added as an extension provider in your OpenShift Service Mesh configuration as follows:

    1. Inspect the ConfigMap object for your OpenShift Service Mesh instance:

      Copy to Clipboard Toggle word wrap
      $ oc get configmap istio-<name_of_service_mesh_instance> -n <namespace_for_service_mesh_instance> --output=jsonpath={.data.mesh}
    2. Confirm that you see output similar to the following example, which shows that the Authorino instance has been successfully added as an extension provider.

      Copy to Clipboard Toggle word wrap
      defaultConfig:
        discoveryAddress: istiod-data-science-smcp.istio-system.svc:15012
        proxyMetadata:
          ISTIO_META_DNS_AUTO_ALLOCATE: "true"
          ISTIO_META_DNS_CAPTURE: "true"
          PROXY_XDS_VIA_AGENT: "true"
        terminationDrainDuration: 35s
        tracing: {}
      dnsRefreshRate: 300s
      enablePrometheusMerge: true
      extensionProviders:
      - envoyExtAuthzGrpc:
          port: 50051
          service: authorino-authorino-authorization.opendatahub-auth-provider.svc.cluster.local
        name: opendatahub-auth-provider
      ingressControllerMode: "OFF"
      rootNamespace: istio-system
      trustDomain: null%

5.4.5. Configuring authorization for KServe

To configure the single-model serving platform to use Authorino, you must create a global AuthorizationPolicy resource that is applied to the KServe predictor pods that are created when you deploy a model. In addition, to account for the multiple network hops that occur when you make an inference request to a model, you must create an EnvoyFilter resource that continually resets the HTTP host header to the one initially included in the inference request.

Prerequisites

  • You have created an Authorino instance and configured your OpenShift Service Mesh to use it.
  • You have privileges to update the KServe deployment on your cluster.
  • You have privileges to add resources to the project in which your OpenShift Service Mesh instance was created. See Creating an OpenShift Service Mesh instance.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a user that has privileges to update the KServe deployment, log in to the OpenShift CLI as shown in the following example:

    Copy to Clipboard Toggle word wrap
    $ oc login <openshift_cluster_url> -u <username> -p <password>
  2. Create a new YAML file with the following contents:

    Copy to Clipboard Toggle word wrap
    apiVersion: security.istio.io/v1beta1
    kind: AuthorizationPolicy
    metadata:
      name: kserve-predictor
    spec:
      action: CUSTOM
      provider:
         name: redhat-ods-applications-auth-provider 
    1
    
      rules:
         - to:
              - operation:
                   notPaths:
                      - /healthz
                      - /debug/pprof/
                      - /metrics
                      - /wait-for-drain
      selector:
         matchLabels:
            component: predictor
    1
    The name that you specify must match the name of the extension provider that you added to your OpenShift Service Mesh instance.
  3. Save the YAML file.
  4. Create the AuthorizationPolicy resource in the namespace for your OpenShift Service Mesh instance.

    Copy to Clipboard Toggle word wrap
    $ oc create -n <namespace_for_service_mesh_instance> -f <file_name>.yaml
  5. Create another new YAML file with the following contents:

    Copy to Clipboard Toggle word wrap
    apiVersion: networking.istio.io/v1alpha3
    kind: EnvoyFilter
    metadata:
      name: activator-host-header
    spec:
      priority: 20
      workloadSelector:
        labels:
          component: predictor
      configPatches:
      - applyTo: HTTP_FILTER
        match:
          listener:
            filterChain:
              filter:
                name: envoy.filters.network.http_connection_manager
        patch:
          operation: INSERT_BEFORE
          value:
            name: envoy.filters.http.lua
            typed_config:
              '@type': type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
              inlineCode: |
               function envoy_on_request(request_handle)
                  local headers = request_handle:headers()
                  if not headers then
                    return
                  end
                  local original_host = headers:get("k-original-host")
                  if original_host then
                    port_seperator = string.find(original_host, ":", 7)
                    if port_seperator then
                      original_host = string.sub(original_host, 0, port_seperator-1)
                    end
                    headers:replace('host', original_host)
                  end
                end

    The EnvoyFilter resource shown continually resets the HTTP host header to the one initially included in any inference request.

  6. Create the EnvoyFilter resource in the namespace for your OpenShift Service Mesh instance.

    Copy to Clipboard Toggle word wrap
    $ oc create -n <namespace_for_service_mesh_instance> -f <file_name>.yaml

Verification

  • Check that the AuthorizationPolicy resource was successfully created.

    Copy to Clipboard Toggle word wrap
    $ oc get authorizationpolicies -n <namespace_for_service_mesh_instance>

    Confirm that you see output similar to the following example:

    Copy to Clipboard Toggle word wrap
    NAME               AGE
    kserve-predictor   28h
  • Check that the EnvoyFilter resource was successfully created.

    Copy to Clipboard Toggle word wrap
    $ oc get envoyfilter -n <namespace_for_service_mesh_instance>

    Confirm that you see output similar to the following example:

    Copy to Clipboard Toggle word wrap
    NAME                                          AGE
    activator-host-header                         28h
Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat, Inc.