Este contenido no está disponible en el idioma seleccionado.

Chapter 2. Configuring Feature Store


As a cluster administrator, you can install and manage Feature Store as a component in the Red Hat OpenShift AI Operator configuration.

2.1. Setting up Feature Store

As a cluster administrator, you must complete the following tasks to set up Feature Store:

  1. Enable the Feature Store component.
  2. Create a project and add a Feature Store instance.
  3. Initialize the Feature Store instance.
  4. Set up Feature Store so that ML Engineers and data scientists can push and retrieve features to use for model training and inference.

2.1.1. Before you begin

Before you implement Feature Store in your machine learning workflow, you must have the following information:

Knowledge of your data and use case

You must know your use case and your raw underlying data so that you can identify the properties or attributes that you want to define as features. For example, if you are developing machine learning (ML) models that detect possible credit card fraud transactions, you would identify data such as purchase history, transaction location, transaction frequency, or credit limit.

With Feature Store, you define each of those attributes as a feature. You group features that share a conceptual link or relationship together to define an entity. You define entities to map to the domain of your use case. Not all features must be in an entity.

Knowledge of your data source

You must know the source of the raw data that you want to use in your ML workflow. When you configure the Feature Store online and offline stores and the feature registry, you must specify an environment that is compatible with the data source. Also, when you define features, you must specify the data source for the features.

Feature Store uses a time-series data model to represent data. This data model is used to interpret feature data in data sources in order to build training datasets or materialize features into an online store.

You can connect to the following types of data sources:

Batch data source
A method of collecting and processing data in discrete chunks or batches, rather than continuously streaming it. This approach is commonly used for large datasets or when real-time processing is not essential. In a data processing context, a batch data source defines the connection to the data-at-rest source, allowing you to access and process data in batches. Examples of batch data sources include data warehouses (for example, BigQuery, Snowflake, and Redshift) or data lakes (for example, S3 and GCS). Typically, you define a batch data source when you configure the Feature Store offline store.
Stream data source
The origin of data that is continuously flowing or emitted for online, real-time processing. Feature Store does not have native streaming integrations, but it facilitates push sources that allow you to push features into Feature Store. You can use Feature Store for training or batch scoring (offline), for real-time feature serving (online), or for both. Typically, you define a stream data source when you configure the Feature Store online store.

You can use the following data sources with Feature Store:

Data sources for online stores

  • SQLite
  • Snowflake
  • Redis
  • Dragonfly
  • IKV
  • Datastore
  • DynamoDB
  • Bigtable
  • PostgreSQL
  • Cassandra + Astra DB
  • Couchbase
  • MySQL
  • Hazelcast
  • ScyllaDB
  • Remote
  • SingleStore

For details on how to configure these online stores, see the Feast reference documentation for online stores.

Data sources for offline stores

  • Dask
  • Snowflake
  • BigQuery
  • Redshift
  • DuckDB

An offline store is an interface for working with historical time-series feature values that are stored in data sources. Each offline store implementation is designed to work only with the corresponding data source.

Offline stores are useful for the following purposes:

  • To build training datasets from time-series features.
  • To materialize (load) features into an online store to serve those features at low-latency in a production setting.

You can use only a single offline store at a time. Offline stores are not compatible with all data sources; for example, the BigQuery offline store cannot be used to query a file-based data source.

For details on how to configure these offline stores, see the Feast reference documentation for offline stores.

Data sources for the feature registry

  • Local
  • S3
  • GCS
  • SQL
  • Snowflake

For details on how to configure these registry options, see the Feast reference documentation for the registry.

2.1.2. Enabling the Feature Store component

To allow the ML engineers and data scientists in your organization to work with machine learning features, you must enable the Feature Store component in Red Hat OpenShift AI.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed OpenShift AI.

Procedure

  1. In the OpenShift console, click Operators Installed Operators.
  2. Click the Red Hat OpenShift AI Operator.
  3. Click the Data Science Cluster tab.
  4. Click the default instance name (for example, default-dsc) to open the instance details page.
  5. Click the YAML tab.
  6. Edit the spec:components section. For the feastoperator component, set the managementState field to Managed:

    spec:
      components:
        feastoperator:
          managementState: Managed
  7. Click Save.

Verification

Check the status of the feast-operator-controller-manager-<pod-id> pod:

  1. Click Workloads Deployments.
  2. From the Project list, select redhat-ods-applications.
  3. Search for the feast-operator-controller-manager deployment.
  4. Click the feast-operator-controller-manager deployment name to open the deployment details page.
  5. Click the Pods tab.
  6. View the pod status.

When the status of the feast-operator-controller-manager-<pod-id> pod is Running, Feature Store is enabled.

Next Step

  • Create a Feature Store instance in a project.

2.1.3. Creating a Feature Store instance in a project

You can add an instance of Feature Store to a project by creating a custom resource definition (CRD) in the OpenShift console.

The following example shows the minimum requirements for a Feature Store CR YAML file:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample
spec:
  feastProject: my_feast_project

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have enabled the Feature Store component, as described in Enabling the Feature Store component.
  • You have set up your database infrastructure for the online store, offline store, and registry.

    For an example of setting up and running PostgreSQL (for the registry) and Redis (for the online store), see the Feature Store Operator quick start example: https://github.com/feast-dev/feast/tree/stable/examples/operator-quickstart.

  • You have created a project, as described in Creating a project. In the following procedure, my-project is the name of the project.

Procedure

  1. In the OpenShift console, click the Quick Create ( quick create icon ) icon and then click the Import YAML option.
  2. Verify that your project is the selected project.
  3. Copy the following code and paste it into the YAML editor:

    apiVersion: feast.dev/v1alpha1
    kind: FeatureStore
    metadata:
      name: sample-git
    spec:
      feastProject: credit_scoring_local
      feastProjectDir:
        git:
          url: https://github.com/feast-dev/feast-credit-score-local-tutorial
          ref: 598a270

    The spec.feastProjectDir references a Feature Store project that is in the Git repository for a Credit Store tutorial.

  4. Optionally, change the metadata.name for the Feature Store instance.
  5. Optionally, edit feastProject, which is the namespace for organizing your Feature Store instance. Note that this project is not the OpenShift AI project.
  6. Click Create.

When you create the Feature Store CR in OpenShift, Feature Store starts a remote online feature server, and configures a default registry and an offline store with the local provider.

A provider is a customizable interface that provides default Feature Store components, such as the registry, offline store, and online store, that target a specific environment, ensuring that these components can work together seamlessly. The local provider uses the following default settings:

  • Registry: A SQL registry or local file
  • Offline store: A Parquet file
  • Online store: SQLite

Verification

  1. In the OpenShift console, select Workloads Pods.
  2. Make sure that your project (for example, my-project) is selected.
  3. Find the pod that has the feast- prefix, followed by the metadata.name that you specified in the CRD configuration, for example, sample-git.
  4. Verify that the pod status is Running.
  5. Click the feast pod and then select Pod details.
  6. Scroll down to see the online container. This container is the deployment for the online server. It makes the feature server REST API available in the OpenShift cluster.
  7. Scroll up and then click Terminal.
  8. Run the following command to verify that the feast CLI is installed correctly:

    $ feast --help
  9. To view the files for the Feature Store project, enter the following command:

    $ ls -la

    You should see output similar to the following:

    .
    ..
    data
    example_repo.py
    feature_store.yaml
    __init__.py
    __pycache__
    test_workflow.py
  10. To view the feature_store.yaml configuration file, enter the following command:

    $ cat feature_store.yaml

    You should see output similar to the following:

    project: my_feast_project
    provider: local
    online_store:
    	path: /feast-data/online_store.db
    	type: sqlite
    registry:
    	path: /feast-data/registry.db
    	registry_type: file
    auth:
    	type: no_auth
    entity_key_serialization_version: 3

The feature_store.yaml file defines the following components:

  • project — The namespace for the Feature Store instance. Note that this project refers to the feature project rather than the OpenShift AI project.
  • provider — The environment in which Feature Store deploys and operates.
  • registry — The location of the feature registry.
  • online_store — The location of the online store.
  • auth - The type of authentication and authorization (no_auth, kubernetes, or oidc)
  • entity_key_serialization_version - Specifies the serialization scheme that Feature Store uses when writing data to the online store.

NOTE: Although the offline_store location is not included in the feature_store.yaml file, the Feature Store instance uses a DASK file-based offline store. In the feature_store.yaml file, the registry type is file but it uses a simple SQLite database.

Next steps

  • Optionally, you can customize the default configurations for the offline store, online store, or registry by editing the YAML configuration for the Feature Store CR, as described in Customizing your Feature Store configuration.
  • Give your ML engineers and data scientists access to the project so that they can create a workbench. and provide them with a copy of the feature_store.yaml file so that they can add it to their workbench IDE, such as Jupyter.

2.1.4. Configuring and managing Role Based Access Control

You can set permissions using Role-Based Access Control (RBAC) to manage user access to Feature Store. This grants access to actions such as creating, reading, updating and deleting namespaces.

Prerequisites

  • You have Administrator access.
  • You have created a Feature Store instance.
Note

Procedure

  1. Open your command line interface (CLI). Deploy the Feature store custom resource by running the following command:

    kubectl apply -f feature-store-cr.yaml
    1. Locate the Feature Store Custom Resource (CR) YAML file, which is named feature-store-cr.yaml. You will see key value pairs. Change the key type: to Kubernetes:

      apiVersion: feast.dev/v1alpha1
      kind: FeatureStore
      metadata:
      name: <feature-store-name>
      spec: # ... other configurations ...
      authz:
      type: Kubernetes
  2. Verify that your Feature Store projects were created.

    kubectl get feast
    
    <project name>
    kubectl get configmaps -l feast.dev/service-type=client
    
    <your-project-name> <feast project name> <number of data entries> <time since created>
  3. Configure data science project permissions. You must create a permissions.py file in the Feature Store pod terminal. This file must reside in the feature_store directory. You can use a role based policy, a group based policy, combined group namespace policy or read and write permissions.

    Note

    For an example of a permission.py file, see the Feast Operator RBAC with TLS.

  4. Transfer your local permissions.py file to the remote container filesystem. In a Kubernetes/OpenShift environment, you use a command-line tool such as oc OpenShift Command Line Interface or kubectl:

    `oc/kubectl cp <local-file> <remote-pod>:<remote-path>.`
  5. Configure and set up the Feature Store Server. If a cron job has been run previously, run feast apply on the online container. Open your command line interface (CLI) and run the following command:

    `oc create job --from=cronjob/feast-project-name cronjob-manual-$(date +%s) -n <project name>`
    
    `oc exec -it deployments/<feast deployment name> -c online -- feast apply`
  6. Configure authentication in the OpenShift web console. You have full control over your data science project access. You can grant and revoke access to users/groups instantly.

    1. Log in to your OpenShift AI or OpenShift Console.
    2. Navigate to the Data Science Projects tab and select the appropriate project.
    3. Click Permissions tab > Users Groups.
    4. Name your group.
    5. Under Permissions, choose a predefined role, add permissions, and click Save.
Note

The name of your group must exist in your identity provider. The identity provider is configured at the OpenShift cluster level, outside of the specific project you are working in.

Verification

The deployment pod is running and you see the project details in the Feature Store UI and Integration tab.

Initialize the Feature Store instance to start using it.

When you initialize the Feature Store instance, Feature Store completes the following tasks:

  • Scans the Python files in your feature repository and finds all Feature Store object definitions, such as feature views, entities, and data sources.

    Note: Feature Store reads all Python files recursively, including subdirectories, even if they do not contain feature definitions. For information on identifying Python files, such as imperative scripts that you want Feature Store to ignore, see Specifying files to ignore.

  • Validates your feature definitions, for example, by checking for uniqueness of features within a feature view.
  • Syncs the metadata about objects to the feature registry. If a registry does not exist, Feature Store creates one. The default registry is a simple Protobuf binary file on disk (locally or in an object store).
  • Creates or updates all necessary Feature Store infrastructure. The exact infrastructure that Feature Store creates depends on the provider configuration that you have set in feature_store.yaml. For example, when you specify local as your provider, Feature Store creates the infrastructure on the local cluster.

    Note: When you use a cloud provider, such as Google Cloud Platform or Amazon Web Service, the feast apply command creates cloud infrastructure that might incur costs for your organization.

Prerequisites

  • An ML engineer on your team has given you a Python file that defines features. For more information about how to define features, see Defining features.
  • If you want to store the feature registry in cloud storage or in a database, you have configured storage for the feature registry. For example, if the provider is GCP, you have created a Cloud Storage bucket for the feature registry.
  • You have the cluster-admin role in OpenShift.
  • You have created a Feature Store instance in your project.

Procedure

  1. In the OpenShift console, select Workloads Pods.
  2. Make sure that your project is the current project.
  3. Click the feast pod and then select Pod details.
  4. Scroll down to see the online container. This container is the deployment for the online server, and it makes the feature server REST API available in the OpenShift cluster.
  5. Scroll up and then click Terminal.
  6. Copy the feature definition (.py) file to your Feature Store directory.
  7. To create a feature registry and add the feature definitions to the registry, run the following command:

    feast apply

Verification

  • You should see output similar to the following that indicates that the features in the feature definition file were successfully added to the registry:

    Created project credit_scoring_local
    Created entity zipcode
    Created entity dob_ssn
    Created feature view zipcode_features
    Created feature view credit_history
    Created on demand feature view total_debt_calc
    
    Created sqlite table credit_scoring_local_credit_history
    Created sqlite table credit_scoring_local_zipcode_features
  • In the OpenShift console, select Workloads Deployments to view the deployment pod.

2.1.5.1. Specifying files to ignore

When you run the feast apply command, Feature Store reads all Python files recursively, including Python files in subdirectories, even if the Python files do not contain feature definitions.

If you have Python files, such as imperative scripts, in your registry folder that you want Feature Store to ignore when you run the feast apply command, you should create a .feastignore file and add a list of paths to all files that you want Feature Store to ignore.

Example .feastignore file

# Ignore virtual environment
venv

# Ignore a specific Python file
scripts/foo.py

# Ignore all Python files directly under scripts directory
scripts/*.py

# Ignore all "foo.py" anywhere under scripts directory
scripts/**/foo.py

2.1.6. Viewing Feature Store objects in the web-based UI

You can use the Feature Store Web UI to view all registered features, data sources, entities, and feature services.

Prerequisites

  • You can access the OpenShift console.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • You have enabled the Feature Store component, as described in Enabling the Feature Store component.
  • You have created a Feature Store CRD, as described in Creating a Feature Store instance in a project.

Procedure

  1. In the OpenShift console, select Administration CustomResourceDefinitions.
  2. To filter the list, in the Search by Name field, enter feature.
  3. Click the FeatureStore CRD and then click Instances.
  4. Click the name of the instance that corresponds to the metadata name you specified when you created the Feature Store instance.
  5. Edit the YAML to include a reference to services.ui in the spec section, as shown in the following example:

    spec:
     feastProject: credit_scoring_local
     feastProjectDir:
       git:
         ref: 598a270
         url: 'https://github.com/feast-dev/feast-credit-score-local-tutorial'
     services:
       ui: {}
  6. Click Save and then click Reload.

    The Feature Store Operator starts a container for the web-based Feature Store UI and creates an OpenShift route that provides the URL so that you can access it.

  7. In the OpenShift console, select Workloads Pods.
  8. Make sure that your project (for example, my-project) is selected.

    You should see a deployment for the web-based UI. Note that OpenShift enables TLS by default at runtime.

  9. To populate the web-based UI with the objects in your Feature Store instance:

    1. In the OpenShift console, select Workloads Pods.
    2. Make sure that your project (for example, my-project) is selected.
    3. Click the feast pod and then select Pod details.
    4. Click Terminal.
    5. To update the Feature Store instance, enter the following command:

      feast apply
  10. To find the URL for the Feature Store UI, in the OpenShift console, click Networking Routes.

    In the row for the Feature Store UI, for example feast-sample-ui, the URL is in the Location column.

  11. Click the URL link to open it in your default web browser.

Verification

The Feature Store Web UI is displayed and shows the feature objects in your project as shown in the following figure:

Figure 2.1. The Feature Store Web UI

The Feature Store Web UI

2.2. Customizing your Feature Store configuration

Optionally, you can apply the following configurations to your Feature Store instance:

  • Configure an offline store
  • Configure an online store
  • Configure the feature registry
  • Configure persistent volume claims (PVCs)
  • Configure role-based access control (RBAC)

The examples in the following sections describe how to customize a feature store instance by creating a new custom resource definition (CRD). Alternatively, you can customize an existing feature instance as described in Editing an existing feature store instance.

For more information about how you can customize your feature store configuration, see the Feast API documentation.

2.2.1. Configuring an offline store

When you create a Feature Store instance that uses the minimal configuration, by default, Feature Store uses a SQLite file-based store for the offline store.

The example in the following procedure shows how to configure DuckDB for the offline store.

You can configure other offline stores, such as Snowflake, BigQuery, Redshift, as detailed in the Feast reference documentation for offline stores.

Note

The example code in the following procedure requires that you edit it with values that are specific to your use case.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have enabled the Feature Store component, as described in Enabling the Feature Store component.
  • You have created a project, as described in Creating a project. In the following procedure, my-project is the name of the project.
  • Your project includes an existing secret that provides credentials for accessing the database that you want to use for the offline store. The example in the following procedure requires that you have configured DuckDB.

Procedure

  1. In the OpenShift console, click the Quick Create ( quick create icon ) icon and then click the Import YAML option.
  2. Verify that your project is the selected project.
  3. Copy the following code and paste it into the YAML editor:

    apiVersion: feast.dev/v1alpha1
    kind: FeatureStore
    metadata:
      name: sample-db-persistence
    spec:
      feastProject: my_project
      services:
        offlineStore:
          persistence:
            file:
              type: duckdb
  4. Edit the services.offlineStore section to specify values specific to your use case.
  5. Click Create.

Verification

  1. In the OpenShift console, select Workloads Pods.
  2. Make sure that your project (for example, my-project) is selected.
  3. Find the pod that has the feast- prefix, followed by the metadata name that you specified in the CRD configuration, for example, feast-sample-db-persistence.
  4. Verify that the status is Running.

2.2.2. Configuring an online store

When you create a Feature Store instance using the minimal configuration, by default, the online store is a SQLite database.

The example in the following procedure shows how to configure a PostgreSQL database for the online store.

You can configure other online stores, such as Snowflake, Redis, and DynamoDB, as detailed in the Feast reference documentation for online stores.

Note

The example code in the following procedure requires that you edit it with values that are specific to your use case.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have enabled the Feature Store component, as described in Enabling the Feature Store component.
  • You have created a project, as described in Creating a project. In the following procedure, my-project is the name of the project.
  • Your project includes an existing secret that provides credentials for accessing the database that you want to use for the online store. The example in the following procedure requires that you have configured a PostgreSQL database.

Procedure

  1. In the OpenShift console, click the Quick Create ( quick create icon ) icon and then click the Import YAML option.
  2. Verify that your project is the selected project.
  3. Copy the following code and paste it into the YAML editor:

    apiVersion: feast.dev/v1alpha1
    kind: FeatureStore
    metadata:
      name: sample-db-persistence
    spec:
      feastProject: my_project
      services:
        onlineStore:
          persistence:
            store:
              type: postgres
              secretRef:
                name: feast-data-stores
  4. Edit the services.onlineStore section to specify values that are specific to your use case.
  5. Click Create.

Verification

  1. In the OpenShift console, select Workloads Pods.
  2. Make sure that your project (for example, my-project) is selected.
  3. Find the pod that has the feast- prefix, followed by the metadata name that you specified in the CRD configuration, for example, feast-sample-db-persistence.
  4. Verify that the status is Running.

2.2.3. Configuring the feature registry

By default, when you create a feature instance using the minimal configuration, the registry is a simple SQLite database.

The example in the following procedure shows how to configure an S3 registry.

You can configure other types of registries, such as GCS, SQL, Snowflake, as detailed in the Feast reference documentation for registries.

Note

The example code in the following procedure requires that you edit it with values that are specific to your use case.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have enabled the Feature Store component, as described in Enabling the Feature Store component.
  • You have created a project, as described in Creating a project. In the following procedure, my-project is the name of the project.
  • Your project includes an existing secret that provides credentials for accessing the database that you want to use for the registry. The example in the following procedure requires that you have configured S3.

Procedure

  1. In the OpenShift console, click the Quick Create ( quick create icon ) icon and then click the Import YAML option.
  2. Verify that your project is the selected project.
  3. Copy the following code and paste it into the YAML editor:

    apiVersion: feast.dev/v1alpha1
    kind: FeatureStore
    metadata:
     name: sample-s3-registry
    spec:
     feastProject: my_project
     services:
       registry:
         local:
           server:
            restAPI: true
           persistence:
             file:
               path: s3://bucket/registry.db
               s3_additional_kwargs:
                 ServerSideEncryption: AES256
                 ACL: bucket-owner-full-control
                 CacheControl: max-age=3600
  4. Edit the services.registry section to specify values that are specific to your use case.
  5. Click Create. You have now configured your registry service and enabled the REST APIs.

Verification

  1. In the OpenShift console, select Workloads Pods.
  2. Make sure that your project (for example, my-project) is selected.
  3. Find the pod that has the feast- prefix, followed by the metadata name that you specified in the CRD configuration, for example, sample-s3-registry.
  4. Click the feast pod and then select Pod details.
  5. Click Terminal.
  6. In the Terminal window, enter the following command to view the configuration, including the S3 registry:

    $ cat feature_store.yaml

2.2.4. Example PVC configuration

When you configure the online store, offline store, or registry, you can also configure persistent volume claims (PVCs) as shown in the following Feature Store custom resource definition (CRD) example.

Note

The following example code requires that you edit it with values that are specific to your use case.

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-pvc-persistence
spec:
  feastProject: my_project
  services:
    onlineStore:   
1

      persistence:
        file:
          path: online_store.db
          pvc:
            ref:
              name: online-pvc
            mountPath: /data/online
    offlineStore:   
2

      persistence:
        file:
          type: duckdb
          pvc:
            create:
              storageClassName: standard
              resources:
                requests:
                  storage: 5Gi
            mountPath: /data/offline
    registry:   
3

      local:
        persistence:
          file:
            path: registry.db
            pvc:
              create: {}
              mountPath: /data/registry
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: online-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
1
The online store specifies a PVC that must already exist.
2
The offline store specifies a storage class name and storage size.
3
The registry configuration specifies that the Feature Store Operator creates a PVC with default settings.

2.2.5. Editing an existing Feature Store instance

The examples in this document describe how to customize a Feature Store instance by creating a new custom resource definition (CRD). Alternatively, you can customize an existing feature instance.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have created a Feature Store instance, as described in Deploying a Feature Store instance in a project.

Procedure

  1. In the OpenShift console, select Administration CustomResourceDefinitions.
  2. To filter the list, in the Search by Name field, enter feature.
  3. Click the FeatureStore CRD and then click Instances.
  4. Select the instance that you want to edit, and then click YAML.
  5. In the YAML editor, edit the configuration.
  6. Click Save and then click Reload.

Verification

The Feature Store instance CRD deploys successfully.

Red Hat logoGithubredditYoutubeTwitter

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.

Hacer que el código abierto sea más inclusivo

Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.

Acerca de Red Hat

Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.

Theme

© 2026 Red Hat
Volver arriba