Configuring Feature Store

Red Hat OpenShift AI Self-Managed 2.22

Install and manage Feature Store as a component in the Red Hat OpenShift AI Operator configuration

Abstract

As a cluster administrator, you can install and manage Feature Store as a component in the Red Hat OpenShift AI Operator configuration. Feature Store provides an interface between machine learning models and data.

Chapter 1. Overview of machine learning features and Feature Store
Copy link

Important

Feature Store is currently available in Red Hat OpenShift AI 2.22 as a Technology Preview feature. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

A machine learning (ML) feature is a measurable property or attribute within a dataset that a machine learning model can analyze to learn patterns and make decisions. Examples of features include a customer’s purchase history, demographic data like age and location, weather conditions, and financial market data. You can use these features to train models for tasks such as personalized product recommendations, fraud detection, and predictive maintenance.

Feature Store is a Red Hat OpenShift AI component that provides a centralized repository that stores, manages, and serves machine learning features for both training and inference purposes.

1.1. Overview of machine learning features
Copy link

In machine learning, a feature, also referred to as a field, is an individual measurable property. A feature is used as an input signal to a predictive model. For example, if a bank’s loan department is trying to predict whether an applicant should be approved for a loan, a useful feature might be whether they have filed for bankruptcy in the past or how much credit card debt they currently carry.

Expand

Table 1.1. A feature represents a column in a data table
customer_id	avg_cc_balance	credit_score	bankruptcy
1005	500.00	730	0
982	20000.00	570	2
1001	1400.00	600	0

Features are prepared data that help machine learning models understand patterns in the world. Feature engineering is the process of selecting, manipulating, and transforming raw data into features that can be used in supervised learning. As shown in Table 1, a feature refers to an entire column in a dataset, for example, credit_score, and a feature value refers to a single value in a feature column, such as 730.

1.2. Overview of Feature Store
Copy link

Feature Store is an OpenShift AI component that provides an interface between models and data. It is based on the Feast open source project. Feature Store provides a framework for storing, managing, and serving features to machine learning models by using your existing infrastructure and data stores. It facilitates the retrieval of feature data from different data sources to generate and manage features by providing unified feature management capabilities.

The following figure shows where Feature Store fits in the ML workflow. In an ML workflow, features are inputs to ML models. The ML workflow starts with many types of relevant data, such as transactional data, customer references, and product data. The data comes from a variety of databases and data sources. From this data, ML engineers use Feature Store to curate features. The features are input to models and the models can then use the data from the features to make predictions.

Figure 1.1. Feature Store in the ML workflow

Feature Store is a machine learning data system that provides the following capabilities:

Runs data pipelines that transform raw data into feature values
Stores and manages feature data
Serves feature data consistently for training and inference purposes
Manages features consistently across offline and online environments
Powers one model or thousands simultaneously with fresh, reusable features, on demand

Feature Store is a centralized hub for storing, processing, and accessing commonly-used features that enables users in your ML organization to collaborate. When you register a feature in a Feature Store, it becomes available for immediate reuse by other models across your organization. The Feature Store registry reduces duplication of data engineering efforts and allows new ML projects to bootstrap with a library of curated, production-ready features.

Feature Store provides consistency in model training and inference, promotes collaboration and usability across multiple projects, monitors lineage and versioning of models for data drifts, leaks, and training skews, and seamlessly integrates with other MLOps tools. Feature Store remotely manages data stored in other systems, such as BigQuery, Snowflake, DynamoDB, and Redis, to make features consistently available at training / serving time.

Feature Store performs the following tasks:

Stores features in offline and online stores
Registers features in the registry for sharing
Serves features to ML models

ML platform teams use Feature Store to store and serve features consistently for offline training, such as batch-scoring, and online real-time model inference.

Feature Store consists of the following key components:

Registry

A central catalog of all feature definitions and their related metadata. It allows data scientists to search, discover, and collaborate on new features. The registry exposes methods to apply, list, retrieve, and delete features.

Offline Store

The data store that contains historical data for scale-out batch scoring or model training. The offline store persists batch data that has been ingested into Feature Store. This data is used for producing training datasets. Examples of offline stores include Dask, Snowflake, BigQuery, Redshift, and DuckDB.

Online Store

The data store that is used for low-latency feature retrieval. The online store is used for real-time inference. Examples of online stores include Redis, GCP Datastore, and DynamoDB.

Server

A feature server that serves pre-computed features online. There are three Feature Store servers:

The online feature server - A Python feature server that is an HTTP endpoint that serves features with JSON I/O. You can write and read features from the online store using any programming language that can make HTTP requests.
The offline feature server - An Apache Arrow Flight Server that uses the gRPC communication protocol to exchange data. This server wraps calls to existing offline store implementations and exposes interfaces as Arrow Flight endpoints.
The registry server - A server that uses the gRPC communication protocol to exchange data. You can communicate with the server using any programming language that can make gRPC requests.

UI

A web-based graphical user interface (UI) for viewing all the feature store objects and their relationships with each other.

Feature Store provides the following software capabilities:

A Python SDK for programmatically defining features and data sources
A Python SDK for reading and writing features to offline and online data stores
An optional feature server for reading and writing features (useful for non-python languages) by using APIs
A web-based UI for viewing and exploring information about features defined in the project
A command line interface (CLI) for viewing and updating feature information

1.3. Audience for Feature Store
Copy link

The target audience for Feature Store is ML platform and MLOps teams with DevOps experience in deploying real-time models to production. Feature Store also helps these teams build a feature platform that improves collaboration between data engineers, software engineers, machine learning engineers, and data scientists.

For Data Scientists: Feature Store is a tool where you can define, store, and retrieve your features for both model development and model deployment. By using Feature Store, you can focus on what you do best: build features that power your AI/ML models and maximize the value of your data.
For MLOps Engineers: Feature Store is a library that connects your existing infrastructure, such as online database, application server, microservice, analytical database, and orchestration tooling. By using Feature Store, you can focus on maintaining a resilient system, instead of implementing features for data scientists.
For Data Engineers: Feature Store provides a centralized catalog for storing feature definitions, allowing you to maintain a single source of truth for feature data. It provides the abstraction for reading and writing to many different types of offline and online data stores. Using the provided Python SDK or the feature server service, you can write data to the online and offline stores and then read that data out again in either batch scenarios for model training or low-latency online scenarios for model inference.
For AI Engineers: Feature Store provides a platform designed to scale your AI applications by enabling seamless integration of richer data and facilitating fine-tuning. With Feature Store, you can optimize the performance of your AI models while ensuring a scalable and efficient data pipeline.

Chapter 2. Before you begin
Copy link

Before you implement Feature Store in your machine learning workflow, you must have the following information:

Knowledge of your data and use case

You must know your use case and your raw underlying data so that you can identify the properties or attributes that you want to define as features. For example, if you are developing machine learning (ML) models that detect possible credit card fraud transactions, you would identify data such as purchase history, transaction location, transaction frequency, or credit limit.

With Feature Store, you define each of those attributes as a feature. You group features that share a conceptual link or relationship together to define an entity. You define entities to map to the domain of your use case. Not all features must be in an entity.

Knowledge of your data source

You must know the source of the raw data that you want to use in your ML workflow. When you configure the Feature Store online and offline stores and the feature registry, you must specify an environment that is compatible with the data source. Also, when you define features, you must specify the data source for the features.

Feature Store uses a time-series data model to represent data. This data model is used to interpret feature data in data sources in order to build training datasets or materialize features into an online store.

You can connect to the following types of data sources:

Batch data source: A method of collecting and processing data in discrete chunks or batches, rather than continuously streaming it. This approach is commonly used for large datasets or when real-time processing is not essential. In a data processing context, a batch data source defines the connection to the data-at-rest source, allowing you to access and process data in batches. Examples of batch data sources include data warehouses (for example, BigQuery, Snowflake, and Redshift) or data lakes (for example, S3 and GCS). Typically, you define a batch data source when you configure the Feature Store offline store.
Stream data source: The origin of data that is continuously flowing or emitted for online, real-time processing. Feature Store does not have native streaming integrations, but it facilitates push sources that allow you to push features into Feature Store. You can use Feature Store for training or batch scoring (offline), for real-time feature serving (online), or for both. Typically, you define a stream data source when you configure the Feature Store online store.

You can use the following data sources with Feature Store:

Data sources for online stores

SQLite
Snowflake
Redis
Dragonfly
IKV
Datastore
DynamoDB
Bigtable
PostgreSQL
Cassandra + Astra DB
Couchbase
MySQL
Hazelcast
ScyllaDB
Remote
SingleStore

For details on how to configure these online stores, see the Feast reference documentation for online stores.

Data sources for offline stores

Dask
Snowflake
BigQuery
Redshift
DuckDB

An offline store is an interface for working with historical time-series feature values that are stored in data sources. Each offline store implementation is designed to work only with the corresponding data source.

Offline stores are useful for the following purposes:

To build training datasets from time-series features.
To materialize (load) features into an online store to serve those features at low-latency in a production setting.

You can use only a single offline store at a time. Offline stores are not compatible with all data sources; for example, the BigQuery offline store cannot be used to query a file-based data source.

For details on how to configure these offline stores, see the Feast reference documentation for offline stores.

Data sources for the feature registry

Local
S3
GCS
SQL
Snowflake

For details on how to configure these registry options, see the Feast reference documentation for the registry.

Chapter 3. Enabling the Feature Store component
Copy link

To allow the data scientists in your organization to work with machine learning features, you must enable the Feature Store component in Red Hat OpenShift AI.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have installed OpenShift AI.

Procedure

In the OpenShift console, click Operators → Installed Operators.
Click the Red Hat OpenShift AI Operator.
Click the Data Science Cluster tab.
Click the default instance name (for example, default-dsc) to open the instance details page.
Click the YAML tab.
Edit the spec:components section. For the feastoperator component, set the managementState field to Managed:
```
spec:
  components:
    feastoperator:
      managementState: Managed
```
```
spec:
  components:
    feastoperator:
      managementState: Managed
```
Copy to Clipboard Toggle word wrap
Click Save.

Verification

Check the status of the feast-operator-controller-manager-<pod-id> pod:

Click Workloads → Deployments.
From the Project list, select redhat-ods-applications.
Search for the feast-operator-controller-manager deployment.
Click the feast-operator-controller-manager deployment name to open the deployment details page.
Click the Pods tab.
View the pod status.

When the status of the feast-operator-controller-manager-<pod-id> pod is Running, Feature Store is enabled.

Next Step

Deploy a feature store instance in a data science project.

Chapter 4. Deploying a feature store instance in a data science project
Copy link

You can add an instance of Feature Store to a data science project by creating a custom resource definition (CRD) in the OpenShift console.

The following example shows the minimum requirements for a Feature Store CRD YAML file:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample
spec:
  feastProject: my_feast_project

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample
spec:
  feastProject: my_feast_project

Copy to Clipboard

Toggle word wrap

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have enabled the Feature Store component, as described in Enabling the Feature Store component.
You have created a data science project, as described in Creating a data science project. In the following procedure, my-ds-project is the name of the data science project.

Procedure

In the OpenShift console, click the Quick Create ( ) icon and then click the Import YAML option.
Verify that your data science project is the selected project.

Copy the following code and paste it into the YAML editor:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample
spec:
  feastProject: my_feast_project

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample
spec:
  feastProject: my_feast_project

Copy to Clipboard

Toggle word wrap

Optionally, change the metadata.name for the feature store instance.
Optionally, edit the name of the feature project that you want to use for organizing your Feature Store code.
Click Create.

When you create the Feature Store CRD file in OpenShift, Feature Store starts a remote online feature server, and configures a default registry and an offline store with the local provider.

A provider is a customizable interface that provides default Feature Store components, such as the registry, offline store, and online store, that target a specific environment, ensuring that these components can work together seamlessly. The local provider uses the following default settings:

Registry: A SQL registry or local file
Offline store: A Parquet file
Online store: SQLite

Verification

In the OpenShift console, select Workloads → Pods.
Make sure that your data science project (for example, my-ds-project) is selected.
Click the feast pod and then select Pod details.
Scroll down to see the online container. This container is the deployment for the online server. It makes the feature server REST API available in the OpenShift cluster.
Scroll up and then click Terminal.
To view the files for the feature store project, enter the following command:
```
ls -la
```
```
$ ls -la
```
Copy to Clipboard Toggle word wrap
You should see output similar to the following:
```
.
..
data
example_repo.py
feature_store.yaml
__init__.py
__pycache__
test_workflow.py
```
```
.
..
data
example_repo.py
feature_store.yaml
__init__.py
__pycache__
test_workflow.py
```
Copy to Clipboard Toggle word wrap

To view the feature_store.yaml configuration file, enter the following command:

cat feature_store.yaml

$ cat feature_store.yaml

Copy to Clipboard

Toggle word wrap

You should see output similar to the following:

project: my_feast_project
provider: local
online_store:
	path: /feast-data/online_store.db
	type: sqlite
registry:
	path: /feast-data/registry.db
	registry_type: file
auth:
	type: no_auth
entity_key_serialization_version: 3

project: my_feast_project
provider: local
online_store:
	path: /feast-data/online_store.db
	type: sqlite
registry:
	path: /feast-data/registry.db
	registry_type: file
auth:
	type: no_auth
entity_key_serialization_version: 3

Copy to Clipboard

Toggle word wrap

NOTE: Although the offline_store service is not shown in the feature_store.yaml file, the feature store instance uses a DASK file-based offline store. In the feature_store.yaml file, the registry type is file but it uses a simple SQLite database.

Next steps

Optionally, you can customize the default configurations for the offline store, online store, or registry by editing the YAML configuration for the Feature Store CRD, as described in Customizing your feature store configuration.

Chapter 5. Customizing your feature store configuration
Copy link

Optionally, you can apply the following configurations to your feature store instance:

Specify to use an existing feature project in a Git repository
Configure an offline store
Configure an online store
Configure the feature registry
Configure PVCs
Configure role-based access control

The examples in the following sections describe how to customize a feature store instance by creating a new custom resource definition (CRD). Alternatively, you can customize an existing feature instance as described in Editing an existing feature store instance.

For more information about how you can customize your feature store configuration, see the Feast API documentation.

5.1. Specifying to use a feature project from a Git repository
Copy link

If you want to start with a pre-existing feature project that exists in a Git repository, create a feature store instance that includes a reference to the feature project location in the Git repository.

Note

The example code in the following procedure requires that you edit it with values that are specific to your use case.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have enabled the Feature Store component, as described in Enabling the Feature Store component.
You have created a data science project, as described in Creating a data science project. In the following procedure, my-ds-project is the name of the data science project.
You have an existing feature store project in an existing Git repository.

Procedure

In the OpenShift console, click the Quick Create ( ) icon and then click the Import YAML option.
Verify that your data science project is the selected project.

Copy the following example code and paste it into the YAML editor:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-git-repopath
spec:
  feastProject: feast_demo_odfv
  feastProjectDir:
    git:
      url: https://github.com/feast-dev/feast-workshop 
      ref: e959053    
      featureRepoPath: module_2/feature_repo

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-git-repopath
spec:
  feastProject: feast_demo_odfv
  feastProjectDir:
    git:
      url: https://github.com/feast-dev/feast-workshop


      ref: e959053


      featureRepoPath: module_2/feature_repo

Copy to Clipboard

Toggle word wrap

1: The URL for the Git repository.
2: The Git commit ID or branch.
3: The path to the feature store repository that you want to use.

Edit the Git repository URL, the reference (commit ID or branch), and the path to specify values that are specific to your use case.
Click Create.

Verification

In the OpenShift console, select Workloads → Pods.
Make sure that your project (for example, my-ds-project) is selected.
Find the pod that has the feast- prefix, followed by the metadata name that you specified in the CRD configuration, for example, feast-sample-git-repopath.
Verify that the status is Running.

5.2. Configuring an offline store
Copy link

When you create a feature store instance that uses the minimal configuration, by default, Feature Store uses a SQLite file-based store for the offline store.

The example in the following procedure shows how to configure DuckDB for the offline store.

You can configure other offline stores, such as Snowflake, BigQuery, Redshift, as detailed in the Feast reference documentation for offline stores.

Note

The example code in the following procedure requires that you edit it with values that are specific to your use case.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have enabled the Feature Store component, as described in Enabling the Feature Store component.
You have created a data science project, as described in Creating a data science project. In the following procedure, my-ds-project is the name of the data science project.
Your data science project includes an existing secret that provides credentials for accessing the database that you want to use for the offline store. The example in the following procedure requires that you have configured DuckDB.

Procedure

In the OpenShift console, click the Quick Create ( ) icon and then click the Import YAML option.
Verify that your data science project is the selected project.

Copy the following code and paste it into the YAML editor:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-db-persistence
spec:
  feastProject: my_project
  services:
    offlineStore:
      persistence:
        file:
          type: duckdb

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-db-persistence
spec:
  feastProject: my_project
  services:
    offlineStore:
      persistence:
        file:
          type: duckdb

Copy to Clipboard

Toggle word wrap

Edit the services.offlineStore section to specify values specific to your use case.
Click Create.

Verification

In the OpenShift console, select Workloads → Pods.
Make sure that your project (for example, my-ds-project) is selected.
Find the pod that has the feast- prefix, followed by the metadata name that you specified in the CRD configuration, for example, feast-sample-db-persistence.
Verify that the status is Running.

5.3. Configuring an online store
Copy link

When you create a feature store instance using the minimal configuration, by default, the online store is a SQLite database.

The example in the following procedure shows how to configure a PostgreSQL database for the online store.

You can configure other online stores, such as Snowflake, Redis, and DynamoDB, as detailed in the Feast reference documentation for online stores.

Note

The example code in the following procedure requires that you edit it with values that are specific to your use case.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have enabled the Feature Store component, as described in Enabling the Feature Store component.
You have created a data science project, as described in Creating a data science project. In the following procedure, my-ds-project is the name of the data science project.
Your data science project includes an existing secret that provides credentials for accessing the database that you want to use for the online store. The example in the following procedure requires that you have configured a PostgreSQL database.

Procedure

In the OpenShift console, click the Quick Create ( ) icon and then click the Import YAML option.
Verify that your data science project is the selected project.

Copy the following code and paste it into the YAML editor:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-db-persistence
spec:
  feastProject: my_project
  services:
    onlineStore:
      persistence:
        store:
          type: postgres
          secretRef:
            name: feast-data-stores

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-db-persistence
spec:
  feastProject: my_project
  services:
    onlineStore:
      persistence:
        store:
          type: postgres
          secretRef:
            name: feast-data-stores

Copy to Clipboard

Toggle word wrap

Edit the services.onlineStore section to specify values that are specific to your use case.
Click Create.

Verification

In the OpenShift console, select Workloads → Pods.
Make sure that your project (for example, my-ds-project) is selected.
Find the pod that has the feast- prefix, followed by the metadata name that you specified in the CRD configuration, for example, feast-sample-db-persistence.
Verify that the status is Running.

5.4. Configuring the feature registry
Copy link

By default, when you create a feature instance using the minimal configuration, the registry is a simple SQLite database.

The example in the following procedure shows how to configure an S3 registry.

You can configure other types of registries, such as GCS, SQL, Snowflake, as detailed in the Feast reference documentation for registries.

Note

The example code in the following procedure requires that you edit it with values that are specific to your use case.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have enabled the Feature Store component, as described in Enabling the Feature Store component.
You have created a data science project, as described in Creating a data science project. In the following procedure, my-ds-project is the name of the data science project.
Your data science project includes an existing secret that provides credentials for accessing the database that you want to use for the registry. The example in the following procedure requires that you have configured S3.

Procedure

In the OpenShift console, click the Quick Create ( ) icon and then click the Import YAML option.
Verify that your data science project is the selected project.

Copy the following code and paste it into the YAML editor:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
 name: sample-s3-registry
spec:
 feastProject: my_project
 services:
   registry:
     local:
       persistence:
         file:
           path: s3://bucket/registry.db
           s3_additional_kwargs:
             ServerSideEncryption: AES256
             ACL: bucket-owner-full-control
             CacheControl: max-age=3600

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
 name: sample-s3-registry
spec:
 feastProject: my_project
 services:
   registry:
     local:
       persistence:
         file:
           path: s3://bucket/registry.db
           s3_additional_kwargs:
             ServerSideEncryption: AES256
             ACL: bucket-owner-full-control
             CacheControl: max-age=3600

Copy to Clipboard

Toggle word wrap

Edit the services.registry section to specify values that are specific to your use case.
Click Create.

Verification

In the OpenShift console, select Workloads → Pods.
Make sure that your project (for example, my-ds-project) is selected.
Find the pod that has the feast- prefix, followed by the metadata name that you specified in the CRD configuration, for example, sample-s3-registry.
Click the feast pod and then select Pod details.
Click Terminal.
In the Terminal window, enter the following command to view the configuration, including the S3 registry:
```
cat feature_store.yaml
```
```
$ cat feature_store.yaml
```
Copy to Clipboard Toggle word wrap

5.5. Example PVC configuration
Copy link

When you configure the online store, offline store, or registry, you can also configure persistent volume claims (PVCs) as shown in the following Feature Store custom resource definition (CRD) example.

Note

The following example code requires that you edit it with values that are specific to your use case.

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-pvc-persistence
spec:
  feastProject: my_project
  services:
    onlineStore:   
      persistence:
        file:
          path: online_store.db
          pvc:
            ref:
              name: online-pvc
            mountPath: /data/online
    offlineStore:   
      persistence:
        file:
          type: duckdb
          pvc:
            create:
              storageClassName: standard
              resources:
                requests:
                  storage: 5Gi
            mountPath: /data/offline
    registry:   
      local:
        persistence:
          file:
            path: registry.db
            pvc:
              create: {}
              mountPath: /data/registry
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: online-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-pvc-persistence
spec:
  feastProject: my_project
  services:
    onlineStore:


      persistence:
        file:
          path: online_store.db
          pvc:
            ref:
              name: online-pvc
            mountPath: /data/online
    offlineStore:


      persistence:
        file:
          type: duckdb
          pvc:
            create:
              storageClassName: standard
              resources:
                requests:
                  storage: 5Gi
            mountPath: /data/offline
    registry:


      local:
        persistence:
          file:
            path: registry.db
            pvc:
              create: {}
              mountPath: /data/registry
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: online-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Copy to Clipboard

Toggle word wrap

1: The online store specifies a PVC that must already exist.
2: The offline store specifies a storage class name and storage size.
3: The registry configuration specifies that the Feature Store Operator creates a PVC with default settings.

5.6. Configuring role-based access control
Copy link

Role-Based Access Control (RBAC) is a security mechanism that restricts access to resources based on the roles of individual users within an organization. Feature Store RBAC ensures that only authorized users or groups can access or modify specific resources, thereby maintaining data security and operational integrity.

The RBAC implementation in Feature Store is designed to provide the following capabilities:

Assign permissions - Allow administrators to assign permissions for various operations and resources to users or groups based on their roles.
Seamless integration - Integrate smoothly with existing business code without requiring significant modifications.
Backward compatibility - Maintain support for non-authorized models as the default to ensure backward compatibility.

Feature Store RBAC provides the following benefits:

Feature sharing - Enable multiple teams to share the feature store while ensuring controlled access. This capability allows for collaborative work without compromising data security.
Access control management - Prevent unauthorized access to team-specific resources and spaces, governing the operations that each user or group can perform.

The Feature Store permissions model allows you to configure granular permission policies to all the resources defined in a feature store.

The permission authorization enforcement is performed when requests are executed through one of the Feature Store (Python) servers:

The online feature server (REST)
The offline feature server (Apache Arrow Flight) uses the gRPC communication protocol to exchange data. This server wraps calls to existing offline store implementations and exposes interfaces as Arrow Flight endpoints.
The registry server (gRPC)

Note

If you configure the feature store with a local provider (the default), there is no permission enforcement when accessing the Feature Store API.

5.6.1. Default authorization configuration
Copy link

In the default configuration, no permission enforcement is applied. The following example Feature Store does not include a spec.authz section, which indicates no authorization.

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-no-auth
spec:
  feastProject: my_project

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-no-auth
spec:
  feastProject: my_project

Copy to Clipboard

Toggle word wrap

Optionally, you can configure either OIDC and Kubernetes RBAC authorization protocols.

5.6.2. Example OIDC Authorization configuration
Copy link

The following example shows an OIDC authorization configuration in the Feature Store custom resource definition (CRD):

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-oidc-auth
spec:
  feastProject: my_project
  authz:
    oidc:
      secretRef:
        name: oidc-secret

---
kind: Secret
apiVersion: v1
metadata:
  name: oidc-secret
stringData:
  client_id: client_id
  auth_discovery_url: auth_discovery_url
  client_secret: client_secret
  username: username
  password: password

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-oidc-auth
spec:
  feastProject: my_project
  authz:
    oidc:
      secretRef:
        name: oidc-secret

---
kind: Secret
apiVersion: v1
metadata:
  name: oidc-secret
stringData:
  client_id: client_id
  auth_discovery_url: auth_discovery_url
  client_secret: client_secret
  username: username
  password: password

Copy to Clipboard

Toggle word wrap

Note

This example code requires that you edit it with values that are specific to your use case.

For more information, see OIDC configuration in the Feast documentation.

5.6.3. Example Kubernetes Authorization configuration
Copy link

The following example shows a Kubernetes authorization configuration in the Feature Store custom resource definition (CRD):

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-kubernetes-auth
spec:
  feastProject: feast_rbac
  authz:
    kubernetes:
      roles:
      - feast-writer
      - feast-reader

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample-kubernetes-auth
spec:
  feastProject: feast_rbac
  authz:
    kubernetes:
      roles:
      - feast-writer
      - feast-reader

Copy to Clipboard

Toggle word wrap

Note

The example code requires that you edit it with values that are specific to your use case.

For more information, see Kubernetes RBAC configuration in the Feast documentation.

For an example of how to implement Kubernetes RBAC Authorization, see Running the Feast RBAC example on Kubernetes using the Feast Operator.

5.7. Editing an existing feature store instance
Copy link

The examples in this document describe how to customize a feature store instance by creating a new custom resource definition (CRD). Alternatively, you can customize an existing feature instance.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have created a feature store instance, as described in Deploying a feature store instance in a data science project.

Procedure

In the OpenShift console, select Administration → CustomResourceDefinitions.
To filter the list, in the Search by Name field, enter feature.
Click the FeatureStore CRD and then click Instances.
Select the instance that you want to edit, and then click YAML.
In the YAML editor, edit the configuration.
Click Save and then click Reload.

Verification

The feature store instance CRD deploys successfully.

Chapter 6. Viewing feature store objects in the web-based UI
Copy link

You can use the Feature Store Web UI to view all registered features, data sources, entities, and feature services.

Prerequisites

You can access the OpenShift console.
You have installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.
You have enabled the Feature Store component, as described in Enabling the Feature Store component.

Procedure

In the OpenShift console, click the Quick Create ( ) icon and then click the Import YAML option.
Verify that your data science project is the selected project.

Copy the following code and paste it into the YAML editor:

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample
spec:
  feastProject: my_project
  services:
	  ui: {}

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
  name: sample
spec:
  feastProject: my_project
  services:
	  ui: {}

Copy to Clipboard

Toggle word wrap

1: Specifies to create a Web UI for the feature store instance.

Click Create.
The Feature Store Operator starts a container for the web-based Feature Store UI and creates an OpenShift route that provides the URL so that you can access it.
In the OpenShift console, select Workloads → Pods.
Make sure that your project (for example, my-ds-project) is selected.
You should see a deployment for the web-based UI. Note that OpenShift enables TLS by default at runtime.
To populate the web-based UI with the objects in your feature store, create an OpenShift cron job:
1. Open a terminal window.
2. If you are not already logged in to your OpenShift cluster as a cluster administrator, log in as shown in the following example:
  $ oc login __<openshift_cluster_url>__ -u __<admin_username>__ -p __<password>__
  Copy to Clipboard Toggle word wrap
3. Make sure that you are using the data science project in which you enabled Feature Store, for example, my-ds-project.
  $ oc project my-ds-project
  Copy to Clipboard Toggle word wrap
4. Create a cron job by running a command with the following syntax:
  $ oc create job --from=cronjob/feast-<FeatureStore CR name> feast-apply
  Copy to Clipboard Toggle word wrap
  For example, if the metadata.name in the Feature Store CRD is sample, run the following command:
  $ oc create job --from=cronjob/feast-sample feast-apply
  Copy to Clipboard Toggle word wrap
5. Run the following commands:
  $ oc wait --for=condition=complete job/feast-apply $ oc logs job/feast-apply --all-containers=true
  Copy to Clipboard Toggle word wrap
To find the URL for the Feature Store UI, in the OpenShift console, click Networking → Routes.
In the row for the Feature Store UI, for example feast-sample-ui, the URL is in the Location column.
Click the URL link to open it in your default web browser.

Verification

The Feature Store Web UI appears and shows the feature objects in your project as shown in the following figure:

Figure 6.1. The Feature Store Web UI

Chapter 7. Additional resources
Copy link

For example Feature Store CRD configurations, see the Feast Operator configuration samples.
For details about the Feast API, see the Feast API documentation.
For information on how to implement machine learning features, see the Feast documentation.

Legal Notice
Copy link

The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.

Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.

Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.

Linux® is the registered trademark of Linus Torvalds in the United States and other countries.

Java® is a registered trademark of Oracle and/or its affiliates.

XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.

MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.

Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.

The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.

All other trademarks are the property of their respective owners.

Configuring Feature Store

Install and manage Feature Store as a component in the Red Hat OpenShift AI Operator configuration

Chapter 1. Overview of machine learning features and Feature Store
Copy link

1.1. Overview of machine learning features
Copy link

1.2. Overview of Feature Store
Copy link

1.3. Audience for Feature Store
Copy link

Chapter 2. Before you begin
Copy link

Chapter 3. Enabling the Feature Store component
Copy link

Chapter 4. Deploying a feature store instance in a data science project
Copy link

Chapter 5. Customizing your feature store configuration
Copy link

5.1. Specifying to use a feature project from a Git repository
Copy link

5.2. Configuring an offline store
Copy link

5.3. Configuring an online store
Copy link

5.4. Configuring the feature registry
Copy link

5.5. Example PVC configuration
Copy link

5.6. Configuring role-based access control
Copy link

5.6.1. Default authorization configuration
Copy link

5.6.2. Example OIDC Authorization configuration
Copy link

5.6.3. Example Kubernetes Authorization configuration
Copy link

5.7. Editing an existing feature store instance
Copy link

Chapter 6. Viewing feature store objects in the web-based UI
Copy link

Chapter 7. Additional resources
Copy link

Legal Notice
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Configuring Feature Store

Install and manage Feature Store as a component in the Red Hat OpenShift AI Operator configuration

Chapter 1. Overview of machine learning features and Feature StoreCopy linkLink copied to clipboard!

1.1. Overview of machine learning featuresCopy linkLink copied to clipboard!

1.2. Overview of Feature StoreCopy linkLink copied to clipboard!

1.3. Audience for Feature StoreCopy linkLink copied to clipboard!

Chapter 2. Before you beginCopy linkLink copied to clipboard!

Chapter 3. Enabling the Feature Store componentCopy linkLink copied to clipboard!

Chapter 4. Deploying a feature store instance in a data science projectCopy linkLink copied to clipboard!

Chapter 5. Customizing your feature store configurationCopy linkLink copied to clipboard!

5.1. Specifying to use a feature project from a Git repositoryCopy linkLink copied to clipboard!

5.2. Configuring an offline storeCopy linkLink copied to clipboard!

5.3. Configuring an online storeCopy linkLink copied to clipboard!

5.4. Configuring the feature registryCopy linkLink copied to clipboard!

5.5. Example PVC configurationCopy linkLink copied to clipboard!

5.6. Configuring role-based access controlCopy linkLink copied to clipboard!

5.6.1. Default authorization configurationCopy linkLink copied to clipboard!

5.6.2. Example OIDC Authorization configurationCopy linkLink copied to clipboard!

5.6.3. Example Kubernetes Authorization configurationCopy linkLink copied to clipboard!

5.7. Editing an existing feature store instanceCopy linkLink copied to clipboard!

Chapter 6. Viewing feature store objects in the web-based UICopy linkLink copied to clipboard!

Chapter 7. Additional resourcesCopy linkLink copied to clipboard!

Legal NoticeCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 1. Overview of machine learning features and Feature Store
Copy link

1.1. Overview of machine learning features
Copy link

1.2. Overview of Feature Store
Copy link

1.3. Audience for Feature Store
Copy link

Chapter 2. Before you begin
Copy link

Chapter 3. Enabling the Feature Store component
Copy link

Chapter 4. Deploying a feature store instance in a data science project
Copy link

Chapter 5. Customizing your feature store configuration
Copy link

5.1. Specifying to use a feature project from a Git repository
Copy link

5.2. Configuring an offline store
Copy link

5.3. Configuring an online store
Copy link

5.4. Configuring the feature registry
Copy link

5.5. Example PVC configuration
Copy link

5.6. Configuring role-based access control
Copy link

5.6.1. Default authorization configuration
Copy link

5.6.2. Example OIDC Authorization configuration
Copy link

5.6.3. Example Kubernetes Authorization configuration
Copy link

5.7. Editing an existing feature store instance
Copy link

Chapter 6. Viewing feature store objects in the web-based UI
Copy link

Chapter 7. Additional resources
Copy link

Legal Notice
Copy link