Chapter 3. Data virtualization operator


The data-virtualization operator helps to automate the configuration and deployment of virtual databases. After you add the operator to your OpenShift cluster, you can use it to build and deploy virtual database images from a range of data sources.

In this Technology Preview release, the preferred method for deploying virtual databases to OpenShift is to use the data-virtualization operator. The method of using creating and deploying virtual databases with Fabric8 and the Spring Boot Maven plugin is deprecated. Currently, virtual databases that you create from the data virtualization operator are not available in Fuse Online.

You can install the data virtualization operator on OpenShift 3.11 or greater. On OpenShift 4.2 and later the Operator is available in the OperatorHub.

Prerequisites

  • You are a cluster administrator. You must have cluster-admin rights to add operators to the OpenShift cluster.
  • You have the 2019-12 release of Red Hat Integration.
  • You have Developer access to an OpenShift server and you are familiar with using the OpenShift console and CLI.
  • You have a DDL file for the virtual database that you want to create, or you know how to write SQL code and create DDL files.
  • You have Maven 3.0 installed.
  • You have JDK 8 (Java Platform, Standard Edition 8 Development Kit) or greater installed.
  • You can connect to one of the following types of data sources:

    • Relational database
    • mongoDB
    • Salesforce
    • REST
    • OData
    • OpenAPI
  • You have administrator access to the configured data source.

3.1. Installing the data virtualization operator on OpenShift

Install the data virtualization operator so that you can use it to deploy virtual database images to OpenShift from YAML-based custom resources (CRs).

Prerequisites

  • You have cluster-admin access to an OpenShift 3.11 or 4.2 or greater cluster.
  • You can use the oc command-line tool to connect to interact with your OpenShift 3.11 cluster, or you have access to the OpenShift 4.2 or greater web console.

Procedure

  • Install the operator using one of the following methods, depending on the version of OpenShift that you are running.

Installing on OpenShift 3.11

  1. From a terminal window, log in to the OpenShift cluster as a cluster administrator.

    oc login
  2. Create or open a project where you want to deploy a virtual database.
  3. Type the following commands:

    export OP_ROOT=https://raw.githubusercontent.com/teiid/teiid-operator/7.5-0.0.x/deploy
    oc create -f $OP_ROOT/crds/virtualdatabase.crd.yaml  1
    oc create -f $OP_ROOT/service_account.yaml
    oc create -f $OP_ROOT/role.yaml
    oc create -f $OP_ROOT/role_binding.yaml
    oc create -f $OP_ROOT/operator.yaml
    1
    If you previously created a CRD in the cluster, the command returns an error, reporting that the CRD already exists. You can ignore the message.
  4. Type the following commands to create a pull secret that you can use to access the Red Hat image registry:

    oc create secret docker-registry dv-pull-secret /
    --docker-server=registry.redhat.io /
    --docker-username=$username / 1
    --docker-password=$password /
    --docker-email=$email_address
    oc secrets link builder dv-pull-secret
    oc secrets link builder dv-pull-secret --for=pull
    1
    Substitute the user name and password that you use to log in to the Red Hat Customer Portal.

    If the command completes with no errors, the operator is deployed to your OpenShift instance within the current OpenShift project.

  5. To enable the data virtualization operator to retrieve images from the Red Hat registry so that you can create virtual databases, link the secret that you created in Step 4 to the service account for the operator.

    oc secrets link dv-operator dv-pull-secret --for=pull

Installing on OpenShift 4.2 or greater

  1. From a terminal window, type the following commands to log in to the OpenShift cluster and create a pull secret that you can use to access the Red Hat image registry:

    oc login
    oc create secret docker-registry dv-pull-secret /
    --docker-server=registry.redhat.io /
    --docker-username=$username /  1
    --docker-password=$password /
    --docker-email=$email_address
    oc secrets link builder dv-pull-secret
    oc secrets link builder dv-pull-secret --for=pull
    1
    Use your Red Hat Customer Portal login credentials.
  2. Log in to the OpenShift web console as a cluster administrator.
  3. From the OpenShift menu, expand Operators and click OperatorHub.
  4. Click Data Virtualization Operator 7.5.0 provided by Red Hat, Inc., and then click Install.
  5. From the Create Operator Subscription page, verify that the selected namespace matches the name of the project where you want to install the operator, and then click Subscribe.

    The Installed Operators page lists the Data Virtualization Operator and reports the status of the installation.

  6. From the OpenShift menu, expand Workloads and click Pods to check the status of the operator pod. After a few minutes, the pod for the operator service begins to run.
  7. To enable the data virtualization operator to retrieve images from the Red Hat registry so that you can create virtual databases, link the secret that you created in Step 1 to the service account for the operator.

    oc secrets link dv-operator dv-pull-secret --for=pull

3.2. Running the data virtualization operator to deploy a virtual database

After a cluster administrator adds the data virtualization operator to an OpenShift cluster, other users can run the operator to create a virtual database.

The data virtualization operator processes a virtual database custom resource (CR) to create and deploy a virtual database object on OpenShift. You specify the configuration properties for the data source in the CR. Each type of data source requires a specific set of configuration properties. When you run the operator, you provide it with a .yaml file that contains the CR. By running the operator with different CRs, you can create virtual databases from a range of data sources. For more information about specifying data source properties, see Section 3.4, “Custom resources to support data sources”

Note

In this Technology Preview, the data virtualization operator can create virtual databases from the following data sources only:

  • Relational databases
  • Salesforce databases
  • MongoDB
  • Open API
  • OData

Prerequisites

  • You have access to an OpenShift cluster in which the data virtualization operator is installed.
  • The operator has access to the Maven repositories that contain the dependencies that the build requires.
  • OpenShift can access a supported data source that runs on your network.
  • You have the login credentials to access the data source.
  • You have a CR in .yaml format that provides information about how to create the virtual database.

Procedure

  1. From a terminal window, log in to OpenShift and open the project where you want to create the virtual database.
  2. On you computer, change to the directory that contains the .yaml file that contains the CR.
  3. Type the following command to run the operator to create the virtual database:

    oc create -f <cr_filename.yaml>

    Replace <cr_filename.yaml> with the name of the CR file for your data source. For example,

    oc create -f dv-customer.yaml

    After the deployment completes, a service is added to the OpenShift cluster. The name of the service matches the name of the custom resource.

  4. Type the following command to verify that the virtual database is created:

    oc get vdbs

    OpenShift returns the list of virtual databases in the project.

The deployed service supports connections from the following clients:

  • JDBC clients through port 31000.
  • postgreSQL clients, including ODBC clients, through port 35432.
  • OData clients, through an HTTP endpoint and route.
Note

For OpenShift to create an HTTP endpoint, the value of the property spec/exposeVia3scale must be set to false in the CR. If the value is set to true it is assumed that 3scale manages the endpoint, and no HTTP endpoint is created.

3.3. Creating a virtual database from the OpenShift web console

After a cluster administrator adds the data virtualization operator from the OperatorHub, other users can run the operator from web console to create a virtual database.

When you run the operator from the web console, you can use the embedded editor to create the custom resource (CR) that defines the virtual database. The CR includes properties that specify how to connect to the data source, and a DDL section that determines how information in the data source is used in your virtual database. You use YAML or JSON to define connection properties, and you use SQL-MED in the DDL section to specify the virtual database schema.

The editor includes a sample CR that you can edit in-place, or download to edit it off-line. You can also use drag and drop to upload local copies of properties definitions or DDL files. For information about the properties to use for different data sources, see Section 3.4, “Custom resources to support data sources”

Note

In this Technology Preview, the data virtualization operator can create virtual databases from the following data sources only:

  • Relational databases
  • Salesforce databases
  • MongoDB
  • Open API
  • OData

Prerequisites

  • You have web console access to an OpenShift cluster in which the data virtualization operator is installed.
  • The operator has access to the Maven repositories that contain the dependencies that the build requires.
  • OpenShift can access a supported data source that runs on your network.
  • You have the login credentials to access the data source.
  • You have a CR in .yaml format that provides information about how to create the virtual database.

Procedure

  1. From the OpenShift menu, expand Operators and click Installed Operators.
  2. Click Data Virtualization Operator.
  3. Click the VirtualDatabase tab.
  4. Click Create Virtual Database.

    OpenShift displays a sample custom resource.

  5. Edit the sample CR as needed to specify the details of the virtualization, and then click Create.

    OpenShift lists the virtual database service on the VirtualDatabase tab.

  6. From the OpenShift menu, expand Workloads and click Pods to check the status of the pod that runs the new virtualization. After it is running, you are ready to use the new virtualization.
  7. To view the protocols that the virtualization exposes, from the Openshift menu, expand Networking, click Services, and then click the service with the name of the virtualization.

    The deployed service automatically supports the following connection types:

    • JDBC - Provides access for SQL clients over port 31000.
    • postgreSQL - Provides access for ODBC clients and other postgreSQL clients over port 35432.
    • HTTP - Provides access for OData and REST clients over port 8080.

3.4. Custom resources to support data sources

Before you can use the data virtualization operator to create a virtual database, you must specify properties for the data source in a custom resource (CR) file.

By providing configuration information in a CR, you provide the operator with directions for how to create virtual databases from any of multiple types of data sources. When you run the data virtualization operator, it reads information from the CR to determine the type of the data source, its schema structure, and how to connect to it and authenticate with it.

The CR uses SQL Data Definition Language (DDL) commands to describe the schema of the virtual database and the data source, the data that you want to import into the virtual database, and the mapping between the source schema and the virtual schema. The CR also specifies the translator to use to convert the format of commands and data that pass between the virtual database and the data source.

You can specify the values of configuration properties directly in the CR file, or you can reference values that you define in OpenShift secrets. For more information about creating secrets, see Section 4.2, “Creating and deploying secrets”.

Note

Period characters (.) are not valid for use in environment variables. When you add variables to the CR, replace period characters with underscore characters (_). At runtime, underscores in the variables are converted automatically to periods.

3.4.1. Settings to configure relational databases as data sources

The custom resource (CR) for a relational database management system (RDBMS), such as PostgreSQL or Microsoft SQL Server, must contains specific information so that the data virtualization operator can create a virtual database from the database.

The following tables list some of the properties that are required in a CR to create a virtual database that is based on a relational database. The values that you assign to these properties vary according to your environment and the specific database technology that you use.

Table 3.1. Connection properties in the spec/env section of an RDBMS CR
Property NameDescriptionRequiredDefault value

SPRING_DATASOURCE_<NAME>_JDBCURL

URL For the connection

Yes

n/a

SPRING_DATASOURCE_<NAME>_USERNAME

User Name

Yes

n/a

SPRING_DATASOURCE_<NAME>_PASSWORD

Password

Yes

n/a

SPRING_DATASOURCE_<NAME>_DRIVER_CLASS_NAME

Driver Name

No

n/a

SPRING_DATASOURCE_<NAME>_IMPORTER_SCHEMA_NAME

Schema Name for import

Yes

n/a

In the preceding table, NAME specifies a name, in upper case, that refers to the data source. This same name string is used in the DDL that defines the virtual database to represent the data source server.

The JDBC driver class that you reference must be available as a Maven artifact from a repository that is listed in the spec/dependencies section of the CR file. The virtualization operator must have access to download artifacts from the specified repository.

Along with the connection properties You can define properties to control the behavior of the JDBC translator and to provide more control over how you import data from the source database.

Table 3.2. Foreign data wrapper settings in the build/source/ddl section of an RDBMS CR
Property NameDescriptionRequiredValue

FOREIGN DATA WRAPPER

Translator

Yes

<translator-name>

The translator value depends on the type of database. Translators are available for the following relational databases.

  • db2
  • derby
  • h2
  • hana (Connects to SAP HANA through a JDBC driver)
  • hive*jdbc (Connects to Apache Hive through the Hive JDBC driver)
  • hsql
  • impala (Connects to Apache Impala)
  • informix
  • ingres
  • jtds (Connects to Microsoft SQL Server through Java tabular data stream (JTDS) drivers)
  • mysql5
  • netezza
  • oracle
  • osisoft-pi (Connects to an OSIsoft PI Data Archive through the OSIsoft JDBC driver)
  • phoenix (Connects to Apache Phoenix using the Phoenix JDBC driver)
  • postgresql
  • presto-jdbc (Connects to PrestoDB through a JDBC driver)
  • sqlserver (Connects to Microsoft SQL Server)
  • sybase
  • teradata
  • teiid (Connecting to a Teiid virtual database through a JDBC driver)
  • vertica

For a complete list of the translator properties that you can define for JDBC data sources, see JDBC Translators in the Teiid Reference Guide.

3.4.2. Settings to configure MongoDB as a data source

The custom resource (CR) for a MongoDB database must contain specific information so that the data virtualization operator can create a virtual database from the database.

Note

The current release supports MongoDB release 3.

The following tables list the properties that are required in the CR to create a virtual database that is based on a MongoDB database:

Table 3.3. Connection properties in the spec/env section of the MongoDB CR
Property NameDescriptionRequiredDefault value

SPRING_TEIID_DATA_MONGODB_{NAME}_REMOTE_SERVER_LIST

List of MongoDb servers ex:(localhost:27012)

Yes

n/a

SPRING_TEIID_DATA_MONGODB_{NAME}_USER

User Name

Yes

n/a

SPRING_TEIID_DATA_MONGODB_{NAME}_PASSWORD

Password

Yes

n/a

SPRING_TEIID_DATA_MONGODB_{NAME}_DATABASE

Database name to connect to

Yes

n/a

SPRING_TEIID_DATA_MONGODB_{NAME}_AUTH_DATABASE

Database name for authorization

No

n/a

SPRING_TEIID_DATA_MONGODB_{NAME}_SSL

Use SSL Connection

No

n/a

In the preceding table, NAME specifies a name, in upper case, that refers to the data source. This same name string is used in the DDL that defines the virtual database to represent the data source server.

Table 3.4. Foreign data wrapper settings in the build/source/ddl section of the MongoDB CR
Property NameDescriptionRequiredValue

FOREIGN DATA WRAPPER

Translator

Yes

mongodb

For a complete list of the properties that you can set to control how data is translated between MongoDB and a virtual database, see the Teiid Reference Guide.

Note

You are not required to list any build dependencies in the CR for a MongoDB virtual database.

3.4.3. Settings to configure REST, OData, and OpenAPI data sources

The custom resource (CR) for a REST-based data source must contain specific information so that the data virtualization operator can create a virtual database from the source.

The set of connection properties that are required for all REST-based data sources is the same. However, services that are based on standard specifications, such as OData or OpenAPI, require specific translators. For information about translator properties, see Table 3.6, “Foreign data wrapper settings in the build/source/ddl section of a REST-based CR”.

For generic REST web services that are not based on a standard specification, the data virtualization service cannot automatically convert query criteria into a query parameter. Because these services lack built-in mechanisms to pass SQL query conditions to a REST API endpoint, you must use the invokeHttp procedure to pass the query as an XML or JSON payload, and you must specify all query strings and headers.

By default, translators are unable to parse the security configuration of a secured API. To enable translators to access data for a secured API, the CR must specify the security properties for the API.

Table 3.5. Properties in the spec/env section of REST-based CRs
Property NameDescriptionRequiredDefault value

SPRING_TEIID_REST_<NAME>_ENDPOINT

Endpoint for the service

Yes

n/a

SPRING_TEIID_REST_<NAME>_SECURITY_TYPE

Security type used. Available options are http-basic, openid-connect or empty

No

no security

SPRING_TEIID_REST_<NAME>_USER_NAME

User Name

Yes

n/a

SPRING_TEIID_REST_<NAME>_PASSWORD

Password

Yes

n/a

SPRING_TEIID_REST_<NAME>_CLIENT_ID

ClientId from connected app

Yes, when the security type is defined as openid-connect

n/a

SPRING_TEIID_REST_<NAME>_CLIENT_SECRET

clientSecret from connected app

Yes, when the security type is defined as openid-connect

n/a

SPRING_TEIID_REST_<NAME>_AUTHORIZE_URL

clientSecret from connected app

Yes, when the security type is defined as openid-connect

n/a

SPRING_TEIID_REST_<NAME>_ACCESS_TOKEN_URL

clientSecret from connected app

Yes, when the security type is defined as openid-connect

n/a

SPRING_TEIID_REST_<NAME>_SCOPE

clientSecret from connected app

No. Applies when the security type is defined as openid-connect

n/a

SPRING_TEIID_REST_<NAME>_REFRESH_TOKEN

Refresh token for the virtual database service

No

n/a

When security is set to openid-connect, you can use refresh tokens to authenticate, instead of using name and password authentication. Information about obtaining refresh tokens is beyond the scope of this document.

In the Table 3.5, “Properties in the spec/env section of REST-based CRs”, NAME indicates the name, in upper case, that represents the data source. The NAME value is also used in lower case in the build/source/ddl section of the CR to represent the data source server.

Table 3.6. Foreign data wrapper settings in the build/source/ddl section of a REST-based CR
Data sourceDescriptionRequiredValue

REST

Translator

Yes

rest

OData

Translator

Yes

odata

OData4

Translator

Yes

odata4

OpenAPI

Translator

Yes

openapi

For a complete list of the properties that you can set to control how data is translated between REST-based services and a virtual database, see the OData, OData V4, OpenAPI, and Web Services Translator sections in the Translators section of the Teiid Reference Guide.

The OpenAPI translator assumes that the endpoint in the API document is set to the target location /openapi, and it builds a source model that is based on that assumption. If the API endpoint is set to a different target, a configuration setting must be specified to enable the translator to locate the endpoint and import data correctly. The following examples show a DDL schema statement and an environment variable that you can set to specify a non-standard endpoint.

Sample DDL statement to specify a non-standard OpenAPI endpoint

CREATE SCHEMA sourceModel SERVER oService OPTIONS ("importer.metadataUrl" '/swagger.json');

Sample environment property to specify a non-standard OpenAPI endpoint

SPRING_TEIID_REST_OSERVICE_IMPORTER_METADATAURL=/swagger.json

3.4.4. Settings to configure Salesforce as a data source

The custom resource (CR) for a Salesforce database must contain specific information so that the data virtualization operator can create a virtual database from the database.

Salesforce uses OAuth 2.0 for authentication and authorization. Before a virtual database can import and query Salesforce data, you must obtain OAuth credentials for the virtual database from Salesforce. For information about how to set up OAuth, see Section 3.4.5, “Setting up an OAuth connection to Salesforce”

The following tables list the properties that are required in the CR to create a virtual database that is based on a Salesforce database:

Table 3.7. Properties in the spec/env section of the Salesforce CR
Property NameDescriptionRequiredDefault value

SPRING_TEIID_DATA_SALESFORCE_<NAME>_USER_NAME

User for salesforce.com

Yes

n/a

SPRING_TEIID_DATA_SALESFORCE_<NAME>_PASSWORD

Password for salesforce.com

Yes

n/a

SPRING_TEIID_DATA_SALESFORCE_<NAME>_CLIENT_ID

ClientId from connected app

Yes

n/a

SPRING_TEIID_DATA_SALESFORCE_<NAME>_CLIENT_SECRET

clientSecret from connected app

No

n/a

SPRING_TEIID_DATA_SALESFORCE_<NAME>_REFRESH_TOKEN

Refresh token for connected app

No

n/a

If your connected app uses refresh tokens to authenticate, rather than name and password, your CR must specify the refresh token property, in place of the user name and password properties. Information about obtaining refresh tokens is beyond the scope of this document. For information about how to obtain a refresh token for your connected app, see the Salesforce documentation.

In Table 1, NAME indicates the name, in upper case, that refers to the data source. The NAME value is also used in lower case in the build/source/ddl section of the CR to represent the data source server.

Table 3.8. Properties in the build/source/ddl section of the Salesforce CR
Property NameDescriptionRequiredValue

FOREIGN DATA WRAPPER

Translator

Yes

salesforce

For a complete list of the properties that you can set to control how data is translated between Salesforce and a virtual database, see the Teiid Reference Guide.

3.4.5. Setting up an OAuth connection to Salesforce

Before the data virtualization service can retrieve data from a Salesforce database, you must enable configure it as a connected app in Salesforce that is OAuth-enabled. After you configure OAuth, Salesforce generates a client ID and client secret that you must add to the CR file that defines the connection from the virtual database to Salesforce.

To configure OAuth you create a connected app in Salesforce that can request access to Salesforce data on behalf of the data virtualization service. In the settings for the connected app, you enable integration with the Salesforce API by using the OAuth 2.0.

Prerequisites

  • You have a Salesforce.com account that has access to the data that you want to integrate in a virtual database.
Note

The following steps are based on Salesforce Classic. If you use a different version of Salesforce, you might use a different procedure. For more information about creating connected apps in Salesforce, see the Salesforce documentation.

Procedure

  1. From Salesforce, log into your account.
  2. Click SetUp in the profile menu.
  3. In the Build section of the navigation sidebar, expand Create, and then click Apps.
  4. In the Connected Apps section, click New.
  5. Complete the required fields.
  6. In the section API (Enable OAuth Settings), select Enable OAuth Settings to display the OAuth settings.
  7. Complete the required OAuth fields. In the OAuth Scopes field, you must select the following scopes:

    • Access and manage your data (api).
    • Access your basic information (id, profile, email, address, phone).
    • Allow access to your unique identifier (openid).
    • Full access (full).
    • Perform requests on your behalf at any time (refresh_token, offline_access).
  8. Select Require Secret for Web Server Flow.
  9. Click Save and then click Continue.
  10. Make a note of the values in the Consumer Key and Consumer Secret fields. These values are required for properties in the CR that specifies how the virtual database connects to Salesforce.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.