Chapter 9. Translators
Data Virtualization uses the Teiid Connector Architecture (TCA), which provides a robust mechanism for integrating with external systems. The TCA defines a common client interface between Data Virtualization and an external system that includes metadata as to what SQL constructs are available for pushdown and the ability to import metadata from the external system.
A Translator is the heart of the TCA and acts as the bridge logic between Data Virtualization and an external system.
Translators can have a number of configurable properties. These are broken down into execution properties, which determine aspects of how data is retrieved, and import settings, which determine what metadata is read for import.
The execution properties for a translator typically have reasonable defaults. For specific translator types, such as the Derby translator, base execution properties are already tuned to match the source. In most cases the user will not need to adjust their values.
Name | Description | Default |
---|---|---|
Immutable |
Set to | false |
RequiresCriteria |
Set to | false |
SupportsOrderBy |
Set to | false |
SupportsOuterJoins |
Set to | false |
SupportsFullOuterJoins |
If | false |
SupportsInnerJoins |
Set to | false |
SupportedJoinCriteria | If join capabilities are enabled, defines the criteria that can be used as the join criteria. May be one of (ANY, THETA, EQUI, or KEY). | ANY |
MaxInCriteriaSize |
If the use of | -1 |
MaxDependentInPredicates |
If the use of | -1 |
DirectQueryProcedureName |
If | native |
SupportsDirectQueryProcedure |
Set to | false |
ThreadBound |
Set to | false |
CopyLobs |
If | false |
TransactionSupport |
The highest level of transaction capability. Used by the engine as a hint to determine if a transaction is needed for | XA |
Only a subset of the available metadata can be set through execution properties on the base ExecutionFactory. All methods are available on the BaseDelegatingExecutionFactory.
There are no base importer settings.
Override execution properties
For all translators, you can override Execution Properties in the main vdb file.
Example: Overriding a translator property
.
CREATE FOREIGN DATA WRAPPER "oracle-override" TYPE oracle OPTIONS (RequiresCriteria 'true'); CREATE SERVER ora FOREIGN DATA WRAPPER "oracle-override" OPTIONS ("resource-name" 'java:/oracle'); CREATE SCHEMA ora SERVER ora; SET SCHEMA ora; IMPORT FROM SERVER ora INTO ora;
The preceding example overrides the oracle translator and sets the RequiresCriteria property to true. The modified translator is only available in the scope of this VDB. As many properties as desired may be overriden together.
See also VDB Definition.
Parameterizable native queries
In some situations the teiid_rel:native-query
property and native procedures accept parameterizable strings that can positionally reference IN parameters. A parameter reference has the form `$integer
, for example, $1
. Note that one-based indexing is used and that only IN parameters may be referenced. Dollar-sign integer is therefore reserved, but may be escaped with another $`
, for example, $$1
. The value will be bound as a prepared value or a literal is a source specific manner. The native query must return a result set that matches the expectation of the calling procedure.
For example the native-query select c from g where c1 = $1 and c2 = '$$1'
results in a JDBC source query of select c from g where c1 = ? and c2 = '$1'
, where ?`
will be replaced with the actual value bound to parameter 1.
General import properties
Several import properties are shared by all translators.
When specifying an importer property, it must be prefixed with importer.
. For example, importer.tableTypes
.
Name | Description | Default |
---|---|---|
autoCorrectColumnNames |
Replace any usage of | true |
renameDuplicateColumns |
If true, rename duplicate columns caused by either mixed case collisions or | false |
renameDuplicateTables |
If true, rename duplicate tables caused by mixed case collisions. A suffix | false |
renameAllDuplicates |
If true, rename all duplicate tables, columns, procedures, and parameters caused by mixed case collisions. A suffix | false |
nameFormat |
Set to a Java string format to modify table and procedure names on import. The only argument will be the original name Data Virtualization name. For example use |
9.1. Amazon S3 Translator
The Amazon Simple Storage Service (S3) translator, known by the type name amazon-s3, exposes stored procedures to leverage Amazon S3 object resources.
This translator is typically used with the TEXTTABLE
or XMLTABLE
functions to consume CSV or XML formatted data, or to read Microsoft Excel files or other object files that are stored in Amazon S3. The S3 translator can access Amazon S3 by using an AWS access key ID and secret access key.
Usage
In the following example, a virtual database reads a CSV file with the name g2.txt
from an Amazon S3 bucket called teiidbucket
:
e1,e2,e3 5,'five',5.0 6,'six',6.0 7,'seven',7.0
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <vdb name="example" version="1"> <model name="s3"> <source name="web-connector" translator-name="user-s3" connection-jndi-name="java:/amazon-s3"/> </model> <model name="Stocks" type="VIRTUAL"> <metadata type="DDL"><![CDATA[ CREATE VIEW G2 (e1 integer, e2 string, e3 double,PRIMARY KEY (e1)) AS SELECT SP.e1, SP.e2,SP.e3 FROM (EXEC s3.getTextFile(name=>'g2.txt')) AS f, TEXTTABLE(f.file COLUMNS e1 integer, e2 string, e3 double HEADER) AS SP; ]]> </metadata> </model> <translator name="user-s3" type="amazon-s3"> <property name="accesskey" value="xxxx"/> <property name="secretkey" value="xxxx"/> <property name="region" value="us-east-1"/> <property name="bucket" value="teiidbucket"/> </translator> </vdb>
Execution properties
Use the translator override mechanism to supply the following properties.
Name | Description | Default |
---|---|---|
Encoding | The encoding that should be used for CLOBs returned by the getTextFiles procedure. The value should match an encoding known to the JRE. | The system default encoding. |
Accesskey | Amazon security access key. Log in to Amazon console to find your security access key. When provided, this becomes the default access key. | n/a |
Secretkey | Amazon security secret key. Log in to Amazon console to find your security secret key. When provided, this becomes the default secret key. | n/a |
Region | Amazon region to be used with the request. When provided, this will be default region used. | n/a |
Bucket | Amazon S3 bucket name. If provided, this will serve as default bucket to be used for all the requests | n/a |
Encryption | When server-side encryption with customer-provided encryption keys (SSE-C) is used, the key is used to define the "type" of encryption algorithm used. You can configure the translator to use the AES-256 or AWS-KMS encryption algorithms. If provided, this will be used as default algorithm for all "get" based calls. | n/a |
Encryptionkey | When SSE-C type encryption used, where customer supplies the encryption key, this key will be used for defining the "encryption key". If provided, this will be used as default key for all "get" based calls. | n/a |
For information about setting properties, see Override execution property in Translators, and review the examples in the sections that follow.
Procedures exposed by translator
When you add the a model (schema) like above in the example, the following procedure calls are available for user to execute against Amazon S3.
bucket
, region
, accesskey
, secretkey
, encryption
and encryptionkey
are optional or nullable parameters in most of the methods provided. Provide them only if they are not already configured by using translator override properties as shown in preceding example.
getTextFile(…)
Retrieves the given named object as a text file from the specified bucket and region by using the provided security credentials as CLOB.
getTextFile(string name NOT NULL, string bucket, string region, string endpoint, string accesskey, string secretkey,string encryption, string encryptionkey, boolean stream default false) returns TABLE(file blob, endpoint string, lastModified string, etag string, size long);
endpoint
is optional. When provided the endpoint URL is used instead of the one constructed by the supplied properties. Use encryption
and encryptionkey
only in when server side security with customer supplied keys (SSE-C) in force.
If the value of stream
is true, then returned LOBs are read only once and are not typically buffered to disk.
Examples
exec getTextFile(name=>'myfile.txt'); SELECT SP.e1, SP.e2,SP.e3, f.lastmodified FROM (EXEC getTextFile(name=>'myfile.txt')) AS f, TEXTTABLE(f.file COLUMNS e1 integer, e2 string, e3 double HEADER) AS SP;
getFile(…)
Retrieves the given named object as binary file from specified bucket and region using the provided security credentials as BLOB.
getFile(string name NOT NULL, string bucket, string region, string endpoint, string accesskey, string secretkey, string encryption, string encryptionkey, boolean stream default false) returns TABLE(file blob, endpoint string, lastModified string, etag string, size long)
endpoint
is optional. When provided the endpoint URL is used instead of the one constructed by the supplied properties. Use encryption
and encryptionkey
only in when server side security with customer supplied keys (SSE-C) in force.
If the value of stream
is true, then returned lOBs are read once and are not typically buffered to disk.
Examples
exec getFile(name=>'myfile.xml', bucket=>'mybucket', region=>'us-east-1', accesskey=>'xxxx', secretkey=>'xxxx'); select b.* from (exec getFile(name=>'myfile.xml', bucket=>'mybucket', region=>'us-east-1', accesskey=>'xxxx', secretkey=>'xxxx')) as a, XMLTABLE('/contents' PASSING XMLPARSE(CONTENT a.result WELLFORMED) COLUMNS e1 integer, e2 string, e3 double) as b;
saveFile(…)
Save the CLOB, BLOB, or XML value to given name and bucket. In the following procedure signature, the contents parameter can be any of the LOB types.
call saveFile(string name NOT NULL, string bucket, string region, string endpoint, string accesskey, string secretkey, contents object)
You cannot use saveFile
to stream or chunk uploads of a file’s contents. If you try to load very large objects, out-of-memory issues can result. You cannot configure saveFile
to use SSE-C encryption.
Examples
exec saveFile(name=>'g4.txt', contents=>'e1,e2,e3\n1,one,1.0\n2,two,2.0');
deleteFile(…)
Delete the named object from the bucket.
call deleteFile(string name NOT NULL, string bucket, string region, string endpoint, string accesskey, string secretkey)
Examples
exec deleteFile(name=>'myfile.txt');
list(…)
Lists the contents of the bucket.
call list(string bucket, string region, string accesskey, string secretkey, nexttoken string) returns Table(result clob)
The result is the XML file that Amazon S3 provides in the following format
<?xml version="1.0" encoding="UTF-8"?>/n <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Name>teiidbucket</Name> <Prefix></Prefix> <KeyCount>1</KeyCount> <MaxKeys>1000</MaxKeys> <IsTruncated>false</IsTruncated> <Contents> <Key>g2.txt</Key> <LastModified>2017-08-08T16:53:19.000Z</LastModified> <ETag>"fa44a7893b1735905bfcce59d9d9ae2e"</ETag> <Size>48</Size> <StorageClass>STANDARD</StorageClass> </Contents> </ListBucketResult>
You can parse this into a view by using a query similar to the one in the following example:
select b.* from (exec list(bucket=>'mybucket', region=>'us-east-1')) as a, XMLTABLE(XMLNAMESPACES(DEFAULT 'http://s3.amazonaws.com/doc/2006-03-01/'), '/ListBucketResult/Contents' PASSING XMLPARSE(CONTENT a.result WELLFORMED) COLUMNS Key string, LastModified string, ETag string, Size string, StorageClass string, NextContinuationToken string PATH '../NextContinuationToken') as b;
If all properties (bucket
, region
, accesskey
, and secretkey
) are defined as translator override properties, you can run the following simple query:
SELECT * FROM Bucket
If there are more then 1000 object in the bucket, then the value 'NextContinuationToken' need to be supplied as 'nexttoken' into the list call to fetch the next batch of objects. This can be automated in Data Virtualization with enhancement request.
9.2. Delegator translators
You can use the delegator translator, which is available in the core Data Virtualization installation, to modify the capabilities of a existing translator. Often times for debugging purposes, or in special situations, you might want to turn certain capabilities of a translator on or off. For example, say that the latest version of a Hive database supports the ORDER BY
construct, but the current Data Virtualization version of the Hive translator does not. You could use the delegator translator to enable ORDER BY
compatibility without actually writing any code. Similarly, you could do the reverse, and turn off certain capabilities to produce a better plan.
To use the delegator translator, you must define it in the DDL. The following example shows how to override the "hive" translator and turn off the ORDER BY
capability.
CREATE DATABASE myvdb; USE DATABASE myvdb; CREATE FOREIGN DATA WRAPPER "hive-delegator" TYPE delegator OPTIONS (delegateName 'hive', supportsOrderBy 'false'); CREATE SERVER source FOREIGN DATA WRAPPER "hive-delegator" OPTIONS ("resource-name" 'java:hive-ds'); CREATE SCHEMA mymodel SERVER source; SET SCHEMA mymodel; IMPORT FROM SERVER source INTO mymodel;
For more information about the translator capabilities that you can override by using execution properties, see Translator_Capabilities in the Translator Development Guide. The preceding example shows how you might modify the default ORDER BY
compatibility of the Hive translator.
9.2.1. Extending the delegator translator
You can create a delegating translator by extending the org.teiid.translator.BaseDelegatingExecutionFactory
. After your classes are packaged as a custom translator, you can wire another translator instance into your delegating translator at runtime in order to intercept all of the calls to the delegate. This base class does not provide any functionality on its own, other than delegation. You can hard code capabilities into the translator instead of defining them as part of the DDL configuration. You can also override methods to provide alternate behavior.
Name | Description | Default |
delegateName | Translator instance name to delegate to. | n/a |
cachePattern | Regex pattern of queries that should be cached using the translator caching API. | n/a |
cacheTtl | Time to live in milliseconds for queries matching the cache pattern. | n/a |
For example, if you use the oracle translator in your virtual database, and you want to intercept calls that go through the translator, you could write a custom delegating translator, as in the following example:
@Translator(name="interceptor", description="interceptor") public class InterceptorExecutionFactory extends org.teiid.translator.BaseDelegatingExecutionFactory{ @Override public void getMetadata(MetadataFactory metadataFactory, C conn) throws TranslatorException { // do intercepting code here.. // If you want call the original delegate, do not call if do not need to. // but if you did not call the delegate fullfill the method contract super.getMetadata(metadataFactory, conn); // do more intercepting code here.. } }
You could then deploy this translator in the Data Virtualization engine. Then in your DDL file, define an interceptor translator as in the following example:
CREATE DATABASE myvdb VERSION '1'; USE DATABASE myvdb VERSION '1'; CREATE FOREIGN DATA WRAPPER "oracle-interceptor" TYPE interceptor OPTIONS (delegateName 'oracle'); CREATE SERVER source FOREIGN DATA WRAPPER "oracle-interceptor" OPTIONS ("resource-name" 'java:oracle-ds'); CREATE SCHEMA mymodel SERVER source; SET SCHEMA mymodel; IMPORT FROM SERVER source INTO mymodel;
We have defined a "translator" override called oracle-interceptor
, which is based on the custom translator "interceptor" from above, and supplied the translator it needs to delegate to "oracle" as its delegateName. Then, we used this override translator oracle-interceptor
in the VDB. Future calls going into this VDB model’s translator are intercepted by your code to do whatever you want to do.
9.3. File translator
The file translator, known by the type name file, exposes stored procedures to leverage file resources. The translator is typically used with the TEXTTABLE
or XMLTABLE
functions to consume CSV or XML formatted data.
Name | Description | Default |
---|---|---|
Encoding | The encoding that should be used for CLOBs returned by the getTextFiles procedure. The value should match an encoding known to Data Virtualization. For more information, see TO_CHARS and TO_BYTES in String functions. | The system default encoding. |
ExceptionIfFileNotFound | Throw an exception in getFiles or getTextFiles if the specified file/directory does not exist. | true |
For information about how to set properties, see the following example, and Override execution properties in Translators.
Example: Virtual datbase DDL override
CREATE SERVER "file-override" FOREIGN DATA WRAPPER file OPTIONS( Encoding 'ISO-8859-1', "ExceptionIfFileNotFound" false ); CREATE SCHEMA file SERVER "file-override";
getFiles
getFiles(String pathAndPattern) returns TABLE(file blob, filePath string, lastModified timestamp, created timestamp, size long)
Retrieve all files as BLOBs matching the given path and pattern.
call getFiles('path/*.ext')
If the path is a directory, then all files in the directory are returned. If the path matches a single file, the file is returned.
The *
character is treated as a wildcard to match any number of characters in the path name. Zero or matching files will be returned.
If *’ is not used, and if the path doesn’t exist and `ExceptionIfFileNotFound
is true, then an exception is raised.
getTextFiles
getTextFiles(String pathAndPattern) returns TABLE(file clob, filePath string, lastModified timestamp, created timestamp, size long)
The size reports the number of bytes.
Retrieve all files as CLOBs matching the given path and pattern.
call getTextFiles('path/*.ext')
Retrieves the same files getFiles
, but with the difference that the results are CLOB values that use the encoding execution property as the character set.
saveFile
Save the CLOB, BLOB, or XML value to given path
call saveFile('path', value)
deleteFile
Delete the file at the given path
call deleteFile('path')
The path should reference an existing file. If the file does not exist and ExceptionIfFileNotFound
is true, then an exception will be thrown. An exception is also thrown if the file cannot be deleted.
This feature is not applicable for the File translator.
This feature is not applicable for the File translator.
9.4. Google spreadsheet translator
The google-spreadsheet translator is used to connect to a Google Sheets spreadsheet.
The query approach expects that the data in the worksheet has the following characteristics:
- All columns that contains data can be queried.
- Any column with an empty cell has the value retrieved as null. However, differentiating between null string and empty string values may not always be possible as Google treats them interchangeably. Where possible, the translator may provide a warning or throw an exception if it cannot differentiate between null and empty strings.
- If the first row is present and contains string values, then the row is assumed to represent the column labels.
If you are using the default native metadata import, the metadata for your Google account (worksheets and information about columns in worksheets) is loaded upon translator start up. If you make any changes in data types, it is advisable to restart your virtual database.
The translator can submit queries against a single sheet only. It provides ordering, aggregation, basic predicates, and most of the functions available in the spreadsheet query language.
The google-spreadsheet translator does not provide importer settings, but it can provide metadata for VDBs.
If you remove all data rows from a sheet with a header that is defined in Data Virtualization, you can no longer access the sheet through Data Virtualization. The Google API will treat the header as a data row at that point, and queries to it will no longer be valid.
Non-string fields are updated using the canonical Data Virtualization SQL value. In cases where the spreadsheet is using a non-conforming locale, consider disallowing updates. For more information, see TEIID-4854 and the following information about the allTypesUpdatable
import property.
Importer properties
- allTypesUpdatable- Set to true to mark all columns as updatable. Set to false to enable update only on string or Boolean columns that are not affected by TEIID-4854. Defaults to true.
Native queries
Google spreadsheet source procedures may be created using the teiid_rel:native-query
extension. For more information, see Parameterizable native queries in Translators. The procedure will invoke the native-query similar to an native procedure call, with the benefits that the query is predetermined, and that result column types are known, rather than requiring the use of ARRAYTABLE or similar functionality. For more information, see the Select section that follows.
This feature is turned off by default, because of the security risk in permitting any command to execute against the data source. To enable this feature, set the property SupportsDirectQueryProcedure to true. For more information, see Override execution properties in Translators.
By default the name of the procedure that executes the queries directly is called native. You can change its name by overriding the execution property DirectQueryProcedureName. For more information, see Override execution properties in Translators.
The Google spreadsheet translator provides a procedure to execute any ad-hoc query directly against the source without any Data Virtualization parsing or resolving. Because the metadata of this procedure’s execution results are not known to Data Virtualization, they are returned as an object array. You can use ARRAYTABLE to construct tabular output for consumption by client applications. For more information, see ARRAYTABLE.
Data Virtualization exposes this procedure with a simple query structure as shown in the following example:
Select example
SELECT x.* FROM (call google_source.native('worksheet=People;query=SELECT A, B, C')) w, ARRAYTABLE(w.tuple COLUMNS "id" string , "type" string, "name" String) AS x
The first argument takes semicolon-separated (;) name-value pairs of the following properties to execute the procedure:
Property | Description | Required |
---|---|---|
worksheet | Google spreadsheet name. | yes |
query | Spreadsheet query. | yes |
limit | Number of rows to fetch. | no |
offset | Offset of rows to fetch from limit or beginning. | no |
9.5. JDBC translators
The JDBC translators bridge the SQL semantics and data type differences between Data Virtualization and a target RDBMS. Data Virtualization has a range of specific translators that target the most popular open source and proprietary relational databases.
Usage
Usage of a JDBC source is straight-forward. Using Data Virtualization SQL, the source can be queried as if the tables and procedures were local to the Data Virtualization system.
If you are using a relational data source, or a data source that has a JDBC driver, and you do not find a specific translator available for that data source type, then start with the JDBC ANSI translator. The JDBC ANSI translator should enable you to perform the SQL basics. If there specific data source capabilities that are not available, you can define a custom translator that does what you need. For more information, see Translator Development.
Type conventions
UID types including UUID, GUID, or UNIQUEIDENTIFIER are typically mapped to the Data Virtualization string type. JDBC data sources treat UID strings as non-case sensitive, but they are case-sensitive in Data Virtualization. If the source does not support the implicit conversion to the string type, then usage in functions that expect a string value might fail at the source.
The following table lists the execution properties that are shared by all JDBC translators.
Name | Description | Default |
---|---|---|
DatabaseTimeZone | The time zone of the database. Used when fetching date, time, or timestamp values. | The system default time zone |
DatabaseVersion | The specific database version. Used to further tune the use of pushdown operations. |
The base compatible version, or the version that is derived from the DatabaseMetadata.getDatabaseProductVersion string. Automatic detection requires a connection. If there are circumstances where you are getting an exception due to capabilities being unavailable (for example, because a connection is not available), then set |
TrimStrings |
| false |
RemovePushdownCharacters |
Set to a regular expression to remove characters that not allowed or undesirable for the source. For example | |
UseBindVariables |
| true |
UseCommentsInSourceQuery | This will embed a leading comment with session/request id in the source SQL for informational purposes. Can be customized with the CommentFormat property. | false |
CommentFormat |
MessageFormat string to be used if
|
|
MaxPreparedInsertBatchSize | The max size of a prepared insert batch. | 2048 |
StructRetrieval | Specify one of the following Struct retrieval modes:
| OBJECT |
EnableDependentJoins | Allow dependent join pushdown for sources that use temporary tables (DB2, Derby, H2, HSQL 2.0+, MySQL 5.0+, Oracle, PostgreSQL, SQLServer, SQP IQ, Sybase). | false |
Importer properties — Shared by all JDBC translators
When specifying the importer property, it must be prefixed with importer.
. Example: importer.tableTypes
Name | Description | Default |
---|---|---|
catalog | See DatabaseMetaData.getTables [1] | null |
schemaName | Recommended setting to import from a single schema. The schema name will be converted into an escaped pattern,overriding schemaPattern if it is also set. | null |
schemaPattern | See DatabaseMetaData.getTables [1] | null |
tableNamePattern | See DatabaseMetaData.getTables [1] | null |
procedureNamePattern | See DatabaseMetaData.getProcedures [1] | null |
tableTypes |
Comma separated list — without spaces — of imported table types. See | null |
excludeTables | A case-insensitive regular expression that when matched against a fully qualified table name [2] will exclude it from import. Applied after table names are retrieved. Use a negative look-ahead (?!<inclusion pattern>).* to act as an inclusion filter. | null |
excludeProcedures | A case-insensitive regular expression that when matched against a fully qualified procedure name [2] will exclude it from import. Applied after procedure names are retrieved. Use a negative look-ahead (?!<inclusion pattern>).* to act as an inclusion filter. | null |
importKeys |
NOTE: Foreign keys to tables that are not imported will be ignored. | true |
autoCreateUniqueConstraints |
| true |
importIndexes |
| false |
importApproximateIndexes |
WARNING: Setting to | true |
importProcedures |
| false |
importSequences |
| false |
sequenceNamePattern |
LIKE pattern string to use when importing sequences. Null or | null |
useFullSchemaName |
When | false (only change when importing from multiple foreign schema). |
useQualifiedName |
WARNING: When you set this option to | true (rarely needs changed) |
useCatalogName |
| true (rarely needs changed) |
widenUnsignedTypes |
| true |
useIntegralTypes |
| false |
quoteNameInSource |
| true |
useAnyIndexCardinality |
| false |
importStatistics |
| false |
importRowIdAsBinary |
| false |
importLargeAsLob |
| false |
[1] JavaDoc for DatabaseMetaData
[2] The fully qualified name for exclusion is based upon the settings of the translator and the particulars of the database. All of the applicable name parts used by the translator settings (see useQualifiedName
and useCatalogName
) including catalog, schema, table will be combined as catalogName.schemaName.tableName
with no quoting. For example, Oracle does not report a catalog, so the name used with default settings for comparison would be just schemaName.tableName
.
The default import settings will crawl all available metadata. This import process is time-consuming, and full metadata import is not needed in most situations. Most commonly you’ll want to limit the import by at least schemaName
or schemaPattern
and tableTypes
.
Example: Importer settings to import only tables and views from my-schema.
SET SCHEMA ora; IMPORT FOREIGN SCHEMA "my-schema" FROM SERVER ora INTO ora OPTIONS ("importer.tableTypes" 'TABLE,VIEW');
For more information about importer settings, see Virtual databases.
Native queries
Physical tables, functions, and procedures may optionally have native queries associated with them. No validation of the native query is performed, it is simply used in a straight-forward manner to generate the source SQL. For a physical table setting the teiid_rel:native-query
extension metadata will execute the native query as an inline view in the source query. This feature should only be used against sources that provide inline views. The native query is used as is and is not treated as a parameterized string. For example, on a physical table y
with nameInSource="x"`
and teiid_rel:native-query="select c from g"
, the Data Virtualization source query "SELECT c FROM y"
would generate the SQL query "SELECT c FROM (select c from g) as x"
. Note that the column names in the native query must match the nameInSource
of the physical table columns for the resulting SQL to be valid.
For physical procedures you may also set the teiid_rel:native-query
extension metadata to a desired query string with the added ability to positionally reference IN parameters . For more information, see Parameterizable native queries in Translators. The teiid_rel:non-prepared
extension metadata property can be set to false
to turn off parameter binding.
Be careful in setting this option, because inbound allows for SQL injection attacks if not properly validated. The native query does not need to call a stored procedure. Any SQL that returns a result set that positionally matches the result set that is expected by the physical stored procedure metadata will work. For example on a stored procedure x
with teiid_rel:native-query="select c from g where c1 = $1 and c2 = `$$1"'
, the Data Virtualization source query `"CALL x(?)"`
would generate the SQL query `"select c from g where c1 = ? and c2 = `$1"'`
. Note that ?
in this example will be replaced with the actual value bound to parameter 1.
Direct query procedure
This feature is turned off by default, because of the inherent security risk in allowing any command to be run against the source. To enable this feature, override the execution property called SupportsDirectQueryProcedure and set it to true
. For more information, see Override execution properties in Translators.
By default, the name of the procedure that executes the queries directly is native. To change the name, override the execution property DirectQueryProcedureName.
The JDBC translator provides a procedure to execute any ad-hoc SQL query directly against the source without Data Virtualization parsing or resolving. Since the metadata of this procedure’s results are not known to Data Virtualization, they are returned as an object array. ARRAYTABLE can be used construct tabular output for consumption by client applications. For more information, see arraytable.
SELECT example
SELECT x.* FROM (call jdbc_source.native('select * from g1')) w, ARRAYTABLE(w.tuple COLUMNS "e1" integer , "e2" string) AS x
INSERT example
SELECT x.* FROM (call jdbc_source.native('insert into g1 (e1,e2) values (?, ?)', 112, 'foo')) w, ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x
UPDATE example
SELECT x.* FROM (call jdbc_source.native('update g1 set e2=? where e1 = ?','blah', 112)) w, ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x
DELETE example
SELECT x.* FROM (call jdbc_source.native('delete from g1 where e1 = ?', 112)) w, ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x
9.5.1. Actian Vector translator (actian-vector)
Also see common JDBC Translators information.
The Actian Vector translator, known by the type name actian-vector, is for use with Actian Vector in Hadoop.
Download the JDBC driver at http://esd.actian.com/platform. Note the port number in the connection URL is "AH7", which maps to 16967.
9.5.2. Apache Phoenix Translator (phoenix)
Also see common JDBC Translators information.
The Apache Phoenix translator, known by the type name phoenix, exposes querying functionality to HBase tables. Apache Phoenix is a JDBC SQL interface for HBase that is required for this translator as it pushes down commands into Phoenix SQL.
The translator is also known by the deprecated name hbase. The name change reflects that the translator is specific to Phoenix and that there could be other translators introduced in the future to connect to HBase.
Do not use the DatabaseTimezone
property with this translator.
The HBase translator cannot process Join commands. Phoenix uses the HBase Table Row ID as the Primary Key. This Translator is developed with Phoenix 4.3 or greater for HBase 0.98.1 or greater.
The translator implements INSERT
/UPDATE
through the Phoenix UPSERT
operation. This means you can see different behavior than with standard INSERT
/UPDATE
. For example, repeated inserts will not throw a duplicate key exception, but will instead update the row in question.
Due to Phoenix driver limitations, the importer does not look for unique constraints, and defaults to not importing foreign keys.
The translator can process SQL OFFSET
arguments and other features starting with Phoenix 4.8. The Phoenix driver hard codes the server version in PhoenixDatabaseMetaData
, and does not otherwise provide a way to detect the server version at runtime. If a newer driver is used with an older server, set the database version translator property manually.
The Phoenix driver does not have robust handling of time values. If your time values are normalized to use a date component of 1970-01-01, then the default handling will work correctly. If not, then the time column should be modeled as timestamp instead.
9.5.3. Cloudera Impala translator (impala)
Also see common JDBC Translators information.
The Cloudera Impala translator, known by the type name impala, is for use with Cloudera Impala 1.2.1 or later.
Impala has limited support for data types. It is does not have native support for time/date/xml or LOBs. These limitations are reflected in the translator capabilities. A Data Virtualization view can use these types, however the transformation would need to specify the necessary conversions. Note that in those situations, the evaluations will be done in the Data Virtualization engine.
Do not use the DatabaseTimeZone
translator property with the Impala translator.
Impala only supports EQUI join, so using any other joins types on its source tables will result in inefficient queries.
To write criteria based on partitioned columns, model them on the source table, but do not include them in selection columns.
Impala Hive importer does not have concept of catalog or source schema, nor does it import keys, procedures, indexes, etc.
Impala specific importer properties
- useDatabaseMetaData
- Set to true to use the normal import logic with the option to import index information disabled. Defaults to false.
If the value of useDatabaseMetaData
is false, the typical JDBC DatabaseMetaData
calls are not used, so not all of the common JDBC importer properties are applicable to Impala. You may still use excludeTables
, regardless.
Some versions of Impala require the use of a LIMIT
when performing an ORDER BY
. If no default is configured in Impala, an exception can occur when a Data Virtualization query with an ORDER BY
but no LIMIT
is issued. You should set an Impala-wide default, or configure the connection pool to use a new connection SQL string to issue a SET DEFAULT_ORDER_BY_LIMIT
statement. For more information about Impala limit options, such as how to control what happens when the limit is exceeded, see the Cloudera documentation.
If the Impala JDBC driver has problems processing PreparedStatements
or parsing statements in general, try disabling useBindVariables
. For more information, see https://issues.redhat.com/browse/TEIID-4610.
9.5.4. Db2 Translator (db2)
Also see common JDBC Translators information.
The Db2 translator, known by the type name db2, is for use with IBM Db2 V8 or later, or IBM Db2 for i V5.4 or later.
Db2 execution properties
- DB2ForI
-
Indicates that the the Db2 instance is Db2 for i. Defaults to
false
. - supportsCommonTableExpressions
-
Indicates that the Db2 instance supports common table expressions (CTEs). Defaults to
true
. Common table expression are not fully supported on some older versions of Db2, and on instances of Db2 that run in a conversion mode. If you encounter errors working with CTEs in these environments, set the CTE property tofalse
.
9.5.5. Derby translator (derby)
Also see common JDBC Translators information.
The Derby translator, known by the type name derby, is for use with Derby 10.1 or later.
9.5.6. Exasol translator (exasol)
Also see common JDBC Translators information.
The Exasol translator, known by the type name exasol, is for use with Exasol version 6 or later.
Usage
The Exasol database has the NULL HIGH default ordering, whereas the Data Virtualization engine works in the NULL LOW mode. As a result, depending on whether the ordering is pushed down to Exasol or done by the engine, you might observe NULLs at either the beginning or end of returned results. To enforce consistency, you can run Data Virtualization with org.teiid.pushdownDefaultNullOrder=true
to specify NULL LOW ordering. Enforcing NULL LOW ordering can result in decreased performance.
9.5.7. Greenplum Translator (greenplum)
Also see common JDBC Translators information.
The Greenplum translator, known by the type name greenplum, is for use with the Greenplum database. This translator is an extension of the PostgreSQL translator, and inherits its options.
9.5.8. H2 Translator (h2)
Also see common JDBC Translators information.
The H2 Translator, known by the type name h2, is for use with H2 version 1.1 or later.
9.5.9. Hive Translator (hive)
Also see common JDBC Translators information.
The Hive translator, known by the type name hive, is for use with Hive v.10 and SparkSQL v1.0 and later.
Capabilities
Hive is compatible with a limited set of data types. It does not have native support for time/XML or large objects (LOBs). These limitations are reflected in the translator capabilities. Although a Data Virtualization view can use these types, the transformation must specify the necessary conversions. Note that in those situations, evaluations are processed in Data Virtualization engine.
Do not use the DatabaseTimeZone
translator property with the Hive translator.
Hive only supports EQUI join, so using any other joins types on its source tables will result in inefficient queries.
To write criteria based on partitioned columns, model them on the source table, but do not include them in selection columns.
The Hive importer does not have the concept of catalog or source schema, nor does it import keys, procedures, indexes, and so forth.
Import properties
- trimColumnNames
-
For Hive 0.11.0 and later, the
DESCRIBE
command metadata is inappropriately returned with padding. Set this property totrue
to remove white space from column names. Defaults tofalse
. - useDatabaseMetaData
-
For Hive 0.13.0 and later, the normal JDBC
DatabaseMetaData
facilities are sufficient to perform an import. Set totrue
to use the normal import logic with the option to import index information disabled. Defaults tofalse
. When true,trimColumnNames
has no effect. If it is set to false, the typical JDBC DatabaseMetaData calls are not used, so not all of the common JDBC importer properties are applicable to Hive. You can still use excludeTables anyway.
"Database Name"
When the database name used in Hive differs from default
, the metadata retrieval and execution of queries does not work as expected in Data Virtualization. The Hive JDBC driver seems to be implicitly connecting (tested with < 0.12) to "default" database, thus ignoring the database name mentioned on connection URL. You can work around this issue if you configure your connection source to send the command use {database-name}
.
This is fixed in version 0.13 and later of the Hive JDBC driver. For more information, see https://issues.apache.org/jira/browse/HIVE-4256.
Limitations
Empty tables might report their description without datatype information. To work around this problem when importing, you can exclude empty tables, or use the useDatabaseMetaData
option.
9.5.10. HSQL Translator (hsql)
Also see common JDBC Translators information.
The HSQL Translator, known by the type name hsql, is for use with HSQLDB 1.7 or later.
9.5.11. Informix translator (informix)
Also see common JDBC Translators information.
The Informix translator, known by the type name informix, is for use with any Informix version.
Known issues
- TEIID-3808
-
The Informix driver’s handling of timezone information is inconsistent, even if the
databaseTimezone
translator property is set. Verify that the Informix server and the application server are in the same time zone.
9.5.12. Ingres translators (ingres / ingres93)
Also see common JDBC Translators information.
You can use one of the following Ingres translators, depending on your Ingres version:
- ingres
- The Ingres translator, known by the type name ingres, is for use with Ingres 2006 or later.
- ingres93
- The Ingres93 translator, known by the type name ingres93, is for use with Ingres 9.3 or later.
9.5.13. Intersystems Caché translator (intersystems-cache)
Also see common JDBC Translators information.
The Intersystem Caché translator, known by the type name intersystems-cache, is for use with Intersystems Caché Object database (relational aspects only).
9.5.14. JDBC ANSI translator (jdbc-ansi)
Also see common JDBC Translators information.
The JDBC ANSI translator, known by the type name jdbc-ansi, works with most of the SQL constructs used in Data Virtualization, except for row LIMIT/OFFSET and EXCEPT/INTERSECT. It translates source SQL into ANSI compliant syntax. This translator should be used when another more specific type is not available. If source exceptions arise due to the use of incompatible SQL constructs, then consider using the JDBC simple translator to further restrict capabilities, or create a custom translator. For more information, see the Custom Translator documentation in the Teiid community.
9.5.15. JDBC simple translator (jdbc-simple)
Also see common JDBC Translators information.
The JDBC Simple translator, known by the type name jdbc-simple, is the same as the jdbc-ansi-translator, except that, to provide maximum compatibility, it does not handle most pushdown constructs.
9.5.16. Microsoft Access translators
Also see common JDBC Translators information.
- access
The Microsoft Access translator known by the type name access is for use with Microsoft Access 2003 or later via the JDBC-ODBC bridge.
If you are using the default native metadata import, or the Data Virtualization connection importer, the importer defaults to
importKeys=false
andexcludeTables=.[.]MSys.
to avoid issues with the metadata provided by the JDBC ODBC bridge. You might need to adjust these values if you use a different JDBC driver.- ucanaccess
- The Microsoft Access translator known by the type name ucanaccess is for use with Microsoft Access 2003 or later via the UCanAccess driver.
9.5.17. Microsoft SQL Server translator (sqlserver)
Also see common JDBC translators information.
The Microsoft SQL Server translator, known by the type name sqlserver, is for use with SQL Server 2000 or later. A SQL Server JDBC driver version 2.0 or later (or a compatible driver such as, JTDS 1.2 or later) should be used. The SQL Server DatabaseVersion
property can be set to 2000
, 2005
, 2008
, or 2012
, but otherwise expects a standard version number, for example, 10.0
.
Sequences
With Data Virtualization 8.5+, sequence operations may be modeled as source functions.
With Data Virtualization 10.0+, sequences may be imported automatically import properties.
Example: Sequence native query
CREATE FOREIGN FUNCTION seq_nextval () returns integer OPTIONS ("teiid_rel:native-query" 'NEXT VALUE FOR seq');
Execution properties
SQL Server specific execution properties:
- JtdsDriver
- Specifies that use of the open source JTDS driver. Defaults to false.
9.5.18. MySQL translator (mysql/mysql5)
Also see common JDBC translators information.
You can use the following translators with MySQL and MariaDB:
- mysql
- The MySQL translator, known by the type name mysql, is for use with MySQL version 4.x.
- mysql5
- The MySQL5 translator, known by the type name mysql5, is for use with MySQL version 5 or later. The translator also works with other compatible MySQL derivatives, such as MariaDB.
Usage
The MySQL translators expect the database or session to be using ANSI mode. If the database is not using ANSI mode, you can set ANSI mode on the pool by submitting the following initialization query:
set SESSION sql_mode = 'ANSI'
When data includes null timestamp values, Data Virtualization generates the following conversion error: 0000-00-00 00:00:00 cannot be converted to a timestamp
. To avoid error, if you expect data with null timestamp values, set the connection property zeroDateTimeBehavior=convertToNull
.
If you must retrieve large result sets, consider setting the connection property useCursorFetch=true
. Otherwise MySQL will fully fetch result sets into memory on the Data Virtualization instance.
MySQL reports TINYINT(1) columns as a JDBC BIT type - however the value range is not actually restricted and may cause issues if for example you are relying on -1 being recognized as a true value. If not using the native importer, change the BOOLEAN columns in the affected source to have a native type of "TINYINT(1)" rather than BIT so that the translator can appropriately handle the Boolean conversion.
9.5.19. Netezza translator (netezza)
Also see common JDBC translators information.
The Netezza translator, known by the type name netezza, is for use with any version of the IBM Netezza appliance.
Usage
The current vendor-supplied JDBC driver for Netezza performs poorly with single transactional updates. It is best to perform batched updates whenever possible.
Execution properties
Netezza-specific execution properties:
- SqlExtensionsInstalled
-
Indicates that SQL extensions, including the ability to process Netezza
REGEXP_LIKE
functions, are installed. All other REGEXP functions are then available as pushdown functions. Defaults tofalse
.
9.5.20. Oracle translator (oracle)
Also see common JDBC translators information.
The Oracle translator, known by the type name oracle, is for use with Oracle Database 9i or later.
The Oracle-provided JDBC driver uses large amounts of memory. Because the driver caches a high volume of data in the buffer, problems can occur on computers that lack sufficient memory allocation.
For more information, see the following resources:
Importer properties
- useGeometryType
Use the Data Virtualization Geometry type when importing columns with a source type of SDO_GEOMETRY. Defaults to false.
NoteMetadata import from Oracle may be slow. It is recommended that at least a schema name filter is specified. There is also the
useFetchSizeWithLongColumn=true
connection property that can increase the fetch size for metadata queries. It significantly improves the metadata load process, especially when there are a large number of tables in a schema.
Execution properties
- OracleSuppliedDriver
-
Indicates that the Oracle supplied driver (typically prefixed by ojdbc) is being used. Defaults to true. Set to
false
when using DataDirect or other Oracle JDBC drivers.
Oracle-specific metadata
- Sequences
You can use sequences with the Oracle translator. You can model a sequence as a table with a name in source of DUAL, and setting column names in the source set to
<sequence name>.[nextval|currval]
With Data Virtualization 10.0+, you can import sequences automatically.
For more information, see Importer properties in JDBC translators. Data Virtualization 8.4 and Prior Oracle Sequence DDL
CREATE FOREIGN TABLE seq (nextval integer OPTIONS (NAMEINSOURCE 'seq.nextval'), currval integer options (NAMEINSOURCE 'seq.currval') ) OPTIONS (NAMEINSOURCE 'DUAL')
With Data Virtualization 8.5 it’s no longer necessary to rely on a table representation and Oracle-specific handling for sequences.
For information about representing currval
and nextval
as source functions, see DDL metadata for schema objects
8.5 Example: Sequence native query
CREATE FOREIGN FUNCTION seq_nextval () returns integer OPTIONS ("teiid_rel:native-query" 'seq.nextval');
You can also use a sequence as the default value for insert columns by setting the column to autoincrement, and setting the name in source to <element name>:SEQUENCE=<sequence name>.<sequence value>
.
Rownum
A rownum
column can also be added to any Oracle physical table to enable use of the rownum pseudo-column. A rownum column should have a name in source of rownum
. These rownum columns do not have the same semantics as the Oracle rownum construct so care must be taken in their usage.
Out parameter result set
Out parameters for procedures may also be used to return a result set, if this is not represented correctly by the automatic import you need to manually create a result set and represent the output parameter with native type REF CURSOR
.
DDL for out parameter result set
create foreign procedure proc (in x integer, out y object options (native_type 'REF CURSOR')) returns table (a integer, b string)
Geospatial functions
You can use the following geospatial functions with the translator for Oracle:
- Relate = sdo_relate
CREATE FOREIGN FUNCTION sdo_relate (arg1 string, arg2 string, arg3 string) RETURNS string; CREATE FOREIGN FUNCTION sdo_relate (arg1 Object, arg2 Object, arg3 string) RETURNS string; CREATE FOREIGN FUNCTION sdo_relate (arg1 string, arg2 Object, arg3 string) RETURNS string; CREATE FOREIGN FUNCTION sdo_relate (arg1 Object, arg2 string, arg3 string) RETURNS string;
- Nearest_Neighbor = sdo_nn
CREATE FOREIGN FUNCTION sdo_nn (arg1 string, arg2 Object, arg3 string, arg4 integer) RETURNS string; CREATE FOREIGN FUNCTION sdo_nn (arg1 Object, arg2 Object, arg3 string, arg4 integer) RETURNS string; CREATE FOREIGN FUNCTION sdo_nn (arg1 Object, arg2 string, arg3 string, arg4 integer) RETURNS string;
- Within_Distance = sdo_within_distance
CREATE FOREIGN FUNCTION sdo_within_distance (arg1 Object, arg2 Object, arg3 string) RETURNS string; CREATE FOREIGN FUNCTION sdo_within_distance (arg1 string, arg2 Object, arg3 string) RETURNS string; CREATE FOREIGN FUNCTION sdo_within_distance (arg1 Object, arg2 string, arg3 string) RETURNS string;
- Nearest_Neigher_Distance = sdo_nn_distance
CREATE FOREIGN FUNCTION sdo_nn_distance (arg integer) RETURNS integer;
- Filter = sdo_filter
CREATE FOREIGN FUNCTION sdo_filter (arg1 Object, arg2 string, arg3 string) RETURNS string; CREATE FOREIGN FUNCTION sdo_filter (arg1 Object, arg2 Object, arg3 string) RETURNS string; CREATE FOREIGN FUNCTION sdo_filter (arg1 string, arg2 object, arg3 string) RETURNS string;
Pushdown functions
Depending on the Oracle version, the Oracle translator, registers the following non-geospatial pushdown functions with the engine:
- TRUNC
- Both numeric and timestamp versions.
- LISTAGG
- Requires the Data Virtualization SQL syntax "LISTAGG(arg [, delim] ORDER BY …)"
SQLXML
If you need to retrieve SQLXML values from Oracle and are getting oracle.xdb.XMLType or OPAQUE instances instead, you make the following changes:
- Use client driver version 11, or later.
-
Place the
xdb.jar
andxmlparserv2.jar
files in the classpath. Set the system property
oracle.jdbc.getObjectReturnsXMLType="false"
.For more information, see the Oracle documentation.
9.5.21. PostgreSQL translator (postgresql)
Also see common JDBC translators information.
The PostgreSQL translator, known by the type name postgresql, is for use with the following PostgreSQL client and server versions: * Client — 8.0 or later * Server — 7.1 or later.
Execution properties
PostgreSQL-specific execution properties:
- PostGisVersion
- Indicates the PostGIS version in use. Defaults to 0, which means that PostGIS is not installed. Will be set automatically if the database version is not set.
- ProjSupported
- Boolean that indicates if the PostGis version supports PROJ coordinate transformation software. Will be set automatically if the database version is not set.
Some driver versions of PostgreSQL will not associate columns to "INDEX" type tables. The current version of Data Virtualization omits such tables automatically.
Older versions of Data Virtualization may need the importer.tableType property or other filtering set.
9.5.22. PrestoDB translator (prestodb)
Also see common JDBC translators information.
The PrestoDB translator, known by the type name prestodb, exposes querying functionality to Presto data sources. In data integration respect, PrestoDB has capabilities that are similar to Data Virtualization, however it goes beyond in terms of distributed query execution with multiple worker nodes. Data Virtualization’s execution model is limited to single execution node and focuses more on pushing the query down to sources. Data Virtualization provides more complete querying capabilities and many enterprise features.
Capabilities
You can use the PrestoDB translator only with SELECT
statements. The translator provides a restricted set of capabilities.
Because PrestoDB exposes a relational model, Data Virtualization can use it as it does other RDBMS sources, such as Oracle, Db2, and so forth. For information about configuring PrestoDB, see the Presto documentation.
In SQL JOIN operations, PrestoDB does not support multiple ORDER BY
columns well. If errors occur during JOIN operations that involve more than one ORDER BY
column, set the translator property supportsOrderBy
to disable the use of the ORDER BY
clause.
Some versions of Presto generate errors when you include null values in subqueries.
PrestoDB does not support transactions. To overcome issues caused by this limitation, define the data source as non-transactional.
By default, every catalog in PrestoDB has an information_schema
. If you have to configure multiple catalogs, duplicate table errors can cause deployment of a virtual database to fail. To prevent duplicate table errors, use import options to filter the schemas.
If you want to configure multiple Presto catalogs, set one of the following import options to filter the schemas and tables in the source:
-
Set
catalog
to a specific catalog name to match the name of the source catalog in Presto. -
Set
schemaName
to a regular expression to filter schemas by matching result. -
Set
excludeTables
to a regular expression to filter tables by matching results.
9.5.23. Redshift translator (redshift)
Also see common JDBC translators information.
The Redshift translator, known by the type name redshift, is for use with the Amazon Redshift database. This translator is an extension of the PostgreSQL translator and inherits its options.
9.5.24. SAP HANA translator (hana)
Also see common JDBC translators information.
The SAP HANA translator, known by the name of hana, is for use with SAP HANA.
Known issues
- TEIID-3805
- The pushdown of the SUBSTRING function is inconsistent with the Data Virtualization SUBSTRING function when the FROM index exceeds the length of the string. SAP HANA will return an empty string, while Data Virtualization produces a null value.
9.5.25. SAP IQ translator (sap-iq)
Also see common JDBC translators information.
The SAP IQ translator, known by the type name sap-iq, is for use with SAP IQ version 15.1 or later. The translator name sybaseiq has been deprecated.
9.5.26. Sybase translator (sybase)
Also see common JDBC Translators information.
The Sybase translator, known by the type name sybase, is for use with SAP ASE (Adaptive Server Enterprise), formerly known as Sybase SQL Server, version 12.5 or later.
If you use the default native import, you can avoid exceptions during the retrieval of system table information, if you specify import properties. If errors occur when retrieving table information, specify a schemaName
or schemaPattern
, or use excludeTables
to exclude system tables. For more information about using import properties, see Importer properties in JDBC translators.
If the name in the source metadata contains quoted identifiers (such as reserved words, or words that contain characters that would not otherwise be allowed), and you are using a jConnect Sybase driver, you must first configure the connection pool to enable quoted_identifier:
Example: Driver URL with SQLINITSTRING
jdbc:sybase:Tds:host.at.some.domain:5000/db_name?SQLINITSTRING=set quoted_identifier on
If you are using a jConnect Sybase driver and will target the source for dependent joins, set the JCONNECT_VERSION
to 6
or later to increase the number of values that the translator can send. If you do not set the JCONNECT_VERSION
, an exception occurs with statements that have more than 481 bind values.
Example: Driver URL with JCONNECT_VERSION
jdbc:sybase:Tds:host.at.some.domain:5000/db_name?SQLINITSTRING=set quoted_identifier on&JCONNECT_VERSION=6
Execution properties specific to Sybase
- JtdsDriver_
- Indicates that the open source JTDS driver is being used. Defaults to false.
9.5.27. Data Virtualization translator (teiid)
Also see common JDBC translators information.
Use the Teiid translator, known by the type name teiid, when creating a virtual database from a Teiid data source.
9.5.28. Teradata translator (teradata)
Also see common JDBC translators information.
The Teradata translator, known by the type name teradata, is for use with Teradata Database V2R5.1 or later.
By default, Teradata driver version 15, adjusts date, time, and timestamp values to match the Data Virtualization server timezone. To remove this adjustment, set the translator DatabaseTimezone
property to GMT or whatever the Teradata server defaults to.
9.5.29. Vertica translator (vertica)
Also see common JDBC translators information.
The Vertica translator, known by the type name vertica, is for use with Vertica 6 or later.
9.6. Loopback translator
The Loopback translator, known by the type name loopback, provides a quick testing solution. It works with all SQL constructs and returns default results, with some configurable behavior.
Name | Description | Default |
---|---|---|
ThrowError |
| false |
RowCount | Rows returned for non-update queries. | 1 |
WaitTime | Wait randomly up to this number of milliseconds with each source query. | 0 |
PollIntervalInMilli |
If positive, results will be | -1 |
DelegateName | Set to the name of the translator which is to be mimicked. | na |
You can also use the Loopback translator to mimic how a real source query would be formed for a given translator (although loopback will still return dummy data that might not be useful for your situation). To enable this behavior, set the DelegateName
property to the name of the translator that you want to mimic. For example, to disable all capabilities, set the DelegateName
property to jdbc-simple
.
9.7. Microsoft Excel translator
The Microsoft Excel Translator, known by the type name excel, exposes querying functionality to a Microsoft Excel document. This translator provides an easy way read a Excel spreadsheet and provide the contents of the spreadsheet in a tabular form that can be integrated with other sources in Data Virtualization.
This translator works on all platforms, including Windows and Linux. The translator uses Apache POI libraries to access the Excel documents which are platform independent.
Translation mapping
The following table describes how Excel translator interprets the data in Excel document into relational terms.
Excel Term | Relational term |
---|---|
Workbook | schema |
Sheet | Table |
Row | Row of data |
Cell | Column Definition or Data of a column |
The Excel translator provides a "source metadata" feature, where for a given Excel workbook, it can introspect and build the schema based on the worksheets that are defined within it. There are options available to detect header columns and data columns in a worksheet to define the correct metadata of a table.
DDL example
The following example shows how to expose an Excel spreadsheet in a virtual database.
CREATE DATABASE excelvdb; USE DATABASE excelvdb; CREATE SERVER connector FOREIGN DATA WRAPPER excel OPTIONS ("resource-name" 'java:/fileDS'); CREATE SCHEMA excel SERVER connector; SET SCHEMA excel; IMPORT FROM SERVER connector INTO excel OPTIONS ( "importer.headerRowNumber" '1', "importer.ExcelFileName" 'names.xls');
Headers in document
If the Excel document contains headers, you can guide the import process to select the cell headers as the column names in the table creation process. For information about defining import properties, see the following table, and also see Importer Properties in JDBC translators.
Import properties
Import properties guide the schema generation part during the deployment of the VDB. This can be used in a native import.
Property Name | Description | Default |
---|---|---|
importer.excelFileName | Defines the name of the Excel Document to import metadata. This can be defined as a file pattern (*.xls), however when defined as pattern all files must be of same format, and the translator will choose an arbitrary file to import metadata from. Use file patterns to read data from multiple Excel documents in the same directory. In the case of a single file, specify the absolute name. | Required |
importer.headerRowNumber | Defines the cell header information to be used as column names. | Optional. Default is first data row of sheet |
importer.dataRowNumber | Defines the row number where the data rows start. | Optional. Default is first data row of sheet. |
To enable information in the Excel spreadsheet to be interpreted correctly, it is best to define all the preceding importer properties.
Purely numerical cells in a column contain containing mixed types will have a string form matching their decimal representation, thus integral values will have .0
appended. If you need the exact text representation, then the cell must be a string value. You can force a string value by preceding the numeric text of a cell with a single quote ('
), or a single space.
Translator extension properties
Excel specific execution properties
- FormatStrings
- Format non-string cell values in a string column according to the worksheet format. Defaults to false.
Metadata extension properties
Properties that are defined on schema artifacts, such as Table, Column, Procedure and so forth. These properties describe how the translator interacts with or interprets source systems. All the properties are defined with the following namespace:
"http://www.teiid.org/translator/excel/2014[http://www.teiid.org/translator/excel/2014\]"
, which also has a recognized aliasteiid_excel
.
Property Name | Schema item property belongs to | Description | Mandatory |
---|---|---|---|
FILE | Table | Defines Excel Document name or name pattern (*.xls). File pattern can be used to read data from multiple files. | Yes |
FIRST_DATA_ROW_NUMBER | Table | Defines the row number where records start in the sheet (applies to every sheet). | Optional |
CELL_NUMBER | Column of Table | Defines cell number to use for reading data of particular column. | Yes |
The following example shows a table that is defined by using the extension metadata properties.
CREATE DATABASE excelvdb; USE DATABASE excelvdb; CREATE SERVER connector FOREIGN DATA WRAPPER excel OPTIONS ("resource-name" 'java:/fileDS'); CREATE SCHEMA excel SERVER connector; SET SCHEMA excel; CREATE FOREIGN TABLE Person ( ROW_ID integer OPTIONS (SEARCHABLE 'All_Except_Like', "teiid_excel:CELL_NUMBER" 'ROW_ID'), FirstName string OPTIONS (SEARCHABLE 'Unsearchable', "teiid_excel:CELL_NUMBER" '1'), LastName string OPTIONS (SEARCHABLE 'Unsearchable', "teiid_excel:CELL_NUMBER" '2'), Age integer OPTIONS (SEARCHABLE 'Unsearchable', "teiid_excel:CELL_NUMBER" '3'), CONSTRAINT PK0 PRIMARY KEY(ROW_ID) ) OPTIONS ("NAMEINSOURCE" 'Sheet1',"teiid_excel:FILE" 'names.xlsx', "teiid_excel:FIRST_DATA_ROW_NUMBER" '2')
Extended capabilities using ROW_ID column
If you define a column that has extension metadata property CELL_NUMBER
with value ROW_ID
, then that column value contains the row information from Excel document. You can mark this column as Primary Key. You can use this column in SELECT
statements with a restrictive set of capabilities including: comparison predicates, IN
predicates and LIMIT
. All other columns cannot be used as predicates in a query.
Importing source metadata is not the only way to create the schema of an Excel document. You can also create a source table manually, and then add the extension properties that you need to create a fully functional model. Metadata imports result in schema models similar to the one in the preceding example.
The Excel translator processes updates with the following limitations:
-
The
ROW_ID
cannot be directly modified or used as an insert value. - UPDATE and INSERT values must be literals.
- UPDATEs are not transactional. That is, the write lock is held while the file is written, but not throughout the entire update. As a result, it is possible for one update to overwrite another.
The ROW_ID
of an inserted row can be returned as a generated key.
This feature is not applicable for the Excel translator.
This feature is not applicable for the Excel translator.
9.8. MongoDB Translator
The MongoDB translator, known by the type name mongodb, provides a relational view of data that resides in a MongoDB database. This translator is capable of converting Data Virtualization SQL queries into MongoDB based queries. It provides for a full range of SELECT, INSERT, UPDATE and DELETE calls.
MongoDB is a document based "schema-less" database with it own query language. It does not map perfectly with relational concepts or the SQL query language. More and more systems are using NOSQL stores such as MongoDB to improve scalability and performance. For example, applications like storing audit logs, or managing web site data, are well-suited to MongoDB, and do not require the structure of relational databases. MongoDB uses JSON documents as its primary storage unit, and those documents can have additional embedded documents inside the parent document. By using embedded documents, MongoDB co-locates related information to achieve de-normalization that typically requires either duplicate data or joins to achieve querying in a relational database.
For MongoDB to work with Data Virtualization the challenge for the MongoDB translator is to design a MongoDB store that can achieve the balance between relational and document based storage. The advantages of "schema-less" design are great at development time. But "schema-less" design can pose problems during migration between application versions, and when querying data, and making effective use of the returned information.
Since it is hard and may be impossible in certain situations to derive a schema based on existing the MongoDB collection(s), Data Virtualization approaches the problem in reverse compared to other translators. When working with MongoDB, Data Virtualization requires you to define the MongoDB schema upfront, by using Data Virtualization metadata. Because Data Virtualization only allows relational schema as its metadata, you must define your MongoDB schema in relational terms, using tables, procedures, and functions. For the purposes of MongoDB, the Data Virtualization metadata has been extended to provide extension properties that can be defined on a table to convert it into a MongoDB based document. These extension properties let you define how a MongoDB document is structured and stored. Based on the relationships (primary-key, foreign-key) that are defined on a table, and their cardinality (ONE-to-ONE, ONE-to-MANY, MANY-to-ONE), relations between tables are mapped such that related information can be embedded along with the parent document for co-location (as mentioned earlier in this topic). Thus, a relational schema-based design, but document-based storage in MongoDB.
Who is the primary audience for the MongoDB translator?
The above may not satisfy every user’s needs. The document structure in MongoDB can be more complex than what Data Virtualization can currently define. We hope this will eventually catch up in future versions of Data Virtualization. This is currently designed for:
- Users who are using relational databases and would like to move/migrate their data to MongoDB to take advantage of scaling and performance without modifying end user applications that they currently run.
- Users who are seasoned SQL developers, but do not have experience with MongoDB. This provides a low barrier of entry compared to using MongoDB directly as an application developer.
- Users who want to integrate MongoDB-based data with data from other enterprise data sources.
Usage
The name of the translator to use in a virtual database DDL is "mongodb". For example:
CREATE DATABASE nothwind; USE DATABASE nothwind; CREATE SERVER local FOREIGN DATA WRAPPER mongodb OPTIONS ("resource-name" 'java:/mongoDS'); CREATE SCHEMA northwind SERVER local; SET SCHEMA northwind; IMPORT FROM SERVER local INTO northwind;
The MongoDB translator can derive the metadata based on existing document collections in some scenarios. However, when working with complex documents the interpretation of metadata can be inaccurate. In such cases, you must define the metadata. For example, you can define a schema using DDL, as shown in the following example:
<vdb name="nothwind" version="1"> <model name="northwind"> <source name="local" translator-name="mongodb" connection-jndi-name="java:/mongoDS"/> <metadata type="DDL"><![CDATA[ CREATE FOREIGN TABLE Customer ( customer_id integer, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE'); ]]> </metadata> </model> <vdb>
When the following INSERT operation is executed against a table using Data Virtualization, the MongoDB translator creates a document in the MongoDB:
INSERT INTO Customer(customer_id, FirstName, LastName) VALUES (1, 'John', 'Doe');
{ _id: ObjectID("509a8fb2f3f4948bd2f983a0"), customer_id: 1, FirstName: "John", LastName: "Doe" }
If a PRIMARY KEY is defined on the table, then that column name is automatically used as "_id"
field in the MongoDB collection, and then the document structure is stored in the MongoDB, as shown in the following examples:
CREATE FOREIGN TABLE Customer ( customer_id integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE');
{ _id: 1, FirstName: "John", LastName: "Doe" }
If you defined the composite PRIMARY KEY on Customer table, the document structure that results is shown in the following example:
CREATE FOREIGN TABLE Customer ( customer_id integer, FirstName varchar(25), LastName varchar(25), PRIMARY KEY (FirstName, LastName) ) OPTIONS(UPDATABLE 'TRUE');
{ _id: { FirstName: "John", LastName: "Doe" }, customer_id: 1, }
Data types
The MongoDB translator provides automatic mapping of Data Virtualization data types into MongoDB data types, including BLOBS, CLOBS and XML. The LOB mapping is based on GridFS in MongoDB. Arrays are in the following form:
{ _id: 1, FirstName: "John", LastName: "Doe" Score: [89, "ninety", 91.0] }
Users can get individual items in the array using the function array_get
, or can transform the array into tabular structure using ARRAYTABLE.
Note that even though embedded documents can also be in arrays, the handling of embedded documents is different from array with scalar values.
The translator does not work with regular Expressions, MongoDB::Code, MongoDB::MinKey, MongoDB::MaxKey, and MongoDB::OID.
In documents that contain values of mixed types for the same key, you must mark the column as unsearchable, or MongoDB will not correctly match predicates against the column. A key is used as a mixed type of it is represented as a string value in one document, and an integer in another. For more information, see the importer.sampleSize property
in the following table.
Importer Properties
Importer properties define the behavior of the translator during the metadata import from the physical source.
Importer Properties
Name | Description | Default |
---|---|---|
excludeTables | Regular expression to exclude the tables from import. | null |
includeTables | Regular expression to include the tables from import. | null |
sampleSize | Number of documents to sample to determine the structure. If documents have different fields, or fields with different types, this should be greater than 1. | 1 |
fullEmbeddedNames | Whether to prefix embedded table names with their parents, e.g. parent_embedded. If false the name of the table will just be the name of the field - which may lead to conflicts with existing tables or other embedded tables. | false |
MongoDB metadata extension properties for building complex documents
Using the preceding DDL, or any other metadata facility, you can map a table in a relational store into a document in MongoDB. However, to make effective use of MongoDB, you must be able to build complex documents that can co-locate related information, so that data can queried in a single MongoDB query. Unlike a relational database, you cannot run join operations in MongoDB. As as a result, unless you can build complex documents, you would have to issue multiple queries to retrieve data and then join it manually. The power of MongoDB comes from its "embedded" documents, its support for complex data types, such as arrays, and its use of an aggregation framework to query them. This translator provides a way to achieve the goals.
When you do not define the complex embedded documents in MongoDB, Data Virtualization can step in for join processing and provide that functionality. However, if you want to make use of the power of MongoDB itself in querying the data and avoid bringing the unnecessary data and improve performance, you need to look into building these complex documents.
MongoDB translator defines two additional metadata properties along with other Teiid metadata properties to aid in building the complex "embedded" documents. For more information about Data Virtualization schema metadata, see Section 2.2, “DDL metadata for schema objects”. You can use the following metadata properties in your DDL:
- teiid_mongo:EMBEDDABLE
- Means that data defined in this table is allowed to be included as an "embeddable" document in any parent document. The parent document is referenced by the foreign key relationships. In this scenario, Data Virtualization maintains more than one copy of the data in MongoDB store, one in its own collection, and also a copy in each of the parent tables that have relationship to this table. You can even nest embeddable table inside another embeddable table with some limitations. Use this property on table, where table can exist, encompass all its relations on its own. For example, a "Category" table that defines a "Product"’s category is independent of Product, which can be embeddable in "Products" table.
- teiid_mongo:MERGE
- Means that data of this table is merged with the defined parent table. There is only a single copy of the data that is embedded in the parent document. Parent document is defined using the foreign key relationships.
Using the above properties and FOREIGN KEY relationships, we will illustrate how to build complex documents in MongoDB.
A given table can contain either the teiid_mongo:EMBEDDABLE
property or the teiid_mongo:MERGE
property defining the type of nesting in MongoDB. You cannot use both properties within one table.
ONE-2-ONE Mapping
If your current DDL structure representing ONE-2-ONE relationship is like
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Address ( CustomerId integer, Street varchar(50), City varchar(25), State varchar(25), Zipcode varchar(6), FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE');
By default, this will produce two different collections in MongoDB, like with sample data it will look like
Customer { _id: 1, FirstName: "John", LastName: "Doe" } Address { _id: ObjectID("..."), CustomerId: 1, Street: "123 Lane" City: "New York", State: "NY" Zipcode: "12345" }
You can enhance the storage in MongoDB to a single collection by using teiid_mongo:MERGE
extension property on the table’s OPTIONS clause.
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Address ( CustomerId integer PRIMARY KEY, Street varchar(50), City varchar(25), State varchar(25), Zipcode varchar(6), FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:MERGE" 'Customer');
this will produce single collection in MongoDB, like
Customer { _id: 1, FirstName: "John", LastName: "Doe", Address: { Street: "123 Lane", City: "New York", State: "NY", Zipcode: "12345" } }
With the above both tables are merged into a single collection that can be queried together using the JOIN clause in the SQL command. Since the existence of child/additional record has no meaning with out parent table using the "teiid_mongo:MERGE" extension property is right choice in this situation.
The Foreign Key defined on a child table must refer to Primary Keys on both the parent and child tables to form a One-2-One relationship.
ONE-2-MANY Mapping.
Typically there can be more than two (2) tables involved in this relationship. If MANY side is only associated single table, then use teiid_mongo:MERGE
property on MANY side of table and define ONE as the parent. If associated with more than single table then use teiid_mongo:EMBEDDABLE
.
For example, if you define a virtual database as in the following DDL:
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE');
then a Single Customer can have MANY Orders. There are two options to define the how we store the MongoDB document. If in your schema, the Customer table’s CustomerId is only referenced in Order table (i.e. Customer information used for only Order purposes), you can use
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:MERGE" 'Customer');
that will produce a single document for Customer table like
{ _id: 1, FirstName: "John", LastName: "Doe", Order: [ { _id: 100, OrderDate: ISODate("2000-01-01T06:00:00Z") Status: 2 }, { _id: 101, OrderDate: ISODate("2001-03-06T06:00:00Z") Status: 5 } ... ] }
If Customer table is referenced in more tables other than Order table, then use "teiid_mongo:EMBEDDABLE" property
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:EMBEDDABLE" 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Comments ( CommentID integer PRIMARY KEY, CustomerId integer, Comment varchar(140), FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) ) OPTIONS(UPDATABLE 'TRUE');
This creates three different collections in MongoDB.
Customer { _id: 1, FirstName: "John", LastName: "Doe" } Order { _id: 100, CustomerId: 1, OrderDate: ISODate("2000-01-01T06:00:00Z") Status: 2 Customer: { FirstName: "John", LastName: "Doe" } } Comment { _id: 12, CustomerId: 1, Comment: "This works!!!" Customer: { FirstName: "John", LastName: "Doe" } }
Here as you can see the Customer table contents are embedded along with other table’s data where they were referenced. This creates duplicated data where multiple of these embedded documents are managed automatically in the MongoDB translator.
All the SELECT, INSERT, DELETE operations that are generated against the tables with "teiid_mongo:EMBEDDABLE" property are atomic, except for UPDATES, as there can be multiple operations involved to update all the copies. Since there are no transactions in MongoDB, Data Virtualization plans to provide automatic compensating transaction framework around this in future releases TEIID-2957.
MANY-2-ONE Mapping.
This is same as ONE-2-MANY, see above to define relationships.
A parent table can have multiple "embedded" and as well as "merge" documents inside it, it not limited so either one or other. However, please note that MongoDB imposes document size is limited can not exceed 16MB.
MANY-2-MANY Mapping.
This can also mapped with combination of "teiid_mongo:MERGE" and "teiid_mongo:EMBEDDABLE" properties (partially). For example if DDL looks like
CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, OrderDate date, Status integer ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE OrderDetail ( OrderID integer, ProductID integer, PRIMARY KEY (OrderID,ProductID), FOREIGN KEY (OrderID) REFERENCES Order (OrderID), FOREIGN KEY (ProductID) REFERENCES Product (ProductID) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Products ( ProductID integer PRIMARY KEY, ProductName varchar(40) ) OPTIONS(UPDATABLE 'TRUE');
you modify the DDL like below, to have
CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, OrderDate date, Status integer ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE OrderDetail ( OrderID integer, ProductID integer, PRIMARY KEY (OrderID,ProductID), FOREIGN KEY (OrderID) REFERENCES Order (OrderID), FOREIGN KEY (ProductID) REFERENCES Product (ProductID) ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:MERGE" 'Order'); CREATE FOREIGN TABLE Products ( ProductID integer PRIMARY KEY, ProductName varchar(40) ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:EMBEDDABLE" 'TRUE');
That will produce a document like
{ _id : 10248, OrderDate : ISODate("1996-07-04T05:00:00Z"), Status : 5 OrderDetails : [ { _id : { OrderID : 10248, ProductID : 11 Products : { ProductID: 11 ProductName: "Hammer" } } }, { _id : { OrderID : 10248, ProductID : 14 Products : { ProductID: 14 ProductName: "Screw Driver" } } } ] } Products { { ProductID: 11 ProductName: "Hammer" } { ProductID: 14 ProductName: "Screw Driver" } }
Limitations
- Nested embedding of documents is limited due to capabilities of handling nested arrays is limited in the MongoDB. Nesting of "EMBEDDABLE" property with multiple levels is OK, however more than two levels with MERGE is not recommended. Also, you need to be caution about not exceeding the document size of 16 MB for single row, so deep nesting is not recommended.
- JOINS between related tables, MUST use either the "EMBEDDABLE" or "MERGE" properties, otherwise the query will result in error. In order for Data Virtualization to correctly plan and work with JOINS, in the case that any two tables are NOT embedded in each other, use allow-joins=false property on the Foreign Key that represents the relation. For example:
CREATE FOREIGN TABLE Customer ( CustomerId integer PRIMARY KEY, FirstName varchar(25), LastName varchar(25) ) OPTIONS(UPDATABLE 'TRUE'); CREATE FOREIGN TABLE Order ( OrderID integer PRIMARY KEY, CustomerId integer, OrderDate date, Status integer, FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) OPTIONS (allow-join 'FALSE') ) OPTIONS(UPDATABLE 'TRUE');
with the example above, Data Virtualization will create two collections, however when user issues query such as
SELECT OrderID, LastName FROM Order JOIN Customer ON Order.CustomerId = Customer.CustomerId;
instead of resulting in error, the JOIN processing will happen in the Data Virtualization engine, without the above property it will result in an error.
When you use above properties and carefully design the MongoDB document structure, Data Virtualization translator can intelligently collate data based on their co-location and take advantage of it while querying.
Geo Spatial functions
MongoDB translator enables you to use geo spatial query operators in the "WHERE" clause, when the data is stored in the GeoJSon format in the MongoDB Document. The following functions are available:
CREATE FOREIGN FUNCTION geoIntersects (columnRef string, type string, coordinates double[][]) RETURNS boolean; CREATE FOREIGN FUNCTION geoWithin (ccolumnRef string, type string, coordinates double[][]) RETURNS boolean; CREATE FOREIGN FUNCTION near (ccolumnRef string, coordinates double[], maxdistance integer) RETURNS boolean; CREATE FOREIGN FUNCTION nearSphere (ccolumnRef string, coordinates double[], maxdistance integer) RETURNS boolean; CREATE FOREIGN FUNCTION geoPolygonIntersects (ref string, north double, east double, west double, south double) RETURNS boolean; CREATE FOREIGN FUNCTION geoPolygonWithin (ref string, north double, east double, west double, south double) RETURNS boolean;
a sample query looks like
SELECT loc FROM maps where mongo.geoWithin(loc, 'LineString', ((cast(1.0 as double), cast(2.0 as double)), (cast(1.0 as double), cast(2.0 as double))))
Same functions using built-in Geometry type (the versions of the functions in the preceding list will be deprecated and removed in future versions)
CREATE FOREIGN FUNCTION geoIntersects (columnRef string, geo geometry) RETURNS boolean; CREATE FOREIGN FUNCTION geoWithin (ccolumnRef string, geo geometry) RETURNS boolean; CREATE FOREIGN FUNCTION near (ccolumnRef string, geo geometry, maxdistance integer) RETURNS boolean; CREATE FOREIGN FUNCTION nearSphere (ccolumnRef string, geo geometry, maxdistance integer) RETURNS boolean; CREATE FOREIGN FUNCTION geoPolygonIntersects (ref string, geo geometry) RETURNS boolean; CREATE FOREIGN FUNCTION geoPolygonWithin (ref string, geo geometry) RETURNS boolean;
a sample query looks like
SELECT loc FROM maps where mongo.geoWithin(loc, ST_GeomFromGeoJSON('{"coordinates":[[1,2],[3,4]],"type":"Polygon"}'))
There are various "st_geom.." methods are available in the Geo Spatial function library in Data Virtualization.
Capabilities
MongoDB translator is designed on top of the MongoDB aggregation framework. You must use a MongoDB version that the aggregation framework. Apart from SELECT queries, the MongoDB translator also works with INSERT, UPDATE and DELETE queries.
You can use the MongoDB translator with the following functions:
- Grouping.
- Matching.
- Sorting.
- Filtering.
- Limits.
- Working with LOBs stored in GridFS.
- Composite primary and foreign keys.
Native queries
MongoDB source procedures may be created using the teiid_rel:native-query
extension. For more information, see Parameterizable native queries in Translators. The procedure will invoke the native-query similar to a direct procedure call with the benefits that the query is predetermined and that result column types are known, rather than requiring the use of ARRAYTABLE or similar functionality.
Direct query procedure
This feature is turned off by default because of the security risk this exposes to execute any command against the source. To enable direct query procedures, set the execution property called SupportsDirectQueryProcedure
to true
. For more information, see Override the execution properties in Chapter 9, Translators.
By default the name of the procedure that executes the queries directly is called native. For information about how to change the default name, see Override the execution properties in Chapter 9, Translators.
The MongoDB translator provides a procedure to execute any ad-hoc aggregate query directly against the source without Data Virtualization parsing or resolving. Since the metadata of this procedure’s results are not known to Data Virtualization, they are returned as an object array containing single blob at array location one(1). This blob contains the JSON document. XMLTABLE can be used construct tabular output for consumption by client applications.
Example MongoDB Direct Query
select x.* from TABLE(call native('city;{$match:{"city":"FREEDOM"}}')) t, xmltable('/city' PASSING JSONTOXML('city', cast(array_get(t.tuple, 1) as BLOB)) COLUMNS city string, state string) x
In the above example, a collection called "city" is looked up with filter that matches the "city" name with "FREEDOM", using "native" procedure and then using the nested tables feature the output is passed to a XMLTABLE construct, where the output from the procedure is sent to a JSONTOXML function to construct a XML then the results of that are exposed in tabular form.
The direct query MUST be in the format
"collectionName;{$pipeline instr}+"
From Data Virtualization 8.10, MongoDB translator also allows to execute Shell type java script commands like remove, drop, createIndex. For this the command needs to be in format
"$ShellCmd;collectionName;operationName;{$instr}+"
and example looks like
"$ShellCmd;MyTable;remove;{ qty: { $gt: 20 }}"
9.9. OData translator
The OData translator, known by the type name "odata" exposes the OData V2 and V3 data sources and uses the Data Virtualization web services resource adapter for making web service calls. This translator is an extension of the Web services translator.
What is OData?
The Open Data Protocol (OData) web protocol is for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData does this by applying and building upon Web technologies such as HTTP, Atom Publishing Protocol (AtomPub) and JSON to provide access to information from a variety of applications, services, and stores. OData is being used to expose and access information from a variety of sources including, but not limited to, relational databases, file systems, content management systems and traditional Web sites.
Using this specification from the OASIS group, with help from the OData4J framework, Data Virtualization maps OData entities into relational schema. Data Virtualization can read CSDL (Conceptual Schema Definition Language) from a provided OData endpoint, and convert the OData schema into a relational schema. The following table shows the mapping selections in the OData translator from a CSDL document.
OData | Mapped to relational entity |
---|---|
EntitySet | Table |
FunctionImport | Procedure |
AssociationSet | Foreign keys on the table* |
ComplexType | ignored** |
A many-to-many association will result in a link table that can not be selected from, but can be used for join purposes.
- When used in functions, an implicit table is exposed. When used to define a embedded table, all the columns will be in-lined.
All CRUD operations will be appropriately mapped to the resulting entity based on the SQL submitted to the OData translator.
- Usage
Usage of a OData source is similar to that of a JDBC translator. The metadata import is provided through the translator, once the metadata is imported from the source system and exposed in relational terms, then this source can be queried as if the EntitySets and Function Imports were local to the Data Virtualization system.
Name | Description | Default |
---|---|---|
DatabaseTimeZone | The time zone of the database. Used when fetchings date, time, or timestamp values. | The system default time zone |
SupportsOdataCount |
Enables the use of the | true |
SupportsOdataFilter |
Enables the use of the | true |
SupportsOdataOrderBy |
Enables the use of the | true |
SupportsOdataSkip |
Enables the use of the | true |
SupportsOdataTop |
Enables the use of the | true |
Name | Description | Default |
---|---|---|
schemaNamespace | Namespace of the schema to import. | null |
entityContainer | Entity Container Name to import. | default container |
Example: Importer settings to import only tables and views from NetflixCatalog
<property name="importer.schemaNamespace" value="System.Data.Objects"/> <property name="importer.entityContainer" value="NetflixCatalog"/>
The OData server that you connect to might not fully implement the entire OData specification. If the server’s OData implementation does not support a feature, set "execution properties" to turn off the corresponding capability, so that Data Virtualization will not push down invalid queries to the translator.
For example, to turn off $filter
, add the following statement to the virtual database DDL:
CREATE SERVER odata FOREIGN DATA WRAPPER "odata-override" OPTIONS ("SupportOdataFilter" 'false');
The OData translator cannot perform native or direct query execution. However, you can use the invokehttp method of the Web services translator to issue REST-based calls, and then use SQLXML to parse results.
Data Virtualization can not only consume OData-based data sources, but it can also expose any data source as an OData-based web service.
For more information about configuring an OData server, see OData support in the Client Developer’s Guide.
9.10. OData V4 translator
The OData V4 translator, known by the type name "odata4" exposes the OData Version 4 data sources and uses the Data Virtualization web services resource adapter for making web service calls. This translator is extension of Web Services Translator. The OData V4 translator is not for use with older OData V1-3 sources. Use the OData translator ("odata") for older OData sources.
What is OData
The Open Data Protocol (OData) Web protocol is for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData does this by applying and building upon Web technologies such as HTTP, Atom Publishing Protocol (AtomPub), and JSON to provide access to information from a variety of applications, services, and stores. OData is being used to expose and access information from a variety of sources including, but not limited to, relational databases, file systems, content management systems and traditional Web sites.
Using this specification from the OASIS group, with the help from the Olingo framework, Data Virtualization maps OData V4 CSDL (Conceptual Schema Definition Language) document from the OData endpoint provided and converts the OData metadata into Data Virtualization’s relational schema. The following table shows the mapping selections in the OData V4 translator from a CSDL document
Data Virtualization can not only consume OData-based data sources, but it can expose any data source as an OData based web service. For more information see OData Support in the Client Developer’s Guide.
OData | Mapped to relational entity |
---|---|
EntitySet | Table |
EntityType | Table see [1] |
ComplexType | Table see [2] |
FunctionImport | Procedure [3] |
ActionImport | Procedure [3] |
NavigationProperties | Table [4] |
[1] Only if the EntityType is exposed as the EntitySet in the Entity container. [2] Only if the complex type is used as property in the exposed EntitySet. This table will be designed as child table with foreign key [1-to-1] or [1-to-many] relationship to the parent.
[3] If the return type is EntityType or ComplexType, the procedure is designed to return a table. [4] Navigation properties are exposed as tables. The table will be created with foreign key relationship to the parent.
All CRUD operations will be appropriately mapped to the resulting entity based on the SQL submitted to the OData translator.
Usage
Usage of a OData source is similar a JDBC translator. The metadata import is supported through the translator, once the metadata is imported from source system and exposed in relational terms, then this source can be queried as if the EntitySets, Function Imports and Action Imports were local to the Data Virtualization system.
It is not recommended to define your own metadata using Data Virtualization DDL for complex services. There are several extension metadata properties required to enable proper functioning. On non-string properties, a NATIVE_TYPE
property is expected and should specify the full EDM type name - Edm.xxx
.
The below is sample VDB that can read metadata service from TripPin service on http://odata.org site.
<vdb name="trippin" version="1"> <model name="trippin"> <source name="odata4" translator-name="odata4" connection-jndi-name="java:/tripDS"/> </model> </vdb>
You can connect to the VDB deployed using Data Virtualization JDBC driver and issue SQL statements like
SELECT * FROM trippin.People; SELECT * FROM trippin.People WHERE UserName = 'russelwhyte'; SELECT * FROM trippin.People p INNER JOIN trippin.People_Friends pf ON p.UserName = pf.People_UserName; (note that People_UserName is implicitly added by Data Virtualization metadata) EXEC GetNearestAirport(lat, lon) ;
Execution properties
Sometimes default properties need to adjusted for proper execution of the translator. The following execution properties extend or limit the functionality of the translator based on the physical source capabilities.
Name | Description | Default |
---|---|---|
SupportsOdataCount | Supports $count | true |
SupportsOdataFilter | Supports $filter | true |
SupportsOdataOrderBy | Supports $orderby | true |
SupportsOdataSkip | Supports $skip | true |
SupportsOdataTop | Supports $top | true |
SupportsUpdates | Supports INSERT/UPDATE/DELETE | true |
The OData server that you connect to might not fully implement the entire OData specification. If the server’s OData implementation does not support a feature, set "execution properties" to turn off the corresponding capability, so that Data Virtualization does not push down invalid queries to the translator.
<translator name="odata-override" type="odata"> <property name="SupportsOdataFilter" value="false"/> </translator>
then use "odata-override" as the translator name on your source model.
Importer properties
The following table lists the importer properties that define the behavior of the translator during metadata import from the physical source.
Name | Description | Default |
---|---|---|
schemaNamespace | Namespace of the schema to import | null |
Example importer settings to only import tables and views from Trippin service exposed on odata.org
<property name="importer.schemaNamespace" value="Microsoft.OData.SampleService.Models.TripPin"/>
You can leave this property undefined. If the translator does not detect a configured instance of the property, it specifies the default name of the EntityContainer.
Native queries - Native or direct query execution is not supported through the OData translator. However, you can use the invokehttp method of the Web services translator to issue REST-based calls, and then use SQLXML to parse results.
9.11. OpenAPI translator
The OpenAPI translator, known by the type name "openapi" exposes OpenAPI data sources via relational concepts and uses the Data Virtualization WS resource adapter for making web service calls.
What is OpenAPI?
[OpenAPI is a simple yet powerful representation of your RESTful API. With the largest ecosystem of API tooling on the planet, thousands of developers are supporting OpenAPI in almost every modern programming language and deployment environment. With an OpenAPI-enabled API, you get interactive documentation, client SDK generation, and discoverability.
This translator is compatible with OpenAPI/Swagger v2 and OpenAPI v3.
Usage
Usage of a OpenAPI source is similar any other translator in Data Virtualization. The translator enables metadata import. The metadata is imported from source system’s metadata file and then exposed as stored procedures in Data Virtualization. The source system can be queried by executing these stored procedures in Data Virtualization system.
Although parameter order is guaranteed by the Swagger libraries, if you rely upon the native import, it is best if you call procedures using named, rather than positional parameters.
The below is sample VDB that can read metadata from Petstore reference service on http://petstore.swagger.io/ site.
<vdb name="petstore" version="1"> <model visible="true" name="m"> <property name="importer.metadataUrl" value="/swagger.json"/> <source name="s" translator-name="openapi" connection-jndi-name="java:/openapi"/> </model> </vdb>
The required resource-adapter configuration will look like
<resource-adapter id="openapi"> <module slot="main" id="org.jboss.teiid.resource-adapter.webservice"/> <transaction-support>NoTransaction</transaction-support> <connection-definitions> <connection-definition class-name="org.teiid.resource.adapter.ws.WSManagedConnectionFactory" jndi-name="java:/openapi" enabled="true" use-java-context="true" pool-name="teiid-openapi-ds"> <config-property name="EndPoint"> http://petstore.swagger.io/v2 </config-property> </connection-definition> </connection-definitions> </resource-adapter>
After you configure the preceding resource-adapter and deploy the VDB successfully, then you can connect to the VDB deployed using Data Virtualization JDBC driver and issue SQL statements such as the following:
EXEC findPetsByStatus(('sold',)) EXEC getPetById(1461159803) EXEC deletePet('', 1461159803)
Execution properties
Execution properties extend/limit the functionality of the translator based on the physical source capabilities. Sometimes default properties must be adjusted for proper execution of the translator.
Execution properties
None.
Importer properties
The following table lists the importer properties that define the behavior of the translator during the import of from the physical source.
Name | Description | Default |
---|---|---|
metadataUrl | URL from which to obtain the OpenAPI metadata. May be a local file using a file: URL. | true |
server | The server to use. Otherwise the first server listed will be used. | null |
preferredProduces | Preferred Accept MIME type header, this should be one of the OpenAPI 'produces' types; | application/json |
preferredConsumes | Preferred Content-Type MIME type header, this should be one of the OpenAPI 'consumer' types; | application/json |
Native queries - The OpenAPI translator cannot perform native or direct query execution. However, you can use the invokehttp method of the Web services translator to issue REST-based calls, and then use SQLXML to parse results.
Limitations
The OpenAPI translator does not fully implement all of the features of OpenAPI. The following limitations apply:
-
You cannot set the MIME type to
application/xml
in either theAccept
orContent-Type
headers. - File and Map properties cannot be used. As a result, any multi-part payloads are not supported.
- The translator does not process security metadata.
-
The translator does not process custom properties that start with
x-
. The translator does not work with following JSON schema keywords:
-
allOf
-
multipleOf
-
items
-
9.12. Salesforce translators
You can use the Salesforce translator to run SELECT
, DELETE
, INSERT
, UPSERT,
and UPDATE
operations against a Salesforce.com account.
salesforce
The translator, known by the type name salesforce, works with Salesforce API 37.0 and later.
Name | Description | Default |
---|---|---|
MaxBulkInsertBatchSize | Batch Size to use to insert bulk inserts. | 2048 |
SupportsGroupBy |
Enables | true |
The Salesforce translator can import metadata.
Property Name | Description | Required | Default |
---|---|---|---|
NormalizeNames | If the importer should attempt to modify the object/field names so that they can be used unquoted. | false | true |
excludeTables | A case-insensitive regular expression that when matched against a table name will exclude it from import. Applied after table names are retrieved. Use a negative look-ahead (?!<inclusion pattern>).* to act as an inclusion filter. | false | n/a |
includeTables | A case-insensitive regular expression that when matched against a table name will be included during import. Applied after table names are retrieved from source. | false | n/a |
importStatstics | Retrieves cardinalities during import using the REST API explain plan feature. | false | false |
ModelAuditFields | Add Audit Fields To Model. This includes CreatedXXX, LastModifiedXXX, and SystemModstamp fields. | false | false |
NOTE: When both includeTables and excludeTables patterns are present during the import, the includeTables pattern matched first, then the excludePatterns will be applied.
If you need connectivity to an API version other than what is built in, you may try to use an existing connectivity pair, but in some circumstances - especially accessing a later remote api from an older Java API - this is not possible and results in what appears to be hung connections.
Extension metadata properties
Salesforce is not relational database, however Data Virtualization provides ways to map Saleforce data into relational constructs like Tables and Procedures. You can define a foreign table using DDL in Data Virtualization VDB, which maps to Salesforce’s SObject. At runtime, to interpret this table back to a SObject, Data Virtualization decorates or tags this table definition with additional metadata. For example, a table is defined as in the following example:
CREATE FOREIGN TABLE Pricebook2 ( Id string, Name string, IsActive boolean, IsStandard boolean, Description string, IsDeleted boolean) OPTIONS ( UPDATABLE 'TRUE', "teiid_sf:Supports Query" 'TRUE');
In the preceding example, the property in the OPTIONS
clause with the property "teiid_sf:Supports Query"
set to TRUE
indicates that you can run SELECT
commands against this table. The following table lists the metadata extension properties that can be used in a Salesforce schema.
Property Name | Description | Required | Default | Applies To |
---|---|---|---|---|
Supports Query |
You can run | false | true | Table |
Supports Retrieve |
You can retrieve the results of | false | true | Table |
SQL processing
Salesforce does not provide the same set of functionality as a relational database. For example, Salesforce does not support arbitrary joins between tables. However, working in combination with the Data Virtualization Query Planner, the Salesforce connector can use nearly all of the SQL syntax capabilities in Data Virtualization. The Salesforce Connector executes SQL commands by "pushing down" the command to Salesforce whenever possible, depending on the available capabilities. Data Virtualization will automatically provide additional database functionality when the Salesforce Connector does not explicitly enable use of a given SQL construct. In cases where certain SQL capabilities cannot be pushed down to Salesforce, Data Virtualization will push down the capabilities that it can, and fetch a set of data from Salesforce. Then, Data Virtualization will evaluate the additional capabilities, creating a subset of the original data set. Finally, Data Virtualization will pass the result to the client.
If you issue queries with a GROUP BY
clause, and you receive a Salesforce error that indicates that queryMore
is not supported, you can either add limits, or set the execution property SupportsGroupBy
to false
.
SELECT array_agg(Reports) FROM Supervisor where Division = 'customer support';
Neither Salesforce, nor the Salesforce Connector support the array_agg()
scalar. however, both are compatible with the CompareCriteriaEquals
query, so the connector transforms the query that it receives into this query to Salesforce.
SELECT Reports FROM Supervisor where Division = 'customer support';
The array_agg() function will be applied by the Data Virtualization Query Engine to the result set returned by the connector.
In some cases, multiple calls to the Salesforce application will be made to process the SQL that is passed to the connector.
DELETE From Case WHERE Status = 'Closed';
The API in Salesforce to delete objects can delete by object ID only. In order to accomplish this, the Salesforce connector will first execute a query to get the IDs of the correct objects, and then delete those objects. So the above DELETE command will result in the following two commands.
SELECT ID From Case WHERE Status = 'Closed'; DELETE From Case where ID IN (<result of query>);
NOTE: The Salesforce API DELETE call is not expressed in SQL, but the above is an equivalent SQL expression.
It’s useful to be aware of incompatible capabilities, in order to avoid fetching large data sets from Salesforce and making you queries as performant as possible. For information about the SQL constructs that you can push down to Salesforce, see Compatible SQL capabilities.
Selecting from multi-select picklists
A multi-select picklist is a field type in Salesforce that can contain multiple values in a single field. Query criteria operators for fields of this type in SOQL are limited to EQ, NE, includes and excludes. For the Salesforce documentation about how to select from multi-select picklists, see Querying Multi-select Picklists
Data Virtualization SQL does not support the includes or excludes operators, but the Salesforce connector provides user-defined function definitions for these operators that provide equivalent functionality for fields of type multi-select. The definition for the functions is:
boolean includes(Column column, String param) boolean excludes(Column column, String param)
For example, take a single multi-select picklist column called Status that contains all of these values.
- current
- working
- critical
For that column, all of the below are valid queries:
SELECT * FROM Issue WHERE true = includes (Status, 'current, working' ); SELECT * FROM Issue WHERE true = excludes (Status, 'current, working' ); SELECT * FROM Issue WHERE true = includes (Status, 'current;working, critical' );
EQ and NE criteria will pass to Salesforce as supplied. For example, these queries will not be modified by the connector.
SELECT * FROM Issue WHERE Status = 'current'; SELECT * FROM Issue WHERE Status = 'current;critical'; SELECT * FROM Issue WHERE Status != 'current;working';
Selecting all objects
You can use the Salesforce connector to call the queryAll
operation from the Salesforce API. The queryAll
operation is equivalent to the query operation with the exception that it returns data about all current and deletedobjects in the system.
The connector determines if it will call the query or queryAll
operation via reference to the isDeleted
property present on each Salesforce object, and modeled as a column on each table generated by the importer. By default this value is set to false
when the model is generated and thus the connector calls query. Users are free to change the value in the model to true
, changing the default behavior of the connector to be queryAll
.
The behavior is different if isDeleted
is used as a parameter in the query. If the isDeleted
column is used as a parameter in the query, and the value is true
, then the connector calls queryAll
.
select * from Contact where isDeleted = true;
If the isDeleted
column is used as a parameter in the query, and the value is false
, then the connector that performs the default behavior will call the query.
select * from Contact where isDeleted = false;
Selecting updated objects
If the option is selected when importing metadata from Salesforce, a GetUpdated procedure is generated in the model with the following structure:
GetUpdated (ObjectName IN string, StartDate IN datetime, EndDate IN datetime, LatestDateCovered OUT datetime) returns ID string
See the description of the GetUpdated operation in the Salesforce documentation for usage details.
Selecting deleted objects
If the option is selected when importing metadata from Salesforce, a GetDeleted procedure is generated in the model with the following structure:
GetDeleted (ObjectName IN string, StartDate IN datetime, EndDate IN datetime, EarliestDateAvailable OUT datetime, LatestDateCovered OUT datetime) returns ID string, DeletedDate datetime
See the description of the GetDeleted operation in the Salesforce documentation for usage details.
Relationship queries
Unlike a relational database, Salesforce does not support join operations, but it does have support for queries that include parent-to-child or child-to-parent relationships between objects. These are termed Relationship Queries. You can run Relationship Queries in the SalesForce connector through Outer Join syntax.
SELECT Account.name, Contact.Name from Contact LEFT OUTER JOIN Account on Contact.Accountid = Account.id
This query shows the correct syntax to query a SalesForce model with to produce a relationship query from child to parent. It resolves to the following query to SalesForce.
SELECT Contact.Account.Name, Contact.Name FROM Contact
select Contact.Name, Account.Name from Account Left outer Join Contact on Contact.Accountid = Account.id
This query shows the correct syntax to query a SalesForce model with to produce a relationship query from parent to child. It resolves to the following query to SalesForce.
SELECT Account.Name, (SELECT Contact.Name FROM Account.Contacts) FROM Account
See the description of the Relationship Queries operation in the SalesForce documentation for limitations.
Bulk insert queries
You can also use bulk insert statements in the SalesForce translator by using JDBC batch semantics or SELECT INTO semantics. The batch size is determined by the execution property MaxBulkInsertBatchSize, which can be overridden in the vdb file. The default value of the batch is 2048. The bulk insert feature uses the async REST based API exposed by Salesforce for execution for better performance.
Bulk selects
When querying tables with more than 10,000,000 records, or if experiencing timeouts with just result batching, Data Virtualization can issue queries to Salesforce using the bulk API. When using a bulk select, primary key (PK) chunking is enabled if it is compatible with the query.
The use of the bulk api requires a source hint in the query:
SELECT /*+ sh salesforce:'bulk' */ Name ... FROM Account
Where salesforce is the source name of the target source.
The default chunk size of 100,000 records will be used.
This feature is only supported in the Salesforce API version 28 or higher.
Compatible SQL capabilities
You ca use the following SQL capabilities with the Salesforce Connector. These SQL constructs will be pushed down to Salesforce.
- SELECT command
- INSERT Command
- UPDATE Command
- DELETE Command
- NotCriteria
- OrCriteria
- CompareCriteriaEquals
- CompareCriteriaOrdered
- IsNullCritiera
- InCriteria
- LikeCriteria - Can be used for String fields only.
- RowLimit
- Basic Aggregates
- OuterJoins with join criteria KEY
Native Queries
Salesforce procedures may optionally have native queries associated with them. For more information, see Parameterizable native queries in Translators. The operation prefix (select;, insert;, update;, delete; - see below for more) must be present in the native-query, but it will not be issued as part of the query to the source.
Example DDL for a Salesforce native procedure
CREATE FOREIGN PROCEDURE proc (arg1 integer, arg2 string) OPTIONS ("teiid_rel:native-query" 'search;SELECT ... complex SOQL ... WHERE col1 = $1 and col2 = $2') returns (col1 string, col2 string, col3 timestamp);
Direct query procedure
This feature is turned off by default because of the security risk this exposes to execute any command against the source. To enable direct query procedures, set the execution property called SupportsDirectQueryProcedure
to true
. For more information, see Override the execution properties in Chapter 9, Translators.
By default the name of the procedure that executes the queries directly is called native. For information about how to change the default name, see Override the execution properties in Chapter 9, Translators.
The Salesforce translator provides a procedure to execute any ad-hoc SOQL query directly against the source without Data Virtualization parsing or resolving. Since the metadata of this procedure’s results are not known to Data Virtualization, they are returned as an object array. ARRAYTABLE can be used construct tabular output for consumption by client applications. Data Virtualization exposes this procedure with a simple query structure as follows:
Select example
SELECT x.* FROM (call sf_source.native('search;SELECT Account.Id, Account.Type, Account.Name FROM Account')) w, ARRAYTABLE(w.tuple COLUMNS "id" string , "type" string, "name" String) AS x
from the above code, the "search" keyword followed by a query statement.
The SOQL is treated as a parameterized native query so that parameter values may be inserted in the query string properly. For more information, see Parameterizable native queries in Translators. The results returned by search may contain the object Id as the first column value regardless of whether it was selected. Also queries that select columns from multiple object types will not be correct.
Delete Example
SELECT x.* FROM (call sf_source.native('delete;', 'id1', 'id2')) w, ARRAYTABLE(w.tuple COLUMNS "updatecount" integer) AS x
form the above code, the "delete;" keyword followed by the ids to delete as varargs.
Create example
SELECT x.* FROM (call sf_source.native('create;type=table;attributes=one,two,three', 'one', 2, 3.0)) w, ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x
form the above code, the "create" or "update" keyword must be followed by the following properties. Attributes must be matched positionally by the procedure variables - thus in the example attribute two will be set to 2.
Property Name | Description | Required |
---|---|---|
type | Table Name | Yes |
attributes | comma separated list of names of the columns | no |
The values for each attribute is specified as separate argument to the "native" procedure.
Update is similar to create, with one more extra property called "id", which defines identifier for the record.
Update example
SELECT x.* FROM (call sf_source.native('update;id=pk;type=table;attributes=one,two,three', 'one', 2, 3.0)) w, ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x
By default the name of the procedure that executes the queries directly is called native, however you can add set an override execution property in the DDL file to change it.
9.13. Rest translator
The Rest translator, known by the type name rest, exposes stored procedures for calling REST services. Results from this translator will commonly be used with the TEXTTABLE, JSONTABLE, or XMLTABLE table functions to use CSV, JSON, or XML formated data.
Execution properties
There are no rest importer settings, but it can provide metadata for VDBs.
Usage
The rest translator exposes low level procedures for accessing web services.
InvokeHTTP procedure
invokeHttp
can return the byte contents of an HTTP(S) call.
Procedure invokeHttp(action in STRING, request in OBJECT, endpoint in STRING, stream in BOOLEAN, contentType out STRING, headers in CLOB) returns BLOB
Action indicates the HTTP method (GET, POST, etc.), which defaults to POST.
A null value for endpoint will use the default value. The default endpoint is specified in the rest source configuration. The endpoint URL may be absolute or relative. If it’s relative then it will be combined with the default endpoint.
Since multiple parameters are not required to have values, it is often more clear to call the invokeHttp procedure with named parameter syntax.
call invokeHttp(action=>'GET')
The request can be one of SQLXML, STRING, BLOB, or CLOB. The request will be sent as the POST payload in byte form. For STRING/CLOB values this will default to the UTF-8 encoding. To control the byte encoding, see the to_bytes function.
The optional headers parameter can be used to specify the request header values as a JSON value. The JSON value should be a JSON object with primitive or list of primitive values.
call invokeHttp(... headers=>jsonObject('application/json' as "Content-Type", jsonArray('gzip', 'deflate') as "Accept-Encoding"))
Recommendations for setting headers parameter:
-
Content-Type
might be necessary if the HTTP POST/PUT method is invoked. - Accept is necessary if you want to control return Media Type.
You cannot use native queries or direct query execution procedures with the web services translator.
Streaming considerations
If the stream parameter is set to true
, then the resulting LOB value may only be used a single time. If stream is null
or false
, then the engine may need to save a copy of the result for repeated use. Care must be used as some operations, such as casting or XMLPARSE
might perform validation which results in the stream being consumed.
9.14. Web services translator
The Web Services translator, known by the type name soap or ws, exposes stored procedures for calling web/SOAP services. Results from this translator will commonly be used with the TEXTTABLE or XMLTABLE table functions to use CSV or XML formated data.
Execution properties
Name | Description | When Used | Default |
---|---|---|---|
DefaultBinding | The binding that should be used if one is not specified. Can be one of HTTP, SOAP11, or SOAP12. | invoke* | SOAP12 |
DefaultServiceMode | The default service mode. For SOAP, MESSAGE mode indicates that the request will contain the entire SOAP envelope. and not just the contents of the SOAP body. Can be one of MESSAGE or PAYLOAD | invoke* or WSDL call | PAYLOAD |
XMLParamName | Used with the HTTP binding (typically with the GET method) to indicate that the request document should be part of the query string. | invoke* | null - unused |
Setting the proper binding value on the translator is recommended as it removes the need for callers to pass an explicit value. If your service is actually uses SOAP11, but the binding used SOAP12 you will receive execution failures.
There are no importer settings, but it can provide metadata for VDBs. If the connection is configured to point at a specific WSDL, the translator will import all SOAP operations under the specified service and port as procedures.
Importer properties
When specifying the importer property, it must be prefixed with "importer.". Example: importer.tableTypes
Name | Description | Default |
---|---|---|
importWSDL | Import the metadata from the WSDL URL configured in resource-adapter. | true |
Usage
The translator exposes low level procedures for accessing web services.
Invoke procedure
Invoke allows for multiple binding, or protocol modes, including HTTP, SOAP11, and SOAP12.
Procedure invoke(binding in STRING, action in STRING, request in XML, endpoint in STRING, stream in BOOLEAN) returns XML
The binding may be one of null (to use the default) HTTP, SOAP11, or SOAP12. Action with a SOAP binding indicates the SOAPAction value. Action with a HTTP binding indicates the HTTP method (GET, POST, etc.), which defaults to POST.
A null value for the binding or endpoint will use the default value. The default endpoint is specified in the source configuration. The endpoint URL may be absolute or relative. If it’s relative then it will be combined with the default endpoint.
Since multiple parameters are not required to have values, it is often more clear to call the invoke procedure with named parameter syntax.
call invoke(binding=>'HTTP', action=>'GET')
The request XML should be a valid XML document or root element.
InvokeHTTP procedure
invokeHttp
can return the byte contents of an HTTP(S) call.
Procedure invokeHttp(action in STRING, request in OBJECT, endpoint in STRING, stream in BOOLEAN, contentType out STRING, headers in CLOB) returns BLOB
Action indicates the HTTP method (GET, POST, etc.), which defaults to POST.
A null value for endpoint will use the default value. The default endpoint is specified in the source configuration. The endpoint URL may be absolute or relative. If it’s relative then it will be combined with the default endpoint.
Since multiple parameters are not required to have values, it is often more clear to call the invokeHttp procedure with named parameter syntax.
call invokeHttp(action=>'GET')
The request can be one of SQLXML, STRING, BLOB, or CLOB. The request will be sent as the POST payload in byte form. For STRING/CLOB values this will default to the UTF-8 encoding. To control the byte encoding, see the to_bytes function.
The optional headers parameter can be used to specify the request header values as a JSON value. The JSON value should be a JSON object with primitive or list of primitive values.
call invokeHttp(... headers=>jsonObject('application/json' as "Content-Type", jsonArray('gzip', 'deflate') as "Accept-Encoding"))
Recommendations for setting headers parameter:
-
Content-Type
might be necessary if the HTTP POST/PUT method is invoked. - Accept is necessary if you want to control return Media Type.
WSDL based procedures
The procedures above give you anonymous way to execute any web service methods by supplying an endpoint, with this mechanism you can alter the endpoint defined in WSDL with a different endpoint. However, if you have access to the WSDL, then you can configure the WSDL URL in the web-service resource-adapter’s connection configuration, Web Service translator can parse the WSDL and provide the methods under configured port as pre-built procedures as its metadata. If you are using the default native metadata import, you will see the procedures in your web service’s source model.
You cannot use native queries or direct query execution procedures with the web services translator.
Streaming considerations
If the stream parameter is set to true
, then the resulting LOB value may only be used a single time. If stream is null
or false
, then the engine may need to save a copy of the result for repeated use. Care must be used as some operations, such as casting or XMLPARSE
might perform validation which results in the stream being consumed.