Chapter 3. SQL compatibility
Data Virtualization provides nearly all of the functionality of SQL-92 DML. SQL-99 and later features are constantly being added based upon community need. The following does not attempt to cover SQL exhaustively, but rather highlights how SQL is used within Data Virtualization. For details about the exact form of SQL that Data Virtualization accepts, see the BNF for SQL grammar.
3.1. Identifiers
SQL commands contain references to tables and columns. These references are in the form of identifiers, which uniquely identify the tables and columns in the context of the command. All queries are processed in the context of a virtual database, or VDB. Because information can be federated across multiple sources, tables and columns must be scoped in some manner to avoid conflicts. This scoping is provided by schemas, which contain the information for each data source or set of views.
Fully-qualified table and column names are of the following form, where the separate `parts' of the identifier are delimited by periods.
- TABLE: <schema_name>.<table_spec>
- COLUMN: <schema_name>.<table_spec>.<column_name>
Syntax rules
-
Identifiers can consist of alphanumeric characters, or the underscore (
_
) character, and must begin with an alphabetic character. Any Unicode character may be used in an identifier. -
Identifiers in double quotes can have any contents. The double quote character can be used if is escaped with an additional double quote; for example,
"some "" id"
- Because different data sources organize tables in different ways, with some prepending catalog, schema, or user information, Data Virtualization allows table specification to be a dot-delimited construct.
When a table specification contains a dot resolving will allow for the match of a partial name against any number of the end segments in the name. e.g. a table with the fully-qualified name vdbname."sourceschema.sourcetable"
would match the partial name sourcetable
.
-
Columns, column aliases, and schemas cannot contain a dot (
.
) character. - Identifiers, even when quoted, are not case-sensitive in Data Virtualization.
Some examples of valid, fully-qualified table identifiers are:
- MySchema.Portfolios
- "MySchema.Portfolios"
- MySchema.MyCatalog.dbo.Authors
Some examples of valid fully-qualified column identifiers are:
- MySchema.Portfolios.portfolioID
- "MySchema.Portfolios"."portfolioID"
- MySchema.MyCatalog.dbo.Authors.lastName
Fully-qualified identifiers can always be used in SQL commands. Partially- or unqualified forms can also be used, as long as the resulting names are unambiguous in the context of the command. Different forms of qualification can be mixed in the same query.
If you use an alias containing a period (.
) character, it is a known issue that the alias name will be treated the same as a qualified name and may conflict with fully qualified object names.
Reserved words
Reserved words in Data Virtualization include the standard SQL 2003 Foundation, SQL/MED, and SQL/XML reserved words, as well as Data Virtualization specific words such as BIGINTEGER, BIGDECIMAL, or MAKEDEP. For more information about reserved words, see the Reserved Keywords and Reserved Keywords For Future Use sections in BNF for SQL grammar.
3.2. Operator precedence
Data Virtualization parses and evaluates operators with higher precedence before those with lower precedence. Operators with equal precedence are left-associative (left-to-right). The following table lists operator precedence from high to low:
Operator | Description |
---|---|
| array element reference |
| positive/negative value expression |
| multiplication/division |
| addition/subtraction |
| concat |
criteria | For information, see Criteria. |
3.3. Expressions
Identifiers, literals, and functions can be combined into expressions. Expressions can be used in a query with nearly any keyword, including SELECT, FROM (if specifying join criteria), WHERE, GROUP BY, HAVING, or ORDER BY.
You can use following types of expressions in Data Virtualization:
3.3.1. Column Identifiers
Column identifiers are used to specify the output columns in SELECT statements, the columns and their values for INSERT and UPDATE statements, and criteria used in WHERE and FROM clauses. They are also used in GROUP BY, HAVING, and ORDER BY clauses. The syntax for column identifiers was defined in the Identifiers section above.
3.3.2. Literals
Literal values represent fixed values. These can be any of the 'standard' data types. For information about data types, see Data types.
Syntax rules
- Integer values will be assigned an integral data type big enough to hold the value (integer, long, or biginteger).
- Floating point values will always be parsed as a double.
- The keyword 'null' is used to represent an absent or unknown value and is inherently untyped. In many cases, a null literal value will be assigned an implied type based on context. For example, in the function '5 + null', the null value will be assigned the type 'integer' to match the type of the value '5'. A null literal used in the SELECT clause of a query with no implied context will be assigned to type 'string'.
Some examples of simple literal values are:
'abc'
Example: Escaped single tick
'isn"t true'
5
Example: Scientific notation
-37.75e01
Example: exact numeric type BigDecimal
100.0
true
false
Example: Unicode character
'\u0027'
Example: Binary
X'0F0A'
Date/Time literals can use either JDBC Escaped literal syntax:
Example: Date literal
{d'...'}
Example: Time literal
{t'...'}
Example: Timestamp literal
{ts'...'}
Or the ANSI keyword syntax:
Example: Date literal
DATE '...'
Example: Time literal
TIME '...'
Example: Timestamp literal
TIMESTAMP '...'
Either way, the string literal value portion of the expression is expected to follow the defined format - "yyyy-MM-dd" for date, "hh:mm:ss" for time, and "yyyy-MM-dd[ hh:mm:ss[.fff…]]" for timestamp.
Aggregate functions
Aggregate functions take sets of values from a group produced by an explicit or implicit GROUP BY and return a single scalar value computed from the group.
You can use the following aggregate functions in Data Virtualization:
- COUNT(*)
- Count the number of values (including nulls and duplicates) in a group. Returns an integer - an exception will be thrown if a larger count is computed.
- COUNT(x)
- Count the number of values (excluding nulls) in a group. Returns an integer - an exception will be thrown if a larger count is computed.
- COUNT_BIG(*)
- Count the number of values (including nulls and duplicates) in a group. Returns a long - an exception will be thrown if a larger count is computed.
- COUNT_BIG(x)
- Count the number of values (excluding nulls) in a group. Returns a long - an exception will be thrown if a larger count is computed.
- SUM(x)
- Sum of the values (excluding nulls) in a group.
- AVG(x)
- Average of the values (excluding nulls) in a group.
- MIN(x)
- Minimum value in a group (excluding null).
- MAX(x)
- Maximum value in a group (excluding null).
- ANY(x)/SOME(x)
- Returns TRUE if any value in the group is TRUE (excluding null).
- EVERY(x)
- Returns TRUE if every value in the group is TRUE (excluding null).
- VAR_POP(x)
- Biased variance (excluding null) logically equals(sum(x^2) - sum(x)^2/count(x))/count(x); returns a double; null if count = 0.
- VAR_SAMP(x)
- Sample variance (excluding null) logically equals(sum(x^2) - sum(x)^2/count(x))/(count(x) - 1); returns a double; null if count < 2.
- STDDEV_POP(x)
- Standard deviation (excluding null) logically equals SQRT(VAR_POP(x)).
- STDDEV_SAMP(x)
- Sample standard deviation (excluding null) logically equals SQRT(VAR_SAMP(x)).
- TEXTAGG(expression [as name], … [DELIMITER char] [QUOTE char | NO QUOTE] [HEADER] [ENCODING id] [ORDER BY …])
- CSV text aggregation of all expressions in each row of a group. When DELIMITER is not specified, by default comma(,) is used as delimiter. All non-null values will be quoted. Double quotes(") is the default quote character. Use QUOTE to specify a different value, or NO QUOTE for no value quoting. If HEADER is specified, the result contains the header row as the first line - the header line will be present even if there are no rows in a group. This aggregation returns a blob.
TEXTAGG(col1, col2 as name DELIMITER '|' HEADER ORDER BY col1)
- XMLAGG(xml_expr [ORDER BY …]) – XML concatenation of all XML expressions in a group (excluding null). The ORDER BY clause cannot reference alias names or use positional ordering.
- JSONARRAY_AGG(x [ORDER BY …]) – creates a JSON array result as a Clob including null value. The ORDER BY clause cannot reference alias names or use positional ordering. For more information, see JSONARRAY function.
Example: Integer value expression
jsonArray_Agg(col1 order by col1 nulls first)
could return
[null,null,1,2,3]
- STRING_AGG(x, delim) – creates a lob results from the concatenation of x using the delimiter delim. If either argument is null, no value is concatenated. Both arguments are expected to be character (string/clob) or binary (varbinary, blob), and the result will be CLOB or BLOB respectively. DISTINCT and ORDER BY are allowed in STRING_AGG.
Example: String aggregate expression
string_agg(col1, ',' ORDER BY col1 ASC)
could return
'a,b,c'
-
LIST_AGG(x [, delim]) WITHIN GROUP (ORDER BY …) – a form of STRING_AGG that uses the same syntax as Oracle. Here
x
can be any type that can be converted to a string. Thedelim
value, if specified, must be a literal, and theORDER BY
value is required. This is only a parsing alias for an equivalentstring_agg
expression.
Example: List aggregate expression
listagg(col1, ',') WITHIN GROUP (ORDER BY col1 ASC)
could return
'a,b,c'
- ARRAY_AGG(x [ORDER BY …]) – Creates an array with a base type that matches the expression x. The ORDER BY clause cannot reference alias names or use positional ordering.
- agg([DISTINCT|ALL] arg … [ORDER BY …]) – A user defined aggregate function.
Syntax rules
- Some aggregate functions may contain a keyword 'DISTINCT' before the expression, indicating that duplicate expression values should be ignored. DISTINCT is not allowed in COUNT(*) and is not meaningful in MIN or MAX (result would be unchanged), so it can be used in COUNT, SUM, and AVG.
- Aggregate functions cannot be used in FROM, GROUP BY, or WHERE clauses without an intervening query expression.
- Aggregate functions cannot be nested within another aggregate function without an intervening query expression.
- Aggregate functions may be nested inside other functions.
- Any aggregate function may take an optional FILTER clause of the form
FILTER ( WHERE condition )
The condition may be any boolean value expression that does not contain a subquery or a correlated variable. The filter will logically be evaluated for each row prior to the grouping operation. If false the aggregate function will not accumulate a value for the given row.
For more information on aggregates, see the sections on GROUP BY or HAVING.
3.3.3. Window functions
Data Virtualization provides ANSI SQL 2003 window functions. A window function allows an aggregate function to be applied to a subset of the result set, without the need for a GROUP BY
clause. A window function is similar to an aggregate function, but requires the use of an OVER
clause or window specification.
Usage:
aggregate [FILTER (WHERE ...)] OVER ( [partition] [ORDER BY ...] [frame] ) | FIRST_VALUE(val) OVER ( [partition] [ORDER BY ...] [frame] ) | LAST_VALUE(val) OVER ( [partition] [ORDER BY ...] [frame] ) | analytical OVER ( [partition] [ORDER BY ...] ) partition := PARTITION BY expression [, expression]* frame := range_or_rows extent range_or_rows := RANGE | ROWS extent := frameBound | BETWEEN frameBound AND frameBound frameBound := UNBOUNDED PRECEDING | UNBOUNDED FOLLOWING | n PRECEDING | n FOLLOWING | CURRENT ROW
In the preceding syntax, aggregate
can refer to any aggregate function. Keywords exist for the following analytical functions ROW_NUMBER, RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST. There are also the FIRST_VALUE, LAST_VALUE, LEAD, LAG, NTH_VALUE, and NTILE analytical functions. For more information, see Analytical functions definitions.
Syntax rules
- Window functions can only appear in the SELECT and ORDER BY clauses of a query expression.
- Window functions cannot be nested in one another.
- Partitioning and order by expressions cannot contain subqueries or outer references.
- An aggregate ORDER BY clause cannot be used when windowed.
- The window specification ORDER BY clause cannot reference alias names or use positional ordering.
- Windowed aggregates may not use DISTINCT if the window specification is ordered.
- Analytical value functions may not use DISTINCT and require the use of an ordering in the window specification.
- RANGE or ROWS requires the ORDER BY clause to be specified. The default frame if not specified is RANGE UNBOUNDED PRECEDING. If no end is specified the default is CURRENT ROW. No combination of start and end is allowed such that the end is before the start - for example UNBOUNDED FOLLOWING is not allow as a start nor is UNBOUNDED PRECEDING allowed as an end.
- RANGE cannot be used n PRECEDING or n FOLLOWING
Analytical function definitions
- Ranking functions
- RANK() – Assigns a number to each unique ordering value within each partition starting at 1, such that the next rank is equal to the count of prior rows.
- DENSE_RANK() – Assigns a number to each unique ordering value within each partition starting at 1, such that the next rank is sequential.
- PERCENT_RANK() – Computed as (RANK - 1) / ( RC - 1) where RC is the total row count of the partition.
CUME_DIST() – Computed as the PR / RC where PR is the rank of the row including peers and RC is the total row count of the partition.
By default all values are integers - an exception will be thrown if a larger value is needed. Use the system org.teiid.longRanks to have RANK, DENSE_RANK, and ROW_NUMBER return long values instead.
- Value functions
- FIRST_VALUE(val) – Return the first value in the window frame with the given ordering.
- LAST_VALUE(val) – Return the last observed value in the window frame with the given ordering.
- LEAD(val [, offset [, default]]) - Access the ordered value in the window that is offset rows ahead of the current row. If there is no such row, then the default value will be returned. If not specified the offset is 1 and the default is null.
- LAG(val [, offset [, default]]) - Access the ordered value in the window that is offset rows behind of the current row. If there is no such row, then the default value will be returned. If not specified the offset is 1 and the default is null.
- NTH_VALUE(val, n) - Returns the nth val in window frame. The index must be greater than 0. If no such value exists, then null is returned.
- Row value functions
-
ROW_NUMBER() – Sequentially assigns a number to each row in a partition starting at
1
. -
NTILE(n) – Divides the partition into n tiles that differ in size by at most
1
. Larger tiles will be created sequentially starting at the first.n
must be greater than0
.
-
ROW_NUMBER() – Sequentially assigns a number to each row in a partition starting at
Processing
Window functions are logically processed just before creating the output from the SELECT clause. Window functions can use nested aggregates if a GROUP BY clause is present. There is no guaranteed effect on the output ordering from the presence of window functions. The SELECT statement must have an ORDER BY clause to have a predictable ordering.
An ORDER BY in the OVER clause follows the same rules pushdown and processing rules as a top level ORDER BY. In general this means you should specify NULLS FIRST/LAST as null handling may differ between engine and pushdown processing. Also see the system properties controlling sort behavior if you different default behavior.
Data Virtualization processes all window functions with the same window specification together. In general, a full pass over the row values coming into the SELECT clause is required for each unique window specification. For each window specification the values are grouped according to the PARTITION BY clause. If no PARTITION BY clause is specified, then the entire input is treated as a single partition.
The frame for the output value is determined based upon the definition of the analytical function or the ROWS/RANGE
clause. The default frame is RANGE UNBOUNDED PRECEDING
, which also implies the default end bound of CURRENT ROW
. RANGE
computes over a row and its peers together. ROWS
computes over every row. Most analytical functions, such as ROW_NUMBER
, have an implicit RANGE/ROWS
- which is why a different one cannot be specified. For example, ROW_NUMBER() OVER (order)`
can be expressed instead as count(*) OVER (order ROWS UNBOUNDED PRECEDING AND CURRENT ROW)
. Thus it assigns a different value to every row regardless of the number of peers.
Example: Windowed results
SELECT name, salary, max(salary) over (partition by name) as max_sal, rank() over (order by salary) as rank, dense_rank() over (order by salary) as dense_rank, row_number() over (order by salary) as row_num FROM employees
name | salary | max_sal | rank | dense_rank | row_num |
---|---|---|---|---|---|
John | 100000 | 100000 | 2 | 2 | 2 |
Henry | 50000 | 50000 | 5 | 4 | 5 |
John | 60000 | 100000 | 3 | 3 | 3 |
Suzie | 60000 | 150000 | 3 | 3 | 4 |
Suzie | 150000 | 150000 | 1 | 1 | 1 |
3.3.4. Case and searched case
In Data Virtualization, to include conditional logic in a scalar expression, you can use the following two forms of the CASE expression:
-
CASE <expr> ( WHEN <expr> THEN <expr>)+ [ELSE expr] END
-
CASE ( WHEN <criteria> THEN <expr>)+ [ELSE expr] END
Each form allows for an output based on conditional logic. The first form starts with an initial expression and evaluates WHEN expressions until the values match, and outputs the THEN expression. If no WHEN is matched, the ELSE expression is output. If no WHEN is matched and no ELSE is specified, a null literal value is output. The second form (the searched case expression) searches the WHEN clauses, which specify an arbitrary criteria to evaluate. If any criteria evaluates to true, the THEN expression is evaluated and output. If no WHEN is true, the ELSE is evaluated or NULL is output if none exists.
Example case statements
SELECT CASE columnA WHEN '10' THEN 'ten' WHEN '20' THEN 'twenty' END AS myExample SELECT CASE WHEN columnA = '10' THEN 'ten' WHEN columnA = '20' THEN 'twenty' END AS myExample
3.3.5. Scalar subqueries
Subqueries can be used to produce a single scalar value in the SELECT, WHERE, or HAVING clauses only. A scalar subquery must have a single column in the SELECT clause and should return either 0 or 1 row. If no rows are returned, null will be returned as the scalar subquery value. For information about other types of subqueries, see Subqueries.
3.3.6. Parameter references
Parameters are specified using a ?
symbol. You can use parameters only with PreparedStatement
or CallableStatements
in JDBC. Each parameter is linked to a value specified by 1-based index in the JDBC API.
3.3.7. Arrays
Array values may be constructed using parentheses around an expression list with an optional trailing comma, or with an explicit ARRAY constructor.
Example: Empty arrays
() (,) ARRAY[]
Example: Single element array
(expr,) ARRAY[expr]
A trailing comma is required for the parser to recognize a single element expression as an array with parentheses, rather than a simple nested expression.
Example: General array syntax
(expr, expr ... [,]) ARRAY[expr, ...]
If all of the elements in the array have the same type, the array will have a matching base type. If the element types differ the array base type will be object.
An array element reference takes the form of:
array_expr[index_expr]
index_expr
must resolve to an integer value. This syntax is effectively the same as the array_get
system function and expects 1-based indexing.
3.4. Criteria
Criteria can be any of the following items:
- Predicates that evaluate to true or false.
- Logical criteria that combine criteria (AND, OR, NOT).
- A value expression of type Boolean.
Usage
criteria AND|OR criteria
NOT criteria
(criteria)
expression (=|<>|!=|<|>|<=|>=) (expression|((ANY|ALL|SOME) subquery|(array_expression)))
expression IS [NOT] DISTINCT FROM expression
IS DISTINCT FROM
considers null values to be equivalent and never produces an UNKNOWN value.
Because the optimizer is not tuned to handle IS DISTINCT FROM
, if you use it in a join predicate that is not pushed down, the resulting plan does not perform as well a regular comparison.
expression [NOT] IS NULL
expression [NOT] IN (expression [,expression]*)|subquery
expression [NOT] LIKE pattern [ESCAPE char]
LIKE
matches the string expression against the given string pattern. The pattern may contain %
to match any number of characters, and _
to match any single character. The escape character can be used to escape the match characters %
and _
.
expression [NOT] SIMILAR TO pattern [ESCAPE char]
SIMILAR TO
is a cross between LIKE and standard regular expression syntax. %
and _
are still used, rather than .*
and .
, respectively.
Data Virtualization does not exhaustively validate SIMILAR TO
pattern values. Instead, the pattern is converted to an equivalent regular expression. Do not rely on general regular expression features when using SIMILAR TO
. If additional features are needed, use LIKE_REGEX
. Avoid the use of non-literal patterns, because Data Virtualization has a limited ability to process SQL pushdown predicates.
expression [NOT] LIKE_REGEX pattern
You can use LIKE_REGEX
with standard regular expression syntax for matching. This differs from SIMILAR TO
and LIKE
in that the escape character is no longer used. \
is already the standard escape mechanism in regular expressions, and %`
and _
have no special meaning. The runtime engine uses the JRE implementation of regular expressions. For more information, see the java.util.regex.Pattern class.
Data Virtualization does not exhaustively validate LIKE_REGEX
pattern values. It is possible to use JRE-only regular expression features that are not specified by the SQL specification. Additionally, not all sources can use the same regular expression flavor or extensions. In pushdown situations, be careful to ensure that the pattern that you use has the same meaning in Data Virtualization, and across all applicable sources.
EXISTS (subquery)
expression [NOT] BETWEEN minExpression AND maxExpression
Data Virtualization converts BETWEEN
into the equivalent form expression >= minExpression AND expression ⇐ maxExpression
.
expression
Where expression
has type Boolean.
Syntax rules
- The precedence ordering from lowest to highest is comparison, NOT, AND, OR.
- Criteria nested by parenthesis will be logically evaluated prior to evaluating the parent criteria.
Some examples of valid criteria are:
-
(balance > 2500.0)
-
100*(50 - x)/(25 - y) > z
-
concat(areaCode,concat('-',phone)) LIKE '314%1'
Null values represent an unknown value. Comparison with a null value will evaluate to unknown
, which can never be true even if not
is used.
Criteria precedence
Data Virtualization parses and evaluates conditions with higher precedence before those with lower precedence. Conditions with equal precedence are left-associative. The following table lists condition precedence from high to low:
Condition | Description |
---|---|
SQL operators | See Expressions |
EXISTS, LIKE, SIMILAR TO, LIKE_REGEX, BETWEEN, IN, IS NULL, IS DISTINCT, <, ⇐, >, >=, =, <> | Comparison |
NOT | Negation |
AND | Conjunction |
OR | Disjunction |
To prevent lookaheads, the parser does not accept all possible criteria sequences. For example, a = b is null
is not accepted, because by the left-associative parsing we first recognize a =
, then look for a common value expression. b is null
is not a valid common value expression. Thus, nesting must be used, for example, (a = b) is null
. For more information about parsing rules, see BNF for SQL grammar.
3.5. Scalar functions
Data Virtualization provides an extensive set of built-in scalar functions. For more information, see DML commands and Data types. In addition, Data Virtualization provides the capability for user-defined functions or UDFs. For information about adding UDFs, see User-defined functions in the Translator Development Guide. After you add UDFs, you can call them in the same way that you call other functions.
3.5.1. Numeric functions
Numeric functions return numeric values (integer, long, float, double, biginteger, bigdecimal). They generally take numeric values as inputs, though some take strings.
Function | Definition | Datatype constraint |
---|---|---|
+ - * / | Standard numeric operators | x in {integer, long, float, double, biginteger, bigdecimal}, return type is same as x [a] |
ABS(x) | Absolute value of x | See standard numeric operators above |
ACOS(x) | Arc cosine of x | x in {double, bigdecimal}, return type is double |
ASIN(x) | Arc sine of x | x in {double, bigdecimal}, return type is double |
ATAN(x) | Arc tangent of x | x in {double, bigdecimal}, return type is double |
ATAN2(x,y) | Arc tangent of x and y | x, y in {double, bigdecimal}, return type is double |
CEILING(x) | Ceiling of x | x in {double, float}, return type is double |
COS(x) | Cosine of x | x in {double, bigdecimal}, return type is double |
COT(x) | Cotangent of x | x in {double, bigdecimal}, return type is double |
DEGREES(x) | Convert x degrees to radians | x in {double, bigdecimal}, return type is double |
EXP(x) | e^x | x in {double, float}, return type is double |
FLOOR(x) | Floor of x | x in {double, float}, return type is double |
FORMATBIGDECIMAL(x, y) | Formats x using format y | x is bigdecimal, y is string, returns string |
FORMATBIGINTEGER(x, y) | Formats x using format y | x is biginteger, y is string, returns string |
FORMATDOUBLE(x, y) | Formats x using format y | x is double, y is string, returns string |
FORMATFLOAT(x, y) | Formats x using format y | x is float, y is string, returns string |
FORMATINTEGER(x, y) | Formats x using format y | x is integer, y is string, returns string |
FORMATLONG(x, y) | Formats x using format y | x is long, y is string, returns string |
LOG(x) | Natural log of x (base e) | x in {double, float}, return type is double |
LOG10(x) | Log of x (base 10) | x in {double, float}, return type is double |
MOD(x, y) | Modulus (remainder of x / y) | x in {integer, long, float, double, biginteger, bigdecimal}, return type is same as x |
PARSEBIGDECIMAL(x, y) | Parses x using format y | x, y are strings, returns bigdecimal |
PARSEBIGINTEGER(x, y) | Parses x using format y | x, y are strings, returns biginteger |
PARSEDOUBLE(x, y) | Parses x using format y | x, y are strings, returns double |
PARSEFLOAT(x, y) | Parses x using format y | x, y are strings, returns float |
PARSEINTEGER(x, y) | Parses x using format y | x, y are strings, returns integer |
PARSELONG(x, y) | Parses x using format y | x, y are strings, returns long |
PI() | Value of Pi | return is double |
POWER(x,y) | x to the y power | x in {double, bigdecimal, biginteger}, return is the same type as x |
RADIANS(x) | Convert x radians to degrees | x in {double, bigdecimal}, return type is double |
RAND() | Returns a random number, using generator established so far in the query or initializing with system clock if necessary. | Returns double. |
RAND(x) | Returns a random number, using new generator seeded with x. This should typically be called in an initialization query. It will only effect the random values returned by the Data Virtualization RAND function and not the values from RAND functions evaluated by sources. | x is integer, returns double. |
ROUND(x,y) | Round x to y places; negative values of y indicate places to the left of the decimal point | x in {integer, float, double, bigdecimal} y is integer, return is same type as x. |
SIGN(x) | 1 if x > 0, 0 if x = 0, -1 if x < 0 | x in {integer, long, float, double, biginteger, bigdecimal}, return type is integer |
SIN(x) | Sine value of x | x in {double, bigdecimal}, return type is double |
SQRT(x) | Square root of x | x in {long, double, bigdecimal}, return type is double |
TAN(x) | Tangent of x | x in {double, bigdecimal}, return type is double |
BITAND(x, y) | Bitwise AND of x and y | x, y in {integer}, return type is integer |
BITOR(x, y) | Bitwise OR of x and y | x, y in {integer}, return type is integer |
BITXOR(x, y) | Bitwise XOR of x and y | x, y in {integer}, return type is integer |
BITNOT(x) | Bitwise NOT of x | x in {integer}, return type is integer |
[a] The precision and scale of non-bigdecimal arithmetic function functions results matches that of Java. The results of bigdecimal operations match Java, except for division, which uses a preferred scale of max(16, dividend.scale + divisor.precision + 1), which then has trailing zeros removed by setting the scale to max(dividend.scale, normalized scale).
Parsing numeric datatypes from strings
Data Virtualization offers a set of functions you can use to parse numbers from strings. For each string, you need to provide the formatting of the string. These functions use the convention established by the java.text.DecimalFormat class to define the formats you can use with these functions. You can learn more about how this class defines numeric string formats by visiting the Sun Java Web site at the following URL for Sun Java.
For example, you could use these function calls, with the formatting string that adheres to the java.text.DecimalFormat convention, to parse strings and return the datatype you need:
Input String | Function Call to Format String | Output Value | Output Datatype |
---|---|---|---|
'$25.30' | parseDouble(cost, '$,0.00;($,0.00)') | 25.3 | double |
'25%' | parseFloat(percent, ',#0%') | 25 | float |
'2,534.1' | parseFloat(total, ',0.;-,0.') | 2534.1 | float |
'1.234E3' | parseLong(amt, '0.###E0') | 1234 | long |
'1,234,567' | parseInteger(total, ',0;-,0') | 1234567 | integer |
Formatting numeric datatypes as strings
Data Virtualization offers a set of functions you can use to convert numeric datatypes into strings. For each string, you need to provide the formatting. These functions use the convention established within the java.text.DecimalFormat class to define the formats you can use with these functions. You can learn more about how this class defines numeric string formats by visiting the Sun Java Web site at the following URL for Sun Java .
For example, you could use these function calls, with the formatting string that adheres to the java.text.DecimalFormat convention, to format the numeric datatypes into strings:
Input Value | Input Datatype | Function Call to Format String | Output String |
---|---|---|---|
25.3 | double | formatDouble(cost, '$,0.00;($,0.00)') | '$25.30' |
25 | float | formatFloat(percent, ',#0%') | '25%' |
2534.1 | float | formatFloat(total, ',0.;-,0.') | '2,534.1' |
1234 | long | formatLong(amt, '0.###E0') | '1.234E3' |
1234567 | integer | formatInteger(total, ',0;-,0') | '1,234,567' |
3.5.2. String functions
String functions generally take strings as inputs and return strings as outputs.
Unless specified, all of the arguments and return types in the following table are strings and all indexes are 1-based. The 0 index is considered to be before the start of the string.
Function | Definition | Datatype constraint |
---|---|---|
x || y | Concatenation operator | x,y in {string, clob}, return type is string or character large object (CLOB). |
ASCII(x) |
Provide ASCII value of the left most character[1] in x. The empty string will as input will return | return type is integer |
CHR(x) CHAR(x) |
Provide the character[1] for ASCII value | x in {integer} [1] For the engine’s implementations of the ASCII and CHR functions, characters are limited to UCS2 values only. For pushdown there is little consistency among sources for character values beyond character code 255. |
CONCAT(x, y) | Concatenates x and y with ANSI semantics. If x and/or y is null, returns null. | x, y in {string} |
CONCAT2(x, y) | Concatenates x and y with non-ANSI null semantics. If x and y is null, returns null. If only x or y is null, returns the other value. | x, y in {string} |
ENDSWITH(x, y) | Checks if y ends with x. If x or y is null, returns null. | x, y in {string}, returns boolean |
INITCAP(x) | Make first letter of each word in string x capital and all others lowercase. | x in {string} |
INSERT(str1, start, length, str2) | Insert string2 into string1 | str1 in {string}, start in {integer}, length in {integer}, str2 in {string} |
LCASE(x) | Lowercase of x | x in {string} |
LEFT(x, y) | Get left y characters of x | x in {string}, y in {integer}, return string |
LENGTH(x) CHAR_LENGTH(x) CHARACTER_LENGTH(x) | Length of x | return type is integer |
LOCATE(x, y) POSITION(x IN y) | Find position of x in y starting at beginning of y. | x in {string}, y in {string}, return integer |
LOCATE(x, y, z) | Find position of x in y starting at z. | x in {string}, y in {string}, z in {integer}, return integer |
LPAD(x, y) | Pad input string x with spaces on the left to the length of y. | x in {string}, y in {integer}, return string |
LPAD(x, y, z) | Pad input string x on the left to the length of y using character z. | x in {string}, y in {string}, z in {character}, return string |
LTRIM(x) | Left trim x of blank chars. | x in {string}, return string |
QUERYSTRING(path [, expr [AS name] …]) |
Returns a properly encoded query string appended to the given path. Null valued expressions are omitted, nd a null path is treated as ". Names are optional for column reference expressions. For example, | path, expr in {string}. name is an identifier. |
REPEAT(str1,instances) | Repeat string1 a specified number of times | str1 in {string}, instances in {integer} return string. |
RIGHT(x, y) | Get right y characters of x | x in {string}, y in {integer}, return string |
RPAD(input string x, pad length y) | Pad input string x with spaces on the right to the length of y | x in {string}, y in {integer}, return string |
RPAD(x, y, z) | Pad input string x on the right to the length of y using character z | x in {string}, y in {string}, z in {character}, return string |
RTRIM(x) | Right trim x of blank chars | x is string, return string |
SPACE(x) | Repeat the space character x number of times | x is integer, return string |
SUBSTRING(x, y) SUBSTRING(x FROM y) | [b] Get substring from x, from position y to the end of x | y in {integer} |
SUBSTRING(x, y, z) SUBSTRING(x FROM y FOR z) | [b] Get substring from x from position y with length z | y, z in {integer} |
TRANSLATE(x, y, z) | Translate string x by replacing each character in y with the character in z at the same position. | x in {string} |
TRIM([[LEADING|TRAILING|BOTH] [x] FROM] y) | Trim the leading, trailing, or both ends of a string y of character x. If LEADING/TRAILING/BOTH is not specified, BOTH is used. If no trim character x is specified, then the blank space ’ is used. | x in {character}, y in {string} |
UCASE(x) | Uppercase of x | x in {string} |
UNESCAPE(x) | Unescaped version of x. Possible escape sequences are \b - backspace, \t - tab, \n - line feed, \f - form feed, \r - carriage return. \uXXXX, where X is a hex value, can be used to specify any unicode character. \XXX, where X is an octal digit, can be used to specify an octal byte value. If any other character appears after an escape character, that character will appear in the output and the escape character will be ignored. | x in {string} |
[a] Non-ASCII range characters or integers used in these functions may produce different results or exceptions depending on where the function is evaluated (Data Virtualization vs. source). Data Virtualization’s uses Java default int to char and char to int conversions, which operates over UTF16 values.
[b] The substring function depending upon the source does not have consistent behavior with respect to negative from/length arguments nor out of bounds from/length arguments. The default Data Virtualization behavior is:
- Return a null value when the from value is out of bounds or the length is less than 0
- A zero from index is effective the same as 1.
- A negative from index is first counted from the end of the string.
Some sources, however, can return an empty string instead of null
, and some sources are not compatible with negative indexing.
TO_CHARS
Return a CLOB from the binary large object (BLOB) with the given encoding.
TO_CHARS(x, encoding [, wellformed])
BASE64, HEX, UTF-8-BOM and the built-in Java Charset names are valid values for the encoding [b]. x is a BLOB, encoding is a string, wellformed is a boolean, and returns a CLOB. The two argument form defaults to wellformed=true. If wellformed is false, the conversion function will immediately validate the result such that an unmappable character or malformed input will raise an exception.
TO_BYTES
Return a BLOB from the CLOB with the given encoding.
TO_BYTES(x, encoding [, wellformed])
BASE64, HEX, UTF-8-BOM and the builtin Java Charset names are valid values for the encoding [b]. x in a CLOB, encoding is a string, wellformed is a boolean and returns a BLOB. The two argument form defaults to wellformed=true. If wellformed is false, the conversion function will immediately validate the result such that an unmappable character or malformed input will raise an exception. If wellformed is true, then unmappable characters will be replaced by the default replacement character for the character set. Binary formats, such as BASE64 and HEX, will be checked for correctness regardless of the wellformed parameter.
[b] For more information about Charset names, see the Charset docs.
REPLACE
Replace all occurrences of a given string with another.
REPLACE(x, y, z)
Replace all occurrences of y with z in x. x, y, z are strings and the return value is a string.
REGEXP_REPLACE
Replace one or all occurrences of a given pattern with another string.
REGEXP_REPLACE(str, pattern, sub [, flags])
Replace one or more occurrences of pattern with sub in str. All arguments are strings and the return value is a string.
The pattern parameter is expected to be a valid Java regular expression
The flags argument can be any concatenation of any of the valid flags with the following meanings:
Flag | Name | Meaning |
---|---|---|
g | Global | Replace all occurrences, not just the first. |
m | Multi-line | Match over multiple lines. |
i | Case insensitive | Match without case sensitivity. |
Usage:
The following will return "xxbye Wxx" using the global and case insensitive options.
Example regexp_replace
regexp_replace('Goodbye World', '[g-o].', 'x', 'gi')
3.5.3. Date and time functions
Date and time functions return or operate on dates, times, or timestamps.
Date and time functions use the convention established within the java.text.SimpleDateFormat class to define the formats you can use with these functions. You can learn more about how this class defines formats by visiting the Javadocs for SimpleDateFormat.
Function | Definition | Datatype constraint |
---|---|---|
CURDATE() CURRENT_DATE[()] | Return current date - will return the same value for all invocations in the user command. | returns date. |
CURTIME() | Return current time - will return the same value for all invocations in the user command. See also CURRENT_TIME. | returns time |
NOW() | Return current timestamp (date and time with millisecond precision) - will return the same value for all invocations in the user command or procedure instruction. See also CURRENT_TIMESTAMP. | returns timestamp |
CURRENT_TIME[(precision)] | Return current time - will return the same value for all invocations in the user command. The Data Virtualization time type does not track fractional seconds, so the precision argument is effectively ignored. Without a precision is the same as CURTIME(). | returns time |
CURRENT_TIMESTAMP[(precision)] | Return current timestamp (date and time with millisecond precision) - will return the same value for all invocations with the same precision in the user command or procedure instruction. Without a precision is the same as NOW(). Since the current timestamp has only millisecond precision by default setting the precision to greater than 3 will have no effect. | returns timestamp |
DAYNAME(x) | Return name of day in the default locale | x in {date, timestamp}, returns string |
DAYOFMONTH(x) | Return day of month | x in {date, timestamp}, returns integer |
DAYOFWEEK(x) | Return day of week (Sunday=1, Saturday=7) | x in {date, timestamp}, returns integer |
DAYOFYEAR(x) | Return day number in year | x in {date, timestamp}, returns integer |
EPOCH(x) | Return seconds since the unix epoch with microsecond precision | x in {date, timestamp}, returns double |
EXTRACT(YEAR|MONTH|DAY |HOUR|MINUTE|SECOND|QUARTER|EPOCH FROM x) | Return the given field value from the date value x. Produces the same result as the associated YEAR, MONTH, DAYOFMONTH, HOUR, MINUTE, SECOND, QUARTER, EPOCH functions functions. The SQL specification also allows for TIMEZONE_HOUR and TIMEZONE_MINUTE as extraction targets. In Data Virtualization all date values are in the timezone of the server. | x in {date, time, timestamp}, epoch returns double, the others return integer |
FORMATDATE(x, y) | Format date x using format y. | x is date, y is string, returns string |
FORMATTIME(x, y) | Format time x using format y. | x is time, y is string, returns string |
FORMATTIMESTAMP(x, y) | Format timestamp x using format y. | x is timestamp, y is string, returns string |
FROM_MILLIS (millis) | Return the Timestamp value for the given milliseconds. | long UTC timestamp in milliseconds |
FROM_UNIXTIME (unix_timestamp) | Return the Unix timestamp as a String value with the default format of yyyy/mm/dd hh:mm:ss. | long Unix timestamp (in seconds) |
HOUR(x) | Return hour (in military 24-hour format). | x in {time, timestamp}, returns integer |
MINUTE(x) | Return minute. | x in {time, timestamp}, returns integer |
MODIFYTIMEZONE (timestamp, startTimeZone, endTimeZone) | Returns a timestamp based upon the incoming timestamp adjusted for the differential between the start and end time zones. If the server is in GMT-6, then modifytimezone({ts '2006-01-10 04:00:00.0'},'GMT-7', 'GMT-8') will return the timestamp {ts '2006-01-10 05:00:00.0'} as read in GMT-6. The value has been adjusted 1 hour ahead to compensate for the difference between GMT-7 and GMT-8. | startTimeZone and endTimeZone are strings, returns a timestamp |
MODIFYTIMEZONE (timestamp, endTimeZone) | Return a timestamp in the same manner as modifytimezone(timestamp, startTimeZone, endTimeZone), but will assume that the startTimeZone is the same as the server process. | Timestamp is a timestamp; endTimeZone is a string, returns a timestamp |
MONTH(x) | Return month. | x in {date, timestamp}, returns integer |
MONTHNAME(x) | Return name of month in the default locale. | x in {date, timestamp}, returns string |
PARSEDATE(x, y) | Parse date from x using format y. | x, y in {string}, returns date |
PARSETIME(x, y) | Parse time from x using format y. | x, y in {string}, returns time |
PARSETIMESTAMP(x,y) | Parse timestamp from x using format y. | x, y in {string}, returns timestamp |
QUARTER(x) | Return quarter. | x in {date, timestamp}, returns integer |
SECOND(x) | Return seconds. | x in {time, timestamp}, returns integer |
TIMESTAMPCREATE(date, time) | Create a timestamp from a date and time. | date in {date}, time in {time}, returns timestamp |
TO_MILLIS (timestamp) | Return the UTC timestamp in milliseconds. | timestamp value |
UNIX_TIMESTAMP (unix_timestamp) | Return the long Unix timestamp (in seconds). | unix_timestamp String in the default format of yyyy/mm/dd hh:mm:ss |
WEEK(x) | Return week in year 1-53. For customization information, see System Properties in the Administrator’s Guide. | x in {date, timestamp}, returns integer |
YEAR(x) | Return four-digit year | x in {date, timestamp}, returns integer |
Timestampadd
Add a specified interval amount to the timestamp.
Syntax
TIMESTAMPADD(interval, count, timestamp)
Arguments
Name | Description |
---|---|
interval | A datetime interval unit, can be one of the following keywords:
|
count | A long or integer count of units to add to the timestamp. Negative values will subtract that number of units. Long values are allowed for symmetry with TIMESTAMPDIFF - but the effective range is still limited to integer values. |
timestamp | A datetime expression. |
Example
SELECT TIMESTAMPADD(SQL_TSI_MONTH, 12,'2016-10-10') SELECT TIMESTAMPADD(SQL_TSI_SECOND, 12,'2016-10-10 23:59:59')
Timestampdiff
Calculates the number of date part intervals crossed between the two timestamps return a long value.
Syntax
TIMESTAMPDIFF(interval, startTime, endTime)
Arguments
Name | Description |
---|---|
interval | A datetime interval unit, the same as keywords used by Timestampadd. |
startTime | A datetime expression. |
endTime | A datetime expression. |
Example
SELECT TIMESTAMPDIFF(SQL_TSI_MONTH,'2000-01-02','2016-10-10') SELECT TIMESTAMPDIFF(SQL_TSI_SECOND,'2000-01-02 00:00:00','2016-10-10 23:59:59') SELECT TIMESTAMPDIFF(SQL_TSI_FRAC_SECOND,'2000-01-02 00:00:00.0','2016-10-10 23:59:59.999999')
If (endTime > startTime), a non-negative number will be returned. If (endTime < startTime), a non-positive number will be returned. The date part difference difference is counted regardless of how close the timestamps are. For example, '2000-01-02 00:00:00.0' is still considered 1 hour ahead of '2000-01-01 23:59:59.999999'.
Compatibility issues
- In SQL, Timestampdiff typically returns an integer. However the Data Virtualization implementation returns a long. You might receive an exception if you expect a value out of the integer range from a pushed down timestampdiff.
- The implementation of timestamp diff in Teiid 8.2 and earlier versions returned values based on the number of whole canonical interval approximations (365 days in a year, 91 days in a quarter, 30 days in a month, etc.) crossed. For example the difference in months between 2013-03-24 and 2013-04-01 was 0, but based upon the date parts crossed is 1. For information about backwards compatibility, see System Properties in the Adminstrator’s Guide.
Parsing date datatypes from strings
Data Virtualization does not implicitly convert strings that contain dates presented in different formats, such as '19970101' and '31/1/1996' to date-related datatypes. You can, however, use the parseDate, parseTime, and parseTimestamp functions, described in the next section, to explicitly convert strings with a different format to the appropriate datatype. These functions use the convention established within the java.text.SimpleDateFormat class to define the formats you can use with these functions. For more information about how this class defines date and time string formats, see Javadocs for SimpleDateFormat. Note that the format strings are specific to your Java default locale.
For example, you could use these function calls, with the formatting string that adheres to the java.text.SimpleDateFormat convention, to parse strings and return the datatype you need:
String | Function call to parse string |
---|---|
'1997010' | parseDate(myDateString, 'yyyyMMdd') |
'31/1/1996' | parseDate(myDateString, 'dd''/''MM''/''yyyy') |
'22:08:56 CST' | parseTime (myTime, 'HH:mm:ss z') |
'03.24.2003 at 06:14:32' | parseTimestamp(myTimestamp, 'MM.dd.yyyy''at''hh:mm:ss') |
Specifying time zones
Time zones can be specified in several formats. Common abbreviations such as EST for "Eastern standard time" are allowed but discouraged, as they can be ambiguous. Unambiguous time zones are defined in the form continent or ocean/largest city. For example, America/New_York, America/Buenos_Aires, or Europe/London. sAdditionally, you can specify a custom time zone by GMT offset: GMT[+/-]HH:MM.
For example: GMT-05:00
3.5.4. Type conversion functions
Within your queries, you can convert between datatypes using the CONVERT or CAST keyword. For more information, see Type conversions
Function | Definition |
---|---|
CONVERT(x, type) | Convert x to type, where type is a Data Virtualization Base Type |
CAST(x AS type) | Convert x to type, where type is a Data Virtualization Base Type |
These functions are identical other than syntax; CAST is the standard SQL syntax, CONVERT is the standard JDBC/ODBC syntax.
Options that are specified on the type, such as length, precision, scale, etc., are effectively ignored - the runtime is simply converting from one object type to another.
3.5.5. Choice functions
Choice functions provide a way to select from two values based on some characteristic of one of the values.
Function | Definition | Datatype constraint |
---|---|---|
COALESCE(x,y+) | Returns the first non-null parameter. | x and all y’s can be any compatible types. |
IFNULL(x,y) | If x is null, return y; else return x. | x, y, and the return type must be the same type but can be any type. |
NVL(x,y) | If x is null, return y; else return x. | x, y, and the return type must be the same type but can be any type. |
NULLIF(param1, param2) | Equivalent to case when (param1 = param2) then null else param1. | param1 and param2 must be compatable comparable types. |
IFNULL and NVL are aliases of each other. They are the same function.
3.5.6. Decode functions
Decode functions allow you to have the Data Virtualization server examine the contents of a column in a result set and alter, or decode, the value so that your application can better use the results.
Function | Definition | Datatype constraint |
---|---|---|
DECODESTRING(x, y [, z]) |
Decode column | all string |
DECODEINTEGER(x, y [, z]) |
Decode column | all string parameters, return integer |
Within each function call, you include the following arguments:
-
x
is the input value for the decode operation. This will generally be a column name. -
y
is the literal string that contains a delimited set of input values and output values. -
z
is an optional parameter on these methods that allows you to specify what delimiter the string specified in y uses.
For example, your application might query a table called PARTS
that contains a column called IS_IN_STOCK
, which contains a Boolean value that you need to change into an integer for your application to process. In this case, you can use the DECODEINTEGER
function to change the Boolean values to integers:
SELECT DECODEINTEGER(IS_IN_STOCK, 'false, 0, true, 1') FROM PartsSupplier.PARTS;
When the Data Virtualization system encounters the value false
in the result set, it replaces the value with 0.
If, instead of using integers, your application requires string values, you can use the DECODESTRING
function to return the string values you need:
SELECT DECODESTRING(IS_IN_STOCK, 'false, no, true, yes, null') FROM PartsSupplier.PARTS;
In addition to two input/output value pairs, this sample query provides a value to use if the column does not contain any of the preceding input values. If the row in the IS_IN_STOCK column does not contain true or false, the Data Virtualization server inserts a null into the result set.
When you use these DECODE functions, you can provide as many input/output value pairs if you want within the string. By default, the Data Virtualization system expects a comma delimiter, but you can add a third parameter to the function call to specify a different delimiter:
SELECT DECODESTRING(IS_IN_STOCK, 'false:no:true:yes:null',':') FROM PartsSupplier.PARTS;
You can use keyword null
in the DECODE string as either an input value or an output value to represent a null value. However, if you need to use the literal string null
as an input or output value (which means the word null appears in the column and not a null value) you can put the word in quotes: "null"
.
SELECT DECODESTRING( IS_IN_STOCK, 'null,no,"null",no,nil,no,false,no,true,yes' ) FROM PartsSupplier.PARTS;
If the DECODE function does not find a matching output value in the column and you have not specified a default value, the DECODE function will return the original value the Data Virtualization server found in that column.
3.5.7. Lookup function
The Lookup function provides a way to speed up access to values from a reference table. The Lookup function automatically caches all key and return column pairs declared in the function for the referenced table. Subsequent lookups against the same table using the same key and return columns will use the cached values. This caching accelerates response time to queries that use lookup tables, also known in business terminology as code or reference tables.
LOOKUP(codeTable, returnColumn, keyColumn, keyValue)
In the lookup table codeTable, find the row where keyColumn has the value keyValue and return the associated returnColumn value or null, if no matching keyValue is found. codeTable must be a string literal that is the fully-qualified name of the target table. returnColumn and keyColumn must also be string literals and match corresponding column names in the codeTable. The keyValue can be any expression that must match the datatype of the keyColumn. The return datatype matches that of returnColumn.
Country code lookup
lookup('ISOCountryCodes', 'CountryCode', 'CountryName', 'United States')
An ISOCountryCodes table is used to translate a country name to an ISO country code. One column, CountryName, represents the keyColumn. A second column, CountryCode, represents the returnColumn, containing the ISO code of the country. Hence, the usage of the lookup function here will provide a CountryName, shown above as `United States', and expect a CountryCode value in response.
When you call this function for any combination of codeTable, returnColumn, and keyColumn for the first time, the Data Virtualization System caches the result. The Data Virtualization System uses this cache for all queries, in all sessions, that later access this lookup table. You should generally not use the lookup function for data that is subject to updates or may be session/user specific, including row-based security and column masking effects. For more information about caching in the Lookup function, see the Caching Guide .
The keyColumn is expected to contain unique values for its corresponding codeTable. If the keyColumn contains duplicate values, an exception will be thrown.
3.5.8. System functions
System functions provide access to information in the Data Virtualization system from within a query.
COMMANDPAYLOAD
Retrieve a string from the command payload or null if no command payload was specified. The command payload is set by the TeiidStatement.setPayload
method on the Data Virtualization JDBC API extensions on a per-query basis.
COMMANDPAYLOAD([key])
If the key parameter is provided, the command payload object is cast to a java.util.Properties object, and the corresponding property value for the key is returned. If the key is not specified, the return value is the command payload object toString value.
key, return value are strings
ENV
Retrieve a system property. This function was misnamed and is included for legacy compatibility. See ENV_VAR and SYS_PROP for more appropriately named functions.
ENV(key)
call using ENV('KEY'), which returns value as string. Ex: ENV('PATH'). If a value is not found with the key passed in, a lower cased version of the key is tried as well. This function is treated as deterministic, even though it is possible to set system properties at runtime.
ENV_VAR
Retrieve an environment variable.
ENV_VAR(key)
call using ENV_VAR('KEY'), which returns value as string. Ex: ENV_VAR('USER'). The behavior of this function is platform dependent with respect to case-sensitivity. This function is treated as deterministic, even though it is possible for environment variables to change at runtime.
SYS_PROP
Retrieve an system property.
SYS_PROP(key)
call using SYS_PROP('KEY'), which returns value as string. Ex: SYS_PROP('USER'). This function is treated as deterministic, even though it is possible for system properties to change at runtime.
NODE_ID
Retrieve the node id - typically the System property value for "jboss.node.name" which will not be set for Data Virtualization embedded.
NODE_ID()
return value is string.
SESSION_ID
Retrieve the string form of the current session id.
SESSION_ID()
return value is string.
USER
Retrieve the name of the user executing the query.
USER([includeSecurityDomain])
includeSecurityDomain is a boolean. return value is string.
If includeSecurityDomain is omitted or true, then the user name will be returned with @security-domain appended.
CURRENT_DATABASE
Retrieve the catalog name of the database. The VDB name is always the catalog name.
CURRENT_DATABASE()
return value is string.
TEIID_SESSION_GET
Retrieve the session variable.
TEIID_SESSION_GET(name)
name is a string and the return value is an object.
A null name will return a null value. Typically you will use the a get wrapped in a CAST to convert to the desired type.
TEIID_SESSION_SET
Set the session variable.
TEIID_SESSION_SET(name, value)
name is a string, value is an object, and the return value is an object.
The previous value for the key or null will be returned. A set has no effect on the current transaction and is not affected by commit/rollback.
GENERATED_KEY
Get a column value from the generated keys of the last insert statement of this session returning a generated key.
Typically this function will only be used within the scope of procedure to determine a generated key value from an insert. Not all inserts provide generated keys, because not all sources return generated keys.
GENERATED_KEY()
The return value is long.
Returns the first column of the last generated key as a long value. Null is returned if there is no such generated key.
GENERATED_KEY(column_name)`
column_name
is a string. The return value is of type object.
A more general form of GENERATED_KEY
that can be used if there are more than one generated column or a type other than long. Null is returned if there is no such generated key nor matching key column.
3.5.9. XML functions
XML functions provide functionality for working with XML data. For more information, see JSONTOXML in JSON functions.
Sample data for examples
Examples provided with XML functions use the following table structure
TABLE Customer ( CustomerId integer PRIMARY KEY, CustomerName varchar(25), ContactName varchar(25) Address varchar(50), City varchar(25), PostalCode varchar(25), Country varchar(25), );
with Data
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
87 | Wartian Herkku | Pirkko Koskitalo | Torikatu 38 | Oulu | 90110 | Finland |
88 | Wellington Importadora | Paula Parente | Rua do Mercado, 12 | Resende | 08737-363 | Brazil |
89 | White Clover Markets | Karl Jablonski | 305 - 14th Ave. S. Suite 3B | Seattle | 98128 | USA |
XMLCAST
Cast to or from XML.
XMLCAST(expression AS type)
Expression or type must be XML. The return value will be typed as type
. This is the same functionality that XMLTABLE
uses to convert values to the desired runtime type, except that XMLCAST
does not work with array type targets.
XMLCOMMENT
Returns an XML comment.
XMLCOMMENT(comment)
Comment is a string. Return value is XML.
XMLCONCAT
Returns an XML with the concatenation of the given XML types.
XMLCONCAT(content [, content]*)
Content is XML. Return value is XML.
If a value is null, it will be ignored. If all values are null, null is returned.
Concatenate two or more XML fragments
SELECT XMLCONCAT( XMLELEMENT("name", CustomerName), XMLPARSE(CONTENT '<a>b</a>' WELLFORMED) ) FROM Customer c WHERE c.CustomerID = 87; ========================================================== <name>Wartian Herkku</name><a>b</a>
XMLELEMENT
Returns an XML element with the given name and content.
XMLELEMENT([NAME] name [, <NSP>] [, <ATTR>][, content]*) ATTR:=XMLATTRIBUTES(exp [AS name] [, exp [AS name]]*) NSP:=XMLNAMESPACES((uri AS prefix | DEFAULT uri | NO DEFAULT))+
If the content value is of a type other than XML, it will be escaped when added to the parent element. Null content values are ignored. Whitespace in XML or the string values of the content is preserved, but no whitespace is added between content values.
XMLNAMESPACES is used provide namespace information. NO DEFAULT is equivalent to defining the default namespace to the null uri - xmlns="". Only one DEFAULT or NO DEFAULT namespace item may be specified. The namespace prefixes xmlns and xml are reserved.
If a attribute name is not supplied, the expression must be a column reference, in which case the attribute name will be the column name. Null attribute values are ignored.
Name, prefix are identifiers. uri is a string literal. content can be any type. Return value is XML. The return value is valid for use in places where a document is expected.
Simple example
SELECT XMLELEMENT("name", CustomerName) FROM Customer c WHERE c.CustomerID = 87; ========================================================== <name>Wartian Herkku</name>
Multiple columns
SELECT XMLELEMENT("customer", XMLELEMENT("name", c.CustomerName), XMLELEMENT("contact", c.ContactName)) FROM Customer c WHERE c.CustomerID = 87; ========================================================== <customer><name>Wartian Herkku</name><contact>Pirkko Koskitalo</contact></customer>
Columns as attributes
SELECT XMLELEMENT("customer", XMLELEMENT("name", c.CustomerName, XMLATTRIBUTES( "contact" as c.ContactName, "id" as c.CustomerID ) ) ) FROM Customer c WHERE c.CustomerID = 87; ========================================================== <customer><name contact="Pirkko Koskitalo" id="87">Wartian Herkku</name></customer>
XMLFOREST
Returns an concatenation of XML elements for each content item.
XMLFOREST(content [AS name] [, <NSP>] [, content [AS name]]*)
For the definition of NSP - XMLNAMESPACES, see See XMLELEMENT in XML functions.
Name is an identifier. Content can be any type. Return value is XML.
If a name is not supplied for a content item, the expression must be a column reference, in which case the element name will be a partially escaped version of the column name.
You can use the XMLFOREST to simplify the declaration of multiple XMLELEMENTS. The XMLFOREST function allows you to process multiple columns at once.
Example
SELECT XMLELEMENT("customer", XMLFOREST( c.CustomerName AS "name", c.ContactName AS "contact" )) FROM Customer c WHERE c.CustomerID = 87; ========================================================== <customer><name>Wartian Herkku</name><contact>Pirkko Koskitalo</contact></customer>
XMLAGG
XMLAGG is an aggregate function, that takes a collection of XML elements and returns an aggregated XML document.
XMLAGG(xml)
From above example in XMLElement, each row in the Customer table table will generate row of XML if there are multiple rows matching the criteria. That will generate a valid XML, but it will not be well formed, because it lacks the root element. XMLAGG can used to correct that
Example
SELECT XMLELEMENT("customers", XMLAGG( XMLELEMENT("customer", XMLFOREST( c.CustomerName AS "name", c.ContactName AS "contact" ))) FROM Customer c ========================================================== <customers> <customer><name>Wartian Herkku</name><contact>Pirkko Koskitalo</contact></customer> <customer><name>Wellington Importadora</name><contact>Paula Parente</contact></customer> <customer><name>White Clover Markets</name><contact>Karl Jablonski</contact></customer> </customers>
XMLPARSE
Returns an XML type representation of the string value expression.
XMLPARSE((DOCUMENT|CONTENT) expr [WELLFORMED])
expr in {string, clob, blob, varbinary}. Return value is XML.
If DOCUMENT is specified then the expression must have a single root element and may or may not contain an XML declaration.
If WELLFORMED is specified then validation is skipped; this is especially useful for CLOB and BLOB known to already be valid.
SELECT XMLPARSE(CONTENT '<customer><name>Wartian Herkku</name><contact>Pirkko Koskitalo</contact></customer>' WELLFORMED); Will return a SQLXML with contents =============================================================== <customer><name>Wartian Herkku</name><contact>Pirkko Koskitalo</contact></customer>
XMLPI
Returns an XML processing instruction.
XMLPI([NAME] name [, content])
Name is an identifier. Content is a string. Return value is XML.
XMLQUERY
Returns the XML result from evaluating the given xquery.
XMLQUERY([<NSP>] xquery [<PASSING>] [(NULL|EMPTY) ON EMPTY]] PASSING:=PASSING exp [AS name] [, exp [AS name]]*
For the definition of NSP - XMLNAMESPACES, see XMLELEMENT in XML functions.
Namespaces may also be directly declared in the xquery prolog.
The optional PASSING clause is used to provide the context item, which does not have a name, and named global variable values. If the xquery uses a context item and none is provided, then an exception will be raised. Only one context item may be specified and should be an XML type. All non-context non-XML passing values will be converted to an appropriate XML type. Null will be returned if the context item evaluates to null.
The ON EMPTY clause is used to specify the result when the evaluted sequence is empty. EMPTY ON EMPTY, the default, returns an empty XML result. NULL ON EMPTY returns a null result.
xquery in string. Return value is XML.
XMLQUERY is part of the SQL/XML 2006 specification.
For more information, see XMLTABLE in FROM clause.
See also XQuery optimization.
XMLEXISTS
Returns true if a non-empty sequence would be returned by evaluating the given xquery.
XMLEXISTS([<NSP>] xquery [<PASSING>]] PASSING:=PASSING exp [AS name] [, exp [AS name]]*
For the definition of NSP - XMLNAMESPACES, see XMLELEMENT in XML functions.
Namespaces can also be directly declared in the xquery prolog.
The optional PASSING clause is used to provide the context item, which does not have a name, and named global variable values. If the xquery uses a context item and none is provided, then an exception will be raised. Only one context item may be specified and should be an XML type. All non-context non-XML passing values will be converted to an appropriate XML type. Null/Unknown will be returned if the context item evaluates to null.
xquery in string. Return value is boolean.
XMLEXISTS is part of the SQL/XML 2006 specification.
See also XQuery optimization.
XMLSERIALIZE
Returns a character type representation of the XML expression.
XMLSERIALIZE([(DOCUMENT|CONTENT)] xml [AS datatype] [ENCODING enc] [VERSION ver] [(INCLUDING|EXCLUDING) XMLDECLARATION])
Return value matches datatype. If no datatype is specified, then clob will be assumed.
The type may be character (string, varchar, clob) or binary (blob, varbinar). CONTENT is the default. If DOCUMENT is specified and the XML is not a valid document or fragment, then an exception is raised.
The encoding enc is specified as an identifier. A character serialization may not specify an encoding. The version ver is specified as a string literal. If a particular XMLDECLARATION is not specified, then the result will have a declaration only if performing a non UTF-8/UTF-16, or non version 1.0 document serialization or the underlying XML has an declaration. If CONTENT is being serialized, then the declaration will be omitted if the value is not a document or element.
See the following example that produces a BLOB of XML in UTF-16 including the appropriate byte order mark of FE FF and XML declaration.
Sample Binary Serialization
XMLSERIALIZE(DOCUMENT value AS BLOB ENCODING "UTF-16" INCLUDING XMLDECLARATION)
XMLTEXT
Returns XML text.
XMLTEXT(text)
text is a string. Return value is XML.
XSLTRANSFORM
Applies an XSL stylesheet to the given document.
XSLTRANSFORM(doc, xsl)
Doc, XSL in {string, clob, xml}. Return value is a clob.
If either argument is null, the result is null.
XPATHVALUE
Applies the XPATH expression to the document and returns a string value for the first matching result. For more control over the results and XQuery, use the XMLQUERY function. For more information, see XMLQUERY in XML functions.
XPATHVALUE(doc, xpath)
Doc in {string, clob, blob, xml}. xpath is string. Return value is a string.
Matching a non-text node will still produce a string result, which includes all descendant text nodes. If a single element is matched that is marked with xsi:nil, then null will be returned.
When the input document utilizes namespaces, it is sometimes necessary to specify XPATH that ignores namespaces:
Sample XML for xpathValue Ignoring Namespaces
<?xml version="1.0" ?> <ns1:return xmlns:ns1="http://com.test.ws/exampleWebService">Hello<x> World</x></return>
Function:
Sample xpathValue Ignoring Namespaces
xpathValue(value, '/*[local-name()="return"]')
Results in Hello World
Example: Generating hierarchical XML from flat data structure
With following table and its contents
Table { x string, y integer }
data like ['a', 1], ['a', 2], ['b', 3], ['b', 4], if you want generate a XML that looks like
<root> <x> a <y>1</y> <y>2</y> </x> <x> b <y>3</y> <y>4</y> </x> </root>
use the SQL statement in Data Virtualization as below
select xmlelement(name "root", xmlagg(p)) from (select xmlelement(name "x", x, xmlagg(xmlelement(name "y", y)) as p from tbl group by x)) as v
For more examples, see http://oracle-base.com/articles/misc/sqlxml-sqlx-generating-xml-content-using-sql.php
3.5.10. JSON functions
JSON functions provide functionality for working with JSON (JavaScript Object Notation) data.
Sample data for examples
Examples provided with XML functions use the following table structure:
TABLE Customer ( CustomerId integer PRIMARY KEY, CustomerName varchar(25), ContactName varchar(25) Address varchar(50), City varchar(25), PostalCode varchar(25), Country varchar(25), );
with Data
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
87 | Wartian Herkku | Pirkko Koskitalo | Torikatu 38 | Oulu | 90110 | Finland |
88 | Wellington Importadora | Paula Parente | Rua do Mercado, 12 | Resende | 08737-363 | Brazil |
89 | White Clover Markets | Karl Jablonski | 305 - 14th Ave. S. Suite 3B | Seattle | 98128 | USA |
JSONARRAY
Returns a JSON array.
JSONARRAY(value...)
value
is any object that can be converted to a JSON value. For more information, see JSON functions. Return value is JSON.
Null values will be included in the result as null literals.
mixed value example
jsonArray('a"b', 1, null, false, {d'2010-11-21'})
Would return
["a\"b",1,null,false,"2010-11-21"]
Using JSONARRAY on a Table
SELECT JSONARRAY(CustomerId, CustomerName) FROM Customer c WHERE c.CustomerID >= 88; ========================================================== [88,"Wellington Importadora"] [89,"White Clover Markets"]
JSONOBJECT
Returns a JSON object.
JSONARRAY(value [as name] ...)
value
is any object that can be converted to a JSON value. For more information, see JSON functions. Return value is JSON.
Null values will be included in the result as null literals.
If a name is not supplied and the expression is a column reference, the column name will be used otherwise exprN will be used where N is the 1-based index of the value in the JSONARRAY expression.
mixed value example
jsonObject('a"b' as val, 1, null as "null")
Would return
{"val":"a\"b","expr2":1,"null":null}
Using JSONOBJECT on a Table
SELECT JSONOBJECT(CustomerId, CustomerName) FROM Customer c WHERE c.CustomerID >= 88; ========================================================== {"CustomerId":88, "CustomerName":"Wellington Importadora"} {"CustomerId":89, "CustomerName":"White Clover Markets"}
Another example
SELECT JSONOBJECT(JSONOBJECT(CustomerId, CustomerName) as Customer) FROM Customer c WHERE c.CustomerID >= 88; ========================================================== {"Customer":{"CustomerId":88, "CustomerName":"Wellington Importadora"}} {"Customer":{"CustomerId":89, "CustomerName":"White Clover Markets"}}
Another example
SELECT JSONOBJECT(JSONARRAY(CustomerId, CustomerName) as Customer) FROM Customer c WHERE c.CustomerID >= 88; ========================================================== {"Customer":[88, "Wellington Importadora"]} {"Customer":[89, "White Clover Markets"]}
JSONPARSE
Validates and returns a JSON result.
JSONPARSE(value, wellformed)
value
is blob with an appropriate JSON binary encoding (UTF-8, UTF-16, or UTF-32) or a clob. wellformed is a boolean indicating that validation should be skipped. Return value is JSON.
A null for either input will return null.
JSON parse of a simple literal value
jsonParse('{"Customer":{"CustomerId":88, "CustomerName":"Wellington Importadora"}}', true)
JSONARRAY_AGG
creates a JSON array result as a Clob including null value. This is similar to JSONARRAY but aggregates its contents into single object
SELECT JSONARRAY_AGG(JSONOBJECT(CustomerId, CustomerName)) FROM Customer c WHERE c.CustomerID >= 88; ========================================================== [{"CustomerId":88, "CustomerName":"Wellington Importadora"}, {"CustomerId":89, "CustomerName":"White Clover Markets"}]
You can also wrap array as
SELECT JSONOBJECT(JSONARRAY_AGG(JSONOBJECT(CustomerId as id, CustomerName as name)) as Customer) FROM Customer c WHERE c.CustomerID >= 88; ========================================================== {"Customer":[{"id":89,"name":"Wellington Importadora"},{"id":100,"name":"White Clover Markets"}]}
Conversion to JSON
A straight-forward, specification-compliant conversion is used for converting values into their appropriate JSON document form.
- Null values are included as the null literal.
- Values parsed as JSON or returned from a JSON construction function (JSONPARSE, JSONARRAY, JSONARRAY_AGG) will be directly appended into a JSON result.
- Boolean values are included as true/false literals.
- Numeric values are included as their default string conversion - in some circumstances if not a number or +-infinity results are allowed, invalid JSON may be obtained.
- String values are included in their escaped/quoted form.
- Binary values are not implicitly convertable to JSON values and require a specific prior to inclusion in JSON.
- All other values will be included as their string conversion in the appropriate escaped/quoted form.
JSONTOXML
Returns an XML document from JSON.
JSONTOXML(rootElementName, json)
rootElementName
is a string, json
is in {clob, blob}. Return value is XML.
The appropriate UTF encoding (8, 16LE. 16BE, 32LE, 32BE) will be detected for JSON blobs. If another encoding is used, see the TO_CHARS function in String functions.
The result is always a well-formed XML document.
The mapping to XML uses the following rules:
- The current element name is initially the rootElementName, and becomes the object value name as the JSON structure is traversed.
- All element names must be valid XML 1.1 names. Invalid names are fully escaped according to the SQLXML specification.
- Each object or primitive value will be enclosed in an element with the current name.
- Unless an array value is the root, it will not be enclosed in an additional element.
- Null values will be represented by an empty element with the attribute xsi:nil="true"
- Boolean and numerical value elements will have the attribute xsi:type set to boolean and decimal respectively.
JSON:
Sample JSON to XML for jsonToXml(’person’, x)
{"firstName" : "John" , "children" : [ "Randy", "Judy" ]}
XML:
Sample JSON to XML for jsonToXml(’person’, x)
<?xml version="1.0" ?> <person> <firstName>John</firstName> <children>Randy</children> <children>Judy<children> </person>
JSON:
Sample JSON to XML for jsonToXml('person', x) with a root array
[{"firstName" : "George" }, { "firstName" : "Jerry" }]
XML (Notice there is an extra "person" wrapping element to keep the XML well-formed):
Sample JSON to XML for jsonToXml(’person’, x) with a root array
<?xml version="1.0" ?> <person> <person> <firstName>George</firstName> </person> <person> <firstName>Jerry</firstName> </person> </person>
JSON:
Sample JSON to XML for jsonToXml(’root’, x) with an invalid name
{"/invalid" : "abc" }
XML:
Sample JSON to XML for jsonToXml(’root’, x) with an invalid name
<?xml version="1.0" ?> <root> <_x002F_invalid>abc</_x002F_invalid> </root>
prior releases defaulted incorrectly to using uXXXX escaping rather than xXXXX. If you need to rely on that behavior see the org.teiid.useXMLxEscape system property.
JsonPath
Processing of JsonPath expressions is provided by Jayway JsonPath. Please note that it uses 0-based indexing, rather than 1-based indexing. Be sure that you are familiar with the expected returns for various path expressions. For example, if a row JsonPath expression is expected to provide an array, make sure that it’s the array that you want, and not an array that would be returned automatically by an indefinite path expression.
If you encounter a situation where path names use reserved characters, such as '.', then you must use the bracketed JsonPath notation as that allows for any key, e.g. $['.key'].
For more information, see JSONTABLE.
JSONPATHVALUE
Extracts a single JSON value as a string.
JSONPATHVALUE(value, path [, nullLeafOnMissing])
value
is a clob JSON document, path
is a JsonPath string, and nullLeafOnMissing
is a Boolean. Return value is a string value of the resulting JSON.
If nullLeafOnMissing
is false (the default), then a path that evaluates to a leaf that is missing will throw an exception. If nullLeafOnMissing
is true, then a null value will be returned.
If the value is an array produced by an indefinite path expression, then only the first value will be returned.
jsonPathValue('{"key":"value"}' '$.missing', true)
Would return
null
jsonPathValue('[{"key":"value1"}, {"key":"value2"}]' '$..key')
Would return
value1
JSONQUERY
Evaluate a JsonPath expression against a JSON document and return the JSON result.
JSONQUERY(value, path [, nullLeafOnMissing])
value
is a clob JSON document, path
is a JsonPath string, and nullLeafOnMissing
is a Boolean. Return value is a JSON value.
If nullLeafOnMissing
is false (the default), then a path that evaluates to a leaf that is missing will throw an exception. If nullLeafOnMissing
is true, then a null value will be returned.
jsonPathValue('[{"key":"value1"}, {"key":"value2"}]' '$..key')
Would return
["value1","value2"]
3.5.11. Security functions
Security functions provide the ability to interact with the security system or to hash/encrypt values.
HASROLE
Whether the current caller has the Data Virtualization data role roleName
.
hasRole([roleType,] roleName)
roleName must be a string, the return type is Boolean.
The two argument form is provided for backwards compatibility. roleType
is a string and must be `data'.
Role names are case-sensitive and only match Data Virtualization Data roles. Foreign/JAAS roles/groups names are not valid for this function, unless there is corresponding data role with the same name.
MD5
Computes the MD5 hash of the value.
MD5(value)
value
must be a string or varbinary, the return type is varbinary. String values are first converted to their UTF-8 byte representation.
SHA1
Computes the SHA-1 hash of the value.
SHA1(value)
value
must be a string or varbinary, the return type is varbinary. String values are first converted to their UTF-8 byte representation.
SHA2_256
Computes the SHA-2 256 bit hash of the value.
SHA2_256(value)
value
must be a string or varbinary, the return type is varbinary. String values are first converted to their UTF-8 byte representation.
SHA2_512
Computes the SHA-2 512 bit hash of the value.
SHA2_512(value)
value
must be a string or varbinary, the return type is varbinary. String values are first converted to their UTF-8 byte representation.
AES_ENCRYPT
aes_encrypt(data, key)
AES_ENCRYPT()
allow encryption of data using the official AES (Advanced Encryption Standard) algorithm, 16 bytes(128 bit) key length, and AES/CBC/PKCS5Padding cipher algorithm with an explicit initialization vector.
The AES_ENCRYPT()
will return a BinaryType encrypted data. The argument data
is the BinaryType data to encrypt, and the argument key
is a BinaryType used in encryption.
AES_DECRYPT
aes_decrypt(data, key)
AES_DECRYPT()
allow decryption of data using the official AES (Advanced Encryption Standard) algorithm, 16 bytes(128 bit) key length, and AES/CBC/PKCS5Padding cipher algorithm expecting an explicit initialization vector.
The AES_DECRYPT()
will return a BinaryType decrypted data. The argument data
is the BinaryType data to decrypt, and the argument key
is a BinaryType used in decryption.
3.5.12. Spatial functions
Spatial functions provide functionality for working with geospatial data. Data Virtualization relies on the JTS Topology Suite to provide partial compatibility with the OpenGIS Simple Features Specification For SQL Revision 1.1. For more information about particular functions, see the Open GIS specification or the PostGIS manual.
Most Geometry capabilities is limited to two dimensions due to the WKB and WKT formats.
There might be minor differences between Data Virtualization and pushdown results that will need to be further refined.
ST_GeomFromText
Returns a geometry from a Clob in WKT format.
ST_GeomFromText(text [, srid])
text
is a CLOB, srid
is an optional integer that represents a spatial reference identifier (SRID). Return value is a geometry.
ST_GeogFromText
Returns a geography from a Clob in (E)WKT format.
ST_GeogFromText(text)
text
is a CLOB, srid
is an optional integer. Return value is a geography.
ST_GeomFromWKB/ST_GeomFromBinary
Returns a geometry from a BLOB in WKB format.
ST_GeomFromWKB(bin [, srid])
bin
is a BLOB, srid
is an optional integer. Return value is a geometry.
ST_GeomFromEWKB
Returns a geometry from a BLOB in EWKB format.
ST_GeomFromEWKB(bin)
bin
is a BLOB. Return value is a geometry. This version of the translator works with two dimensions only.
ST_GeogFromWKB
Returns a geography from a BLOB in (E)WKB format.
ST_GeomFromEWKB(bin)
bin
is a BLOB. Return value is a geography. This version of the translator works with two dimensions only.
ST_GeomFromEWKT
Returns a geometry from a character large object (CLOB) in EWKT format.
ST_GeomFromEWKT(text)
text
is a CLOB. Return value is a geometry. This version of the translator works with two dimensions only.
ST_GeomFromGeoJSON
Returns a geometry from a CLOB in GeoJSON format.
ST_GeomFromGeoJson(`text` [, srid])
text
is a CLOB, srid
is an optional integer. Return value is a geometry.
ST_GeomFromGML
Returns a geometry from a CLOB in GML2 format.
ST_GeomFromGML(text [, srid])
text
is a CLOB, srid
is an optional integer. Return value is a geometry.
ST_AsText
ST_AsText(geom)
geom
is a geometry. Return value is CLOB in WKT format.
ST_AsBinary
ST_AsBinary(geo)
geo is a geometry or geography. Return value is a binary large object (BLOB) in WKB format.
ST_AsEWKB
ST_AsEWKB(geom)
geom
is a geometry. Return value is BLOB in EWKB format.
ST_AsGeoJSON
ST_AsGeoJSON(geom)
geom
is a geometry. Return value is a CLOB with the GeoJSON value.
ST_AsGML
ST_AsGML(geom)
geom
is a geometry. Return value is a CLOB with the GML2 value.
ST_AsEWKT
ST_AsEWKT(geo)
geo
is a geometry or geography. Return value is a CLOB with the EWKT value. The EWKT value is the WKT value with the SRID prefix.
ST_AsKML
ST_AsKML(geom)
geom
is a geometry. Return value is a CLOB with the KML value. The KML value is effectively a simplified GML value and projected into SRID 4326.
&&
Returns true if the bounding boxes of geom1
and geom2
intersect.
geom1 && geom2
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Contains
Returns true if geom1
contains geom2
.
ST_Contains(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Crosses
Returns true if the geometries cross.
ST_Crosses(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Disjoint
Returns true if the geometries are disjoint.
ST_Disjoint(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Distance
Returns the distance between two geometries.
ST_Distance(geo1, geo2)
geo1
, geo2
are both geometries or geographies. Return value is a double. The geography variant must be pushed down for evaluation.
ST_DWithin
Returns true if the geometries are within a given distance of one another.
ST_DWithin(geom1, geom2, dist)
geom1
, geom2
are geometries. dist
is a double. Return value is a Boolean.
ST_Equals
Returns true if the two geometries are spatially equal. The points and order can differ, but neither geometry lies outside of the other.
ST_Equals(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Intersects
Returns true if the geometries intersect.
ST_Intersects(geo1, geo2)
geo1
, geo2
are both geometries or geographies. Return value is a Boolean. The geography variant must be pushed down for evaluation.
ST_OrderingEquals
Returns true if geom1
and geom2
have the same structure and the same ordering of points.
ST_OrderingEquals(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Overlaps
Returns true if the geometries overlap.
ST_Overlaps(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Relate
Test or return the intersection of geom1 and geom2.
ST_Relate(geom1, geom2, pattern)
geom1
, geom2
are geometries. pattern
is a nine character DE-9IM pattern string. Return value is a Boolean.
ST_Relate(geom1, geom2)
geom1
, geom2
are geometries. Return value is the nine character DE-9IM intersection string.
ST_Touches
Returns true if the geometries touch.
ST_Touches(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Within
Returns true if geom1
is completely inside geom2
.
ST_Within(geom1, geom2)
geom1
, geom2
are geometries. Return value is a Boolean.
ST_Area
Returns the area of geom.
ST_Area(geom)
geom
is a geometry. Return value is a double.
ST_CoordDim
Returns the coordinate dimensions of geom.
ST_CoordDim(geom)
geom
is a geometry. Return value is an integer between 0 and 3.
ST_Dimension
Returns the dimension of geom.
ST_Dimension(geom)
geom
is a geometry. Return value is an integer between 0 and 3.
ST_EndPoint
Returns the end Point of the LineString geom. Returns null if geom
is not a LineString.
ST_EndPoint(geom)
geom
is a geometry. Return value is a geometry.
ST_ExteriorRing
Returns the exterior ring or shell LineString of the polygon geom. Returns null if geom
is not a polygon.
ST_ExteriorRing(geom)
geom
is a geometry. Return value is a geometry.
ST_GeometryN
Returns the nth geometry at the given 1-based index in geom. Returns null if a geometry at the given index does not exist. Non-collection types return themselves at the first index.
ST_GeometryN(geom, index)
geom
is a geometry. index is an integer. Return value is a geometry.
ST_GeometryType
Returns the type name of geom
as ST_name. Where name will be LineString, Polygon, Point etc.
ST_GeometryType(geom)
geom
is a geometry. Return value is a string.
ST_HasArc
Tests if the geometry has a circular string. Reports false
, because the translator does not work with curved geometry types.
ST_HasArc(geom)
geom
is a geometry. Return value is a geometry.
ST_InteriorRingN
Returns the nth interior ring LinearString geometry at the given 1-based index in geom. Returns null if a geometry at the given index does not exist, or if geom
is not a polygon.
ST_InteriorRingN(geom, index)
geom
is a geometry. index is an integer. Return value is a geometry.
ST_IsClosed
Returns true if LineString geom
is closed. Returns false if geom
is not a LineString
ST_IsClosed(geom)
geom
is a geometry. Return value is a Boolean.
ST_IsEmpty
Returns true if the set of points is empty.
ST_IsEmpty(geom)
geom
is a geometry. Return value is a Boolean.
ST_IsRing
Returns true if the LineString geom
is a ring. Returns false if geom
is not a LineString.
ST_IsRing(geom)
geom
is a geometry. Return value is a Boolean.
ST_IsSimple
Returns true if the geom
is simple.
ST_IsSimple(geom)
geom
is a geometry. Return value is a Boolean.
ST_IsValid
Returns true
if the geom
is valid.
ST_IsValid(geom)
geom
is a geometry. Return value is a Boolean.
ST_Length
Returns the length of a (Multi)LineString, otherwise returns 0.
ST_Length(geo)
geo
is a geometry or a geography. Return value is a double. The geography variant must be pushed down for evaluation.
ST_NumGeometries
Returns the number of geometries in geom
. Will return 1 if not a geometry collection.
ST_NumGeometries(geom)
geom
is a geometry. Return value is an integer.
ST_NumInteriorRings
Returns the number of interior rings in the polygon geometry. Returns null if geom
is not a polygon.
ST_NumInteriorRings(geom)
geom
is a geometry. Return value is an integer.
ST_NunPoints
Returns the number of points in geom
.
ST_NunPoints(geom)
geom
is a geometry. Return value is an integer.
ST_PointOnSurface
Returns a point that is guaranteed to be on the surface of geom.
ST_PointOnSurface(geom)
geom
is a geometry. Return value is a point geometry.
ST_Perimeter
Returns the perimeter of the (Multi)Polygon geom. Will return 0 if geom
is not a (Multi)Polygon
ST_Perimeter(geom)
geom
is a geometry. Return value is a double.
ST_PointN
Returns the nth point at the given 1-based index in geom. Returns null if a point at the given index does not exist or if geom
is not a LineString.
ST_PointN(geom, index)
geom
is a geometry. index is an integer. Return value is a geometry.
ST_SRID
Returns the SRID for the geometry.
ST_SRID(geo)
geo
is a geometry or geography. Return value is an integer. A 0 value rather than null will be returned for an unknown SRID on a non-null geometry.
ST_SetSRID
Set the SRID for the given geometry.
ST_SetSRID(geo, srid)
geo
is a geometry or geography. srid
is an integer. Return value is the same as the value of geo
. Only the SRID metadata of is modified. No transformation is performed.
ST_StartPoint
Returns the start Point of the LineString geom. Returns null if geom
is not a LineString.
ST_StartPoint(geom)
geom
is a geometry. Return value is a geometry.
ST_X
Returns the X ordinate value, or null if the point is empty. Throws an exception if the geometry is not a point.
ST_X(geom)
geom
is a geometry. Return value is a double.
ST_Y
Returns the Y ordinate value, or null if the point is empty. Throws an exception if the geometry is not a point.
ST_Y(geom)
geom
is a geometry. Return value is a double.
ST_Z
Returns the Z ordinate value, or null if the point is empty. Throws an exception if the geometry is not a point. Typically returns null
because the translator does not work with more than two dimensions.
ST_Z(geom)
geom
is a geometry. Return value is a double.
ST_Boundary
Computes the boundary of the given geometry.
ST_Boundary(geom)
geom
is a geometry. Return value is a geometry.
ST_Buffer
Computes the geometry that has points within the given distance of geom
.
ST_Buffer(geom, distance)
geom
is a geometry. distance
is a double. Return value is a geometry.
ST_Centroid
Computes the geometric center point of geom.
ST_Centroid(geom)
geom
is a geometry. Return value is a geometry.
ST_ConvexHull
Return the smallest convex polygon that contains all of the points in geometry.
ST_ConvexHull(geom)
geom
is a geometry. Return value is a geometry.
ST_CurveToLine
Converts a CircularString/CurvedPolygon to a LineString/Polygon. Not currently implemented in Data Virtualization.
ST_CurveToLine(geom)
geom
is a geometry. Return value is a geometry.
ST_Difference
Computes the closure of the point set of the points contained in geom1
that are not in geom2
.
ST_Difference(geom1, geom2)
geom1
, geom2
are geometries. Return value is a geometry.
ST_Envelope
Computes the 2D bounding box of the given geometry.
ST_Envelope(geom)
geom
is a geometry. Return value is a geometry.
ST_Force_2D
Removes the z coordinate value if present.
ST_Force_2D(geom)
geom
is a geometry. Return value is a geometry.
ST_Intersection
Computes the point set intersection of the points contained in geom1
and in geom2
.
ST_Intersection(geom1, geom2)
geom1
, geom2
are geometries. Return value is a geometry.
ST_Simplify
Simplifies a geometry using the Douglas-Peucker algorithm, but may oversimplify to an invalid or empty geometry.
ST_Simplify(geom, distanceTolerance)
geom
is a geometry. distanceTolerance
is a double. Return value is a geometry.
ST_SimplifyPreserveTopology
Simplifies a geometry using the Douglas-Peucker algorithm. Will always return a valid geometry.
ST_SimplifyPreserveTopology(geom, distanceTolerance)
geom
is a geometry. distanceTolerance
is a double. Return value is a geometry.
ST_SnapToGrid
Snaps all points in the geometry to grid of given size.
ST_SnapToGrid(geom, size)
geom
is a geometry. size is a double. Return value is a geometry.
ST_SymDifference
Return the part of geom1 that does not intersect with geom2 and vice versa.
ST_SymDifference(geom1, geom2)
geom1
, geom2
are geometry. Return value is a geometry.
ST_Transform
Transforms the geometry value from one coordinate system to another.
ST_Transform(geom, srid)
geom
is a geometry. srid
is an integer. Return value is a geometry. The srid
value and the SRID of the geometry value must exist in the SPATIAL_REF_SYS view.
ST_Union
Return a geometry that represents the point set containing all of geom1
and geom2
.
ST_Union(geom1, geom2)
geom1
, geom2
are geometries. Return value is a geometry.
ST_Extent
Computes the 2D bounding box around all of the geometry values. All values should have the same SRID.
ST_Extent(geom)
geom
is a geometry. Return value is a geometry.
ST_Point
Retuns the Point for the given coordinates.
ST_Point(x, y)
x and y are doubles. Return value is a Point geometry.
ST_Polygon
Returns the Polygon with the given shell and SRID.
ST_Polygon(geom, srid)
geom
is a linear ring geometry and srid
is an integer. Return value is a Polygon geometry.
3.5.13. Miscellaneous functions
Documents additional functions and those contributed by other projects.
array_get
Returns the object value at a given array index.
array_get(array, index)
array
is the object type, index
must be an integer, and the return type is an object.
1-based indexing is used. The actual array value should be a java.sql.Array or java array type. A null is returned if either argument is null, or if the index is out of bounds.
array_length
Returns the length for a given array.
array_length(array)
array
is the object type, and the return type is integer.
The actual array value should be a java.sql.Array or java array type. An exception is thrown if the array value is the wrong type.
uuid
Returns a universally unique identifier.
uuid()
The return type is string.
Generates a type 4 (pseudo randomly generated) UUID using a cryptographically strong random number generator. The format is XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX where each X is a hex digit.
Data quality functions
Data Quality functions are contributed by the ODDQ Project. The functions are prefixed with osdq.
, but can be called without the prefix.
osdq.random
Returns the randomized string. For example, jboss teiid
may randomize to jtids soibe
.
random(sourceValue)
The sourceValue
is the string to be randomized.
osdq.digit
Returns digit characters of the string. For example, a1 b2 c3 d4
becomes 1234
.
digit(sourceValue)
The sourceValue
is the string from which you want to extract digit characters.
osdq.whitespaceIndex
Returns the index of the first whitespace. For example, jboss teiid
will return 5
.
whitespaceIndex(sourceValue)
The sourceValue is the string from which you want to find the whitespace index.
osdq.validCreditCard
Check whether a credit card number is valid. Returns true
if it matches credit card logic and checksum.
validCreditCard(cc)
cc
is the credit card number string to check.
osdq.validSSN
Check whether a social security number (SSN) is valid. Returns true
if it matches SSN logic.
validSSN(ssn)
ssn
is the social security number string to check.
osdq.validPhone
Check whether a phone number is valid. Returns true
if the number matches phone logic. Numbers must contain more than 8, but less than 12 characters, and cannot start with 000
.
validPhone(phone)
`phone
is the phone number string need to check.
osdq.validEmail
Check whether an email address is valid. Returns true
if valid.
validEmail(email)
email
is the email address string to check.
osdq.cosineDistance
Returns the float distance between two strings based on the Cosine Similarity algorithm.
cosineDistance(a, b)
a
and b
are strings for which you want to calculate the distance.
osdq.jaccardDistance
Returns the float distance between two strings, based on the Jaccard similarity algorithm.
jaccardDistance(a, b)
The a
and b
are strings for which you want to calculate the distance.
osdq.jaroWinklerDistance
Returns the float distance between two strings based on the Jaro-Winkler algorithm.
jaroWinklerDistance(a, b)
The a
and b
are strings for which you want to calculate the distance.
osdq.levenshteinDistance
Returns the float distance between two strings based on the Levenshtein algorithm.
levenshteinDistance(a, b)
The a
and b
are strings for which you want to calculate the distance.
osdq.intersectionFuzzy
Returns the set of unique elements from the first set with cosine distance less than the specified value to every member of the second set.
intersectionFuzzy(a, b)
a
and b
are string arrays. c
is a float representing the distance, such that 0.0 or less will match any and > 1.0 will match exact.
osdq.minusFuzzy
Returns the set of unique elements from the first set with cosine distance less than the specified value to every member of the second set.
minusFuzzy(a, b, c)
a
and b
are string arrays. c
is a float representing the distance, such that 0.0 or less will match any and > 1.0 will match exact.
osdq.unionFuzzy
Returns the set of unique elements that contains members from the first set and members of the second set that have a cosine distance less than the specified value to every member of the first set.
unionFuzzy(a, b, c)
a
and b
are string arrays. c
is a float representing the distance, such that 0.0 or less will match any and > 1.0 will match exact.
3.5.14. Nondeterministic function handling
Data Virtualization categorizes functions by varying degrees of determinism. When a function is evaluated and to what extent the result can be cached are based upon its determinism level.
- Deterministic
- The function always returns the same result for the given inputs. Deterministic functions are evaluated by the engine as soon as all input values are known, which may occur as soon as the rewrite phase. Some functions, such as the lookup function, are not truly deterministic, but are treated as such for performance. All functions that are not categorized according to the remaining items in this list are considered deterministic.
- User Deterministic
-
The function returns the same result for the given inputs for the same user. This includes the
hasRole
anduser
functions. User deterministic functions are evaluated by the engine as soon as all input values are known, which may occur as soon as the rewrite phase. If a user deterministic function is evaluated during the creation of a prepared processing plan, then the resulting plan will be cached only for the user. - Session Deterministic
-
The function returns the same result for the given inputs under the same user session. This category includes the
env
function. Session deterministic functions are evaluated by the engine as soon as all input values are known, which may occur as soon as the rewrite phase. If a session deterministic function is evaluated during the creation of a prepared processing plan, then the resulting plan will be cached only for the user’s session. - Command Deterministic
-
The result of function evaluation is only deterministic within the scope of the user command. This category include the
curdate
,curtime
,now
, andcommandpayload
functions. Command deterministic functions are delayed in evaluation until processing to ensure that even prepared plans utilizing these functions will be executed with relevant values. Command deterministic function evaluation will occur prior to pushdown. However, multiple occurrences of the same command deterministic time function are not guaranteed to evaluate to the same value. - Nondeterministic
-
The result of function evaluation is fully nondeterministic. This category includes the
rand
function and UDFs marked asnondeterministic
. Nondeterministic functions are delayed in evaluation until processing with a preference for pushdown. If the function is not pushed down, then it may be evaluated for every row in it’s execution context (for example, if the function is used in the select clause).
Uncorrelated subqueries will be treated as deterministic regardless of the functions used within them.
3.6. DML commands
You can use SQL in Data Virtualization to issue queries and define view transformations. For more information about how SQL is used in virtual procedures and update procedures, see Procedure language. Nearly all these features follow standard SQL syntax and functionality, so you can use any SQL reference for more information.
There are 4 basic commands for manipulating data in SQL, corresponding to the create, read, update, and delete (CRUD) operations: INSERT, SELECT, UPDATE, and DELETE. A MERGE statement acts as a combination of INSERT and UPDATE.
You can also execute procedures by using the EXECUTE command, procedural relational command. For more information, see Procedural relational command, or Anonymous procedure block.
3.6.1. Set operations
You can use the SQL UNION
, UNION ALL
, INTERSECT
, and EXCEPT
set operations in Data Virtualization to combine the results of query expressions.
Usage:
queryExpression (UNION|INTERSECT|EXCEPT) [ALL] queryExpression [ORDER BY...]
Syntax Rules:
- The output columns will be named by the output columns of the first set operation branch.
-
Each
SELECT
must have the same number of output columns and compatible data types for each relative column. Data type conversion is performed if data types are inconsistent and implicit conversions exist. -
If
UNION
,INTERSECT
, orEXCEPT
is specified withoutall
, then the output columns must be comparable types. -
You cannot use the SQL
INTERSECT ALL
orEXCEPT ALL
operators.
3.6.2. SELECT command
The SELECT command is used to retrieve records for any number of relations.
A SELECT command can contain the following clauses:
- WITH …
- SELECT …
- FROM …
- WHERE …
- GROUP BY …
- HAVING …
- ORDER BY …
- (LIMIT …) | ([OFFSET …] [FETCH …])
- OPTION …
Except for the OPTION clause, all of the preceding clauses are defined by the SQL specification. The specification also specifies the order in which these clauses are logically processed. Processing occurs in stages, with each stage passing a set of rows to the following stage. The processing model is logical, and does not represent the way that a database engine performs the processing, but it is a useful model for understanding how SQL works. The SELECT command processes clauses in the following stages:
- Stage 1: WITH clause
- Gathers all rows from all with items in the order listed. Subsequent WITH items and the main query can reference a WITH item as if it were a table.
- Stage 2: FROM clause
- Gathers all rows from all tables involved in the query and logically joins them with a Cartesian product to produce a single large table with all columns from all tables. Joins and join criteria are then applied to filter rows that do not match the join structure.
- Stage 3: WHERE clause
- Applies a criteria to every output row from the FROM stage, further reducing the number of rows.
- Stage 4: GROUP BY clause
- Groups sets of rows with matching values in the GROUP BY columns.
- Stage 5: HAVING clause
- Applies criteria to each group of rows. Criteria can only be applied to columns that will have constant values within a group (those in the grouping columns or aggregate functions applied across the group).
- Stage 6: SELECT clause
- Specifies the column expressions that should be returned from the query. Expressions are evaluated, including aggregate functions that are based on the groups of rows, which will no longer exist after this point. The output columns are named using either column aliases or an implicit name determined by the engine. If SELECT DISTINCT is specified, duplicate removal is performed on the rows being returned from the SELECT stage.
- Stage 7: ORDER BY clause
- Sorts the rows returned from the SELECT stage as desired. Supports sorting on multiple columns in specified order, ascending or descending. The output columns will be identical to those columns returned from the SELECT stage and will have the same name.
- Stage 8: LIMIT clause
- Returns only the specified rows (with skip and limit values).
The preceding model helps to understand how SQL works. For example, given that the SELECT clause assigns aliases to columns, it makes sense that the subsequent ORDER BY clause must use those aliases to reference columns. Without knowledge of the processing model, this can be somewhat confusing. Seen in light of the model, it is clear that the ORDER BY stage is the only stage occurring after the SELECT stage, which is where the columns are named. Because the WHERE clause is processed before the SELECT, the columns have not yet been named and the aliases are not yet known.
The explicit table syntax TABLE x
may be used as a shortcut for SELECT * FROM x
.
3.6.3. VALUES command
The VALUES command is used to construct a simple table.
Example syntax
VALUES (value,...)
VALUES (value,...), (valueX,...) ...
A VALUES command with a single value set is equivalent to SELECT value, …
. A VALUES command with multiple values sets is equivalent to a UNION ALL of simple SELECTs, for example SELECT value, …. UNION ALL SELECT valueX, …
.
3.6.4. Update commands
Update commands report integer update counts. Update commands can report a maximum integer value of (2^31 -1). If you update a greater number of rows, the commands report the maximum integer value.
3.6.4.1. INSERT command
The INSERT command is used to add a record to a table.
Example syntax
INSERT INTO table (column,...) VALUES (value,...)
INSERT INTO table (column,...) query
3.6.4.2. UPDATE command
The UPDATE command is used to modify records in a table. The operation results in 1 or more records being updated, or in no records being updated if none match the criteria.
Example syntax
UPDATE table [[AS] alias] SET (column=value,...) [WHERE criteria]
3.6.4.3. DELETE command
The DELETE command is used to remove records from a table. The operation results in 1 or more records being deleted, or in no records being deleted if none match the criteria.
Example syntax
DELETE FROM table [[AS] alias] [WHERE criteria]
3.6.4.4. UPSERT (MERGE) command
The UPSERT
(or MERGE
) command is used to add or update records. The non-ANSI version of UPSERT
that is implemented in Data Virtualization is a modified INSERT statement that requires that the target table has a primary key, and that the target columns cover the primary key. Before it performs an INSERT
, the UPSERT
operation checks whether a row exists, and if it does, UPSERT
updates the current row rather than inserting a new one.
Example syntax
UPSERT INTO table [[AS] alias] (column,...) VALUES (value,...)
UPSERT INTO table (column,...) query
If an UPSERT
statement is not pushed to the source, it is broken down into the respective INSERT and UPDATE operations. The target database system must support extended architecture (XA) to guarantee transaction atomicity.
3.6.4.5. EXECUTE command
The EXECUTE command is used to execute a procedure, such as a virtual procedure or a stored procedure. Procedures can have zero or more scalar input parameters. The return value from a procedure is a result set, or the set of inout/out/return scalars.
You can use the following short forms of the EXECUTE command:
- EXEC
- CALL
Example syntax
EXECUTE proc()
CALL proc(value, ...)
Named parameter syntax
EXECUTE proc(name1=>value1,name4=>param4, ...)
Syntax rules
- The default order of parameter specification is the same as how they are defined in the procedure definition.
- You can specify the parameters in any order by name. Parameters that have default values, or that are nullable in the metadata, can be omitted from the named parameter call, and will have the appropriate value passed at runtime.
- Positional parameters that have default values or that are nullable in the metadata, can be omitted from the end of the parameter list and will have the appropriate value passed at runtime.
- If the procedure does not return a result set, the values from the RETURN, OUT, and IN_OUT parameters are returned as a single row when used as an inline view query.
- A VARIADIC parameter may be repeated 0 or more times as the last positional argument.
3.6.4.6. Procedural relational command
Procedural relational commands use the syntax of a SELECT to emulate an EXEC. In a procedural relational command, a procedure group name is used in a FROM clause in place of a table. That procedure is executed in place of a normal table access if all of the necessary input values can be found in criteria against the procedure. Each combination of input values that is found in the criteria results in the execution of the procedure.
Example syntax
select * from proc
select output_param1, output_param2 from proc where input_param1 = 'x'
select output_param1, output_param2 from proc, table where input_param1 = table.col1 and input_param2 = table.col2
Syntax rules
The procedure as a table projects the same columns as an EXEC with the addition of the input parameters. For procedures that do not return a result set, IN_OUT columns are projected as two columns:
- One to represents the output value.
- One with the name {column name}_IN that represents the input of the parameter.
-
Input values are passed via criteria. Values can be passed by
=
,is null
, or asin
predicates. Disjuncts are not allowed. It is also not possible to pass the value of a non-comparable column through an equality predicate. - The procedure view automatically has an access pattern on its IN and IN_OUT parameters. The access pattern allows the procedure view to be planned correctly as a dependent join when necessary, or to fail when sufficient criteria cannot be found.
- Procedures that contain duplicate names between the parameters (IN, IN_OUT, OUT, RETURN) and the result set columns cannot be used in a procedural relational command.
- If there is already a table or view with the same name as the procedure, then it cannot be invoked via procedural relational syntax.
- Default values for IN or IN_OUT parameters are not used if there is no criteria present for a given input. Default values are only valid for named procedure syntax. For more information, see EXECUTE.
The preceding issues do not apply when you use a nested table reference. For more information, see Nested table reference in FROM clause.
Multiple execution
The use of in
or join criteria can result in a procedure being executed multiple times.
3.6.4.7. Anonymous procedure block
You can execute a procedure language block as a user command. This can be an advantage in situations in which a virtual procedure does not exist, but a set of processes can be carried out on the server side. For more information about the language for defining virtual procedures, see Procedure language.
Example syntax
begin insert into pm1.g1 (e1, e2) select ?, ?; select rowcount; end;
Syntax rules
-
You can use
in
parameters with prepared/callable statement parameters, as shown in the preceding example, which uses?
parameter. -
You cannot use
out
parameters in an anonymous procedure block. As a workaround, you can use session variables as needed. - Anonymous procedure blocks do not return data as output parameters.
- A single result is returned if any of the statements returns a result set. All returnable result sets must have a matching number of columns and types. To indicate that a statement is not intended to provide a result set, use the WITHOUT RETURN clause.
3.6.5. Subqueries
A subquery is a SQL query embedded within another SQL query. The query containing the subquery is the outer query.
Subquery types:
- Scalar subquery - a subquery that returns only a single column with a single value. Scalar subqueries are a type of expression and can be used where single valued expressions are expected.
- Correlated subquery - a subquery that contains a column reference to from the outer query.
- Uncorrelated subquery - a subquery that contains no references to the outer sub-query.
Inline views
Subqueries in the FROM clause of the outer query (also known as "inline views") can return any number of rows and columns. This type of subquery must always be given an alias. An inline view is nearly identical to a traditional view. See also WITH Clause.
Example subquery in FROM clause (inline view)
SELECT a FROM (SELECT Y.b, Y.c FROM Y WHERE Y.d = '3') AS X WHERE a = X.c AND b = X.b
Subqueries can appear anywhere where an expression or criteria is expected.
You can use subqueries in quantified criteria, the EXISTS
predicate, the IN
predicate, and as Scalar subqueries.
Example subquery in WHERE using EXISTS
SELECT a FROM X WHERE EXISTS (SELECT 1 FROM Y WHERE c=X.a)
Example quantified comparison subqueries
SELECT a FROM X WHERE a >= ANY (SELECT b FROM Y WHERE c=3) SELECT a FROM X WHERE a < SOME (SELECT b FROM Y WHERE c=4) SELECT a FROM X WHERE a = ALL (SELECT b FROM Y WHERE c=2)
Example IN subquery
SELECT a FROM X WHERE a IN (SELECT b FROM Y WHERE c=3)
See also Subquery Optimization.
3.6.6. WITH clause
Data Virtualization provides access to common table expressions via the WITH
clause. You can reference WITH
clause items as tables in subsequent WITH clause items, and in the main query. You can think of the WITH
clause as providing query-scoped temporary tables.
Usage
WITH name [(column, ...)] AS [/*+ no_inline|materialize */] (query expression) ...
Syntax rules
- All of the projected column names must be unique. If they are not unique, then the column name list must be provided.
- If the columns of the WITH clause item are declared, then they must match the number of columns projected by the query expression.
- Each WITH clause item must have a unique name.
-
The optional
no_inline
hint indicates to the optimizer that the query expression should not be substituted as an inline view where referenced. It is possible with no_inline for multiple evaluations of the common table as needed by source queries. -
The optional
materialize
hint requires that the common table be created as a temporary table in Data Virtualization. This forces a single evaluation of the common table.
The WITH clause is also subject to optimization and its entries might not be processed if they are not needed in the subsequent query.
Common tables are aggressively inlined to enhance the possibility of pushdown. If a common table is only referenced a single time in the main query, it is likely to be inlined. In some situations, such as when you use a common table to prevent n-many-processing of a non-pushdown, correlated subquery, you might need to include the no_inline
or materialize
hint.
Examples
WITH n (x) AS (select col from tbl) select x from n, n as n1
WITH n (x) AS /*+ no_inline */ (select col from tbl) select x from n, n as n1
Recursive common table expressions
A recursive common table expression is a special form of a common table expression that is allowed to refer to itself to build the full common table result in a recursive or iterative fashion.
Usage
WITH name [(column, ...)] AS (anchor query expression UNION [ALL] recursive query expression) ...
The recursive query expression is allowed to refer to the common table by name. The anchor query expression is executed first during processing. Results are added to the common table and are referenced for the execution of the recursive query expression. The process is repeated against the new results until there are no more intermediate results.
Non-terminating, recursive common table expressions can lead to excessive processing.
By default, to prevent runaway processing of a recursive common table expression, processing is limited to 10000 iterations. Recursive common table expressions that are pushed down are not subject to this limit, but could be subject to other source-specific limits. You can modify the limit by setting the session variable teiid.maxRecursion
to a larger integer value. After the limit is exceeded, an exception is thrown.
The following example fails, because the recursion limit is reached before processing completes.
SELECT teiid_session_set('teiid.maxRecursion', 25); WITH n (x) AS (values('a') UNION select chr(ascii(x)+1) from n where x < 'z') select * from n
3.6.7. SELECT clause
SQL queries that start with the SELECT
keyword and are often referred to as SELECT statements. YOu can use most of the standard SQL query constructs in Data Virtualization.
Usage
SELECT [DISTINCT|ALL] ((expression [[AS] name])|(group identifier.STAR))*|STAR ...
Syntax Rules
- Aliased expressions are only used as the output column names and in the ORDER BY clause. They cannot be used in other clauses of the query.
- DISTINCT may only be specified if the SELECT symbols are comparable.
3.6.8. FROM clause
The FROM clause specifies the target tables for SELECT, UPDATE, and DELETE statements.
Example Syntax:
- FROM table [[AS] alias]
- FROM table1 [INNER|LEFT OUTER|RIGHT OUTER|FULL OUTER] JOIN table2 ON join-criteria
- FROM table1 CROSS JOIN table2
- FROM (subquery) [AS] alias
- FROM TABLE(subquery) [AS] alias. For more information, see Nested tables
- FROM table1 JOIN /*+ MAKEDEP */ table2 ON join-criteria
- FROM table1 JOIN /*+ MAKENOTDEP */ table2 ON join-criteria
- FROM /*+ MAKEIND */ table1 JOIN table2 ON join-criteria
- FROM /*+ NO_UNNEST */ vw1 JOIN table2 ON join-criteria
- FROM table1 left outer join /*+ optional */ table2 ON join-criteria. For more information, see Optional join in Federated optimizations.
- FROM TEXTTABLE… For more information, see TEXTTABLE.
- FROM XMLTABLE… For more information, see XMLTABLE.
- FROM ARRAYTABLE… For more information, see ARRAYTABLE.
- FROM OBJECTTABLE… For more information, see OBJECTTABLE.
- FROM JSONTABLE… For more information, see JSONTABLE.
- FROM SELECT… For more information, see Inline views in Subqueries.
From clause hints
From clause hints are typically specified in a comment block preceding the affected clause. MAKEDEP and MAKENOTDEP may also appear after in non-comment form after the affected clause. If multiple hints apply to that clause, the hints should be placed in the same comment block.
Example hint
FROM /*+ MAKEDEP PRESERVE */ (tbl1 inner join tbl2 inner join tbl3 on tbl2.col1 = tbl3.col1 on tbl1.col1 = tbl2.col1), tbl3 WHERE tbl1.col1 = tbl2.col1
Dependent join hints
MAKEIND
, MAKEDEP
, and MAKENOTDEP
are hints that you can use to control dependent join behavior. Use them only in situations where the optimizer does not choose the most optimal plan based upon query structure, metadata, and costing information. The hints can appear in a comment that follows the FROM
keyword. The hints can be specified against any FROM
clause, not just a named table.
- MAKEIND
- Indicates that the clause should be the independent (feeder) side of a dependent join.
- MAKEDEP
- Indicates that the clause should be the dependent (filtered) side of a join.
- MAKENOTDEP
- Prevents the clause from being the dependent (filtered) side of a join.
You can use the following optional MAX
and JOIN
arguments with MAKEDEP
and MAKEIND
:
- MAKEDEP(JOIN)
- Indicates that the entire join should be pushed.
- MAKEDEP(NO JOIN)
- Indicates that the entire join should not be pushed.
- MAKEDEP(MAX:val)
- Indicates that the dependent join should only be performed if there are less than the maximum number of values from the independent side.
Other hints
NO_UNNEST can be specified against a subquery FROM clause or view to instruct the planner to not to merge the nested SQL in the surrounding query. This process is known as view flattening. This hint only applies to Data Virtualization planning and is not passed to source queries. NO_UNNEST can appear in a comment that follows the FROM keyword.
The PRESERVE hint can be used against an ANSI join tree to preserve the structure of the join, rather than allowing the Data Virtualization optimizer to reorder the join. This is similar in function to the Oracle ORDERED or MySQL STRAIGHT_JOIN hints.
Example PRESERVE hint
FROM /*+ PRESERVE */ (tbl1 inner join tbl2 inner join tbl3 on tbl2.col1 = tbl3.col1 on tbl1.col1 = tbl2.col1)
3.6.8.1. Nested tables
Nested tables can appear in a FROM
clause with the TABLE
keyword. They are an alternative to using a view with normal join semantics. The columns projected from a command contained in a nested table can be used in join criteria, WHERE clauses, and other contexts where you can use FROM clause projected columns.
A nested table can have correlated references to preceding FROM
clause column references as long as INNER
and LEFT OUTER
joins are used. This is especially useful in cases where then nested expression is a procedure or function call.
Valid nested table example
select * from t1, TABLE(call proc(t1.x)) t2
Invalid nested table example
The following nested table example is invalid, because t1
appears after the nested table in the FROM clause:
select * from TABLE(call proc(t1.x)) t2, t1
Using a correlated nested table can result in multiple executions of the table expression — one for each correlated row.
3.6.8.2. XMLTABLE
The XMLTABLE function uses XQuery to produce tabular output. The XMLTABLE function is implicitly a nested table and it can be used within FROM clauses. XMLTABLE is part of the SQL/XML 2006 specification.
Usage
XMLTABLE([<NSP>,] xquery-expression [<PASSING>] [COLUMNS <COLUMN>, ... ]) AS name
COLUMN := name (FOR ORDINALITY | (datatype [DEFAULT expression] [PATH string]))
For the definition of NSP - XMLNAMESPACES, see XMLELEMENT in XML functions. For the definition of PASSING, see XMLQUERY in XML functions.
See also XQuery optimization.
Parameters
- The optional XMLNAMESPACES clause specifies the namepaces that you can use in the XQuery and COLUMN path expressions.
- The xquery-expression must be a valid XQuery. Each sequence item returned by the xquery is used to create a row of values as defined by the COLUMNS clause.
If COLUMNS is not specified, that is equivalent to a COLUMNS clause that returns the entire item as an XML value, as in the following example:
"COLUMNS OBJECT_VALUE XML PATH '."'
- FOR ORDINALITY columns are typed as integers and return 1-based item numbers as their value.
- Each non-ordinality column specifies a type, and optionally specifies a PATH and a DEFAULT expression.
- If PATH is not specified, then the path is the same as the column name.
Syntax Rules
- You can specify only 1 FOR ORDINALITY column.
- Columns names must not contain duplicates.
-
You can use binary large object (BLOB) datatypes, but there is built-in compatibility only for
xs:hexBinary
values. For xs:base64Binary, use a workaround of a PATH that uses the following explicit value constructor:xs:base64Binary(<path>)
. - The column expression must evaluate to a single value if a non-array type is expected.
- If an array type is specified, then an array is returned, unless there are no elements in the sequence, in which case a null value is returned.
-
An empty element is not a valid null value, because its value is effectively an empty string. Use the
xsi:nil
attribute to specify a null value for an element.
XMLTABLE examples
- Use of PASSING, returns 1 row [1]
select * from xmltable('/a' PASSING xmlparse(document '<a id="1"/>') COLUMNS id integer PATH '@id') x
- As a nested table
select x.* from t, xmltable('/x/y' PASSING t.doc COLUMNS first string, second FOR ORDINALITY) x
- Invalid multi-value
select * from xmltable('/a' PASSING xmlparse(document '<a><b id="1"/><b id="2"/></a>') COLUMNS id integer PATH 'b/@id') x
- Array multi-value
select * from xmltable('/a' PASSING xmlparse(document '<a><b id="1"/><b id="2"/></a>') COLUMNS id integer[] PATH 'b/@id') x
- Nil element
select * from xmltable('/a' PASSING xmlparse(document '<a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><b xsi:nil="true"/></a>') COLUMNS id integer PATH 'b') x
In the preceding example, an exception would be thrown if the nil
attribute (xsi:nil="true"
) were not specified, converting b
to an integer value.
3.6.8.3. ARRAYTABLE
The ARRAYTABLE function processes an array input to produce tabular output. The function itself defines what columns it projects. The ARRAYTABLE function is implicitly a nested table and can be used within FROM clauses.
Usage
ARRAYTABLE([ROW|ROWS] expression COLUMNS <COLUMN>, ...) AS name COLUMN := name datatype
Parameters
- expression
- The array to process, which should be a java.sql.Array or java array value.
- ROW|ROWS
- If ROW (the default) is specified, then only a single row is produced from the given array (typically a one dimensional array). If ROWS is specified, then multiple rows are produced. A multidimensional array is expected, and one row is produced for every non-null element of the outer array.
If the expression is null, no rows are produced.
Syntax rules
- Columns names cannot contain duplicates.
Array table examples
- As a nested table:
select x.* from (call source.invokeMDX('some query')) r, arraytable(r.tuple COLUMNS first string, second bigdecimal) x
ARRAYTABLE is effectively a shortcut for using the array_get
function in a nested table.
For example, the following ARRAYTABLE function:
ARRAYTABLE(val COLUMNS col1 string, col2 integer) AS X
is the same as the following statement which uses an array_get
function:
TABLE(SELECT cast(array_get(val, 1) AS string) AS col1, cast(array_get(val, 2) AS integer) AS col2) AS X
3.6.8.4. OBJECTTABLE
The OBJECTTABLE function processes an object input to produce tabular output. The function itself defines what columns it projects. The OBJECTTABLE function is implicitly a nested table and can be used within FROM clauses.
Usage
OBJECTTABLE([LANGUAGE lang] rowScript [PASSING val AS name ...] COLUMNS colName colType colScript [DEFAULT defaultExpr] ...) AS id
Parameters
- lang
- An optional string literal that is the case sensitive language name of the scripts to be processed. The script engine must be available via a JSR-223 ScriptEngineManager lookup.
If a LANGUAGE is not specified, the default 'teiid_script' is used. name:: An identifier that binds the val
expression value into the script context. rowScript:: A string literal that specifies the script to create the row values. For each non-null item that the Iterator produces, the columns are evaluated. colName/colType:: ID/data type of the column, which can optionally be defaulted with the DEFAULT clause expression defaultExpr
. colScript:: A string literal that specifies the script that evaluates to the column value.
Syntax rules
- Columns names cannot contain duplicates.
-
Data Virtualization places several special variables in the script execution context. The CommandContext is available as
teiid_context
. Additionally thecolScripts
may accessteiid_row
andteiid_row_number
.teiid_row
is the current row object produced by the row script.teiid_row_number
is the current 1-based row number. -
rowScript
is evaluated to an Iterator. If the results is already an Iterator, it is used directly. If the evaluation result is an Iteratable, Array, or Array type, then an Iterator is obtained. Any other Object will be treated as an Iterator of a single item. In all cases null row values are skipped.
Although there are no restrictions on naming PASSING variables, it is best to choose names that you can reference as identifiers in the target language.
OBJECTTABLE examples
- Accessing special variables:
SELECT x.* FROM OBJECTTABLE('teiid_context' COLUMNS "user" string 'teiid_row.userName', row_number integer 'teiid_row_number') AS x
The result would be a row with two columns containing the user name and 1 respectively.
Languages other than teiid_script generally permit unrestricted access to Java functionality. As a result, by default, their use is restricted. You can override the restrictions by declaring allowable languages by name in the allowed-languages
property. To use OBJECTTABLE, even from within view definitions that are not normally subject to permission checks, you must define the allowed-languages
property. You must also set language access rights for user accounts to enable users to process OBJECTTABLE functions.
- For more information about about teiid_script, see the next section.
- For more information about enabling the use of languages other than teiid_script, see allowed-languages in Virtual database properties.
- For more information about setting user account permission, see User query permissions in Permissions.
teiid_script
teiid_script is a simple scripting expression language that allows access to passing and special variables, and to non-void 0-argument methods on objects and indexed values on arrays/lists. A teiid_script expression begins by referencing the passing or special variable. Then, any number of .
accessors can be chained to evaluate the expression to a different value. Methods may be accessed by their property names, for example, foo rather than getFoo. If the object includes both a getFoo()
and foo()
method, then the accessor foo
references foo ()
, and getFoo
should be used to call the getter. An array or list index is accessed using a 1-based positive integral value, using the same .
accessor syntax. The same logic as the system function array_get
is used. That is, if the index is out of bounds, null
is returned, rather than an exception.
teiid_script is effectively dynamically typed as typing is performed at runtime. If an accessor does not exist on the object, or if the method is not accessible, then an exception is raised. If any point in the accessor chain evaluates to a null value, then null will be returned.
teiid_script examples
- To get the VDB description string:
teiid_context.session.vdb.description
- To get the first character of the VDB description string:
teiid_context.session.vdb.description.toCharArray.1
3.6.8.5. TEXTTABLE
The TEXTTABLE function processes character input to produce tabular output. It provides both fixed and delimited file format parsing. The function itself defines what columns it projects. The TEXTTABLE function is implicitly a nested table and can be used within FROM clauses.
Usage
TEXTTABLE(expression [SELECTOR string] COLUMNS <COLUMN>, ... [NO ROW DELIMITER | ROW DELIMITER char] [DELIMITER char] [(QUOTE|ESCAPE) char] [HEADER [integer]] [SKIP integer] [NO TRIM]) AS name
Where <COLUMN>
COLUMN := name (FOR ORDINALITY | ([HEADER string] datatype [WIDTH integer [NO TRIM]] [SELECTOR string integer]))
Parameters
- expression
- The text content to process, which should be convertible to a character large object (CLOB).
- SELECTOR
Used with files containing multiple types of rows (example: order header, detail, summary). A TEXTTABLE SELECTOR specifies which lines to include in the output. Matching lines must begin with the selector string. The selector in column delimited files must be followed by the column delimiter.
If a TEXTTABLE SELECTOR is specified, a SELECTOR may also be specified for column values. A column SELECTOR argument will select the nearest preceding text line with the given SELECTOR prefix, and select the value at the given 1-based integer position (which includes the selector itself). If no such text line or position with a given line exists, a null value will be produced. A column SELECTOR is not valid with fixed width parsing.
- NO ROW DELIMITER
- Specifies that fixed parsing should not assume the presence of newline row delimiters.
- ROW DELIMITER
- Sets the row delimiter / newline to an alternate character. Defaults to the new-line character - with built-in handling for treating carriage return newline as a single character. If ROW DELIMITER is specified, carriage return is given no special treatment.
- DELIMITER
-
Sets the field delimiter character to use. Defaults to
,
. - QUOTE
-
Sets the quote, or qualifier, character used to wrap field values. Defaults to
"
. - ESCAPE
-
Sets the escape character to use if no quoting character is in use. This is used in situations where the delimiter or new line characters are escaped with a preceding character, e.g.
\
. - HEADER
- Specifies the text line number (counting every new line) on which the column names occur. If the HEADER option for a column is specified, then that will be used as the expected header name. All lines prior to the header will be skipped. If HEADER is specified, then the header line will be used to determine the TEXTTABLE column position by case-insensitive name matching. This is especially useful in situations where only a subset of the columns are needed. If the HEADER value is not specified, it defaults to 1. If HEADER is not specified, then columns are expected to match positionally with the text contents.
- SKIP
- Specifies the number of text lines (counting every new line) to skip before parsing the contents. HEADER can be specified with SKIP.
- FOR ORDINALITY
- Column that is typed as integer and returns a 1-based item number as its value.
- WIDTH
- Indicates the fixed-width length of a column in characters, not bytes. With the default ROW DELIMITER, a CR NL sequence counts as a single character.
- NO TRIM
- When specified on a TEXTTABLE, it affects all column and header values. When NO TRIM is specified on a column, the fixed or unqualified text value is not trimmed of leading and trailing white space.
Syntax Rules
- If width is specified for one column it must be specified for all columns and be a non-negative integer.
- If width is specified, then fixed width parsing is used, and ESCAPE, QUOTE, column SELECTOR, nor HEADER should not be specified.
- If width is not specified, then NO ROW DELIMITER cannot be used.
- Columns names must not contain duplicates.
- The characters specified for QUOTE, DELIMITER, and ROW DELIMITER must all be different.
TEXTTABLE examples
- Use of the HEADER parameter, returns 1 row ['b']:
SELECT * FROM TEXTTABLE(UNESCAPE('col1,col2,col3\na,b,c') COLUMNS col2 string HEADER) x
- Use of fixed width, returns 2 rows ['a', 'b', 'c'], ['d', 'e', 'f']:
SELECT * FROM TEXTTABLE(UNESCAPE('abc\ndef') COLUMNS col1 string width 1, col2 string width 1, col3 string width 1) x
- Use of fixed width without a row delimiter, returns 3 rows ['a'], ['b'], ['c']:
SELECT * FROM TEXTTABLE('abc' COLUMNS col1 string width 1 NO ROW DELIMITER) x
- Use of ESCAPE parameter, returns 1 row ['a,', 'b']:
SELECT * FROM TEXTTABLE('a:,,b' COLUMNS col1 string, col2 string ESCAPE ':') x
- As a nested table:
SELECT x.* FROM t, TEXTTABLE(t.clobcolumn COLUMNS first string, second date SKIP 1) x
- Use of SELECTORs, returns 2 rows ['c', 'd', 'b'], ['c', 'f', 'b']:
SELECT * FROM TEXTTABLE('a,b\nc,d\nc,f' SELECTOR 'c' COLUMNS col1 string, col2 string col3 string SELECTOR 'a' 2) x
3.6.8.6. JSONTABLE
The JSONTABLE function uses JsonPath to produce tabular output. The JSONTABLE function is implicitly a nested table and can be used within FROM clauses.
Usage
JSONTABLE(value, path [, nullLeafOnMissing] COLUMNS <COLUMN>, ... ) AS name
COLUMN := name (FOR ORDINALITY | (datatype [PATH string]))
See also JsonPath
Parameters
- value
- A clob containing a valid JSON document.
- nullLeafOnMissing
If false (the default), then a path that evaluates to a leaf that is missing will throw an exception. If nullLeafOnMissing is true, then a null value will be returned.
- PATH
- String should be a valid JsonPath. If an array value is returned, then each non-null element will be used to generate a row. Otherwise a single non-null item will be used to create a single row.
- FOR ORDINALITY
Column typed as integer. Returns a 1-based item number as its value.
- Each non-ordinality column specifies a type and optionally a PATH.
- If PATH is not specified, then the path will be generated from the column name: @['name'], which will look for an object key value matching name. If PATH is specified, it must begin with @, which means that the path will be processed relative the the current row context item.
Syntax Rules
- Columns names must not contain duplicates.
- You cannot use array types with the JSONTABLE function.
JSONTABLE examples
Use of passing, returns 1 row [1]:
select * from jsontable('{"a": {"id":1}}}', '$.a' COLUMNS id integer) x
As a nested table:
select x.* from t, jsontable(t.doc, '$.x.y' COLUMNS first string, second FOR ORDINALITY) x
With more complicated paths:
select x.* from jsontable('[{"firstName": "John", "lastName": "Wayne", "children": []}, {"firstName": "John", "lastName": "Adams", "children":["Sue","Bob"]}]', '$.*' COLUMNS familyName string path '@.lastName', children integer path '@.children.length()' ) x
Differences with XMLTABLE
Processing of JSON to tabular results was previously recommended through the use of XMLTABLE with JSONTOXML. For most tasks, JSONTABLE provides a simpler syntax. However, there are some differences to consider:
- JSONTABLE parses the JSON completely, the processes it. XMLTABLE uses streaming processing to reduce the memory overhead.
- JsonPath is not as powerful as XQuery. There are a lot of functions and operations available in XQuery/XPath that are not available in JsonPath.
- JsonPath does not allow for parent references in the column paths. There is no ability to reference the root or any part of the parent hierarchy (.. in XPath).
3.6.9. WHERE clause
The WHERE clause defines the criteria to limit the records affected by SELECT, UPDATE, and DELETE statements.
The general form of the WHERE is:
- WHERE Criteria
3.6.10. GROUP BY clause
The GROUP BY clause denotes that rows should be grouped according to the specified expression values. One row is returned for each group, after optionally filtering those aggregate rows based on a HAVING clause.
The general form of the GROUP BY is:
GROUP BY expression [,expression]*
GROUP BY ROLLUP(expression [,expression]*)
Syntax Rules
- Column references in the group by cannot be made to alias names in the SELECT clause.
- Expressions used in the group by must appear in the select clause.
- Column references and expressions in the SELECT/HAVING/ORDER BY clauses that are not used in the group by clause must appear in aggregate functions.
- If an aggregate function is used in the SELECT clause and no GROUP BY is specified, an implicit GROUP BY will be performed with the entire result set as a single group. In this case, every column in the SELECT must be an aggregate function as no other column value will be fixed across the entire group.
- The GROUP BY columns must be of a comparable type.
Rollups
Just like normal grouping, ROLLUP processing logically occurs before the HAVING clause is processed. A ROLLUP of expressions will produce the same output as a regular grouping with the addition of aggregate values computed at higher aggregation levels. For N expressions in the ROLLUP, aggregates will be provided over (), (expr1), (expr1, expr2), etc. up to (expr1, … exprN-1), with the other grouping expressions in the output as null values. The following example uses a normal aggregation query:
SELECT country, city, sum(amount) from sales group by country, city
The query returns the following data:
country | city | sum(amount) |
---|---|---|
US | St. Louis | 10000 |
US | Raleigh | 150000 |
US | Denver | 20000 |
UK | Birmingham | 50000 |
UK | London | 75000 |
In contrast, the following example uses a rollup query:
Data returned from a rollup query
SELECT country, city, sum(amount) from sales group by rollup(country, city)
would return:
country | city | sum(amount) |
---|---|---|
US | St. Louis | 10000 |
US | Raleigh | 150000 |
US | Denver | 20000 |
US | <null> | 180000 |
UK | Birmingham | 50000 |
UK | London | 75000 |
UK | <null> | 125000 |
<null> | <null> | 305000 |
Not all sources are compatible with ROLLUPs, and compared to normal aggregate processing, some optimizations might be inhibited by the use of a ROLLUP.
The use of ROLLUPs in Data Virtualization is currently limited in comparison to the SQL specification.
3.6.11. HAVING Clause
The HAVING
clause operates exactly as a WHERE
clause, although it operates on the output of a GROUP BY
. You can use the same syntax with the HAVING
clause as with the WHERE
clause.
Syntax Rules
-
Expressions used in the GROUP BY clause must contain either an aggregate function (
COUNT
,AVG
,SUM
,MIN
,MAX
), or be one of the grouping expressions.
3.6.12. ORDER BY clause
The ORDER BY clause specifies how records are sorted. The options are ASC (ascending) or DESC (descending).
Usage
ORDER BY expression [ASC|DESC] [NULLS (FIRST|LAST)], ...
Syntax rules
- Sort columns can be specified positionally by a 1-based positional integer, by SELECT clause alias name, by SELECT clause expression, or by an unrelated expression.
- Column references can appear in the SELECT clause as the expression for an aliased column, or can reference columns from tables in the FROM clause. If the column reference is not in the SELECT clause, the query cannot be a set operation, specify SELECT DISTINCT, or contain a GROUP BY clause.
- Unrelated expressions, expressions not appearing as an aliased expression in the select clause, are allowed in the ORDER BY clause of a non-set QUERY. The columns referenced in the expression must come from the from clause table references. The column references cannot be to alias names or positional.
- The ORDER BY columns must be of a comparable type.
- If an ORDER BY is used in an inline view or view definition without a LIMIT clause, it is removed by the Data Virtualization optimizer.
- If NULLS FIRST/LAST is specified, then nulls are guaranteed to be sorted either first or last. If the null ordering is not specified, then results will typically be sorted with nulls as low values, which is the default internal sorting behavior for Data Virtualization. However, not all sources return results with nulls sorted as low values by default, and Data Virtualization might return results with different null orderings.
The use of positional ordering is no longer supported by the ANSI SQL standard and is a deprecated feature in Data Virtualization. It is best to use alias names in the ORDER BY clause.
3.6.13. LIMIT clause
The LIMIT clause specifies a limit on the number of records returned from the SELECT command. YOu can specify an optional offset (the number of rows to skip). The LIMIT clause can also be specified using the SQL 2008 OFFSET/FETCH FIRST clauses. If an ORDER BY is also specified, it will be applied before the OFFSET/LIMIT are applied. If an ORDER BY is not specified there is generally no guarantee what subset of rows will be returned.
Usage
LIMIT [offset,] limit
LIMIT limit OFFSET offset
[OFFSET offset ROW|ROWS] [FETCH FIRST|NEXT [limit] ROW|ROWS ONLY]
Syntax rules
-
The LIMIT/OFFSET expressions must be a non-negative integer or a parameter reference (
?
). An offset of0
is ignored. A limit of0
returns no rows. - The terms FIRST/NEXT are interchangeable as well as ROW/ROWS.
-
The LIMIT clause can take an optional preceding NON_STRICT hint to indicate that push operations should not be inhibited, even if the results will not be consistent with the logical application of the limit. The hint is only needed on unordered limits, for example,
"SELECT * FROM VW /*+ NON_STRICT */ LIMIT 2"
.
LIMIT clause examples
-
LIMIT 100
returns the first 100 records (rows 1-100). -
LIMIT 500, 100
skips 500 records and returns the next 100 records(rows 501-600). -
OFFSET 500 ROWS
skips 500 records. -
OFFSET 500 ROWS FETCH NEXT 100 ROWS ONLY
skips 500 records and returns the next 100 records (rows 501-600). -
FETCH FIRST ROW ONLY
returns only the first record.
3.6.14. INTO clause
Usage of the INTO Clause for inserting into a table has been been deprecated. An INSERT with a query command should be used instead. For information about using INSERT, see INSERT command.
When the into clause is specified with a SELECT, the results of the query are inserted into the specified table. This is often used to insert records into a temporary table. The INTO clause immediately precedes the FROM clause.
Usage
INTO table FROM ...
Syntax rules
-
The
INTO
clause is logically applied last in processing, after theORDER BY
andLIMIT
clauses. Data Virtualization’s support for
SELECT INTO
is similar to Microsoft SQL Server. The target of theINTO
clause is a table where the result of theSELECT
command will be inserted.For example, the following statement:
SELECT col1, col2 INTO targetTable FROM sourceTable
inserts
col1
andcol2
from thesourceTable
into thetargetTable
.You cannot combine SELECT INTO with a UNION query.
That is, you cannot select the results from a
sourceTable UNION
query for insertion into atargetTable
.
3.6.15. OPTION clause
The OPTION keyword denotes options that a user can pass in with a command. These options are specific to Data Virtualization and are not covered by any SQL specification.
Usage
OPTION option (, option)*
Supported options
MAKEDEP table (,table)*
- Specifies source tables that should be made dependent in the join.
MAKEIND table (,table)*
- Specifies source tables that should be made independent in the join.
MAKENOTDEP table (,table)*
- Prevents a dependent join from being used.
NOCACHE [table (,table)*]
- Prevents cache from being used for all tables or for the given tables.
Examples
OPTION MAKEDEP table1
OPTION NOCACHE
All tables specified in the OPTION clause should be fully qualified. However, the table name can match either the fully qualified name or an alias name.
The MAKEDEP and MAKEIND hints can take optional arguments to control the dependent join. The extended hint form is:
MAKEDEP tbl([max:val] [[no] join])
-
tbl(JOIN)
means that the entire join should be pushed. -
tbl(NO JOIN)
means that the entire join should not be pushed. -
tbl(MAX:val)
means that the dependent join should only be performed if there are less than the maximum number of values from the independent side.
Data Virtualization does not accept PLANONLY, DEBUG, and SHOWPLAN arguments in the OPTION clause. For information about how to perform the functions formerly provided by these options, see the Client Developer’s Guide.
MAKEDEP and MAKENOTDEP hints can take table names in the form of @view1.view2…table
. For example, with an inline view "SELECT * FROM (SELECT * FROM tbl1, tbl2 WHERE tbl1.c1 = tbl2.c2) AS v1 OPTION MAKEDEP @v1.tbl1"
the hint will now be understood as applying under the v1 view.
3.7. DDL Commands
Data Virtualization is compatible with a subset of the DDL commands for creating or dropping temporary tables and manipulating procedure and view definitions at runtime. It is not currently possible to arbitrarily drop or create non-temporary metadata entries. For information about the DDL statements that you can use to define schemas in a virtual database, see DDL metadata.
3.7.1. Temporary Tables
You can create and use temporary (temp) tables in Data Virtualization. Temporary tables are created dynamically, but are treated as any other physical table.
3.7.1.1. Local temporary tables
Local temporary tables can be defined implicitly by referencing them in a INSERT statement or explicitly with a CREATE TABLE statement. Implicitly created temp tables must have a name that starts with #
.
Data Virtualization interprets local to mean that a temporary table is scoped to the session or block of the virtual procedure that creates it. This interpretation differs from the SQL specification and from the interpretation that other database vendors implement. After exiting a block or at the termination of a session, the table is dropped. Session tables and other temporary tables that a calling procedures creates are not visible to called procedures. If a temporary table of the same name is created in a called procedure, then a new instance is created.
Creation syntax
You can create local temporary tables explicitly or implicitly.
- Explicit creation syntax
Local temporary tables can be defined explicitly with a CREATE TABLE statement, as in the following example:name: value
CREATE LOCAL TEMPORARY TABLE name (column type [NOT NULL], ... [PRIMARY KEY (column, ...)]) [ON COMMIT PRESERVE ROWS]
- Use the SERIAL data type to specify a NOT NULL and auto-incrementing INTEGER column. The starting value of a SERIAL column is 1.
- Implicit creation syntax
Local temporary tables can be defined implicitly by referencing them in an INSERT statement.
INSERT INTO #name (column, ...) VALUES (value, ...) INSERT INTO #name [(column, ...)] select c1, c2 from t
NoteIf
#name
does not exist, it is defined using the given column names and types from the value expressions.INSERT INTO #name (column, ...) VALUES (value, ...) INSERT INTO #name [(column, ...)] select c1, c2 from t
NoteIf
#name
does not exist, it is defined using the target column names, and the types from the query-derived columns. If target columns are not supplied, the column names will match the derived column names from the query.
Drop syntax
DROP TABLE name
+ In the following example, a series of statements loads a temporary table with data from 2 sources, manually inserts a record, and then uses the temporary table in a SELECT query.
Example: Local temporary tables
CREATE LOCAL TEMPORARY TABLE TEMP (a integer, b integer, c integer); SELECT * INTO temp FROM Src1; SELECT * INTO temp FROM Src2; INSERT INTO temp VALUES (1,2,3); SELECT a,b,c FROM Src3, temp WHERE Src3.a = temp.b;
For more information about using local temporary tables, see Virtual procedures.
3.7.1.2. Global temporary tables
Global temporary tables are created from the metadata that you supply to Data Virtualization at deployment time. Unlike local temporary tables, you cannot create global temporary tables at runtime. Your global temporary tables share a common definition through a schema entry. However, a new instance of the temporary table is created in each session. The table is then dropped when the session ends. There is no explicit drop support. A common use for a global temporary table is to pass results into and out of procedures.
Creation syntax
CREATE GLOBAL TEMPORARY TABLE name (column type [NOT NULL], ... [PRIMARY KEY (column, ...)]) OPTIONS (UPDATABLE 'true')
If you use the SERIAL data type, then each session’s instance of the global temporary table will have its own sequence.
You must explicitly specify UPDATABLE if you want to update the temporary table.
For information about syntax options, see the CREATE TABLE
DDL statements in DDL metadata for schema objects.
3.7.1.3. Common features of global and local temporary tables
Global and local temporary tables share some common features.
Primary key usage
- All key columns must be comparable.
- If you use a primary key, it creates a clustered index that enables search improvements for SQL comparison operators, and the IN, LIKE, and ORDER BY operators.
-
You can use
Null
as a primary key value, but there must be only one row that has an all-null key.
Transactions
-
There is a
READ_UNCOMMITED
transaction isolation level. There are no locking mechanisms available to enable higher isolation levels, and the result of a rollback may be inconsistent across multiple transactions. If concurrent transactions are not associated with the same local temporary table or session, then the transaction isolation level is effectively serializable. If you want full consistency with local temporary tables, then only use a connection with one transaction at a time. This mode of operation is ensured by connection pooling that tracks connections by transaction.
Limitations
-
With the
CREATE TABLE
syntax, you can specify only basic table definition (column name, type, and nullable information), and an optional primary key. For global temporary tables, additional metadata in the CREATE statement is effectively ignored when creating the temporary table instance. However, the metadata might still be used by planning similar to any other table entry. -
You can use
ON COMMIT PRESERVE ROWS
. You cannot use otherON COMMIT
actions. - The cannot use "drop behavior" options in the DROP statement.
- Temporary tables are not fail-over safe.
- Non-inlined LOB values (XML, CLOB, BLOB, JSON, geometry) are tracked by reference rather than by value in a temporary table. If you insert LOB values from external sources in your temporary table, they might become unreadable when the associated statement or connection is closed.
3.7.1.4. Foreign temporary tables
Unlike a local or global temporary table, a foreign temporary table is a reference to an actual source table that is created at runtime, rather than during the metadata load.
A foreign temporary table requires explicit creation syntax:
CREATE FOREIGN TEMPORARY TABLE name ... ON schema
Where the table creation body syntax is the same as a standard CREATE FOREIGN TABLE DDL statement. For more information, see DDL metadata. In general, usage of DDL OPTION clauses might be required to properly access the source table, including setting the name in the source, updatability, native types, and so forth.
The schema name must specify an existing schema/model in the VDB. The table will be accessed as if it is on that source. However within Data Virtualization the temporary table will still be scoped the same as a non-foreign temporary table. This means that the foreign temporary table will not belong to a Data Virtualization schema, and will be scoped to the session or procedure block where it is created.
The DROP syntax for a foreign temporary table is the same as for a non-foreign temporary table.
Neither a CREATE nor a corresponding DROP of a foreign temporary table issues a pushdown command. Rather, this mechanism exposes a source table for use within Data Virtualization on a temporary basis.
There are two usage scenarios for a FOREIGN TEMPORARY TABLE. The first is to dynamically access additional tables on the source. The other is to replace the usage of a Data Virtualization local temporary table for performance reasons. The usage pattern for the latter case would look like:
//- create the source table source.native("CREATE GLOBAL TEMPORARY TABLE name IF NOT EXISTS ... ON COMMIT DELETE ROWS"); //- bring the table into Data Virtualization CREATE FOREIGN TEMPORARY TABLE name ... OPTIONS (UPDATABLE true) //- use the table ... //- forget the table DROP TABLE name
Note the usage of the native procedure to pass source-specific CREATE DDL to the source. Data Virtualization does not currently attempt to pushdown a source creation of a temporary table based on the CREATE statement. Some other mechanism, such as the native procedure shown above, must be used to first create the table. Also note the table is explicitly marked as updatable, since DDL defined tables are not updatable by default.
The source’s handling of temporary tables must also be understood to make this work as intended. Sources that use the same GLOBAL table definition for all sessions while scoping the data to be session-specific (such as Oracle) or sources that use session-scoped temporary tables (such as PostgreSQL) will work if accessed under a transaction. A transaction is necessary for the following reasons:
- The source on commit behavior (most likely DELETE ROWS or DROP) will ensure clean-up. Keep in mind that a Data Virtualization drop does not issue a source command and is not guaranteed to occur (in some exception cases, loss of database connectivity, hard shutdown, and so forth).
- The source pool when using track connections by transaction will ensure that multiple uses of that source by Data Virtualization will use the same connection/session and thus the same temporary table and data.
You cannot use the ON COMMIT
clause with Data Virtualization. As a result, for local temporary tables, the ON COMMIT
behavior for source tables is likely to be different from the default PRESERVE ROWS
.
3.7.2. Alter view
Usage
ALTER VIEW name AS queryExpression
Syntax rules
- The alter query expression can be prefixed with a cache hint for materialized view definitions. The hint takes effect the next time that the materialized view table loads.
3.7.3. Alter procedure
Usage
ALTER PROCEDURE name AS block
Syntax rules
-
The ALTER block should not include
CREATE VIRTUAL PROCEDURE
. - You can prefix the ALTER block with a cache hint for cached procedures.
3.7.4. Alter trigger
Usage
ALTER TRIGGER ON name INSTEAD OF INSERT|UPDATE|DELETE (AS FOR EACH ROW block) | (ENABLED|DISABLED)
Syntax rules
-
The target
name
must be an updatable view. - Triggers are not true schema objects. They are scoped only to their view and have no name.
- Update procedures must already exist for the given trigger event. For more information, see Triggers.
3.8. Procedures
You can use a procedure language in Data Virtualization to call foreign procedures and define virtual procedures and triggers.
3.8.1. Procedure language
You can use a procedural language in Data Virtualization to define virtual procedures. These are similar to stored procedures in relational database management systems. You can use this language to define the transformation logic for decomposing INSERT, UPDATE, and DELETE commands against views. These are known as update procedures. For more information, see Virtual procedures and update procedures (Triggers).
3.8.1.1. Command statement
A command statement executes a DML command, DDL command, or dynamic SQL against one or more data sources. For more information, see DML commands and DDL commands.
Usage
command [(WITH|WITHOUT) RETURN];
Example command statements
SELECT * FROM MySchema.MyTable WHERE ColA > 100 WITHOUT RETURN; INSERT INTO MySchema.MyTable (ColA,ColB) VALUES (50, 'hi');
Syntax rules
-
EXECUTE command statements may access IN/OUT, OUT, and RETURN parameters. To access the return value the statement will have the form
var = EXEC proc…
. To access OUT or IN/OUT values named parameter syntax must be used. For example,EXEC proc(in_param⇒'1', out_param⇒var)
will assign the value of the out parameter to the variable var. It is expected that the datatype of a parameter is implicitly convertible to the data type of the variable. For more information about EXECUTE command statements, see EXECUTE command. - The RETURN clause determines if the result of the command is returnable from the procedure. WITH RETURN is the default. If the command does not return a result set, or the procedure does not return a result set, the RETURN clause is ignored. If WITH RETURN is specified, the result set of the command must match the expected result set of the procedure. Only the last successfully executed statement executed WITH RETURN will be returned as the procedure result set. If there are no returnable result sets and the procedure declares that a result set will be returned, then an empty result set is returned.
The INTO clause is used only for inserting into a table. `SELECT … INTO table … is functionally equivalent to `INSERT INTO table SELECT …
If you need to assign variables, you can use one of the following methods:
- Use an assignment statement with a scalar subquery
DECLARE string var = (SELECT col ...);
- Use a temporary table
INSERT INTO #temp SELECT col1, col2 ...; DECLARE string VARIABLES.RESULT = (SELECT x FROM #temp);
- Use an array
DECLARE string[] var = (SELECT (col1, col2) ...); DECLARE string col1val = var[1];
3.8.1.2. Dynamic SQL command
Dynamic SQL allows for the execution of an arbitrary SQL command in a virtual procedure. Dynamic SQL is useful in situations where the exact command form is not known prior to execution.
Usage
EXECUTE IMMEDIATE <sql expression> AS <variable> <type> [, <variable> <type>]* [INTO <variable>] [USING <variable>=<expression> [,<variable>=<expression>]*] [UPDATE <literal>]
Syntax rules
- The SQL expression must be a CLOB or string value of less than 262144 characters.
-
The
AS
clause is used to define the projected symbols names and types returned by the executed SQL string. TheAS
clause symbols will be matched positionally with the symbols returned by the executed SQL string. Non-convertible types or too few columns returned by the executed SQL string will result in an error. -
The
INTO
clause will project the dynamic SQL into the specified temp table. With theINTO
clause specified, the dynamic command will actually execute a statement that behaves like an INSERT with a QUERY EXPRESSION. If the dynamic SQL command creates a temporary table with theINTO
clause, then theAS
clause is required to define the table’s metadata. -
The
USING
clause allows the dynamic SQL string to contain variable references that are bound at runtime to specified values. This allows for some independence of the SQL string from the surrounding procedure variable names and input names. In the dynamic commandUSING
clause, each variable is specified by short name only. However, in the dynamic SQL theUSING
variable must be fully qualified toDVAR
. TheUSING
clause is only for values that will be used in the dynamic SQL as valid expressions. It is not possible to use theUSING
clause to replace table names, keywords, and so forth. This makes using symbols equivalent in power to normal bind (?) expressions in prepared statements. TheUSING
clause helps reduce the amount of string manipulation needed. If a reference is made to a USING symbol in the SQL string that is not bound to a value in theUSING
clause, an exception will occur. -
The
UPDATE
clause is used to specify the updating model count. Accepted values are (0,1,*). 0 is the default value if the clause is not specified. For more information, see Updating model count.
Example: Dynamic SQL
... /* Typically complex criteria would be formed based upon inputs to the procedure. In this simple example the criteria is references the using clause to isolate the SQL string from referencing a value from the procedure directly */ DECLARE string criteria = 'Customer.Accounts.Last = DVARS.LastName'; /* Now we create the desired SQL string */ DECLARE string sql_string = 'SELECT ID, First || " " || Last AS Name, Birthdate FROM Customer.Accounts WHERE ' || criteria; /* The execution of the SQL string will create the #temp table with the columns (ID, Name, Birthdate). Note that we also have the USING clause to bind a value to LastName, which is referenced in the criteria. */ EXECUTE IMMEDIATE sql_string AS ID integer, Name string, Birthdate date INTO #temp USING LastName='some name'; /* The temp table can now be used with the values from the Dynamic SQL */ loop on (SELCT ID from #temp) as myCursor ...
Here is an example showing a more complex approach to building criteria for the dynamic SQL string. In short, the virtual procedure AccountAccess.GetAccounts
has the inputs ID
, LastName
, and bday
. If a value is specified for ID
it will be the only value used in the dynamic SQL criteria. Otherwise, if a value is specified for LastName
the procedure will detect if the value is a search string. If bday
is specified in addition to LastName
, it will be used to form compound criteria with LastName
.
Example: Dynamic SQL with USING clause and dynamically built criteria string
... DECLARE string crit = null; IF (AccountAccess.GetAccounts.ID IS NOT NULL) crit = '(Customer.Accounts.ID = DVARS.ID)'; ELSE IF (AccountAccess.GetAccounts.LastName IS NOT NULL) BEGIN IF (AccountAccess.GetAccounts.LastName == '%') ERROR "Last name cannot be %"; ELSE IF (LOCATE('%', AccountAccess.GetAccounts.LastName) < 0) crit = '(Customer.Accounts.Last = DVARS.LastName)'; ELSE crit = '(Customer.Accounts.Last LIKE DVARS.LastName)'; IF (AccountAccess.GetAccounts.bday IS NOT NULL) crit = '(' || crit || ' and (Customer.Accounts.Birthdate = DVARS.BirthDay))'; END ELSE ERROR "ID or LastName must be specified."; EXECUTE IMMEDIATE 'SELECT ID, First || " " || Last AS Name, Birthdate FROM Customer.Accounts WHERE ' || crit USING ID=AccountAccess.GetAccounts.ID, LastName=AccountAccess.GetAccounts.LastName, BirthDay=AccountAccess.GetAccounts.Bday; ...
Dynamic SQL limitations and workarounds
The use of the dynamic SQL command results in an assignment statement that requires the use of a temporary table.
Example assignment
EXECUTE IMMEDIATE <expression> AS x string INTO #temp; DECLARE string VARIABLES.RESULT = (SELECT x FROM #temp);
The construction of appropriate criteria will be cumbersome if parts of the criteria are not present. For example if criteria
were already NULL, then the following example results in criteria
remaining NULL.
Example: Dangerous NULL handling
... criteria = '(' || criteria || ' and (Customer.Accounts.Birthdate = DVARS.BirthDay))';
It is best to ensure that the criteria is not NULL prior its usage. If this is not possible, a you can specify a default, as shown in the following example.
Example: NULL handling
... criteria = '(' || nvl(criteria, '(1 = 1)') || ' and (Customer.Accounts.Birthdate = DVARS.BirthDay))';
If the dynamic SQL is an UPDATE
, DELETE
, or INSERT
command, the rowcount of the statement can be obtained from the rowcount variable.
Example: AS and INTO clauses
/* Execute an update */ EXECUTE IMMEDIATE <expression>;
3.8.1.3. Declaration statement
A declaration statement declares a variable and its type. After you declare a variable, you can use it in that block within the procedure and any sub-blocks. A variable is initialized to null by default, but can also be assigned the value of an expression as part of the declaration statement.
Usage
DECLARE <type> [VARIABLES.]<name> [= <expression>];
Example syntax
declare integer x; declare string VARIABLES.myvar = 'value';
Syntax rules
- You cannot redeclare a variable with a duplicate name in a sub-block.
- The VARIABLES group is always implied even if it is not specified.
- The assignment value follows the same rules as for an Assignment statement.
- In addition to the standard types, you may specify EXCEPTION if declaring an exception variable.
3.8.1.4. Assignment statement
An assignment statement assigns a value to a variable by evaluating an expression.
Usage
<variable reference> = <expression>;
Example syntax
myString = 'Thank you'; VARIABLES.x = (SELECT Column1 FROM MySchema.MyTable);
Valid variables for assignment include any in-scope variable that has been declared with a declaration statement, or the procedure in_out
and out
parameters. In_out
and out
parameters can be accessed by their fully qualified names.
Example: Out parameter
CREATE VIRTUAL PROCEDURE proc (OUT STRING x, INOUT STRING y) AS BEGIN proc.x = 'some value ' || proc.y; y = 'some new value'; END
3.8.1.5. Special variables
VARIABLES.ROWCOUNT
integer variable will contain the numbers of rows affected by the last INSERT, UPDATE, or DELETE command statement executed. Inserts that are processed by dynamic SQL with an into
clause will also update the ROWCOUNT
.
Sample usage
... UPDATE FOO SET X = 1 WHERE Y = 2; DECLARE INTEGER UPDATED = VARIABLES.ROWCOUNT; ...
Non-update command statements (WITH
or WITHOUT RETURN
) will reset the ROWCOUNT
to 0.
To ensure you are getting the appropriate ROWCOUNT
value, save the ROWCOUNT
to a variable immediately after the command statement.
3.8.1.6. Compound statement
A compound statement or block logically groups a series of statements. Temporary tables and variables that are created in a compound statement are local only to that block, and are destroyed when exiting the block.
Usage
[label :] BEGIN [[NOT] ATOMIC] statement* [EXCEPTION ex statement* ] END
When a block is expected by an IF
, LOOP
, WHILE
, and so forth, a single statement is also accepted by the parser. Even though the block BEGIN
or END
are not expected, the statement will execute as if wrapped in a BEGIN
or END
pair.
Syntax rules
-
If
NOT ATOMIC
or noATOMIC
clause is specified, the block will be executed non-atomically. -
If the
ATOMIC
clause is specified, the block must execute atomically. If a transaction is already associated with the thread, no additional action will be taken; savepoints or sub-transactions are not currently used. If the higher level transaction is used, and the block does not complete — regardless of the presence of exception handling — the transaction will be marked as rollback only. Otherwise, a transaction will be associated with the execution of the block. Upon successful completion of the block the transaction will be committed. - The label must not be the same as any label that is used in statements that contain this one.
- Variable assignments and the implicit result cursor are unaffected by rollbacks. If a block does not complete successfully, its assignments will still take affect.
Exception handling
If an EXCEPTION
clause is used within a compound statement, any processing exception emitted from statements will be caught with the flow of execution transferring to EXCEPTION
statements. Any block-level transaction started by this block will commit if the exception handler successfully completes. If another exception, or the original exception, is emitted from the exception handler, the transaction will rollback. Any temporary tables or variables specific to the BLOCK will not be available to the exception handler statements.
Only processing exceptions, which are typically caused by errors originating at the sources or with function execution, are caught. A low-level internal Data Virtualization error or Java RuntimeException
will not be caught.
To aid in the processing of a caught exception, the EXCEPTION
clause specifies a group name that exposes the significant fields of the exception. The following table shows the variables that an exception group contains:
Variable | Type | Description |
---|---|---|
STATE | string | The SQL State |
ERRORCODE | integer | The error or vendor code. In the case of Data Virtualization internal exceptions this will be the integer suffix of the TEIIDxxxx code. |
TEIIDCODE | string | The full Data Virtualization event code. Typically TEIIDxxxx. |
EXCEPTION | object |
The exception being caught, will be an instance of |
CHAIN | object | The chained exception or cause of the current exception. |
Data Virtualization does not yet fully comply with the ANSI SQL specification on SQL State usage. For Data Virtualization errors without an underlying SQLException cause, it is best to use the Data Virtualization code.
The exception group name might not be the same as any higher level exception group or loop cursor name.
Example exception group handling
BEGIN DECLARE EXCEPTION e = SQLEXCEPTION 'this is bad' SQLSTATE 'xxxxx'; RAISE variables.e; EXCEPTION e IF (e.state = 'xxxxx') //in this trivial example, we'll always hit this branch and just log the exception RAISE SQLWARNING e.exception; ELSE RAISE e.exception; END
3.8.1.7. IF statement
An IF statement evaluates a condition and executes either one of two statements depending on the result. You can nest IF statements to create complex branching logic. A dependent ELSE statement will execute its statement only if the IF statement evaluates to false
.
Usage
IF (criteria) block [ELSE block] END
Example IF statement
IF ( var1 = 'North America') BEGIN ...statement... END ELSE BEGIN ...statement... END
The criteria can be any valid Boolean expression or an IS DISTINCT FROM
predicate referencing row values. The IS DISTINCT FROM
extension uses the following syntax:
rowVal IS [NOT] DISTINCT FROM rowValOther
Where rowVal
and rowValOther
are references to row value group. This would typically be used in instead of update triggers on views to quickly determine if the row values are changing:
Example: IS DISTINCT FROM IF statement
IF ( "new" IS DISTINCT FROM "old") BEGIN ...statement... END
IS DISTINCT FROM
considers null values equivalent and never produces an UNKNOWN value.
Null values should be considered in the criteria of an IF statement. IS NULL
criteria can be used to detect the presence of a null value.
3.8.1.8. Loop Statement
A LOOP statement is an iterative control construct that is used to cursor through a result set.
Usage
[label :] LOOP ON <select statement> AS <cursorname> statement
Syntax rules
- The label must not be the same as any label that is used in statements that contain this one.
3.8.1.9. While statement
A WHILE
statement is an iterative control construct that is used to execute a statement repeatedly whenever a specified condition is met.
Usage
[label :] WHILE <criteria> statement
Syntax rules
- The label must not be the same as any label that is used in statements that contain this one.
3.8.1.10. Continue statement
A CONTINUE
statement is used inside a LOOP
or WHILE
construct to continue with the next loop by skipping over the rest of the statements in the loop. It must be used inside a LOOP
or WHILE
statement.
Usage
CONTINUE [label];
Syntax rules
-
If the label is specified, it must exist on a containing
LOOP
orWHILE
statement. -
If no label is specified, the statement will affect the closest containing
LOOP
orWHILE
statement.
3.8.1.11. Break statement
A BREAK
statement is used inside a LOOP
or WHILE
construct to break from the loop. It must be used inside a LOOP
or WHILE
statement.
Usage
BREAK [label];
Syntax rules
-
If the label is specified, it must exist on a containing
LOOP
orWHILE
statement. -
If no label is specified, the statement will affect the closest containing
LOOP
orWHILE
statement.
3.8.1.12. Leave statement
A LEAVE
statement is used inside a compound, LOOP
, or WHILE
construct to leave to the specified level.
Usage
LEAVE label;
Syntax rules
-
The label must exist on a containing compound statement,
LOOP
, orWHILE
statement.
3.8.1.13. Return statement
A RETURN
statement gracefully exits the procedure and optionally returns a value.
Usage
RETURN [expression];
Syntax rules
- If an expression is specified, the procedure must have a return parameter and the value must be implicitly convertible to the expected type.
-
Even if the procedure has a return parameter, it is not required to specify a return value in a
RETURN
statement. A return parameter can be set through an assignment or it can be left as null.
Sample usage
CREATE VIRTUAL FUNCTION times_two(val integer) RETURNS integer AS BEGIN RETURN val*2; END
3.8.1.14. Error statement
An ERROR
statement declares that the procedure has entered an error state and should abort. This statement will also roll back the current transaction, if one exists. Any valid expression can be specified after the ERROR
keyword.
Usage
ERROR message;
Example: Error statement
ERROR 'Invalid input value: ' || nvl(Acct.GetBalance.AcctID, 'null');
An ERROR
statement is equivalent to:
RAISE SQLEXCEPTION message;
3.8.1.15. Raise statement
A RAISE
statement is used to raise an exception or warning. When raising an exception, this statement will also roll back the current transaction, if one exists.
Usage
RAISE [SQLWARNING] exception;
Where exception may be a variable reference to an exception or an exception expression.
Syntax rules
-
If
SQLWARNING
is specified, the exception will be sent to the client as a warning and the procedure will continue to execute. - A null warning will be ignored. A null non-warning exception will still cause an exception to be raised.
Example raise statement
RAISE SQLWARNING SQLEXCEPTION 'invalid' SQLSTATE '05000';
3.8.1.16. Exception expression
An exception expression creates an exception that can be raised or used as a warning.
Usage
SQLEXCEPTION message [SQLSTATE state [, code]] CHAIN exception
Syntax rules
- Any of the values may be null.
-
message
andstate
are string expressions that specify the exception message and SQL state. Data Virtualization does not fully comply with the ANSI SQL specification on SQL state usage, but you are allowed to set any SQL state you choose. -
code
is an integer expression that specifies the vendor code. -
exception
must be a variable reference to an exception or an exception expression, and will be chained to the resulting exception as its parent.
3.8.2. Virtual procedures
Virtual procedures are defined using the Data Virtualization procedural language. For more information, see Procedure language.
A virtual procedure has zero or more INPUT, INOUT, or OUT parameters, an optional RETURN parameter, and an optional result set. Virtual procedures can execute queries and other SQL commands, define temporary tables, add data to temporary tables, walk through result sets, use loops, and use conditional logic.
Virtual procedure definition
For more information, see Create procedure/function in DDL metadata for schema objects.
Note that the optional result parameter is always considered the first parameter.
Within the body of the procedure, you can use any valid statement. For more information avbout procedure language statements, see Procedure language.
There is no explicit cursoring or value returning statement. Instead, the last unnamed command statement executed in the procedure that returns a result set will be returned as the result. The output of that statement must match the expected result set and parameters of the procedure.
Virtual procedure parameters
Virtual procedures can take zero or more IN
or INOUT
parameters, and can have any number of OUT
parameters and an optional RETURN
parameter. Each input has the following information that is used during runtime processing:
- Name
- The name of the input parameter.
- Datatype
- The design-time type of the input parameter.
- Default value
- The default value if the input parameter is not specified.
- Nullable
-
NO_NULLS
,NULLABLE
,NULLABLE_UNKNOWN
; parameter is optional if nullable, and is not required to be listed when using named parameter syntax.
You reference a parameter in a virtual procedure by using its fully-qualified name (or less if unambiguous). For example, MySchema.MyProc.Param1
.
Example: Referencing an input parameter and assigning an Out parameter for GetBalance
procedure
BEGIN MySchema.GetBalance.RetVal = UPPER(MySchema.GetBalance.AcctID); SELECT Balance FROM MySchema.Accts WHERE MySchema.Accts.AccountID = MySchema.GetBalance.AcctID; END
If an INOUT
parameter is not assigned any value in a procedure, it will retain the value it was assigned for input. Any OUT/RETURN
parameter that is not assigned a value will retain the default NULL value. The INOUT/OUT/RETURN
output values are validated against the NOT NULL
metadata of the parameter.
Example virtual procedures
The following example represents a loop that walks through a cursored table and uses CONTINUE
and BREAK
.
Virtual procedure using LOOP, CONTINUE, BREAK
BEGIN DECLARE double total; DECLARE integer transactions; LOOP ON (SELECT amt, type FROM CashTxnTable) AS txncursor BEGIN IF(txncursor.type <> 'Sale') BEGIN CONTINUE; END ELSE BEGIN total = (total + txncursor.amt); transactions = (transactions + 1); IF(transactions = 100) BEGIN BREAK; END END END SELECT total, (total / transactions) AS avg_transaction; END
The following example uses conditional logic to determine which of two SELECT statements to execute.
Virtual procedure with conditional SELECT
BEGIN DECLARE string VARIABLES.SORTDIRECTION; VARIABLES.SORTDIRECTION = PartsVirtual.OrderedQtyProc.SORTMODE; IF ( ucase(VARIABLES.SORTDIRECTION) = 'ASC' ) BEGIN SELECT * FROM PartsVirtual.SupplierInfo WHERE QUANTITY > PartsVirtual.OrderedQtyProc.QTYIN ORDER BY PartsVirtual.SupplierInfo.PART_ID; END ELSE BEGIN SELECT * FROM PartsVirtual.SupplierInfo WHERE QUANTITY > PartsVirtual.OrderedQtyProc.QTYIN ORDER BY PartsVirtual.SupplierInfo.PART_ID DESC; END END
Executing virtual procedures
You execute procedures using the SQL EXECUTE
command. For more information, see Execute command in DML commands.
If the procedure has defined inputs, you specify those in a sequential list, or using name=value syntax. You must use the name of the input parameter, scoped by the full procedure name if the parameter name is ambiguous in the context of other columns or variables in the procedure.
A virtual procedure call returns a result set like any SELECT
, so you can use this in many places you can use a SELECT
. Typically you’ll use the following syntax:
SELECT * FROM (EXEC ...) AS x
Virtual procedure limitations
A virtual procedure can return only one result set. If you need to pass in a result set, or pass out multiple result sets, then consider using global temporary tables instead.
3.8.3. Triggers
View triggers
Views are abstractions above physical sources. They typically union or join information from multiple tables, often from multiple data sources or other views. Data Virtualization can perform update operations against views. Update commands that you run against a view (INSERT
, UPDATE
, or DELETE
) require logic to define how the tables and views integrated by the view are affected by each type of command. This transformation logic, also referred to as a trigger, is invoked when an update command is issued against a view. Update procedures define the logic for how the update command that you run against a view is decomposed into the individual commands to be executed against the underlying physical sources. Similar to virtual procedures, update procedures have the ability to execute queries or other commands, define temporary tables, add data to temporary tables, walk through result sets, use loops, and use conditional logic. For more inmformation about virtual procedures, see Virtual procedures.
You can use INSTEAD OF
triggers on views in a way that is similar to the way that you might use them with traditional databases. You can have only one FOR EACH ROW
procedure for each INSERT
, UPDATE
, or DELETE
operation against a view.
Usage
CREATE TRIGGER ON view_name INSTEAD OF INSERT|UPDATE|DELETE AS FOR EACH ROW ...
Update procedure processing
- The user application submits the SQL command.
- The command detects the view that it is executed against.
-
The correct procedure is chosen depending upon the command type (
INSERT
,UPDATE
, orDELETE
). - The procedure is executed. The procedure might contain SQL commands of its own. Commands in the procedure can be different in type from the command that is received from the calling application.
- Commands, as described in the procedure, are issued to the individual physical data sources or other views.
- A value representing the number of rows changed is returned to the calling application.
Source triggers
Data Virtualization can use AFTER
triggers on source tables. AFTER
triggers are called by events from a change data capture (CDC) system.
Usage:
CREATE TRIGGER ON source_table AFTER INSERT|UPDATE|DELETE AS FOR EACH ROW ...
FOR EACH ROW triggers
Only the FOR EACH ROW
construct serves as a trigger handler. A FOR EACH ROW
trigger procedure will evaluate its block for each row of the view/source affected by the UPDATE
statement. For UPDATE
and DELETE
statements, this will be every row that passes the WHERE
condition. For INSERT
statements there will be one new row for each set of values from the VALUES
or query expression. For a view, the rows updated is reported as this number, regardless of the affect of the underlying procedure logic.
Usage
FOR EACH ROW BEGIN ATOMIC ... END
The BEGIN
and END
keywords are used to denote block boundaries. Within the body of the procedure, any valid statement may be used.
The use of the ATOMIC
keyword is currently optional for backward compatibility, but unlike a normal block, the default for INSTEAD OF
triggers is atomic.
Special variables for update procedures
You can use a number of special variables when defining your update procedure.
- NEW variables
Every attribute in the view/table whose
UPDATE
andINSERT
transformations you are defining has an equivalent variable namedNEW.<column_name>
.When an INSERT or an UPDATE command is executed against the view, or the event is received, these variables are initialized to the values in the
INSERT VALUES
clause or theUPDATE SET
clause respectively.In an
UPDATE
procedure, the default value of these variables, if they are not set by the command, is the old value. In an INSERT procedure, the default value of these variables is the default value of the virtual table attributes. SeeCHANGING
variables, later in this list for distinguishing defaults from passed values.- OLD variables
Every attribute on the view/table whose
UPDATE
andDELETE
transformations you are defining has an equivalent variable namedOLD.<column_name>
.When a
DELETE
orUPDATE
command is executed against the view, or the event is received, these variables are initialized to the current values of the row being deleted or updated respectively.- CHANGING variables
Every attribute on the view/table whose
UPDATE
andINSERT
transformations you are defining has an equivalent variable namedCHANGING.<column_name>
.When an
INSERT
or anUPDATE
command is executed against the view, or an the event is received, these variables are initialized totrue
orfalse
depending on whether theINPUT
variable was set by the command. ACHANGING
variable is commonly used to differentiate between a default insert value and one that is specified in the user query.For example, for a view with columns A, B, C:
If User Executes… Then… INSERT INTO VT (A, B) VALUES (0, 1)
CHANGING.A = true, CHANGING.B = true, CHANGING.C = false
UPDATE VT SET C = 2
CHANGING.A = false, CHANGING.B = false, CHANGING.C = true
- Key variables
To return generated keys from an
INSERT
trigger, a KEY group is made available that can be assigned the value to be returned. Typically this requires using thegenerated_key
system function. However, not all inserts provide generated keys, because not all sources return generated keys.create view v1 (i integer, k integer not null auto_increment primary key) OPTIONS (UPDATABLE true) as select x, y from tbl; create trigger on v1 instead of insert as for each row begin atomic -- ... some logic insert into tbl (x) values (new.i); key.k = cast(generated_key('y') as integer); end;
Example update procedures
For example, for a view with columns A, B, C:
Sample DELETE procedure
FOR EACH ROW BEGIN DELETE FROM X WHERE Y = OLD.A; DELETE FROM Z WHERE Y = OLD.A; // cascade the delete END
Sample UPDATE procedure
FOR EACH ROW BEGIN IF (CHANGING.B) BEGIN UPDATE Z SET Y = NEW.B WHERE Y = OLD.B; END END
Other usages
FOR EACH ROW
update procedures in a view can also be used to emulate BEFORE/AFTER
each row triggers while still retaining the ability to perform an inherent update. This BEFORE/AFTER
trigger behavior with an inherent update can be achieved by creating an additional updatable view over the target view with update procedures of the form:
CREATE TRIGGER ON outerVW INSTEAD OF INSERT AS FOR EACH ROW BEGIN ATOMIC --before row logic ... --default insert/update/delete against the target view INSERT INTO VW (c1, c2, c3) VALUES (NEW.c1, NEW.c2, NEW.c3); --after row logic ... END
3.9. Comments
You can add multi-line SQL comments in Data Virtualization by enclosing text with /* */
.
/* comment comment comment... */
You can also add single line comments:
SELECT ... -- comment
You can also nest comments.
3.10. Explain statements
You can use an EXPLAIN statement to obtain a query plan. Using EXPLAIN statements to obtain a query execution plan is a native function of the SQL language, and it is the preferred mechanism to use over pg/ODBC transport. If you are using a Teiid JDBC client, you can also use SET/SHOW statements. For more information about SET and SHOW statements, see the Client Developer’s Guide.
Usage
EXPLAIN [(explainOption [, ...])] statement explainOption := ANALYZE [TRUE|FALSE] | FORMAT {TEXT|YAML|XML}
If no options are specified, by default the plan is provided in text format without executing the query.
If you specify ANALYZE
or ANALYZE TRUE
, then the statement is executed, unless the client has set the NOEXEC
option. The resulting plan will include runtime node statistics from the fully executed statement. All side effects, including updates, will still occur. You might need to use a transaction to rollback any unwanted side effects.
While this is superficially the same syntax as PostgreSQL, the plan provided in the various formats is the same that has been provided by Teiid in prior versions.
For more information about how to interpret results, see Query plans.
Example
EXPLAIN (analyze) select * from really_complicated_view
Returns a text-formatted plan from an actual run of the given statement.