Chapter 32. The XPath Language
Abstract
When processing XML messages, the XPath language enables you to select part of a message, by specifying an XPath expression that acts on the message’s Document Object Model (DOM). You can also define XPath predicates to test the contents of an element or an attribute.
32.1. Java DSL
Basic expressions
You can use xpath("Expression")
to evaluate an XPath expression on the current exchange (where the XPath expression is applied to the body of the current In message). The result of the xpath()
expression is an XML node (or node set, if more than one node matches).
For example, to extract the contents of the /person/name
element from the current In message body and use it to set a header named user
, you could define a route like the following:
from("queue:foo") .setHeader("user", xpath("/person/name/text()")) .to("direct:tie");
Instead of specifying xpath()
as an argument to setHeader()
, you can use the fluent builder xpath()
command — for example:
from("queue:foo") .setHeader("user").xpath("/person/name/text()") .to("direct:tie");
If you want to convert the result to a specific type, specify the result type as the second argument of xpath()
. For example, to specify explicitly that the result type is String
:
xpath("/person/name/text()", String.class)
Namespaces
Typically, XML elements belong to a schema, which is identified by a namespace URI. When processing documents like this, it is necessary to associate namespace URIs with prefixes, so that you can identify element names unambiguously in your XPath expressions. Apache Camel provides the helper class, org.apache.camel.builder.xml.Namespaces
, which enables you to define associations between namespaces and prefixes.
For example, to associate the prefix, cust
, with the namespace, http://acme.com/customer/record
, and then extract the contents of the element, /cust:person/cust:name
, you could define a route like the following:
import org.apache.camel.builder.xml.Namespaces;
...
Namespaces ns = new Namespaces("cust", "http://acme.com/customer/record");
from("queue:foo")
.setHeader("user", xpath("/cust:person/cust:name/text()", ns))
.to("direct:tie");
Where you make the namespace definitions available to the xpath()
expression builder by passing the Namespaces
object, ns
, as an additional argument. If you need to define multiple namespaces, use the Namespace.add()
method, as follows:
import org.apache.camel.builder.xml.Namespaces; ... Namespaces ns = new Namespaces("cust", "http://acme.com/customer/record"); ns.add("inv", "http://acme.com/invoice"); ns.add("xsi", "http://www.w3.org/2001/XMLSchema-instance");
If you need to specify the result type and define namespaces, you can use the three-argument form of xpath()
, as follows:
xpath("/person/name/text()", String.class, ns)
Auditing namespaces
One of the most frequent problems that can occur when using XPath expressions is that there is a mismatch between the namespaces appearing in the incoming messages and the namespaces used in the XPath expression. To help you troubleshoot this kind of problem, the XPath language supports an option to dump all of the namespaces from all of the incoming messages into the system log.
To enable namespace logging at the INFO
log level, enable the logNamespaces
option in the Java DSL, as follows:
xpath("/foo:person/@id", String.class).logNamespaces()
Alternatively, you could configure your logging system to enable TRACE
level logging on the org.apache.camel.builder.xml.XPathBuilder
logger.
When namespace logging is enabled, you will see log messages like the following for each processed message:
2012-01-16 13:23:45,878 [stSaxonWithFlag] INFO XPathBuilder - Namespaces discovered in message: {xmlns:a=[http://apache.org/camel], DEFAULT=[http://apache.org/default], xmlns:b=[http://apache.org/camelA, http://apache.org/camelB]}
32.2. XML DSL
Basic expressions
To evaluate an XPath expression in the XML DSL, put the XPath expression inside an xpath
element. The XPath expression is applied to the body of the current In message and returns an XML node (or node set). Typically, the returned XML node is automatically converted to a string.
For example, to extract the contents of the /person/name
element from the current In message body and use it to set a header named user
, you could define a route like the following:
<beans ...> <camelContext xmlns="http://camel.apache.org/schema/spring"> <route> <from uri="queue:foo"/> <setHeader headerName="user"> <xpath>/person/name/text()</xpath> </setHeader> <to uri="direct:tie"/> </route> </camelContext> </beans>
If you want to convert the result to a specific type, specify the result type by setting the resultType
attribute to a Java type name (where you must specify the fully-qualified type name). For example, to specify explicitly that the result type is java.lang.String
(you can omit the java.lang.
prefix here):
<xpath resultType="String">/person/name/text()</xpath>
Namespaces
When processing documents whose elements belong to one or more XML schemas, it is typically necessary to associate namespace URIs with prefixes, so that you can identify element names unambiguously in your XPath expressions. It is possible to use the standard XML mechanism for associating prefixes with namespace URIs. That is, you can set an attribute like this: xmlns:Prefix="NamespaceURI"
.
For example, to associate the prefix, cust
, with the namespace, http://acme.com/customer/record
, and then extract the contents of the element, /cust:person/cust:name
, you could define a route like the following:
<beans ...>
<camelContext xmlns="http://camel.apache.org/schema/spring"
xmlns:cust="http://acme.com/customer/record" >
<route>
<from uri="queue:foo"/>
<setHeader headerName="user">
<xpath>/cust:person/cust:name/text()</xpath>
</setHeader>
<to uri="direct:tie"/>
</route>
</camelContext>
</beans>
Auditing namespaces
One of the most frequent problems that can occur when using XPath expressions is that there is a mismatch between the namespaces appearing in the incoming messages and the namespaces used in the XPath expression. To help you troubleshoot this kind of problem, the XPath language supports an option to dump all of the namespaces from all of the incoming messages into the system log.
To enable namespace logging at the INFO
log level, enable the logNamespaces
option in the XML DSL, as follows:
<xpath logNamespaces="true" resultType="String">/foo:person/@id</xpath>
Alternatively, you could configure your logging system to enable TRACE
level logging on the org.apache.camel.builder.xml.XPathBuilder
logger.
When namespace logging is enabled, you will see log messages like the following for each processed message:
2012-01-16 13:23:45,878 [stSaxonWithFlag] INFO XPathBuilder - Namespaces discovered in message: {xmlns:a=[http://apache.org/camel], DEFAULT=[http://apache.org/default], xmlns:b=[http://apache.org/camelA, http://apache.org/camelB]}
32.3. XPath Injection
Parameter binding annotation
When using Apache Camel bean integration to invoke a method on a Java bean, you can use the @XPath
annotation to extract a value from the exchange and bind it to a method parameter.
For example, consider the following route fragment, which invokes the credit
method on an AccountService
object:
from("queue:payments") .beanRef("accountService","credit") ...
The credit
method uses parameter binding annotations to extract relevant data from the message body and inject it into its parameters, as follows:
public class AccountService { ... public void credit( @XPath("/transaction/transfer/receiver/text()") String name, @XPath("/transaction/transfer/amount/text()") String amount ) { ... } ... }
For more information, see Bean Integration in the Apache Camel Development Guide on the customer portal.
Namespaces
Table 32.1, “Predefined Namespaces for @XPath” shows the namespaces that are predefined for XPath. You can use these namespace prefixes in the XPath
expression that appears in the @XPath
annotation.
Namespace URI | Prefix |
---|---|
| |
|
Custom namespaces
You can use the @NamespacePrefix
annotation to define custom XML namespaces. Invoke the @NamespacePrefix
annotation to initialize the namespaces
argument of the @XPath
annotation. The namespaces defined by @NamespacePrefix
can then be used in the @XPath
annotation’s expression value.
For example, to associate the prefix, ex
, with the custom namespace, http://fusesource.com/examples
, invoke the @XPath
annotation as follows:
public class AccountService { ... public void credit( @XPath( value = "/ex:transaction/ex:transfer/ex:receiver/text()", namespaces = @NamespacePrefix( prefix = "ex", uri = "http://fusesource.com/examples" ) ) String name, @XPath( value = "/ex:transaction/ex:transfer/ex:amount/text()", namespaces = @NamespacePrefix( prefix = "ex", uri = "http://fusesource.com/examples" ) ) String amount, ) { ... } ... }
32.4. XPath Builder
Overview
The org.apache.camel.builder.xml.XPathBuilder
class enables you to evaluate XPath expressions independently of an exchange. That is, if you have an XML fragment from any source, you can use XPathBuilder
to evaluate an XPath expression on the XML fragment.
Matching expressions
Use the matches()
method to check whether one or more XML nodes can be found that match the given XPath expression. The basic syntax for matching an XPath expression using XPathBuilder
is as follows:
boolean matches = XPathBuilder .xpath("Expression") .matches(CamelContext, "XMLString");
Where the given expression, Expression, is evaluated against the XML fragment, XMLString, and the result is true, if at least one node is found that matches the expression. For example, the following example returns true
, because the XPath expression finds a match in the xyz
attribute.
boolean matches = XPathBuilder .xpath("/foo/bar/@xyz") .matches(getContext(), "<foo><bar xyz='cheese'/></foo>"));
Evaluating expressions
Use the evaluate()
method to return the contents of the first node that matches the given XPath expression. The basic syntax for evaluating an XPath expression using XPathBuilder
is as follows:
String nodeValue = XPathBuilder .xpath("Expression") .evaluate(CamelContext, "XMLString");
You can also specify the result type by passing the required type as the second argument to evaluate()
— for example:
String name = XPathBuilder .xpath("foo/bar") .evaluate(context, "<foo><bar>cheese</bar></foo>", String.class); Integer number = XPathBuilder .xpath("foo/bar") .evaluate(context, "<foo><bar>123</bar></foo>", Integer.class); Boolean bool = XPathBuilder .xpath("foo/bar") .evaluate(context, "<foo><bar>true</bar></foo>", Boolean.class);
32.5. Enabling Saxon
Prerequisites
A prerequisite for using the Saxon parser is that you add a dependency on the camel-saxon
artifact (either adding this dependency to your Maven POM, if you use Maven, or adding the camel-saxon-2.23.2.fuse-7_10_0-00018-redhat-00001.jar
file to your classpath, otherwise).
Using the Saxon parser in Java DSL
In Java DSL, the simplest way to enable the Saxon parser is to call the saxon()
fluent builder method. For example, you could invoke the Saxon parser as shown in the following example:
// Java // create a builder to evaluate the xpath using saxon XPathBuilder builder = XPathBuilder.xpath("tokenize(/foo/bar, '_')[2]").saxon(); // evaluate as a String result String result = builder.evaluate(context, "<foo><bar>abc_def_ghi</bar></foo>");
Using the Saxon parser in XML DSL
In XML DSL, the simplest way to enable the Saxon parser is to set the saxon
attribute to true in the xpath
element. For example, you could invoke the Saxon parser as shown in the following example:
<xpath saxon="true" resultType="java.lang.String">current-dateTime()</xpath>
Programming with Saxon
If you want to use the Saxon XML parser in your application code, you can create an instance of the Saxon transformer factory explicitly using the following code:
// Java import javax.xml.transform.TransformerFactory; import net.sf.saxon.TransformerFactoryImpl; ... TransformerFactory saxonFactory = new net.sf.saxon.TransformerFactoryImpl();
On the other hand, if you prefer to use the generic JAXP API to create a transformer factory instance, you must first set the javax.xml.transform.TransformerFactory
property in the ESBInstall/etc/system.properties
file, as follows:
javax.xml.transform.TransformerFactory=net.sf.saxon.TransformerFactoryImpl
You can then instantiate the Saxon factory using the generic JAXP API, as follows:
// Java import javax.xml.transform.TransformerFactory; ... TransformerFactory factory = TransformerFactory.newInstance();
If your application depends on any third-party libraries that use Saxon, it might be necessary to use the second, generic approach.
The Saxon library must be installed in the container as the OSGi bundle, net.sf.saxon/saxon9he
(normally installed by default). In versions of Fuse ESB prior to 7.1, it is not possible to load Saxon using the generic JAXP API.
32.6. Expressions
Result type
By default, an XPath expression returns a list of one or more XML nodes, of org.w3c.dom.NodeList
type. You can use the type converter mechanism to convert the result to a different type, however. In the Java DSL, you can specify the result type in the second argument of the xpath()
command. For example, to return the result of an XPath expression as a String
:
xpath("/person/name/text()", String.class)
In the XML DSL, you can specify the result type in the resultType
attribute, as follows:
<xpath resultType="java.lang.String">/person/name/text()</xpath>
Patterns in location paths
You can use the following patterns in XPath location paths:
/people/person
The basic location path specifies the nested location of a particular element. That is, the preceding location path would match the person element in the following XML fragment:
<people> <person>...</person> </people>
Note that this basic pattern can match multiple nodes — for example, if there is more than one
person
element inside thepeople
element./name/text()
-
If you just want to access the text inside by the element, append
/text()
to the location path, otherwise the node includes the element’s start and end tags (and these tags would be included when you convert the node to a string). /person/telephone/@isDayTime
To select the value of an attribute, AttributeName, use the syntax
@AttributeName
. For example, the preceding location path returnstrue
when applied to the following XML fragment:<person> <telephone isDayTime="true">1234567890</telephone> </person>
*
-
A wildcard that matches all elements in the specified scope. For example,
/people/person/\*
matches all the child elements ofperson
. @*
-
A wildcard that matches all attributes of the matched elements. For example,
/person/name/@\*
matches all attributes of every matchedname
element. //
Match the location path at every nesting level. For example, the
//name
pattern matches everyname
element highlighted in the following XML fragment:<invoice> <person> <name .../> </person> </invoice> <person> <name .../> </person> <name .../>
..
- Selects the parent of the current context node. Not normally useful in the Apache Camel XPath language, because the current context node is the document root, which has no parent.
node()
- Match any kind of node.
text()
- Match a text node.
comment()
- Match a comment node.
processing-instruction()
- Match a processing-instruction node.
Predicate filters
You can filter the set of nodes matching a location path by appending a predicate in square brackets, [Predicate]
. For example, you can select the Nth node from the list of matches by appending [N]
to a location path. The following expression selects the first matching person
element:
/people/person[1]
The following expression selects the second-last person
element:
/people/person[last()-1]
You can test the value of attributes in order to select elements with particular attribute values. The following expression selects the name
elements, whose surname
attribute is either Strachan or Davies:
/person/name[@surname="Strachan" or @surname="Davies"]
You can combine predicate expressions using any of the conjunctions and
, or
, not()
, and you can compare expressions using the comparators, =
, !=
, >
, >=
, <
, ⇐
(in practice, the less-than symbol must be replaced by the <
entity). You can also use XPath functions in the predicate filter.
Axes
When you consider the structure of an XML document, the root element contains a sequence of children, and some of those child elements contain further children, and so on. Looked at in this way, where nested elements are linked together by the child-of relationship, the whole XML document has the structure of a tree. Now, if you choose a particular node in this element tree (call it the context node), you might want to refer to different parts of the tree relative to the chosen node. For example, you might want to refer to the children of the context node, to the parent of the context node, or to all of the nodes that share the same parent as the context node (sibling nodes).
An XPath axis is used to specify the scope of a node match, restricting the search to a particular part of the node tree, relative to the current context node. The axis is attached as a prefix to the node name that you want to match, using the syntax, AxisType::MatchingNode
. For example, you can use the child::
axis to search the children of the current context node, as follows:
/invoice/items/child::item
The context node of child::item
is the items
element that is selected by the path, /invoice/items
. The child::
axis restricts the search to the children of the context node, items
, so that child::item
matches the children of items
that are named item
. As a matter of fact, the child::
axis is the default axis, so the preceding example can be written equivalently as:
/invoice/items/item
But there several other axes (13 in all), some of which you have already seen in abbreviated form: @
is an abbreviation of attribute::
, and //
is an abbreviation of descendant-or-self::
. The full list of axes is as follows (for details consult the reference below):
-
ancestor
-
ancestor-or-self
-
attribute
-
child
-
descendant
-
descendant-or-self
-
following
-
following-sibling
-
namespace
-
parent
-
preceding
-
preceding-sibling
-
self
Functions
XPath provides a small set of standard functions, which can be useful when evaluating predicates. For example, to select the last matching node from a node set, you can use the last() function, which returns the index of the last node in a node set, as follows:
/people/person[last()]
Where the preceding example selects the last person
element in a sequence (in document order).
For full details of all the functions that XPath provides, consult the reference below.
Reference
For full details of the XPath grammar, see the XML Path Language, Version 1.0 specification.
32.7. Predicates
Basic predicates
You can use xpath
in the Java DSL or the XML DSL in a context where a predicate is expected — for example, as the argument to a filter()
processor or as the argument to a when()
clause.
For example, the following route filters incoming messages, allowing a message to pass, only if the /person/city
element contains the value, London
:
from("direct:tie") .filter().xpath("/person/city = 'London'").to("file:target/messages/uk");
The following route evaluates the XPath predicate in a when()
clause:
from("direct:tie") .choice() .when(xpath("/person/city = 'London'")).to("file:target/messages/uk") .otherwise().to("file:target/messages/others");
XPath predicate operators
The XPath language supports the standard XPath predicate operators, as shown in Table 32.2, “Operators for the XPath Language”.
Operator | Description |
---|---|
| Equals. |
| Not equal to. |
| Greater than. |
| Greater than or equals. |
| Less than. |
| Less than or equals. |
| Combine two predicates with logical and. |
| Combine two predicates with logical inclusive or. |
| Negate predicate argument. |
32.8. Using Variables and Functions
Evaluating variables in a route
When evaluating XPath expressions inside a route, you can use XPath variables to access the contents of the current exchange, as well as O/S environment variables and Java system properties. The syntax to access a variable value is $VarName
or $Prefix:VarName
, if the variable is accessed through an XML namespace.
For example, you can access the In message’s body as $in:body
and the In message’s header value as $in:HeaderName
. O/S environment variables can be accessed as $env:EnvVar
and Java system properties can be accessed as $system:SysVar
.
In the following example, the first route extracts the value of the /person/city
element and inserts it into the city
header. The second route filters exchanges using the XPath expression, $in:city = 'London'
, where the $in:city
variable is replaced by the value of the city
header.
from("file:src/data?noop=true") .setHeader("city").xpath("/person/city/text()") .to("direct:tie"); from("direct:tie") .filter().xpath("$in:city = 'London'").to("file:target/messages/uk");
Evaluating functions in a route
In addition to the standard XPath functions, the XPath language defines additional functions. These additional functions (which are listed in Table 32.4, “XPath Custom Functions”) can be used to access the underlying exchange, to evaluate a simple expression or to look up a property in the Apache Camel property placeholder component.
For example, the following example uses the in:header()
function and the in:body()
function to access a head and the body from the underlying exchange:
from("direct:start").choice() .when().xpath("in:header('foo') = 'bar'").to("mock:x") .when().xpath("in:body() = '<two/>'").to("mock:y") .otherwise().to("mock:z");
Notice the similarity between theses functions and the corresponding in:HeaderName
or in:body
variables. The functions have a slightly different syntax however: in:header('HeaderName')
instead of in:HeaderName
; and in:body()
instead of in:body
.
Evaluating variables in XPathBuilder
You can also use variables in expressions that are evaluated using the XPathBuilder
class. In this case, you cannot use variables such as $in:body
or $in:HeaderName
, because there is no exchange object to evaluate against. But you can use variables that are defined inline using the variable(Name, Value)
fluent builder method.
For example, the following XPathBuilder construction evaluates the $test
variable, which is defined to have the value, London
:
String var = XPathBuilder.xpath("$test") .variable("test", "London") .evaluate(getContext(), "<name>foo</name>");
Note that variables defined in this way are automatically entered into the global namespace (for example, the variable, $test
, uses no prefix).
32.9. Variable Namespaces
Table of namespaces
Table 32.3, “XPath Variable Namespaces” shows the namespace URIs that are associated with the various namespace prefixes.
Namespace URI | Prefix | Description |
---|---|---|
None | Default namespace (associated with variables that have no namespace prefix). | |
| Used to reference header or body of the current exchange’s In message. | |
| Used to reference header or body of the current exchange’s Out message. | |
| Used to reference some custom functions. | |
| Used to reference O/S environment variables. | |
| Used to reference Java system properties. | |
Undefined | Used to reference exchange properties. You must define your own prefix for this namespace. |
32.10. Function Reference
Table of custom functions
Table 32.4, “XPath Custom Functions” shows the custom functions that you can use in Apache Camel XPath expressions. These functions can be used in addition to the standard XPath functions.
Function | Description |
---|---|
| Returns the In message body. |
| Returns the In message header with name, HeaderName. |
| Returns the Out message body. |
| Returns the Out message header with name, HeaderName. |
| Looks up a property with the key, PropKey . |
| Evaluates the specified simple expression, SimpleExp. |