Overview
XPath functions
XPath functions specific to Lavoisier
Recommendations for writing efficient XPath expressions
Any XPath expression that is evaluated by an adaptor MUST be set in attribute @eval (or @match in case of processor adaptors) of element <parameter> rather than in the text() node, else it would be considered as a constant value rather than a XPath expression.
This enables Lavoisier to hide the complexity induced by these 2 different evaluation contexts (i.e. one for view invocation + one for each selected XML event). This also enables Lavoisier to optimize the execution of the XPath expression by exploding it into several expressions, which are then distributed between these 2 contexts. In particular, function view() needs to be invoked only once per view invocation, while the relative paths must be invoked for each selected node.
The @match attribute always expects an absolute path, and the default path is /*. The supported values of a parameter depends on the expected type declared for this parameter:
absolute XPath parameter | relative XPath parameter | XPath expression parameter | non-XPath parameter | |||||
ALLOWED CONTENT | @match='/...' | @match='eval()' | @eval | text() | @eval | text() | @eval | text() |
---|---|---|---|---|---|---|---|---|
/absolute | OK | / | / | / | / | / | / | / |
./relative | / | / | OK | / | OK | / | / | / |
(expr with ./relative) | / | / | / | / | OK | / | / | / |
(expr w/o ./relative) | / | OK | / | / | OK | / | OK | / |
constant | / | / | / | / | / | OK | / | OK |
Attribute @match of <processor> aims at selecting nodes from the input stream of XML events.
In order to save memory and CPU usage, Lavoisier implements its own XPath engine, which is able to evaluate absolute XPath expressions without generating a huge data structure for most common use-cases.
Evaluating XPath on XML events stream rather than data structure enables processing large amount of data, but it of course also implies some constraints.
When you are processing XML events, you can not see the events that will be processed in the future. As a consequence, accessing to a future event requires that you force the Lavoisier XPath engine to read these events before processing the selected node. You can do this by simply adding any predicate on a future event on the selected node or one of its ancestors, as shown in the following example:
<view name="xpath_record"> <connector type="XMLConnector"> <parameter name="content"><[CDATA[ <root> <child><leaf id="?"/><id>one</id></child> <child><leaf id="?"/><id>two</id></child> </root> ]]></parameter> </connector> <processors> <processor match="/root/child[*]/leaf/@id" type="ReplaceProcessor"> <parameter name="nodes" eval="parent::leaf/following-sibling::id/text()"/> </processor> </processors> </view>
<XPath xmlns="http://www.w3.org/TR/xpath"> <element depth="3" localName="leaf"> <predicate>self::leaf/parent::child/parent::root[not(parent::*)]</predicate> <attribute localName="id"></attribute> </element> </XPath>
<XPath xmlns="http://www.w3.org/TR/xpath"> <tree nodes="self::child[child::*]/child::leaf/attribute::id" depth="2" localName="child"> <predicate>self::child/parent::root[not(parent::*)]</predicate> </tree> </XPath>
Although Lavoisier tries to offer an homogeneous configuration language by supporting the same query language (XPath) in all different contexts, all the functions can not be available in all of them.
ALLOWED FUNCTIONS | @eval | @match | <pre-renderers> | XSLTConnector |
---|---|---|---|---|
Core functions | OK | OK in predicates | OK | OK |
EXSLT functions | OK | OK in predicates | subset on server
not available on browser |
subset |
Lavoisier functions | OK | OK in predicates | / | / |
Custom functions | OK | OK in predicates | / | / |
<view name="UTC-now" xmlns:date="http://exslt.org/dates-and-times"> <argument name="format">yyyy-MM-dd HH:mm:ss z</argument> <variable name="decalage" eval="date:format-date(date:date-time(),'X')"/> <connector type="StringConnector"> <parameter name="content" eval="date:format-date(date:add( date:date-time(), concat(choose(starts-with($decalage,'+'),'-',''),'PT',substring($decalage,2),'H') ), $format)"/> </connector> <serializer type="EncapsulateSerializer"/> </view>
view('UTC-now/HH:mm:ss')
Since Lavoisier does not (yet) have an optimized to modify its execution plan, the way you write your XPath expression may have a strong impact on performance. In case of performance issue, you may have to rewrite your XPath expression to make its evaluation more efficient.
Then you can reduce memory usage of your XPath expression by trying to:
You can also reduce CPU usage of your XPath expression by trying to:
The example below shows 7 different XPath expressions that give the same result, but they have different execution plans and of course different efficiencies. Fortunately, the simplest XPath is often the most efficient one as well.
//*[local-name()='element']/son[@id='1']/parent::*/@*[starts-with(parent::*/@foo,'bar') and local-name()='attr' and .>3]
<XPath xmlns="http://www.w3.org/TR/xpath"> <tree nodes="/child::*[local-name() = 'element']/child::son[attribute::id = '1']/parent::*/attribute::*[starts-with(parent::*/attribute::foo,'bar') and local-name() = 'attr' and . > 3.0]" depth="1" localName="*"></tree> </XPath>
/*/*/*[local-name()='element' and son/@id='1']/@*[starts-with(parent::*/@foo,'bar') and local-name()='attr' and .>3]
<XPath xmlns="http://www.w3.org/TR/xpath"> <tree nodes="self::*[local-name() = 'element' and child::son/attribute::id = '1']/attribute::*[starts-with(parent::*/attribute::foo,'bar') and local-name() = 'attr' and . > 3.0]" depth="3" localName="*"> <predicate>self::*/parent::*/parent::*[not(parent::*)]</predicate> </tree> </XPath>
//*[local-name()='element'][son/@id='1']/@*[starts-with(parent::*/@foo,'bar') and local-name()='attr' and .>3]
<XPath xmlns="http://www.w3.org/TR/xpath"> <tree nodes="self::*[local-name() = 'element'][child::son/attribute::id = '1']/attribute::*[starts-with(parent::*/attribute::foo,'bar') and local-name() = 'attr' and . > 3.0]" depth="1" localName="*"> <predicate>self::*[not(parent::*)]</predicate> <predicate>local-name() = 'element'</predicate> </tree> </XPath>
//*[local-name()='element']/@*[starts-with(parent::*/@foo,'bar') and local-name()='attr' and .>3]
<XPath xmlns="http://www.w3.org/TR/xpath"> <element depth="1" localName="*"> <predicate>self::*[not(parent::*)]</predicate> <predicate>local-name() = 'element'</predicate> <attribute localName="*"> <predicate>starts-with(parent::*/attribute::foo,'bar') and local-name() = 'attr' and . > 3.0</predicate> </attribute> </element> </XPath>
/*/*/*[local-name()='element']/@*[starts-with(parent::*/@foo,'bar') and local-name()='attr' and .>3]
<XPath xmlns="http://www.w3.org/TR/xpath"> <element depth="3" localName="*"> <predicate>self::*/parent::*/parent::*[not(parent::*)]</predicate> <predicate>local-name() = 'element'</predicate> <attribute localName="*"> <predicate>starts-with(parent::*/attribute::foo,'bar') and local-name() = 'attr' and . > 3.0</predicate> </attribute> </element> </XPath>
/*/*/*[local-name()='element' and starts-with(@foo,'bar')]/@*[local-name()='attr' and .>3]
<XPath xmlns="http://www.w3.org/TR/xpath"> <element depth="3" localName="*"> <predicate>self::*/parent::*/parent::*[not(parent::*)]</predicate> <predicate>local-name() = 'element' and starts-with(attribute::foo,'bar')</predicate> <attribute localName="*"> <predicate>local-name() = 'attr' and . > 3.0</predicate> </attribute> </element> </XPath>
/*/*/element[starts-with(@foo,'bar')]/@attr[.>3]
<XPath xmlns="http://www.w3.org/TR/xpath"> <element depth="3" localName="element"> <predicate>self::element/parent::*/parent::*[not(parent::*)]</predicate> <predicate>starts-with(attribute::foo,'bar')</predicate> <attribute localName="attr"> <predicate>. > 3.0</predicate> </attribute> </element> </XPath>