Main concepts
Rules attributes
Shortcut notations
The XML Template Language of Lavoisier aims at transforming the input XML data stream into a new output XML data stream, by writing rules rather than writing imperative instructions. This language allows a better maintainability of your application, as well as a better performance because it enables Lavoisier to process the data stream on-the-fly without building big in-memory data-structures.
An XML Template is itself written in XML, following the syntax described below.
An XML Template is a tree of rules, that follows the hierarchy of the nodes in the input XML data stream. You can chain several templates under the tag <processors>.
The core of the XML template language is composed of a few keywords:
The 13 rule types are obtained by combining these keywords:
create | ignore | keep/update | |
---|---|---|---|
element | <element-create> <element-create-as-parent> |
<element-ignore> | <element> |
attribute | <attribute-create> | <attribute-ignore> | <attribute> |
text | <text-create> | <text-ignore> | <text> |
comment | <comment-create> | <comment-ignore> | <comment> |
As with XML language, the rules <element-*> (except <element-create>) can contain child rules, while the other rules (<attribute-*>, <text-*>, <comment-*>) can not contain any child rule.
In each rule, the predicate and value of created or modified nodes is obtained via the evaluation of an expression written in the XPath language.
This language allows to define a path into a XML document, like you would describe a path into a file system:
cd /usr/local/lib cd ../include ls .
The main differences with file system paths are:
These few differences with file system paths makes XPath a language far more powerful than you may think!
The relative position of a rule is the same as the relative position of its matching nodes. Then, the context of a rule is its matching node, and any XPath expression defined in a rule is relative to this matching node (unless it starts with character '/' or course).
As a consequence, accessing to the child nodes of the parent of the context node requires to first navigate up to this parent element (..). This also apply to the nodes being created:
<element in="root"> <attribute in="anAttributeOfRoot">../text() + 1</attribute> <attribute-create out="anotherAttributeOfRoot">../text() + 2</attribute-create> </element>
Relative paths are of course not restricted to the parent axis. Although some of them (in particular the axis "preceding" and "following") may have a significant impact on the size of the data-structure needed for Lavoisier to execute the rule, any XPath axis can be used:
<element in="root"> <element in="node" out="hasDescendantXXX" if="descendant::XXX"/> </element>
This context is kept unchanged within the current template. Although the processing is done on-the-fly, the result of its execution (i.e. the modified context) is only seen after the template (for example in the next template or adaptor).
In other words, a rule does not modify the context of the other rules of the same template:
<element in="root"> <attribute in="anAttributeOfRoot">'newValue'</attribute> <element in="node"> <!-- will take the old value of attribute @anAttributeOfRoot rather than its new value 'newValue' --> <attribute in="anAttributeOfNode">ancestor::root/@anAttributeOfRoot</attribute> </element> </element>
As a consequence, writing a rule that gets data from node(s) removed within the same template makes sense, and it will work:
<element in="root"> <element-ignore in="node"/> <element-create>new_element('new', ../node/@anAttributeOfNode)</element-create> </element>
The order of the rules may impact the choice of the selected rule for a given node. Indeed, when several rules match the current node, then the first one will be chosen. If the rules are exclusives (i.e. there is only 1 possible matching rule per node), then the order does not matter.
<element in="root"> <element in="node" out="hasChild" if="*"/> <element in="node" out="isLeaf" if="not(*)"/> </element>
<element in="root"> <element in="node" out="isLeaf" if="not(*)"/> <element in="node" out="hasChild" if="*"/> </element>
The order of the rules does not impact the order in which they are executed. Indeed, the rules are executed in the order of the matching nodes in the input XML stream.
An additional keyword allows for setting a variable to be used within the template: <set>
Supported text/attribute nodes are:
As we would expect, the scope of this variable is the subtree under the node for which the variable has been set.
These attributes contain an XPath expression:
All the rules except <element> and <element-ignore> take an XPath expression as their text node.
Note that setting a literal value in a field that expects an XPath expression requires this value to be nested into ' or ":
<element in="root"> <attribute in="myAttribute">'this is the new value'</attribute> </element>
The following attributes do not contain an XPath expression:
<processors xmlns:db="http://docbook.org/ns/docbook"> <element in="db:article"/> </processors>
<processors xmlns:db="http://docbook.org/ns/docbook" xmlns:x="http://www.w3.org/1999/xhtml"> <element in="db:article" out="x:html"/> </processors>
<element in="root"> <element-create as="first-child">new_element('first-child')</element-create> <element-create as="preceding-sibling">new_element('preceding-sibling')</element-create> <element in="node"/> <element-create as="following-sibling">new_element('following-sibling')</element-create> <element-create-as-parent out="parent"> <element in="ex-last-node"/> </element-create-as-parent> <element-create as="last-child">new_element('last-child')</element-create> </element>
<root> <first-child/> <ex-first-node/> <preceding-sibling/> <node/> <following-sibling/> <parent> <ex-last-node/> </parent> <last-child/> </root>
<element in="person" if="@birthday='true'" recursive="true"> <attribute in="age">. + 1</attribute> </element>
<element in="person"> <element in="person" out="descendant" flat="true" recursive="true"/> </element>
<element in="root"> <element in="node" future="true"> <attribute in="anAttributeOfNode">eval($generatedXPath)</attribute> </element> </element>
<element in="root"> <element in="node"> <element in="leaf"/> </element> </element>
<element-ignore in="root"> <element-ignore in="node"> <element-ignore in="leaf"/> </element-ignore> </element-ignore>