XInclude and XML Compare
Introduction
XInclude is a W3C standard that can be used to include external XML documents (or parts of documents) as part of the content of another XML document using include statements to indicate where the external content should be added. The include statements are processed as part of the parsing stage and so by the time the XML reaches the XML Compare comparator input stage, the document should be indistinguishable from a document written as a single source. XML Compare comes with Apache Xerces as part of the distribution and by default, XInclude processing is switched off for document parsing. This sample shows you how to enable XInclude in your documents and explains how to get it working with the DXP pipelines using both the command line processor and the PipelinedComparator API.
Enabling XInclude on a DXP Pipeline
When using the Apache Xerces parser included in the XML Compare distribution, enabling XInclude in a DXP pipeline is simply a case of setting a parser feature. There are also two more parser features that can be used to configure the XInclude processing stage.
If you are not using Xerces as the parser during a comparison, please refer to your parser's documentation for information on enabling XInclude.
The parser feature used to enable XInclude is: http://apache.org/xml/features/xinclude
This should be set to true in the DXP file as follows:
<parserFeatures>
<feature name="http://apache.org/xml/features/xinclude" literalValue="true"/>
</parserFeatures>
If you wish to use a parameter to specify whether or not to enable XInclude, you must define a boolean parameter and refer to it. See Guide to DXP for more details.
<pipelineParameters>
<booleanParameter name="enable-xinclude" defaultValue="true"/>
...
</pipelineParameters>
...
<parserFeatures>
<feature name="http://apache.org/xml/features/xinclude" parameterRef="enable-xinclude"/>
</parserFeatures>
Configuration features
Apache Xerces also uses the following features for configuring XInclude (both are set to true
by default):
http://apache.org/xml/features/xinclude/fixup-base-uris - whether or not to add
xml:base
attributes to the included XML (see the XInclude spec for more details)http://apache.org/xml/features/xinclude/fixup-language - whether or not to add
xml:lang
attributes to the included XML (see the XInclude spec for more details
Running the pipeline using the command line tool
Once XInclude has been enabled in the DXP file, the pipeline can be run using the command line tool with no other configuration. As inputs are opened from files on disk, relative hrefs in the XInclude statements are resolved relative to the input file. To run the pipeline defined in the sample directory (using XInclude), run the following command from the sample directory, replacing x.y.z with the major.minor.patch version number of your release e.g. command-10.0.0.jar
java -jar ../../command-x.y.z.jar compare xinclude-demo inputs/input1-full.xml inputs/input2-includes.xml result.xml enable-xinclude=true
Running the pipeline using the PipelinedComparator API
A DXP file can be used to generate a pre-configured PipelinedComparator instance using the DXPConfiguration class. Once the PipelinedComparator has been created, the compare method can be run using a variety of different input Object types. When using XInclude, it is important to ensure that the input type used is one that either automatically sets the systemId for the input or allows a systemId to be specified. The systemId is essential for resolving relative hrefs specified in the XInclude statements.
The sample code included in Bitbucket (XincludeWithPipelinedComparator.java) shows how the use of a StreamSource input without a systemId causes the XInclude to fail and output its fallback. This is because the relative include was resolved from the current working directory, not from the location of the input. Setting systemIds on the StreamSource allows the relative href to be resolved correctly and the desired XInclude takes place successfully.
This is a slightly contrived example as it would make more sense to use File inputs directly in this instance (which automatically set the systemId) but it was written this way to illustrate the potential problems.
Running the sample code
The sample resources and a description on how to run it can be found at: https://bitbucket.org/deltaxml/using-xinclude.
The resources should be checked-out, cloned or downloaded and unzipped into the samples directory of the XML Compare release. They should be located such that they are two levels below the top level release directory, for example DeltaXML-XML-Compare-10_0_0_j/samples/using-xinclude.
Reference Material
http://www.w3.org/TR/xinclude/ - the W3C definition for the XInclude standard
http://xerces.apache.org/xerces2-j/features.html - Parser features available on Apache Xerces