Xerces Support
Introduction
For many years, the XML parser Apache Xerces has been distributed with all of our products. While not required, its status has been as our recommended XML parser. From the XML Compare 8.2 release, some new features require Xerces as the parser. As of XML Compare version 14.0 the minimum required version is 2.11.0 (previously 2.9.0). This is the version that is distributed with our product releases. This document outlines which functions require Xerces and gives details on how to use a different parser if you wish to do so.
Dependencies
Since XML Compare 14.2, the distributed version of Xerces (2.12.2) has required an additional dependency: xml-apis.jar
We ship the corresponding version of xml-apis.jar
with XML Compare, and if you want to use a different Xerces version you should always use the correct xml-apis.jar from that Xerces-J release (if it is necessary).
In previous releases, where Xerces-J version 2.9.0 was provided, this file was not required, and it was preferable to make use of the JAXP interfaces provided by the Java JDK/JRE. However, changes to the way in which Xerces is shipped means this jar file is now required at runtime (but not compile time, where the JDK is sufficient).
New Features
DCP - Document Comparator Pipeline
DCP is the DocumentComparator's equivalent of the PipelinedComparator's DXP, although it is defined as an XML Schema rather than as a DTD. Processing of the schema-defined XML files makes use of some validation features that are only available in Xerces 2.9.0 and above.
Whitespace Handling
XML Compare versions 8.2 and above include improvements to the handling of whitespace. A requirement for this improved functionality was to add the detection of 'ignorable whitespace' to the existing com.deltaxml.pipe.filters.LexicalPreservation
code. This detection requires access to methods available only in Xerces 2.9.0 and above.
Configuration changes
Configuration Properties
The com.deltaxml.config.JavaPlatform.useSAXParserFactory
property is now set to 'false
' by default. This means that when a SAXParser
is created inside XML Compare, it is done using an explicit class name that loads Xerces.
Classpath
While the jar files for the command-line tool and the GUI have always contained a classpath containing Xerces, deltaxml.jar
did not. This has now been updated to include xercesImpl.jar
on paths including deltaxml.jar
.
What can I do without Xerces?
While use of the com.deltaxml.cores9api.PipelinedComparatorS9
and com.deltaxml.cores9api.DocumentComparator
classes now requires Xerces, it is still possible to use some of our legacy classes without requiring Xerces.
The com.deltaxml.core.PipelinedComparator
can still be used without Xerces as the parser as long as LexicalPreservation
functionality is not used. If it is, a ClassNotFoundException
will be thrown with details about the required Xerces classes. To configure the use of a different parser, please see the 'Using a different parser' section below.
Using a different parser
Three changes muct be made in order to use a different parser:
Update the configuration properties to use the SAXParserFactory
Remove or rename the existing
xercesImpl.jar
Optionally add a different parser to the classpath
1. The com.deltaxml.config.JavaPlatform.useSAXParserFactory
must be set to true. See Configuration Properties for more information on how to do this.
2. The existing xercesImpl.jar
will still be loaded from the deltaxml.jar
classpath. In order to stop this from occurring, either move or rename the xercesImpl.jar
in the XML Compare release directory.
3. This stage is optional. Completing the steps above will cause the JVM's internal parser to be used (this is not recommended). To replace it with a different parser, make the relevant jar files available on the classpath. If the jar files do not advertise themselves as implementing a SAXParserFactory
, you will need to set up the JVM property appropriately.