Xerces Support

 Table of Contents

1. Introduction

For many years, the XML parser Apache Xerces has been distributed with all of our products. While not required, its status has been as our recommended XML parser. From the XML Compare 8.2 release, some new features require Xerces as the parser. As of XML Compare version 9.0 the minimum required version is 2.11.0 (previously 2.9.0). This is the version that is distributed with our product releases. This document outlines which functions require Xerces and gives details on how to use a different parser if you wish to do so.

2. Dependencies

Since XML Compare 9.0, the distributed version of Xerces (2.11.0) has required an additional dependency: xml-apis.jar

We ship the corresponding version of xml-apis.jar with XML Compare, and if you want to use a different Xerces version you should always use the correct xml-apis.jar from that Xerces-J release (if it is necessary).

In previous releases, where Xerces-J version 2.9.0 was provided, this file was not required, and it was preferable to make use of the JAXP interfaces provided by the Java JDK/JRE. However, changes to the way in which Xerces is shipped means this jar file is now required at runtime (but not compile time, where the JDK is sufficient).

3. New Features

3.1. DCP - Document Comparator Pipeline

DCP is the DocumentComparator's equivalent of the PipelinedComparator's DXP, although it is defined as an XML Schema rather than as a DTD. Processing of the schema-defined XML files makes use of some validation features that are only available in Xerces 2.9.0 and above.

3.2. Whitespace Handling

XML Compare versions 8.2 and above include improvements to the handling of whitespace. A requirement for this improved functionality was to add the detection of 'ignorable whitespace' to the existing com.deltaxml.pipe.filters.LexicalPreservation code. This detection requires access to methods available only in Xerces 2.9.0 and above.

4. Configuration changes

4.1. Configuration Properties

The com.deltaxml.config.JavaPlatform.useSAXParserFactory property is now set to 'false' by default. This means that when a SAXParser is created inside XML Compare, it is done using an explicit class name that loads Xerces.

In the .NET API, the explicit instantiation of a Xerces class is the only option available.

4.2. Classpath

While the jar files for the command-line tool and the GUI have always contained a classpath containing Xerces, deltaxml.jar did not. This has now been updated to include xercesImpl.jar on paths including deltaxml.jar.

5. What can I do without Xerces?

While use of the com.deltaxml.cores9api.PipelinedComparatorS9 and com.deltaxml.cores9api.DocumentComparator classes now requires Xerces, it is still possible to use some of our legacy classes without requiring Xerces.

The com.deltaxml.core.PipelinedComparator can still be used without Xerces as the parser as long as LexicalPreservation functionality is not used. If it is, a ClassNotFoundException will be thrown with details about the required Xerces classes. To configure the use of a different parser, please see the ' Using a different parser' section below.

6. Using a different parser

Three changes muct be made in order to use a different parser:

  1. Update the configuration properties to use the SAXParserFactory
  2. Remove or rename the existing xercesImpl.jar
  3. Optionally add a different parser to the classpath


1. The com.deltaxml.config.JavaPlatform.useSAXParserFactory must be set to true. See Configuration Properties for more information on how to do this.

2. The existing xercesImpl.jar will still be loaded from the deltaxml.jar classpath. In order to stop this from occurring, either move or rename the xercesImpl.jar in the XML Compare release directory.

3. This stage is optional. Completing the steps above will cause the JVM's internal parser to be used (this is not recommended). To replace it with a different parser, make the relevant jar files available on the classpath. If the jar files do not advertise themselves as implementing a SAXParserFactory, you will need to set up the JVM property appropriately.

#content .code