Skip to main content
Skip table of contents

Ignoring Changes

Introduction

This document describes the concepts behind ignoring changes. For the resources associated with this sample, see the Bitbucket repo, here.

The XML files that you are comparing may contain data that you expect to change. You may wish to ignore these changes. From release 5.1 of XML Compare, XSLT filters are provided to allow you to ignore selected changes. This makes it easy to generate some forms of output from the delta file. 

What does "ignore" really mean?

First, we need to ask the question: What is meant by "ignore"?

Consider this very simple example of attribute change:

Input 1:

XML
<x y='1'/> 

Input 2:

XML
<x y='2'/>

Ignore could mean:

  1. remove it completely from the result: <x/>

  2. prefer the 'A' or 'old' value: <x y='1'/>

  3. prefer the 'B' or 'new' value: <x y='2'/>

  4. take the average of any values with numerical/time data types: <x y='1.5'/>

  5. put in a difference marker: <x y='changed'/>

  6. find some way to represent them both: <x y='1|2'/> 

All of these approaches are possible using an output filter, however this document will concentrate on a generic approach and describe filters included in XML Compare since release 5.1 which implement the first three strategies above.

Example data

This document discusses how you might handle merges using two sets of input data; one data-centric and one document-centric. Two practical solutions are presented, one for each input data set, with each solution using a different comparator and method for customising a comparison:

  • Pipelined Comparator (DXP) - Uses a filter pipeline defined by an XML file called a 'DXP' to customise the comparison.

  • Document Comparator - Uses Java API calls to customise a pre-existing pipeline with a number of extension points. The Document Comparator provides a solution tailored to comparing structured documents.

Pipelined Comparator

Imagine comparing the following two inputs, with the intention of ignoring the change made to the lastUpdated attribute:

Example 1.1: a small address book as an XML file (documentA.xml in the sample on Bitbucket)

XML
<addressBook>
  <person lastUpdated="01012008">
    <log/>
    <name>Joe Blogs</name>
    <telephone>01234 567890</telephone>
    <email>joe@blogs.com</email>
  </person>
</addressBook>

Example 2.1: an updated version of the address book (documentB.xml in the sample on Bitbucket)

XML
<addressBook>
  <person lastUpdated="01022008">
    <log>
      <lastLoggedIn>01032008</lastLoggedIn>
    </log>
    <name>Joe Blogs</name>
    <telephone>01235 467890</telephone>
    <email>joe@blogs.co.uk</email>
  </person>
</addressBook>

XML Compare will produce the following delta:

XML
<addressBook xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
             xmlns:dxx="http://www.deltaxml.com/ns/xml-namespaced-attribute"
             xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
             deltaxml:deltaV2="A!=B"
             deltaxml:version="2.0"
             deltaxml:content-type="full-context">
   <person deltaxml:deltaV2="A!=B">
      <deltaxml:attributes deltaxml:deltaV2="A!=B">
         <dxa:lastUpdated deltaxml:deltaV2="A!=B">
            <deltaxml:attributeValue deltaxml:deltaV2="A">01012008</deltaxml:attributeValue>
            <deltaxml:attributeValue deltaxml:deltaV2="B">01022008</deltaxml:attributeValue>
         </dxa:lastUpdated>
      </deltaxml:attributes>
      <log deltaxml:deltaV2="A!=B">
         <lastLoggedIn deltaxml:deltaV2="B">01032008</lastLoggedIn>
      </log>
      <name deltaxml:deltaV2="A=B">Joe Blogs</name>
      <telephone deltaxml:deltaV2="A!=B">
         <deltaxml:textGroup deltaxml:deltaV2="A!=B">
            <deltaxml:text deltaxml:deltaV2="A">01234 567890</deltaxml:text>
            <deltaxml:text deltaxml:deltaV2="B">01235 467890</deltaxml:text>
         </deltaxml:textGroup>
      </telephone>
      <email deltaxml:deltaV2="A!=B">
         <deltaxml:textGroup deltaxml:deltaV2="A!=B">
            <deltaxml:text deltaxml:deltaV2="A">joe@blogs.com</deltaxml:text>
            <deltaxml:text deltaxml:deltaV2="B">joe@blogs.co.uk</deltaxml:text>
         </deltaxml:textGroup>
      </email>
   </person>
</addressBook>

This shows the changes represented in our deltaV2 format. While this may look overly complicated for such a simple change, it makes the job of processing it considerably easier. A side-effect of attribute changes being represented as elements is the addition of the dxa namespace, this is due to the namespace of a non-qualified attribute not being that of the document but an anonymous one and so this anonymous namespace needs to be represented. 

Document Comparator

Imagine comparing the following two inputs, with the intention of ignoring the change made to the revision attribute of the author, and also the date elements:

Example 1.2: the author information from a DocBook file (document/documentA.xml in the sample on Bitbucket)

XML
<article xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"
         version="5.0">
  <info>
    <title>Ignore Changes Sample</title>
      <author revision="1.0">
        <personname>Joe Bloggs</personname>
        <address>
          <phone>+44 200 1234 567</phone> 
          <email>joe@blogs.com</email>
        </address>
        <personblurb><info></info><para></para></personblurb>
      </author>
  </info>
  <sect1>
    <title>Ignore Changes</title>
    <para><date>20141229</date>The input document for the ignore changes sample.</para>
  </sect1>
</article>

Example 2.2: an updated version of the author information with changed telephone numbers and updated dates ( document/documentB.xml in the sample on Bitbucket)

XML
<article xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"
         version="5.0">
  <info>
    <title>Ignore Changes Sample</title>
      <author revision="1.1">
        <personname>Joe Bloggs</personname>
        <address>
          <phone>+44 200 1235 890</phone> 
          <email>joe@blogs.co.uk</email>
        </address>
        <personblurb><info><date>01032008</date></info><para></para></personblurb>
      </author>
  </info>
  <sect1>
    <title>Ignore Changes</title>
    <para><date>20150105</date>The input document for the ignore changes sample.</para>
  </sect1>
</article>

XML Compare will produce the following delta:

XML
<article xmlns="http://docbook.org/ns/docbook"
  xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1" deltaxml:deltaV2="A!=B"
  deltaxml:word-by-word="false" version="5.0" deltaxml:version="2.1"
  deltaxml:content-type="full-context">
  <preserve:xmldecl xmlns:preserve="http://www.deltaxml.com/ns/preserve" deltaxml:ignore-changes="B"
    deltaxml:deltaV2="A=B" xml-version="1.0" encoding="UTF-8"/>
  <info deltaxml:deltaV2="A!=B">
    <title deltaxml:deltaV2="A=B">Ignore Changes Sample</title>
    <author deltaxml:deltaV2="A!=B">
      <deltaxml:attributes deltaxml:deltaV2="A!=B">
        <dxa:revision xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
          deltaxml:deltaV2="A!=B">
          <deltaxml:attributeValue deltaxml:deltaV2="A">1.0</deltaxml:attributeValue>
          <deltaxml:attributeValue deltaxml:deltaV2="B">1.1</deltaxml:attributeValue>
        </dxa:revision>
      </deltaxml:attributes>
      <personname deltaxml:deltaV2="A=B">Joe Bloggs</personname>
      <address deltaxml:deltaV2="A!=B">
      <phone deltaxml:deltaV2="A!=B">
        <deltaxml:textGroup deltaxml:deltaV2="A!=B">
          <deltaxml:text deltaxml:deltaV2="A">+44 200 1234 567</deltaxml:text>
          <deltaxml:text deltaxml:deltaV2="B">+44 200 1235 890</deltaxml:text>
        </deltaxml:textGroup>
        </phone>
        <email deltaxml:deltaV2="A!=B">
          <deltaxml:textGroup deltaxml:deltaV2="A!=B">
            <deltaxml:text deltaxml:deltaV2="A">joe@blogs.com</deltaxml:text>
            <deltaxml:text deltaxml:deltaV2="B">joe@blogs.co.uk</deltaxml:text>
          </deltaxml:textGroup>
          </email></address>
      <personblurb deltaxml:deltaV2="A!=B">
        <info deltaxml:deltaV2="A!=B">
          <date deltaxml:deltaV2="B">01032008</date>
        </info>
        <para deltaxml:deltaV2="A=B"/>
      </personblurb>
    </author>
  </info>
  <sect1 deltaxml:deltaV2="A!=B">
    <title deltaxml:deltaV2="A=B">Ignore Changes</title>
    <para deltaxml:deltaV2="A!=B">
      <date deltaxml:deltaV2="A!=B">
        <deltaxml:textGroup deltaxml:deltaV2="A!=B">
          <deltaxml:text deltaxml:deltaV2="A">20141229</deltaxml:text>
          <deltaxml:text deltaxml:deltaV2="B">20150105</deltaxml:text>
        </deltaxml:textGroup>
      </date>The input document for the ignore changes sample.</para>
  </sect1>
</article>

This is the changes represented in our deltaV2 format. While this may look overly complicated for such a simple change, it makes our job of processing it a lot easier. A side-effect of attribute changes being represented as elements is the addition of the dxa namespace, this is due to the namespace of a non-qualified attribute not being that of the document but an anonymous one and so this anonymous namespace needs to be represented. The implication of this is that when promoting this attribute we need to make sure that attribute gets placed in the correct namespace.

Marking data that needs to be ignored

Next we need to mark our data to be ignored, this is achieved by placing the deltaxml:ignore-changes attribute on the following:

  • to ignore an attribute change: on the appropriate child of deltaxml:attributes which is representing the attribute you wish to ignore,

  • to ignore a sub-tree change: on the top most node in the sub-tree with a deltaxml:deltaV2 attribute,

  • to ignore a text change: on the deltaxml:textGroup.

By placing the deltaxml:ignore-changes='B,A' attribute, you’re instructing apply-ignore-changes XSLT to change the delta of the modification to be unchanged and to copy the new (B) version. If there is no new version (i.e. in the case of a deletion) the old (A) version is used. This behaviour can be controlled by using a different value for the deltaxml:ignore-changes attribute, the legal values are shown below:

deltaxml:ignore-changes Value

Description

"B,A" or "true"

Default. Copy new value if it exists, otherwise copy old value.

"A,B"

Copy old value if it exists, otherwise copy new value.

"A"

Copy old value if it exists, otherwise don’t output

"B"

Copy new value if it exists, otherwise don’t output

""

Don’t copy under any circumstances (but process the subtree if present).

The ignore-changes attribute can be added using an XSLT stylesheet.

Note that if you want to ignore specific changes to comments or processing instructions, you will need to change the lexical preservation settings on the Comparator. See the Preserving Processing Instructions and Comments sample for more information.

Pipelined Comparator

An example for ignoring changes to the lastUpdated attribute and lastLoggedIn element is included below.

Example 3.1: an XSLT stylesheet to mark parts of the address book to be ignored ( mark-ignore-changes.xsl in the sample on Bitbucket)

XML
<xsl:stylesheet version="2.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
                xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1">
  
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="deltaxml:attributes/dxa:lastUpdated">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'B,A'"/>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="log/lastLoggedIn[@deltaxml:deltaV2]">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'B,A'"></xsl:attribute>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

Document Comparator

An example for ignoring changes to the version attribute and date elements is included below.

Example 3.2: an XSLT stylesheet to mark parts of the DocBook document to be ignored ( document/mark-ignore-changes.xsl in the sample on Bitbucket)

XML
<xsl:stylesheet version="2.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
                xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
                xmlns:docbook="http://docbook.org/ns/docbook"
  >
  
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="deltaxml:attributes/dxa:revision">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'true'"/>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="docbook:personblurb/docbook:info[@deltaxml:deltaV2]">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'true'"></xsl:attribute>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="docbook:para/docbook:date[@deltaxml:deltaV2]">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="''"></xsl:attribute>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

After the delta has been marked with the changes that should be ignored, using a filter similar to the one above, running apply-ignore-changes.xsl and then propagate-ignore-changes.xsl will process the delta, ignoring the marked data. The filter dx2-extract-version-moded.xsl is imported by apply-ignore-changes.xsl. All of these filters are supplied with versions of XML Compare 5.1 and later.

The examples used in this document are available for your own experimentation in the Ignoring Changes repo on Bitbucket (suitable for versions 5.1 and above). The sample shows how to ignore both element and attribute change and provides two examples - one using the Pipelined Comparator and one using the Document Comparator - of how to construct the pipeline of appropriate output filters described here.

Running the sample code

For the resources associated with this sample, see the Bitbucket repo, here.

Download the sample resources into the XML Compare release directory under the samples directory. The resources should be located such that they are two levels below the top level release directory that contains the jar files. For example DeltaXML-XML-Compare-10_0_0_j/samples/IgnoreChanges

Full instructions for running the sample are given in the file README.md file which is displayed under the source in Bitbucket.

Ignore processing in further detail

This section provides some rules and further details about how ignore change processing and particularly how the apply-ignore-changes.xsl filter works.

Every element in the post-comparison XML tree has an 'effective' deltaxml:deltaV2 attribute which (a) specifies which of the inputs it was present in and (b) whether or not the elements were identical, if present in both inputs. The word effective is used because if you are in an unchanged, added or deleted sub-tree the deltaV2 attribute may only be on an ancestor element.

An element may also have an ancestor ignore-changes attribute, the closest ancestor is used when determining whether an element is included in the result.

Like most filters, some data flows through unaffected. In this case, if an element does not have an ancestor ignore-changes attribute it is copied to the result as-is.

When it does have an ancestor ignore-changes attribute, the following table specifies whether that element appears in the result:

delta/ignore-changes

''

A

B

A,B

B,A/true

A

-

-

B

-

-

A=B

-

A!=B

-

The only difference in behaviour for A,B vs. B,A occurs at the leaves of the XML tree (i.e. for changed text and attributes).  When there are two possible text values in a textGroup or two possible attribute values then the choice between these settings determines which of two values is used in the result.

Ignore changes and attributes

There are some issues related to the closest ancestor rule outlined above when considering attributes.  Attributes need to be attached to their parent element.  If the ignore-change settings specify that an element is not included, neither are any of its attributes irrespective of their ignore change settings. Here is an example:

XML
<x deltaxml:deltaV2='A!=B' deltaxml:ignore-changes=''>
  <deltaxml:attributes deltaxml:deltaV2='A!=B'>
    <dxa:y deltaxml:deltaV2='A!=B' deltaxml:ignore-changes='B'>
      <deltaxml:attribute deltaxml:deltaV2="A">12</deltaxml:attribute>
      <deltaxml:attribute deltaxml:deltaV2="B">24</deltaxml:attribute>
    </dxa:y>
  </deltaxml:attributes>
</x>

Normally we would expect y='24' to appear in the result if we look solely at the attribute and its local ignore-changes and deltaV2 attributes. However, the ignore-changes setting on the element x means that the attribute has lost its associated parent element and therefore cannot appear in the result.

Ignore changes and element removal

It is possible to use ignore changes at the element level as well as for simple attribute and text data. This is used for merging as discussed below and can also be used to remove elements from the result.  Here are two examples, firstly removing a child element:

XML
<x deltaxml:ignore-changes="true" deltaxml:deltaV2="A!=B">
  <y deltaxml:deltaV2="A">
     <z deltaxml:ignore-changes='B'/>
  </y>
</x>

In the above example the ignore-changes setting prevents the z element appearing in the result.  Note that as well as occurring at the bottom of a hierarchy this can also appear with a hierarchy,  here is another example:

XML
<chapter deltaxml:deltaV2="A!=B">
  <section deltaxml:deltaV2="A" deltaxml:ignore-changes='B'>
    <pagebreak deltaxml:ignore-changes='A'/>
  </section>
</chapter>

The ignore-changes settings preclude the section appearing in the result, but the same is not true for the pagebreak element, which is effectively promoted in this result of the filter:

XML
<chapter deltaxml:deltaV2="A!=B">
  <pagebreak deltaxml:ignore-changes='A=B'/>
</chapter>

How to merge two documents using deltaxml:ignore-changes

This section has been moved to  Creating a Merged Document

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.