Word by word

Introduction

When comparing documents containing text, XML Data Compare treats each block of text as a single node when a configuration file is not specified. This can lead to large amounts of change when in fact only certain words within the text have been changed. Such a result is not particularly useful for displaying what has actually changed. A much better approach would be to compare the text on a word by word basis.

Defining word by word

By default, when a configuration file is specified, the word-by-word option is switched on, meaning that single word changes are shown in the delta. Any part of a paragraph that is the same is shown in the text group and text that differs is shown against either "A" or "B".

If you wish to switch word-by-word off and see the whole contents of elements against either "A" or "B" then you can change the default behaviour of word-by-word for the whole comparison and set it to false as shown in config-wbw-false.xml.

  <dcf:defaults>
    <dcf:word-by-word on="false"/>
  </dcf:defaults>

Alternatively you can choose to change the default behaviour of word-by-word and then switch it on for a particular element as in config-wbw-specific.xml. When running the sample with this configuration file, the <notes> elements are shown with the whole element contents against either "A" or "B" where there are changes whilst the <extra> elements now show the normal default behaviour of single word changes.

  <!-- For defining the user defaults for a feature. A feature defined inside location will override this. -->
  <dcf:defaults>
    <dcf:word-by-word on="false"/>
  </dcf:defaults>

  <dcf:location name="Switch wbw on in extra elements only" xpath="/addressList/person/extra">
    <dcf:word-by-word on="true"/>
  </dcf:location>

There is a sample demonstrating word-by-word available to download from Bitbucket here.

#content .code