Text Splitting

Introduction

When comparing documents containing text, XML Data Compare treats each block of text as a single node when a configuration file is not specified. This can lead to large amounts of change when in fact only certain words within the text have been changed. Such a result is not particularly useful for displaying what has actually changed. A much better approach would be to compare the text using text splitting setting.

Defining text splitting

By default, when a configuration file is specified, the text-splitting option is switched on, meaning that single word changes are shown in the delta. Any part of a paragraph that is the same is shown in the text group and text that differs is shown against either "A" or "B".

If you wish to switch text-splitting off and see the whole contents of elements against either "A" or "B" then you can change the default behaviour of text-splitting for the whole comparison and set it to false as shown in config-ts-false.xml.

  <dcf:defaults>
    <dcf:text-splitting enabled="false"/>
  </dcf:defaults>

Alternatively you can choose to change the default behaviour of text-splitting and then switch it on for a particular element as in config-ts-specific.xml. When running the sample with this configuration file, the <notes> elements are shown with the whole element contents against either "A" or "B" where there are changes whilst the <extra> elements now show the normal default behaviour of single word changes.

  <!-- For defining the user defaults for a feature. A feature defined inside location will override this. -->
  <dcf:defaults>
    <dcf:text-splitting enabled="false"/>
  </dcf:defaults>

  <dcf:location name="Switch wbw on in extra elements only" xpath="/addressList/person/extra">
    <dcf:text-splitting enabled="true"/>
  </dcf:location>

There is a sample demonstrating text-splitting available to download from Bitbucket here.

#content .code