Release Documentation

Reference

1. Introduction

This Reference is part of the documentation for the DeltaXML DITA Compare product and supplements the ReadMe and User Guide documents. The ReadMe provides the essential information for getting started, including how to run and configure the comparator, and licensing terms and conditions. The User Guide provides a high-level introduction to DITA Compare's main features, whilst this Reference covers DITA Compare more thoroughly and in greater detail.

Note: Most of the discussion within this Reference concerns the topic level comparison, thus this is the default context. Those aspects of the discussion that relate to map level comparison are clearly identified.

The DITA Compare product highlights changes between two versions of a DITA document, such as minor changes between editing sessions or changes between customer releases. Here, the 'document' can be a topic or a collection of topics as specified by a map (and its submaps), or a single map file.

2. Types of Comparison

2.1. Topic Comparison

At the topic level a detailed comparison between the two topics results in a new DITA topic. In the default mode of comparison the output format is set to DITA Markup, which highlights changes using DITA's rev or status attributes. There are also some XML editor specific 'track change' output formats, as discussed in Output Formats.

2.2. Mapfile Comparison

A Mapfile Comparison is a 'flat' comparison of two DITA map files. No ditamap references to sub-maps are followed and any topics referenced are not compared. The result therefore provides information about changes to the map files alone.

2.3. Map Topicset Comparison

This comparison begins at the map level. The comparator aligns the topics between the maps, compares the aligned topic, and returns a map, or maps, that contain the compared, added, and deleted topics. Each topicref in the output map has a status attribute set to one of unchanged, changed, deleted, or new, depending on whether the underpinning topic (i.e. the referent) has no reportable modifications, has some reportable modification, has been inserted, or has been deleted respectively. Here, a reportable modification is one that can be marked up in the chosen output format's change markup scheme.

Note: Setting an Output Format for a Map Topicset Comparison only affects markup in the referenced topics, the markup within the map files is always in the form described above.
Note: The content of a DITA Map file is not compared. Instead, a DITA map is used to specify a collection of topics that are of interest. And it is these topics that are compared. The results of the topic comparisons are then used to modify existing maps and/or create new maps.

3. Output Formats

The DeltaXML DITA Compare product can represent the differences between two inputs using a variety of output formats.

Each output format has its own limitations on what types of change are supported and where changes are allowed to take place. For example, DITA's own change markup language does not support changes in attributes.

The DeltaXML DITA Compare product currently provides the following output formats (products ordered alphabetically):
DITA Markup
This output format marks differences using DITA's own change markup scheme.
Arbortext Tracked Changes
This output format marks differences using the Arbortext Change Tracking Markup Specification.
FrameMaker Tracked Changes
This output format marks differences using the FrameMaker Tracked Changes format, which is supported by the FrameMaker Editor.
oXygen Tracked Changes
This output format marks differences using the oXygen Tracked Changes format, which is supported by the oXygen Editor and Author products.
XMetaL Tracked Changes
This output format marks differences using the XMetaL Tracked Changes format, which is supported by the XMetaL Editor.

The 'default' output format is 'DITA Markup', which identifies change using DITA's own change markup scheme as discussed in DITA Markup.

The remaining output formats represent the differences between two inputs using tracked change formats. Here, the intention is to enable the differences to be viewed, accepted, and rejected in an editor or word-processor that supports the given tracked changes format.

The remainder of this section discusses the output formats in more detail.

3.1. DITA Markup

When using DITA Markup, the output of a comparison is itself a DITA document whose changes are marked up using DITA specific scheme.

The DITA Markup output format uses rev and status attributes to identify change. These attributes can be added to many DITA elements. The rev attribute can take any value and the status attribute can take one of the following values: 'changed', 'new', 'deleted', 'unchanged'. These values can be used to highlight changes between two given versions of a document.

Each of these attributes is optionally added to elements that have changed (as detailed below). The rev attribute values default to 'deltaxml-delete' for elements that existed in the original document but are no longer present, and 'deltaxml-add' for elements that exist in the new version of the document but were not present in the original. These values can be changed using the 'old-revision' and 'new-revision' parameters.

The status values are used on elements as follows:
  • when an element existed in the original document but is no longer present, a status="deleted" attribute is added to that element,

  • when an element exists in the new version of the document but was not present in the original a status="new" attribute is added to that element,

  • for certain elements, attribute changes cause a status="changed" attribute to be added to the element

Changes to text content are handled slightly differently. Since it is not possible to add attributes to text, it must first be wrapped in an element that can have the attribute added. The phrase (<ph>) element is intended for this kind of purpose. Text that was in the original document but is no longer present is wrapped with <ph status="deleted">...</ph>, text that is only included in the new version of the document is wrapped with <ph status="new">...</phrase>.

It is possible to change the element used to mark phrases. This should be used if the inputs are specialized DITA and the <ph> element has been specialized into a new element type. The element name to use can be set using the 'phrase-element-name' parameter.

If text changes are made in a context where <ph> is not a valid element, there is the option to use textual markers to display the change. This option is controlled using the 'show-non-phrase-changes' parameter, which is set to true by default. Text changes are marked up as follows:
  • text that existed in the original document but is no longer present is wrapped like this: -[[...]]-

  • text that exists in the new version of the document but was not present in the original is wrapped like this: +[[...]]+

  • As well as marking the text in this fashion, the element that contains the text will also have a status="changed" attribute added to it

If the 'show-non-phrase-changes' parameter is set to false, then only the text from the new version of the document will be output, without any marking.

3.2. Arbortext Tracked Changes

When using Arbortext Tracked Changes Markup the output of the comparison is an Arbortext tracked change version of a DITA document. Here DITA elements can contain Arbortext tracked change elements, and vice versa.

One consequence of using XML elements to represent tracked changes is that the resulting tracked change document does not conform to the DITA specification. In order to return an Arbortext tracked change document back to the DITA specification all changes need to be accepted or rejected (and the tracked change author information has to be removed).

Assuming that the inputs to the comparison are valid DITA documents and all the changes to the output are accepted (or rejected) as previously discussed, then the resulting document will be a valid DITA document. Note that in general it is not possible to guarantee that an arbitrary combination of 'accepted' and 'rejected' changes will result in a valid document, due to the granularity of change.

The generated tracked changes use three of the available tracked change elements:
atict:add
For inserted content;
atict:del
For deleted content;
atict:chgm
For attribute modification (outside the context of a table).

Changes within comments and CDATA Sections results in the whole of the old version of the text being marked as deleted, and the whole of the new version of the text being marked as inserted.

The Arbortext tracked change format does not support changes to processing instructions or those comments that are outside the body of the DITA document. It does, however, support both cell and row level changes within tables.

3.3. FrameMaker Tracked Changes

The FrameMaker Tracked Changes Markup output format is a valid DITA document that includes annotations to represent changes in the document.

This format employs FrameMaker's method for tracking changes, exploiting XML processing-instructions and comments to mark additions and deletions within documents.

The FrameMaker tracked change format is restricted to the Author and WYSYWIG views, these views do not support edits within XML marked as CDATA, changes to CDATA sections are therefore converted to normally parsed XML content. This format uses a pseudo-entity '&fm-double-hyphen;' to allow two adjacent hyphen characters to be represented within comments - which the track change format uses to contain deleted content.

As with most editors, FrameMaker has a few limitations on what types of change can be tracked for different element types, an example is the addition/deletion of table rows. For this specific example, the output format defaults to (affected by framemaker-tcs-table-change-mode) showing changes as changes in the text content of the row cells; however, other limitations have not been fully explored and its possible that some changes marked in the output format will be ignored by FrameMaker.

3.4. oXygen Tracked Changes

When using oXygen Tracked Changes Markup the output of the comparison is itself a DITA document.

This output format uses processing instructions to identify change, where deleted content is typically contained within the processing instruction and inserted content is typically sandwiched between two processing instructions, one marking the start of the insertion and the other the end. Hence, removing (or ignoring) the processing instructions has the affect of accepting all changes to the document.

Comments and CDATA Sections are handled specially, as processing instructions cannot be placed inside their content. Instead, changes are identified by a sequence of processing instructions that immediately follow the Comment or CDATA Section, which mark the location of the change by using a character counting technique. Here, deleted content is contained in the processing instructions, whereas inserted content is already in the Comment or CDATA Section text itself. This preserves the principle of being able to accept all the changes within a document by either ignoring or removing the tracked change processing instructions.

The oXygen tracked change format does not support changes to attributes, processing instructions, or those comments that are outside the body of the DITA document. It does, however, support both cell and row level changes within tables.

3.5. XMetaL Tracked Changes

When using XMetaL Tracked Changes Markup the output of the comparison is itself a DITA document.

This output format uses processing instructions to identify change in a similar manner to that of oXygen tracked change format. However, it does not support changes to comments, changes within CDATA sections, or row or cell level table changes.

Changes within CDATA Sections are handled by moving the change to the CDATA Section level as a whole. Therefore any textual change with in a CDATA section results in the old version of the whole CDATA section being marked as deleted, and the whole of the new version of the CDATA Section being marked as inserted.

There is a special XMetaL specific parameter (xmetal-tcs-table-change-mode) which controls what happens when row or cell level table changes are present. These changes can be pushed down to the cell content level, where the content of each cell within the changed region is appropriately deleted and inserted; this is the 'default' behaviour. The second option is that changes to rows or cells can be pushed up to the table level, so that the old and new versions of the table as a whole are tracked. The third option is that changes can simply be ignored (which mirrors what the XMetaL editor would do). However, selecting the ignore mode means that all changes within a table are ignored, not just those that are at the 'row' or 'cell' level. This is deliberate, as we believe that partial tracking of changes within a table would be confusing.

4. Map Processing

4.1. Mapfile Result

A Mapfile comparison is simply a comparison of two DITA map files. The result of such a comparison is a single DITA map file with markup describing the differences between the two input files. The available output formats are the same as those for a topic comparison, as described in Output Formats.

The alignment of structure in the input map files relies on keying based on the topicref element's attributes keyref or href. The keyref takes priority if both attributes exists. A topicref with a changed key is treated as an entirely different element, it will not be aligned with its original version.

4.2. Map Topicset Result

For a Map TopicSet comparison there are four basic cases to consider: topic inserted, topic deleted, topic changed, and topic unchanged. In addition to these cases, other complicating factors can affect how the results at the map level should be presented, such as whether a topic has:
  1. moved location e.g. position in the map or on the file system,
  2. contains unrepresentable change e.g. an attribute change, which cannot be represented in most output formats, or
  3. been refactored e.g. split or merged.

The stucture of the Map TopicSet result is referred to as the Map Result Structure. To suit different presentation needs the following Map Result Structure types are provided as options:
Topic Set

The result of the topic comparison is a map that contains a non-hierarchical (i.e. flat) set of topic references, which are marked up to indicate whether their referent topics (i.e. the topics that they point at) have been inserted, deleted, changed, or unchanged.

The default behaviour is for the result map's topic references to be in the order of their occurrence in the second input (or 'B' document map). Note, those topics that appear only in the first input (or 'A' document map) are positioned close to another neighbouring 'A' document topic that is also in the 'B' document.

Map Pair

The topic references within one of the existing maps (or a copy of it) are marked up with how they have changed. And those topic references that only appear in the other 'remaining map' are output in a non-hierarchical map. Note, this latter map is the map that is specified by the output of the comparison operation.

The default behaviour is for the second 'B' document map (and its submaps) to be updated. Here, the 'B' documents map hierarchy is updated, to indicate which topic references have been inserted, changed, or left unchanged, and the specified output map contains a set of 'deleted' topic references (i.e. those that only appear in the 'A' document).

Unified Map

This result structure attempts to combine the benefits of both the Top Set and Map Pair. That is, in the default case, the 'B' map structure is preserved, and the 'A' only topic references are inserted in the 'B' map as close as possible to the last topic reference to match in both maps.

The Unified Map result works best when the maps being compared have a very similar structure. Specialized DITA maps may also cause problems because the element names of deleted topic references may need renaming so as to be valid if they are inserted in the 'B' map at a different level to their occurrence in the 'A' map.

The Result Structure type is set using the 'map-result-structure' parameter. The default behaviour of basing the output on the structure of the second 'B' document can be changed by setting the 'map-result-origin' parameter.

Note: Map comparisons can be performed either 'inplace' or on a 'copy' of their inputs. In the latter case:
  1. both inputs are copied to an output directory, as discussed in Section Copying Maps,
  2. an inplace comparison is performed on that copy, and
  3. a result-alias.ditamap is created in the output dir., whose topicref(s) point to the result map(s).
When copying the inputs the 'map-copy-scope' parameters configure what is copied.

The topic references within an output map are now marked up to indicate whether the content that they are referring to, i.e. the referent, has changed. Here, each topic reference is marked with a status attribute to indicate whether its referent (i.e. the topic it is pointing at) is added, deleted, changed, or unchanged. Note that the setting of this attribute is independent on whether the content of the topicref element has changed. Therefore, it is possible to have an unchanged topicref element containing new, deleted, or changed topicref elements.

As the structure of the map comparison result is non-trivial it is useful to illustrate this by an example. Consider the two versions of a simple ReadMe document illustrated in Figure 1 below; here the only difference in the map structures is the removal of the notices.dita topic from the 'B' version of the document.


Figure 1. Simple map compare inputs

These inputs can be compared 'in place' using both the 'topic set' and 'map pair' outputs, as illustrated by Figure 2 and Figure 3 respectively, where the structure of the output is based on the 'B' document. Note, the backup of the 'B' document's map is being displayed in the case of the topic set, to highlight that it differs structurally from that of the resulting map.

Note:

These examples exclude the 'unified map' output structure which can be considered to be the same as that for 'topic set' (only the content of the 'B' map and any sub-map files is different).


Figure 2. Result of a topic set in place comparison

Figure 3. Result of a map pair in place comparison
Note: It is possible to change the name of the map pair result's remaining document DITA map using the 'map-pair-remaining-map-name' parameter. Further, that control over the backup mechanism is provided by the 'map-clean-temp' and 'map-backup-suffix' parameters.

The example inputs can also be compared using a 'copy' of the input, as illustrated in Figure 4 and Figure 5, where the output is stored in the C:\OutDir directory. Note that the structure of the output directory is in part explained by the Section on Copying Maps. In addition to the copying of the inputs, a result-alias.ditamap map is generated, as previously discussed.


Figure 4. Result of a topic set comparison with an OutDir output directory

Figure 5. Result of a map pair comparison with an OutDir output directory

4.3. Copying Maps

What does it mean to copy a DITA Map? This section answers that question in the context of preparing inputs for comparison.

A DITA map provides a mechanism for specifying those resources that both belong to and are referenced by the map. Each of these resources is associated with a scope (e.g. local and external), which can be used to determine whether a resource should be copied along with the map. The 'map-copy-scope' parameter can be used to configure what is copied.

4.3.1. Identifying the resources to copy

DITA provides two basic means for referencing another resource:
A hypertext reference (href)
These references contain a relative or absolute URI to a resource.
A key reference
These references are defined indirectly by a string, known as a key, which is bound to a hypertext reference within its definition.
Note that DITA keys can only be defined within a map, and that it is the first definition that is used (see DITA 1.2 specification's Overview of Keys section for details on the search order).

DITA Maps are designed to contain submaps and topics, therefore the resources that are referenced by a submap or topic are considered to be indirectly referenced by their parent map. However, with the introduction of keys within DITA 1.2, some DITA practitioners are encouraging the use of 'key' only references within topics. In this case, there is no need to scan the topics for potential resource use as these will be declared at the map level. The 'map-scan-topics-for-references' parameter can be used to prevent DITA topics from being scanned for resources.

Note: Our DITA map model currently considers all resources that are referenced within the map, it submaps, and optionally its topics to be part of the document. In particular, references within key definitions that are not bound are considered to be part of the document. When 'key' aware processing is introduced there will be an option to remove unbound key definitions from the resources associated with a DITA map.

DITA documents are designed to enable references to non-DITA resources, such as images and web-sites. Images are often given a local scope as they are considered to be part of the document. In such cases, the images (and other locally scoped resources) will be copied along with the map. Resources that have other scopes may be copied depending on the setting of the 'map-copy-scope' parameter.

Note: It is assumed that non-DITA content is either self contained (i.e. it does not refer to other resources) or that the references are 'global' (i.e. they can be obtained from a globally unique - and available - address). This may not be true in general, and other formats may be provided with resource scanning capabilities when requested.

4.3.2. Copied structure

When copying a resource we aim to maintain its relative position to the map when this is feasible. For example, if a map contains relative references to resources that are being copied, then these references remain unchanged in the copied output. However, if these relatively referenced resources are not being copied, then the references are updated to reflect their current positions. Alternatively, if a map contains resources that are being copied and cannot be relatively addressed (e.g. they are on a different host computer), then these resources are copied to new locations, and their associated references are updated appropriately.

Note: Those resources that are identified as belonging to the map, and thus are copied, are the same resources that are identified as potential resources for comparison.

In general, the output of a copied map is represented by a forest (e.g. a directory) that has a tree (e.g. another directory) for each 'host system', where the relative locations between the resources on each host system is maintained in the copied output.

Before we explore the general case, it is worth briefly presenting a simple case, where the resources referenced within the document defined by a DITA map - C:\Common\Docs\Notices\notices.ditamap - are either alongside or beneath the location of the DITA map. In this case, the copying of a map to a target location is straightforward, as illustrated by Figure 6 below, which is being copied in preparation for use as the first, 'A', input in a comparison.


Figure 6. Copy of notices.ditamap to C:\Copy
The result might look a little odd, as the copying has introduced an apparently unnecessary directory _a-0-file- and file _a-copy-alias.ditamap, which is a map that references the copied notices.ditamap. The reason for this is that in the general case the copied map might not be in the top-level directory of the copied output directory, and it is useful to have a reliable location for specifying where the copied map is. Consider the following example, where:
  • the map ReadMe.ditamap in C:\Users\Ian\Products\Docs directory is:
    <map>
       <title>Product ReadMe</title>
       <topicref ref="Topics/intro.dita" />
       <topicref ref="file:/C:/Common/Docs/Notices/notices.ditamap" />
    </map>
  • the map notices.ditamap in C:\Common\Docs\Notices directory is:
    <map>
       <title>Notices</title>
       <topicref ref="notice1.dita" />
       <topicref ref="notice2.dita" />
    </map>
As it is possible to relatively reference the notices.ditamap from the ReadMe.ditamap, the copying of the ReadMe.ditamap results in a forest with a single tree, as illustrated in Figure 7 below.

Figure 7. Copy of ReadMe.ditamap to C:\Copy
In this case, the copied tree retains the relative paths from ReadMe.ditamap to notices.ditamap in the copied output. One consequence of this is that the ReadMe.ditamap has ended up four levels of directory structure lower than the top of its copied tree. This is not a problem, as the copy alias map (_a-copy-alias.ditamap) points to the actual location of the copied map. For clarity the three result maps are:
  • the map _a-copy-alias.ditamap in C:\Copy directory is:
    <map>
       <topicref ref="_a-0-file-/Users/Ian/Products/Docs/ReadMe.ditamap" />
    </map>
  • the map ReadMe.ditamap in C:\Copy\_a-0-file-\Users\Ian\Products\Docs directory is:
    <map>
       <title>Product ReadMe</title>
       <topicref ref="Topics/intro.dita" />
       <topicref ref="../../../../Common/Docs/Notices/notices.ditamap" />
    </map>
  • the map notices.ditamap in C:\Copy\_a-0-file-\Common\Docs\Notices directory is:
    <map>
       <title>Notices</title>
       <topicref ref="notice1.dita" />
       <topicref ref="notice2.dita" />
    </map>

So far the examples have only shown one tree in the copied result, the directory structure under _a-0-file-. It is straightforward to construct an example that requires two trees, by a small modification to our previous example: Let the notices.ditamap file located on the C:\ drive be moved to a similar position on the D:\ drive, and have the ReadMe.ditamap updated to reflect the new location. The copy of the updated ReadMe.ditamap to C:\Copy is illustrated in Figure 8 below.


Figure 8. Copy of the updated ReadMe.ditamap to C:\Copy

In this case the directory structures under the trees are much more compact, which is arguably a better result than that produced in the previous example, because it appears to mirror the conceptual breakdown of the document. In general, a user might want to state that certain directories should be treated as if they were on a separate non-relatively addressable storage area. Such functionality can be added in the future if there is sufficient demand.

5. How is Comparison Performed?

DeltaXML DITA Compare makes use of the fact that DITA is an XML format when performing document comparison. XML documents are machine readable documents that conform to a set of rules defined by the W3C. For more information on XML see Resources.

The document comparison is performed by another of our products, DeltaXML Core, with various pre-configured pre- and post-processing steps, referred to as a filter pipeline.

Core works by matching together elements that have the same name, and where possible, the same or similar contents. This means that a paragraph (<p> element) can only ever be compared against another paragraph and will never be compared against a note (<note> element). Understanding this is a key part of understanding how the comparison works.

When deciding which elements match best, e.g. which amongst a number of possible paragraph pairings is the best match, Core uses the words within an element. Elements that have the same or similar content are much more likely to be matched together than those that are quite different. Once this matching phase has taken place, Core will then compare the contents of the two elements it has matched, recursing in this fashion until it reaches the bottom of the XML structure.

5.1. Generalization

In order to compare two input files, DITA Compare must first ensure that the root elements are the same. This is a requirement of the underlying tool used to compare the documents. In order to ensure that this is the case, the inputs are generalized back to a DITA topic using the generalization mechanism in the DITA OpenToolkit.

5.2. Pre-processing

As well as generalizing the two inputs, the pre-processing stages perform many other tasks on the documents. Some of these tasks are described below, along with any parameter settings available for configuring them.

5.2.1. Removing track change markup from input documents

Some of the output formats generate tracked change markup. All supported tracked change format markup is removed before comparison. This avoids confusion between pre-existing changes and those added as part of the comparison. Note this has the affect of accepting all the changes on both input documents before comparison begins

5.2.2. Preserving comments and processing instructions

XML comments (text contained in <!-- --> markup) and processing instructions (special instructions marked as <?instruction_name more details ?>) in the document need to be converted into other XML markup in order to be output in the result document. This task is carried out before comparison, the elements are then compared and they are converted back into comments and processing instructions afterwards. Because there is no useful way of marking changes to comments and processing instructions, the result contains only those from the second input.

5.2.3. Whitespace preservation

For most DITA elements, whitespace is not significant (i.e. multiple spaces and newlines are effectively turned into a single space when converting to a published format such as PDF). Therefore, when using the DITA markup such spaces are 'normalized' before comparison.

Tracked changes output formats are intended for use with editors, where 'roundtrip' processing is the typical behaviour. In this case we are typically not interseted in whitespace change, but want to preserve the document indentation. Therefore, we 'ignore' changes in whitespace.

The whitespace-processing-mode parameter provides a means for configuring how whitespace differences are to be handled.

5.2.4. Word by word

When comparing text, the comparison can treat text blocks as a single chunk of text and compare one chunk against another or it can treat it as a sequence of words, comparing one word against another. The word-based comparison gives more understandable results at a much finer-grained level and is the default setting. If you wish to turn it off, set the word-by-word parameter to false. See word-by-word for more details.

5.3. Table processing

Table comparison is a complicated matter and part of the requirements for processing DITA tables is that the table is a valid 'CALS Table'. This is a separate standard that defines how tables should be constructed and is used as the definition for DITA tables. However, it is possible for a table to be valid according to the DITA language but semantically invalid according to the CALS table specification. Part of the input processing analyzes tables in the document, performs normalization and annotates them to inform later processing stages about their validity.

Table normalization involves the following:
  • Converting a column width (the colwidth attribute) value of * to 1*. These are semantically equivalent but would register as a difference when compared.

  • Explicitly outputting inferred column specifications (colspec elements). For example, if the first column defined is listed as column 2, then there is an inferred default entry for column 1. The input processing adds an explicit definition of such inferred colspecs.

DITA Compare also includes support for comparing DITA's simple tables (as identified by the topic/simpletable class attribute). Processing for these tables is slightly different as they are a much simpler form of table than CALS tables.

See Tables for more details on how tables are compared.

5.4. Post-processing

5.4.1. Specialization

As mentioned above, all inputs are generalized back to a DITA topic to ensure that they have the same root element (a prerequisite for comparison). Once comparison has taken place, the result document is specialized back to whatever DITA type was originally passed in. This works well if the input documents are both the same type, e.g. both DITA tasks but if the two inputs are different, e.g. a task and a concept, specialization can result in an invalid result. This is because the result can contain two different specialization types. If elements are deleted, their specialization type will be from the first input document, if they are added, it will be from the second input document. There may also be changes to the class attribute itself if two different elements, once generalized, match together. In these cases, it is not obvious how to perform specialization of the result file and indeed it is quite likely to lead to an invalid result. The default behaviour of DITACompare is to leave the result as a DITA topic but with the class attributes available should you wish to edit it and then specialize. It is possible to force specialization to occur using the force-specialization parameter. At present, this will specialize the result to the type used for the second, or 'B', input. Future releases may allow this to be configured to choose the 'A' document as an option. Please be aware if using this option that the result file is quite likely to be invalid.

5.4.2. Content choice

XML grammars often present author's with a choice of elements to use in a particular context. Sometimes this choice will be a list of elements, any of which can be used multiple times. Other choices involve mutually exclusive sets of elements. One such example of this in DITA is the choice of steps or stepsunordered in a DITA task. If one input document contains steps and the other contains stepsunordered, the result would normally contain both, one marked as deleted and one marked as added. This means that the result file is invalid as it is not permissible to use both of these elements. Part of the post-processing performed by DITACompare is to detect these cases and try to resolve them. At present, the only case supported is the one mentioned here, steps vs stepsunordered. The behaviour of DITACompare can be configured using steps-conflict-resolution. Support for more cases will be added in future releases.

5.4.3. Conflict resolution of id attributes

When comparing aligned topics it is possible to encounter conflicting 'id' definitions. Conflicting 'id' definitions can cause problems for red-line document production from the DITA-Markup output-format. Therefore, conflicting 'id' definitions are renamed to avoid such conflicts (when DITA-Markup output format is selected). The associated direct cross-referencing and reuse attributes, 'href' and 'conref', are also updated to be consistent with such 'id' attribute renaming.

Note: When performing a topic-only comparison, global DITA cross-references are not processed using the conflict resolution scheme described above. Here, a global cross-reference is of the form uri#topic-id[/element-id] and a local cross-reference is of the form #topic-id[/element-id].

6. Product Features

The DeltaXML DITA Compare product has several special features for handling specific situations. This section focuses on tables, round-trip processing (via lexical preservation), and attribute change.

6.1. Tables

DITA tables (which use the CALS table model) are handled slightly differently from the rest of the document because displaying change, particularly structural change, in tables is more difficult. For this reason, structural table changes are shown at various different levels of granularity. Our main aim in the table processing is to produce a result where the changes can be seen in a s much detail as possible but with the result document still maintaining validity against the DITA specification and the CALS table specification. Producing an invalid result document can cause problems further down the publishing pipeline.

6.1.1. Simple Structural Change

When the column definitions for the two tables have not changed, it is possible to represent changes to column or row spanning at the effected row granularity. Rather than repeating the whole of the table in two tgroup elements, it is possible to repeat only individual rows, or in some cases a set of consecutive rows in the same format of original document rows (marked with status="deleted") followed by the latest document rows (marked with status="new"). The number of rows that are repeated depends on what type of structural change has occurred. If the change involves changes to column spanning within a single row that does not overlap other rows and is itself not overlapped, it is possible to repeat only that single row. If column spanning changes occur on a row that overlaps other rows or is itself overlapped, it is necessary to group together all of the rows affected by the row spanning and repeat them together. This is also the case for any changes involving changes to row spanning.

6.1.2. Complex Structural Change

Some structural changes are too complex to represent in a single result table section (the tgroup element) and so the result document contains a table with two table sections: the first contains the table from the original document with a status="deleted" attribute on it, the second contains the table from the latest document with a status="new" attribute on it. Although it is not possible to see individual changes to rows/cells etc that occurred between the document versions, it is possible to see the two table versions and, providing the inputs were both valid, be sure that the result document is valid.

This type of result is produced when a table contains changes to row or column spanning as well as changes to the column definitions (e.g. changed column names or added/deleted columns).

6.1.3. Orderless Tables

Sometimes the order of rows within a table is insignificant. For example, consider a simple product information table, where the first column of the table contains a unique product name, the second column its 'tag line', the third column its standard price, etc. The rows in this table can be reasonably ordered in a variety of ways, such as by 'name', or by 'price'. When two versions of a document are compared that use different row ordering mechanisms, a significant number of rows are likely to be added and deleted due to them moving position. If such differences are insignificant then an orderless row comparison would be useful.

Orderless row comparison support can be provided so long as there is no row spanning within the tables being compared. In such cases, the <?dxml-orderless-rows?> processing instruction can be added within the element that directly contains the rows that are to be processed in an orderless fashion. It is important to ensure that this processing instruction is added to the relevant table in both input documents.

The orderless comparison algorithm is greatly improved through the use of unique row keys. Adding a <?dxml-key id1?> processing instruction within the element that directly contains the row, sets that rows key to 'id1'. It is also possible to specify the row 'cell position' that is used for defining the default value for a row's key. For example, the <?dxml-orderless-rows cell-pos:2?> processing instruction specifies that the text content of the row's second cell (e.g. <entry> or <stentry> element) should be used as the row's key. Note that the row cell position takes no account of 'column' data (e.g. @colnum attribute), it just counts the number of cells.

6.1.4. Other Changes

Other kinds of simple structural change can be represented within a single table without needing to repeat any rows. For example, column deletion in a table that does not have any changes to column or row spanning can be represented by marking each of the deleted cells with the status="deleted" attribute.

6.2. Lexical Preservation: Preserving Entities, DTDs, CDATA, PIs and Comments

It can be useful to preserve the lexical structure of the inputs when performing a comparison. This section discusses our support for lexical preservation and its limitations.

The DeltaXML DITA Compare product provides a selection of output formats with different intended use cases as discussed in Output Formats. Some are intended for use in an publication pipeline, whereas others are intended for onward review and editing (we refer to this as 'round trip' processing). For onward editing, it is useful to provide the user with a document that is as close to the original input documents as possible. For example, it is important not to expand entity references and CDATA sections.

These lexical preservation modes can be set by the 'preservation-mode' parameter as discussed in Parameters Appendix. The remainder of this section provides: an overview of each lexical preservation mode (Modes); a detailed account of precisely what is preserved in each mode (Details); and a discussion on the limitations of preservation (Limitations);

6.2.1. Modes

6.2.1.1. Round trip preservation mode

When using a track-change output format, a user is likely to expect that accepting all the changes would result in the 'B' document, whereas rejecting all the changes would result in the 'A' document. The 'round trip' preservation mode is designed to achieve this as far as possible, within the limitations of standard XML parsing and XSLT 2.0 transformation technologies. However, as some data cannot be tracked using tracked change markup, it is necessary to choose either the 'A' or 'B' version of that data. By default the result document uses data in the 'B' document in preference to that in the 'A' document. Hence, accepting all changes is likely to be close to the 'B' document whereas rejecting all changes may not be as close to the 'A' document.

6.2.1.2. Document preservation mode

When marking changes using attributes, such as revision flags, the user is likely to expect full content expansion. Here entity references and CDATA sections are expanded and compared, rather than kept in their original source form. This typically enables finer grained change identification and display. It can also significantly improve the aligning of the documents before the comparison is performed. This type of processing is performed when using the 'document' preservation mode.

6.2.1.3. Document and attribute preservation mode

One issue with the document preservation mode is that all the attributes that are provided by the DTD are retained in the output, which can lead to unnecessary clutter in the output, which both increases the size and decreases its clarity for manual review/editing. The 'document and attribute' preservation mode address this issue by tracking which attributes have been supplied by the DTD, and removing them so long as they have not changed.

6.2.1.4. Entity reference and nested entity reference preservation modes.

These are variations on the 'round trip' mode to enable expert users to know when the underpinning definitions of an entity have changed, as explained in Details.

6.2.2. Details

The table below shows the different preservation modes and their effect on how various items in the file are preserved.

Table 1. Preservation Modes
Preservation ModePreserve Comments & Processing InstructionsXML Declaration & Doctype Preserve defaulted attributes Preserve CDATA sections & whitespacePreserve entity referencesPreserve entity references & contentPreserve nested entity references & content
documentononoffoffoffn/an/a
docAndAttribonononoffoffn/an/a
roundTripononononon*offoff
entityRefononononon*on*off
nestedEntityRefononononon*on*on*
*It is not feasible to preserve entity references when using the DITA Markup output format.
The effects of turning these preservation items 'on' or 'off' is now discussed in the following list, where the use of 'this column' in an item's description refers to the corresponding column in the above table.
Preserve Comments & Processing Instructions
Comments and Processing Instructions (PIs) in the 'B' document are preserved in the result, whereas comments and PIs in the 'A' document (that are not also in the 'B' document) do not appear in the result. The exception here is that PIs that represent oXygen tracked changes are removed prior to comparison so that they do not get confused with the changes identified by the comparator. Further, neither comments or PIs in the internal DTD subset are currently preserved.
Preserve XML Declaration & Document Type (DTD & internal subset)
Most of the XML declaration, doctype and internal subset data is preserved (for the preservation modes that contain an 'on' in this column). A current limitation is that comments and processing instructions within an internal subset are lost. Another limitation is that XML declaration's standalone marking is not preserved.
Preserve defaulted attributes
Default attribute values can be specified in a DTD and these are automatically put onto the elements in the document by the parser. If they are preserved as defaulted attributes (i.e. an 'on' in this column), then these default values will not appear in the result document.
Preserve CDATA sections and whitespace
CDATA (character data) sections are preserved in the result (for the preservation modes that contain an 'on' in this column). Insignificant whitespace characters are treated as normal whitespace characters, and modifications in whitespace are by default ignored in the output.
Preserve entity references
General parsed entities are preserved as entities - rather than expanded (i.e. replaced by their content) - in the result document when an 'on' is in this column. This is usually what you want when you continue to edit the document. For example, consider two documents that differ in how the name of a city - London - is represented: in the first document the city is written as the string 'London', and in the second document the city is written as an entity reference '&city;' whose value is the string 'London'. In this case, modes with an 'on' in this column the two representations of city London are marked as different, because the unexpanded entity is different from the text, whereas those modes with an 'off' in this column mark the two representations of the city London as the same, because the expanded entity reference is the same as the text.
Preserve entity references and content
This is intended only for expert users who understand how entities work. In roundTrip mode you will not see changes in entity references in the (unusual) situation where the definition of these entities is different in the two documents. For example, consider two documents containing the entity reference '&city;' that differ only in the value of the 'city' entity, which has changed from 'London' in one document to 'Birmingham' in the other. Both of these documents use the same '&city;' entity reference, which would be marked as unmodified as it is identical from the round trip (source document) perspective. If you need to see such changes, then use a mode with an 'on' in this column. In the result document, there can only be one entity definition and this will be either from the original ('A' document) or new ('B' document). Therefore the entities are guaranteed to be the same in the result document, and so any difference is shown by adding and removing an identical element.
Preserve nested entity references and content
This is intended only for expert users who understand the way one entity can reference another. An 'on' in this column means that subtle changes in entity reference structure are shown. The full structure of nested entities is preserved and compared and any changes are shown. This is useful in some complex cases where the overall semantics of an entity does not change, but the way in which it is defined changes. For example, consider a document that contains a reference to the entity '<!ENTITY ent "&inner1;">', where the 'inner1' entity has the value 'val'. Let a second version of the document be the same as the first, except that the inner entity reference is renamed to '&inner2;'. In this case, both the syntactic and semantic analyses will miss this change, as the syntax analysis compares '&ent;' against itself and the semantic analysis compare the text 'val' against itself. An 'on' in this column means the comparator will detect such changes in the internal definition of an entity, and marks them using the same scheme as above: the addition and deletion of an identical entity reference.

6.2.3. Limitations

There are some fundamental limitations on what changes can be shown, which reflect the nature of a given output format and XML parsing and processing technology. These fundamental limitations include:
  1. Some output formats cannot represent changes in attributes. In these cases, it is possible to configure the resultant document to contain the 'A' version, the 'B' version, the 'A' version if it exists otherwise the 'B' version, etc; see the 'modified-attribute-mode' parameter documentation for details.

  2. Many output formats - such as DITA markup and Arbortext tracked change formats - cannot represent changes in the document type and internal subset data. In these cases, it is possible to configure the resultant document to contain the 'A' version, the 'B' version, the 'A' version if it exists otherwise the 'B' version, etc; see the 'unmarked-change-mode' parameter documentation for details.

  3. Some changes in white space cannot be reproduced, as whitespace outside the root element of a document is not reported by an XML parser.

6.3. Attribute Change

Special case: Some elements are intended to provide hypertext links, to other parts of the document and external resources. In such cases, when changes in the link target attributes cannot be represented, they are propagated to the element level as discussed in Hypertext links Section.

Note: DITA's hypertext link elements, such as the xref and link elements, use the href and/or keyref attributes to define the target of the link. And use the content of the element to define the links label.

When attribute change can not be represented, changes in hyper text links are handled specially. In this context, changes to the href or keyref attributes are propagated up to the element level; i.e. the 'A' and 'B' versions of the link element are represented as being deleted and inserted respectively, as appropriate for the selected output-format.

The full list of 'generalized' elements that are considered to be hypertext link elements are: author, fragref, image, link, publisher, source, synnoteref and xref.

Note: Other elements, such as the topicref and data elements, also make use of the href and keyref attributes. Such elements are intended to represent more than just a hypertext link, and thus are likely to have a significant amount of content. Seeing changes within this content is important, therefore, changes in the href or keyref are not propagated to the element level in these cases.

7. Displaying Changes

A DITA Compare result file contains the information needed to determine the changes between two revisions of a document but in itself, it is not able to display those changes in a meaningful way. The next task is to utilise this information in the publishing step for the document so that the changes can be highlighted in formats such as HTML, PDF or even a WYSIWYG editor.

Please note that auto-numbered items such as sections or list items in a document displaying changes may not use the same numbering as in the latest version of the document. This is because deleted items are still given a number. For example, a list that originally had three numbered items but had the middle one deleted will still contain three items in the result but with the middle marked with status="deleted". When this is converted into a published document, the middle deleted item will be item 2 with the final item being item 3 (rather than 2 as it would be in the latest version of the document).

7.1. oXygen Editor

When editing DITA files using a WYSIWYG editor, it is possible to use status attributes to highlight changes. For example the oXygen editor from Syncro Soft SEL has an Author view which uses CSS to display a DITA document similar to that of a word processor display. With oXygen it is possible to customize the CSS used to display the document in order to use status values to highlight change.

The samples directory samples/resources/css includes sample files that can be used with various versions of the oXygen editor for change highlighting and display in both DITA Topics and Maps. The instructions for doing this are provided in the Using CSS styling for Topics and Maps in oXygenXML editor page on the website.

7.2. DITAVAL files

A DITAVAL file is used for both conditional processing and flagging of DITA content in a publishing pipeline. Flagging is a limited form of styling based on attribute values. It is possible to make use of the rev attributes added by DeltaXML DITACompare for flagging the output via a DITAVAL file.

The samples/resources/ditaval directory contains a deltaxml.ditaval file with some flagging and changebar suggestions. Comments at the top of this file describe how it can be used.

8. Dll Library Dependencies

This product uses several third party libraries as discussed in the Licensing and Legal Notices section. The following table summarises the versions of the libraries used. Note that when the libraries have been created by an IKVM conversion of jar files, then the versions of the underpinning jar files are documented.

Table 2. Library Dependencies
LibraryIKVMed jar(s)Jar VersionsDescription
deltaxml-saxon9pe.dllsaxon9pe.jar9.5.1.3Saxon XSLT 2 Processing
deltaxml-saxon9he-dotnet.dllsaxon9he-dotnet.jar9.5.1.3
deltaxml-xercesImpl.dllxercesImpl.jar2.9.0XML Parsing
resolver.jar1.2 - custom patch 6OASIS Catalog Resolver
deltaxml-icu4j.dllicu4j.jar49.1International Unicode Utilities
IKVM-*.dllN/A7.2IKVM bytecode transformer

9. Case Sensitivity of Hrefs/URIs

Some resource locations are case sensitive (e.g. a typical UNIX file system), whereas others are not (e.g. a typical Windows file system).

Incorrectly specifying the case sensitivity of a resource location can lead to confusing or hard to predict behaviour, including the miss-alignment of topics for comparison, not being able to locate specified resources, unexpected deletion of files, and exceptional termination of the comparison. Having said this, if the case of the URI's, as specified by their hrefs within a DITA document, is consistent with that of the underpinning resource location, then none of these issues should occur.

We assume that URI's of two distinct resources do not differ only in their case, and that the case of the 'href' references within a document are consistent with that of the underpinning resource location; i.e. the 'href' references are specified using precisely the same case as that of the resource location that they are referencing.

A. Parameters Appendix

A.1. 'Automatic' parameter values

Where parameters can have the value automatic (generally used as the default value for that parameter), their actual value is calculated when the inputs are compared. This calculation is based on the values of other parameters; in the case of DITA Compare this is typically the 'output-format' parameter. Here, the idea is to set the value used for the parameter to that which is most appropriate for the given output format. When this automatic behaviour is inappropriate, the actual value of each parameter can be manually set to a specific values using the usual mechanisms. See the documentation for the individual parameters for details on what settings will be used.

A.2. Parameter Definitions

The parameter names available on the command-line tool and in the APIs are slightly different. In the command-line version, they are written as lower case words separated by a hyphen (e.g. validate-inputs). In the Java API version of the product, they can be accessed using set/get methods that use a camel-cased version of the name (e.g. setValidateInputs and getValidateInputs). In the .NET API version of the product, they are accessed as Properties, again using a camel-cased version of the name (e.g. ValidateInputs). The names used below correspond to the command-line name.

A.2.1. Output Parameters

These parameters enable the form of the output to be configured.

Table 3. Output configuration parameters.
Non-format specific output parameters.
groupingwhether adjacent changes should be grouped
indent-outputwhether to indent the output file.
map-result-structureHow the result map is constructed.
map-uncompared-topic-markupWhat markup if any to put on an uncompared topic.
map-topic-exception-propagation-modeWhat action to perform when there is an exception raised during the processing of a topic.
output-formatwhat type of output is produced
preservation-modehow much of the original document information to preserve
DITA markup output format specific parameters.
add-outputclass-attswhether to include an outputclass attribute for marking changes.
add-revision-attswhether to add 'rev' attributes to changed elements
add-status-attswhether to add 'status' attributes to changed elements
include-deletedwhether to include deleted content in the result.
new-outputclassif 'outputclass' attributes are added, this is the attribute value to use for new/added items
new-revisionif 'rev' attributes are added, this is the attribute value to use for new/added items
new-version-attsattributes to place on added elements in the form [namespace]prefix:local-name=value;
old-outputclassif 'outputclass' attributes are added, this is the attribute value to use for old/deleted items
old-revisionif 'rev' attributes are added, this is the attribute value to use for old/deleted items
old-version-attsattributes to place on deleted elements in the form [namespace]prefix:local-name=value;
phrase-container-exclusionsa comma-separated list of the specialized elements where phrases are no longer permitted
phrase-element-namethe element name to use to wrap old/new text when it has changed
show-non-phrase-changesif set to 'yes', uses -[[old-text]]- +[[new-text]]+ delimiters to show change where is not allowed
Tracked changes configuration parameters.
framemaker-tcs-table-change-modehow changes in tables should be tracked
oxygen-tcs-deleted-space-modehow deleted spaces should be handled
oxygen-tcs-versionthe oXygen editor version
tracked-changes-authorthe author of the changes
tracked-changes-datethe time-stamp when the changes were produced
xmetal-tcs-table-change-modehow changes in tables should be tracked
DITA map configuration parameters.
map-backup-suffixthe suffix used when creating a backup of a file
map-clean-tempwhether to clean temporary files introduced when performing the comparison
map-pair-remaining-map-namethe name of the remaining topic references ditamap
map-result-originWhich map input is being used to form the basis of the output.
Non-format specific output parameters.

The output parameters that are either used to specify a specific output type or apply to all output types.

Note that map output type specific configuration is contained in the DITA Map output group.

grouping

Specifies whether adjacent changes (insertions or deletions) should be grouped into a single insertion and/or deletion block. One benefit of this is that changes to a consecutive group of words within a sentence are gathered into one insertion and one deletion block, rather than a series of individual word swaps. This makes it easier to read and understand the changes.

Note that when either HTML or CALS table processing modes are selected, then this grouping mechanism is turned off within the context of these tables. The table processing has its own specialised grouping mechanisms.

The default value is 'false'.

indent-output

Sets whether the result should be indented.

If run with a compare method that produces a serialized result, a value of yes causes the output to be pretty printed.

This parameter can take the following values:
yes

The value representing the String value 'yes'. Output is indented.

no

The value representing the String value 'no'. Output is not indented.

The default value is 'no'.

map-result-structure

Specifies how the result map is constructed.

The available result structures are defined in terms of the result origin , which is the input that is being used to form the basis of the output. See the map-result-origin parameter documentation for more details.

This parameter can take the following values:
topic-set

The result map will contain a flat list of topic references that appear in the order that they were first encountered in the result origin .

map-pair

The result map contains two submaps: the ' updated ' result origin map (and submaps); and the ' missing ' topic references map.

unified-map

The result map contains: the ' updated ' result origin map; and also includes deleted topic references in positions as close as possible their original location.

The default value is 'topic-set'.

map-uncompared-topic-markup

What action to perform on topics that are not compared. Some DITA aware publishing tools do not make use of revision attributes of a topicref on the referent (i.e. the topis that is being referenced). In these cases, it can be useful to mark the top-level topic element, which is being referenced by the topicref, with appropriate revision and/or status attributes. However, for output formats that are not intended for publication, such as the editor specific tracked change formats, marking up the root element on a topic file as added or deleted, could be counter productive.

This parameter can take the following values:
auto

When DITA Markup is selected as the output format, behave as if 'mark-change' had been selected, otherwise behave as if 'leave-alone' had been selected.

mark-change

Mark the top-level topic element of uncompared topics as added or deleted, in the manner specified by the other topic level parameters.

do-not-modify

Do not update the uncompared topics.

The default value is 'auto'.

map-topic-exception-propagation-mode

What action to perform when there is an exception raised during the processing of a topic during a map file comparison.

This parameter can take the following values:
fast-fail

Propagate the exception as quickly as possible to the map comparison.

phase-fail

Propagate the exception at the end of each phase of the comparison. Here, all topics are progressed to the end of the given comparison phase before the exception is thrown.

slow-fail

Propagate the exception once other topics and map-level operations have completed their processing.

The default value is 'slow-fail'.

output-format

Specifies what type of output is produced.

This parameter can take the following values:
dita-markup

Differences are marked up using DITA's rev and status attributes.

arbortext-tcs

Differences are marked up in the Arbortext tracked change format.

oxygen-tcs

Differences are marked up in the oXygen tracked change format.

xmetal-tcs

Differences are marked up in the XMetaL tracked change format.

framemaker-tcs

Differences are marked up in the FrameMaker tracked change format.

The default value is 'dita-markup'.

preservation-mode

Sets the mode to use for preserving original data.

This mode can be used to preserve information for round trip processing.

The 'automatic' setting will have an effective setting of 'roundTrip' when the output format is a tracked change format and 'docAndAttrib' otherwise.

This parameter can take the following values:
automatic

Chooses the most appropriate setting based on the output format. If tracked changes are being produced, this will be 'roundTrip', otherwise 'docAndAttrib' will be used.

document

Preserve document typing information.

docAndAttrib

Preserve document typing and original attribute information.

roundTrip

Preserve document type, entity usage, and attribute usage data.

entityRef

Enhance round-trip processing by analysing entity contents.

nestedEntityRef

Enhance round-trip processing by analysing both entity and nested entity contents.

The default value is 'automatic'.

DITA markup output format specific parameters.

The DITA markup output format specific parameters provide control over how the 'rev' and 'status' attributes are used to represent change.

add-outputclass-atts

Specifies whether to include an outputclass attribute for marking changes.

NOTE: This option only applies to the DITA Markup format.

The default value is 'false'.

add-revision-atts

Specifies whether the result should use 'rev' attributes to mark change. When this property is true, any 'rev' attributes in the input that match the regex pattern in 'remove-rev-attribute-regex' are stripped before comparison.

The default value is 'true'.

add-status-atts

Specifies whether the result should use 'status' attributes to mark change. When this property is true, any 'status' attributes in the input are stripped before comparison.

The default value is 'true'.

include-deleted

Specifies whether to include deleted content in the result.

If set to false , the result will only include unchanged and added content.

The default value is 'true'.

new-outputclass

Specifies the string value to use for the outputclass attribute of added content.

Added elements and text (via the <ph> parent) will have this string set as the value of the outputclass attribute in the result.

The default value is 'deltaxml-add'.

new-revision

Specifies the string value to use for the rev attribute of added content.

Added elements and text (via the <ph> parent) will have this string set as the value of the rev attribute in the result.

The default value is 'deltaxml-add'.

new-version-atts

Specifies attributes to place on to added elements in the result.

The default value is ''.

old-outputclass

Specifies the string value to use for the outputclass attribute of deleted content.

Deleted elements and text (via the <ph> parent) will have this string set as the value of the outputclass attribute in the result.

The default value is 'deltaxml-delete'.

old-revision

Specifies the string value to use for the rev attribute of deleted content.

Deleted elements and text (via the <ph> parent) will have this string set as the value of the rev attribute in the result.

The default value is 'deltaxml-delete'.

old-version-atts

Specifies attributes to place on deleted elements in the result.

The default value is ''.

phrase-container-exclusions

Specifies any specializations where phrases have been removed from the content model.

These should be listed by full specialization name, e.g. topic/title, in a comma-separated list.

If an element contains any of the items in this list as part of its class attribute, text changes within that element will not be marked using phrase elements.

The default value is ''.

phrase-element-name

Specifies the element name to use in place of <ph> in the result.

If the input documents are specialized documents that have renamed the phrase element, use this method to specify the element name that should be used instead.

N.B. It is assumed that the 'rev' attribute is allowed on the specified element.

The default value is 'ph'.

show-non-phrase-changes

Specifies whether to textually mark changes to text where <ph> elements are not allowed.

If text has changed within an element where phrase markup is not allowed, some other means must be used to process the changes. If this parameter is set to true , the changed text is wrapped in textual delimiters e.g. -[[old text ]]- +[[new text ]]+. If the value is set to false, no markup is used and only the new text is output.

The default value is 'true'.

deleted-id-suffix

ustomizes the suffix that is added to ids of deleted element whose value is changed because of duplication.

The default value is '_deleted_'.

added-id-suffix

Customizes the suffix that is added to ids of added element whose value is changed because of duplication.

The default value is '_added_'.

Tracked changes configuration parameters.

The tracked changes output format specific parameters provide a way of configuring tracked change output and editor/format specific output.

framemaker-tcs-table-change-mode

Specifies how changes in tables should be tracked.

The FrameMaker editor cannot track the addition or deletion of a row or cell within a table. Such changes can be pushed down to the cell level, pushed up to the table level (e.g. a CALS or HTML 'table' element), or ignored. The advantage of pushing the changes down to the cell content level, is that this provides the highest level of change granularity, at the cost of having to accept or reject every changed cell in the table independently.

The push up processing means that any table that contains a row or cell level update is represented by a table level add and delete.

The ignore option is similar to the way in which track changes within tables are handled in FrameMaker; as neither row insertion and deletion or cell splitting and merging is tracked. However, for clarity this mode of operation ignores all changes within a table, it simply produces the 'B' version of the table.

This parameter can take the following values:
down

Changes in rows and cells are pushed down to the cell content level.

up

Changes in rows and cells are pushed up to the table level.

ignore

All changes in a table are ignored.

The default value is 'down'.

oxygen-tcs-deleted-space-mode

Specifies how deleted spaces should be handled.

Prior to oXygen 14 whitespace within the deleted content of a processing instruction were sometimes not displayed correctly. A work-around was to normalise the space within a deleted region.

This parameter can take the following values:
automatic

Chooses the delete space processing mode based on the declared oxygen-tcs-version parameter.

normlize

Allows deleted text to be viewed correctly prior to oXygen 14 release.

keep

Keeps the original whitespace formating of the deleted region.

The default value is 'automatic'.

oxygen-tcs-version

Specifies the version of oXygen editor used to display, accept and reject the tracked changes.

The format of the version is either '[0-9]+' or '[0-9]+.[0-9]+' without the enclosing string quotes, where: the first number sequence is the major number; and the optional second number sequence is the minor number.

This parameter is used to automatically set the relevant backwards compatibility options related to the oXygen tracked changes format. For example, prior to oXygen 14.0 release deleted whitespace needed to be normalised in order to consistently generate a reasonable result.

The default value is '11.2'.

tracked-changes-author

Specifies the author name that is embedded into the generated insertion and deletion processing instruction.

The default value is 'deltaxml'.

tracked-changes-date

Specifies the time-stamp that is embedded into the generated insertion and deletion processing instruction. The default time-stamp is that of the time that the comparison is run.

The default value is 'xsl date'.

xmetal-tcs-table-change-mode

Specifies how changes in tables should be tracked.

The XMetaL editor cannot track the addition or deletion of a row or cell within a table. Such changes can be pushed down to the cell level, pushed up to the table level (e.g. a CALS or HTML 'table' element), or ignored. The advantage of pushing the changes down to the cell content level, is that this provides the highest level of change granularity, at the cost of having to accept or reject every changed cell in the table independently.

The push up processing means that any table that contains a row or cell level update is represented by a table level add and delete.

The ignore option is similar to the way in which track changes within tables are handled in XMetaL; as neither row insertion and deletion or cell splitting and merging is tracked. However, for clarity this mode of operation ignores all changes within a table, it simply produces the 'B' version of the table.

This parameter can take the following values:
down

Changes in rows and cells are pushed down to the cell content level.

up

Changes in rows and cells are pushed up to the table level.

ignore

All changes in a table are ignored.

The default value is 'down'.

DITA map configuration parameters.

The DITA Map configuration parameters apply to all output formats. They are used to force outcomes, control conflict resolution, control scope, and specify which document's structure is the basis of the result.

map-backup-suffix

When the map comparison needs to backup a file before it can be compared it appends the '.' character followed by this suffix to the existing file name. Note that if the file already exists then an IOException will be thrown.

The default value is 'bak'.

map-clean-temp

Specifies whether to clean temporary files introduced when performing the comparison.

During the comparison those topics that are compared are initially backed up into a file

This parameter can take the following values:
yes

Clean temporary files created during the map comparison.

no

Do not clean temporary files created during the map comparison.

auto

Clean the temporary files only when the comparison is being performed on a copy of the input.

The default value is 'auto'.

map-pair-remaining-map-name

When the map-result-structure parameter is set to map-pair, the comparison result contains two maps. An updated version of the existing origin map, and a map containing those topic references that were in the original comparison, but not in the existing origin map.

Note that when the map-scan-topics-for-references is set to true , then the topics references within a map (and its submaps) also includes those references that are contained within the topics themselves.

The default value is 'dxml-remaining.ditamap'.

map-result-origin

When comparing the topics referenced in two maps, one of the maps is declared to be the result origin . This is the map that is used to generate the structure and/or order of topic references in the output.

Note that the map-copy-scope parameter is used to specify what the in-scope referents are (i.e. which resources that are pointed to by the map are considered to be part of the document that the map is defining).

This parameter can take the following values:
A

The first 'A' document map is being used as the basis of the output.

B

The second 'B' document map is being used as the basis of the output.

The default value is 'B'.

A.2.2. Comparison Parameters

These parameters control how the comparison is performed.

Table 4. Comparison configuration parameters.
Text comparison configuration parameters.
whitespace-processing-modehow to process whitespace changes
word-by-wordwhether to compare text in a more detailed way.
Table comparison configuration parameters.
cals-table-processingwhether to apply CALS table processing
html-table-processingwhether to apply html table or DITA simpletable processing
invalid-cals-table-behaviourhow to process invalid CALS tables
DITA configuration parameters.
remove-rev-attribute-regexthe regular expression that matches 'rev' attributes to be stripped
steps-conflict-resolutionhow to resolve conflicts between taskbody/steps and taskbody/steps-unordered elements.
use-ids-as-keyswhether to use id attributes as keys during the alignment process.
Ignore inline structural changes parameters.
ignore-inline-formattingwhether to ignore any changes that consist only of DITA inline element changes.
inline-formatting-elementsA comma and/or space separated list of elements which will be treated as ignoreable inline formatting changes. Only active when ignore-inline-formatting is set to true.
add-inline-formatting-elementsA comma and/or space separated list of elements which will be added to the set in inline-formatting-elements.
remove-inline-formatting-elementsA comma and/or space separated list of elements which will be removed from the set in inline-formatting-elements.
Element moves handling parameters.
detect-movesSpecifies whether to enable or disable move detection feature during comparison.
move-source-revisionSpecifies the string value to use for the rev attribute for source of moved content.
move-target-revisionSpecifies the string value to use for the rev attribute for target of moved content.
Text comparison configuration parameters.

These parameters control how text is compared and whitespace handled.

whitespace-processing-mode

Specifies how to handle whitespace changes.

When this option is set to 'show' whitespace differences are reported where possible. If the output-format is set to 'dita-markup', this will be wherever the DITA doctype allows text (as opposed to inter-element whitespace). For tracked-changes output formats, all whitespace changes can be shown.

This can lead to a significant amount of marked change throughout the document. When the parameter is set to 'ignore', whitespace differences are not shown; instead the 'B' document's whitespace is kept where possible.

Note that differences in whitespace are never ignored when the XML document explicitly states that the whitespace is important, via the xml:space attribute being set to 'preserve'.

The 'automatic' setting effectively behaves as either 'normalize' or 'ignore' depending on the value of the 'preservation-mode' and the 'output-format' parameters. Here, 'normalize' is chosen when: (1) the lexical preservation mode is set to 'automatic' and the output format is 'dita-markup'; or (2) the lexical preservation mode is set to either 'document' or 'docAndAttrib'. In all other cases, the automatic preservation mode is treated as if it were 'ignore'.

This parameter can take the following values:
show

Display the differences in whitespace where possible.

cdata

Ignore differences in whitespace, unless they occur within a CDATA section (or are explicitly preserved).

ignore

Ignore differences in whitespace that is not explicitly preserved.

normalize

Normalize whitespace in inputs before comparison.

automatic

Chooses the most appropriate mode based on other parameter settings. This is dependent on two other parameters 'output-format' and the 'preservation-mode', as discussed in the main whitespace processing mode documentation.

The default value is 'automatic'.

word-by-word

Specifies whether to perform detailed text comparison or not.

If set to true , text is split into words before comparison. This allows individual word changes to be detected.

The default value is 'true'.

Table comparison configuration parameters.

These parameters control how tables are compared.

cals-table-processing

Specifies whether to apply CALS table processing.

CALS table processing ensures that when valid (both syntactically and semantically according to the OASIS CALS table model documentation) input tables are provided the result will be a valid CALS table.

Simple changes to the table, such as changing the contents of an entry, adding a row or column are generally represented as fine grain changes. Because CALS entries can overlap or span multiple rows and columns, some types of change are difficult to represent at fine granularity, whilst ensuring validity. In these cases changes are represented at row (ie, groups of added/deleted rows) or even whole-table granularity.

Setting this parameter to false turns off this processing, therefore it is possible to generate an invalid table. However, if table validity is not a concern changes may be represented at finer granularity.

The default value is 'true'.

html-table-processing

Specifies whether to apply html table or DITA simpletable processing.

HTML tables processing ensures that when valid input tables are provided - according to the HTML-4 or draft HTML-5 documentation - the result will be a valid HTML-4/5 table. Note that both inputs need to follow the same standard (ie be HTML-4 or HTML-5).

Simple changes to the table, such as changing the contents of a cell and adding a row or column are generally represented as fine grain changes. Because HTML entries can overlap or span multiple rows and columns, some types of change are difficult to represent at fine granularity, whilst ensuring validity. In these cases changes are represented at row (ie, groups of added/deleted rows) or even whole-table granularity.

DITA Simple tables are also handled by this filter. In this case, the syntactic constraints ensure that cells cannot overlap or span either rows or columns, therefore changes are represented at a fine grained level of detail.

Setting this parameter to false turns off this processing, therefore it is possible to generate an invalid table. However, if table validity is not a concern changes may be represented at finer granularity.

The default value is 'true'.

invalid-cals-table-behaviour

In order to ensure that only valid CALS tables are passed to our specialized CALS table processing, each input table is marked either valid or invalid. This parameter declares what type of processing should be used for those tables that are marked as invalid. The 'warning report mode' parameter configures how recoverable errors are reported.

Three options are provided: fail, propagate up, and compare as XML. The fail option stops the comparison by throwing an appropriate exception (that includes the errors identified by the validity checker). The propagate up option ensures that changes to an invalid table (or more specifically 'tgroup') are represented at the table level. The compare as XML option essentially compares the tables as if they were well-formed XML.

Note that the results of the compare as XML option can differ from comparing the tables without CALS table processing enabled, as a small amount of CALS specific processing is applied to invalid tables in order to allow them to be compared against a similar valid table.

This parameter can take the following values:
fail

Throw an exception.

propagateUp

Propagate the changes to the table-level.

compareAsXml

Compare the table content as well-formed XML.

The default value is 'propagateUp'.

DITA configuration parameters.

The DITA configuration parameters apply to all output formats. They are used to control both conflict resolution and which topics are considered to be part of the document.

remove-rev-attribute-regex

'rev' attributes in the input could be confused with those used to represent change in the output. This parameter specifies the regular expression that matches 'rev' attributes to be stripped.

'rev' attributes that match this regex pattern are only stripped when the 'add-revision-atts' property is true.

The default value is 'deltaxml-.*'.

steps-conflict-resolution

Specifies how to resolve conflicts between a task body's mutually exclusive steps and steps-unordered elements.

A task's body must have precisely one of either the steps or steps-unordered elements. When one input's task body has a steps element and the other a steps-unordered element, the output cannot contain both. Therefore, the comparison is configured to compare the children of the steps and steps-unordered elements, and choose the name of the parent element as follows:

This parameter can take the following values:
preferA

Use the element name from the 'A' document to contain the compared entries.

preferB

Use the element name from the 'B' document to contain the compared entries.

preferSteps

Use a steps element to contain the compared entries.

preferStepsUnordered

Use a steps-unordered element to contain the compared entries.

The default value is 'preferB'.

use-ids-as-keys

Specifies whether to use id attribute value as keys in order to provide identity during comparison.

If set to true , elements with the same name and id value at the same XML tree level will always be matched, regardless of their content.

This setting should only be used if id attribute values are consistent across versions.

The default value is 'false'.

Ignore inline structural changes parameters.

These parameters control which DITA elements are marked as inline structural changes.

ignore-inline-formatting

If set to true, any changes that consist only of the addition, deletion or modification of DITA inline markup elements are not noted as changes.

When comparing the following 2 versions with ignore-inline-formatting switched on, no changes are marked up.

<p>This is an important point</p>

<p>This is an <b>important</b> point</p>

The default value is 'false'.

inline-formatting-elements

Sets the comma and/or space separated list of inline formatting elements. Only active when ignore-inline-formatting is set to true.

This parameter overrides the built in list of DITA inline elements. If the parameter ignore-inline-formatting is set to true then the addition, deletion or modification of these elements will not be treated as changes.

The default value is "apiname, b, cite, cmdname, codeph, filepath, i, lines, msgnum, msgph, parmname, pre, q, sep, sub, sup, systemoutput, term, tm, tt, u, uicontrol, userinput, var, /wintitle".

add-inline-formatting-elements

A comma and/or space separated list of inline formatting elements, which will be added to the value of the inline-formatting-elements parameter.

Use this list to add to the value of inline-formatting-elements which you wish to be ignored when considering change processing.

Elements in this list are added to the set in inline-formatting-elements, before the set in remove-inline-formatting-elements has been removed.

The default value is "".

remove-inline-formatting-elements

A comma and/or space separated list of inline formatting elements, which will be removed from the value of the inline-formatting-elements parameter.

Use this list to remove from the value of inline-formatting-elements when the defaults include elements you do not wish to be ignored when considering change processing.

Elements in this list are removed from the set in inline-formatting-elements, after the set in add-inline-formatting-elements has been added.

The default value is "".

Element moves handling parameters.

These parameters control element moves handling and defines the values for revision attribute.

detect-moves

If set to true, them elements moves are handled if the elements have ids on them.

If this feature is set to true then it overrides the value of the parameter ‘use-ids-as-keys’

move-source-revision

Specifies the string value to use for the rev attribute for source of moved content.

move-target-revision

Specifies the string value to use for the rev attribute for target of moved content.

A.2.3. Advanced Parameters

The advanced settings parameters provide a finer level of control over the comparison, the output and error reporting.

Note that changing the default values of some of these parameters can affect the validity of the result, in the sense that the output may not conform to the DITA specification.

Table 5. Configuration parameters for advanced use.
Parameters used for validation and reporting.
validate-inputswhether to validate the input documents if they specify a DOCTYPE
warning-report-modehow to report recoverable errors and warnings
Parameters used for selecting what to present.
map-context-hrefan optional reference to a file that stores the map context information
map-copy-scopeHow to copy a DITA map and its content to an output directory.
map-scan-topics-for-referencesWhether topics should be scanned when identifying a map's referenced resources.
modified-attribute-modehow modified attributes should be included in the output
unmarked-change-modehow to handle data that cannot contain difference markup
Parameters used to override or force specific behaviours.
force-specializationwhether or not to force result specialization when input types are different
output-encoding-declarationthe output character encoding to use in the XML declaration
topic-public-idthe publicId to use for the doctype if the result document is to remain as a Topic (due to comparing different specializations)
topic-system-idthe systemId to use for the doctype if the result document is to remain as a Topic (due to comparing different specializations)
xml-version-declarationthe version to use in the XML declaration
Parameters used for validation and reporting.

These parameters are used to specify what validation is performed and how the errors and warnings are reported.

validate-inputs

Specifies whether to validate input documents.

If set to true and the inputs include a DOCTYPE, they will be validated against it. If they are not valid, the compare method will throw an InputLoadException.

The default value is 'true'.

warning-report-mode

Sets the mode to use for reporting recoverable errors and warnings.

This parameter can take the following values:
message

Report the recoverable errors and warnings as XSL messages, which are typically visible when the comparison is run from a command-line terminal.

pis

Add the recoverable errors and warnings as processing instructions.

comments

Add the recoverable errors and warnings as comments.

markup

Add the recoverable errors and warnings as document content.

The default value is 'pis'.

Parameters used for selecting what to present.

These parameters are used to specify how unrepresentable change can be handled; when to remove information from the inputs; what information is in scope; and contextual information for referenced content (which can affect how it is displayed).

map-context-href

When set to the non-empty string, this parameter provides a reference to a file that stores the map context information.

This parameter is intended for internal use by the DITA map comparator, which uses it to provide information about the map comparison context. Current information includes the map's result origin and result structure.

The default value is ''.

map-copy-scope

Specifies how to copy a DITA map and its content to an output directory, which maintains the relative structure of the source where this is feasible.

Some maps will contain references to resources that are both marked for copying and cannot be made relative to the map's location. In general, the resources that are referenced from a map (referents), can be grouped according to their 'host' locations, such as a specified file system, web service, version control repository, and content management system (CMS). Each of these 'host' locations is represented as a 'top-level' directory of the output directory (and thus forms a 'forest' of 'tree' structures).

Note the output directory will also contain an additional DITA map file that includes a single reference to the copied map file. This is intended to record where the copied map is within the output directory structure, and provide a convenient point for loading the result into an DITA aware editor.

This parameter can take the following values:
local

References with local scope are copied to the output directory.

local-and-peer

References with local and peer scope are copied to the output directory.

all

All map level references (including external scope) are copied to the output directory.

The default value is 'local'.

map-scan-topics-for-references

Specify whether topics should be scanned when identifying a map's referenced resources (referents). It is possible for a DITA topic to directly refer to other resources rather than make use of DITA keying reuse mechanism. When this is the case, just scanning the maps for resources may not be sufficient, to identify all the resources that are referenced. Turning this option off may significantly improve the performance of the comparison, if it is known that all resources of interest are referenced directly by the maps.

Note that this setting also affects the way in which the resources are identified when copying maps to an output directory.

The default value is 'true'.

modified-attribute-mode

Specifies how modified attributes should be included in the output.

Not all output formats allow the markup of attribute changes. When this is the case, a decision needs to be made on which version of the attribute should be present in the result file. This parameter is used to define what behaviour is required.

This parameter can take the following values:
automatic

The behaviour will depend on other parameter settings, primarily the output-format. If the output format is able to show attribute changes, modified attributes will not be processed differently, otherwise 'B' mode will be chosen.

B

Output the 'B' version of modified attributes and any added ('B') attributes. Note that deleted ('A') attributes will not be output.

BA

Output the 'B' version of modified attributes. Output both inserted ('B') and deleted ('A') attributes.

A

Output the 'A' version of modified attributes and any deleted ('A') attributes. Note that added ('B') attributes will not be output.

AB

Output the 'A' version of modified attributes. Output both inserted ('B') and deleted ('A') attributes.

encode-as-attributes

Output the 'B' version of modified attributes and any added ('B') attributes but additionally show the changes encoded as attributes in the attribute-change ('ac') namespace.

The default value is 'automatic'.

unmarked-change-mode

Specifies how to handle data that cannot contain difference markup.

Some differences between two XML documents cannot feasibly be displayed in a valid output document. These typically include changes to the XML declaration, doctype, internal subset and processing instructions. In these situations it is useful to specify what should be done.

Note that the internal subset can contain local element, attribute, and entity declarations, as well as processing instructions, comments, and entity-references.

This parameter can take the following values:
B

Output the 'B' version of modified items; output inserted ('B') items; do not output deleted ('A') items.

BdA

As 'B' except when processing an internal subset declaration, in which case act as 'BA'.

BA

Output the 'B' version of modified items; output both inserted ('B') and deleted ('A') items.

A

Output the 'A' version of modified items; output deleted ('A') items; do not output inserted ('B') items.

AdB

As 'A' except when processing an internal subset declaration, in which case act as 'AB'.

AB

Output the 'A' version of modified items; output both inserted ('B') and deleted ('A') items.

The default value is 'BdA'.

Parameters used to override or force specific behaviours.

These parameters are used to override detected settings or 'safe' behaviour and force particular outcomes, even if this outcome is not valid.

force-specialization

Specifies whether or not to force result specialization when the input types are different.

If this value is set to false and the input documents are different DITA types, the result document will be left in its generalized form with class attributes to indicate what the specializations are.

If this value is set to true and the input documents are different DITA types, the result document will be specialized to the DITA type of the second input. Please note that this may produce a result document that is not valid against its doctype.

The default value is 'false'.

output-encoding-declaration

Sets the character encoding output to use in the XML declaration. For example, 'UTF-8', 'ISO-8859-1', 'windows-1252', and 'ascii'. Precisely which encodings are available is dependent on the specific Java or .NET runtime environment that you use. Note that an invalid output encoding will cause an exception to be raised.

The 'system' special value is introduced; selecting it has the affect of choosing the 'B' document's character encoding.

The default value is 'system'.

topic-public-id

Specifies the publicId to use for a Topic doctype.

A value should only be passed if you wish to override the default values for the publicId (-//OASIS//DTD DITA 1.1 Topic//EN for DITA 1.1 and -//OASIS//DTD DITA 1.2 Topic//EN for DITA 1.2)

Note that this value is only used if the following are true:

The input documents are different DITA specializations, or only one of them is specialized At least one of the input documents specified a doctype The comparison has not been configured to re-specialize the result document even if the inputs were different types

The default value is ''.

topic-system-id

Specifies the systemId to use for a Topic doctype.

A value should only be passed if you wish to override the default values for the publicId (http://docs.oasis-open.org/dita/v1.1/OS/dtd/topic.dtd for DITA 1.1 and http://docs.oasis-open.org/dita/v1.2/os/dtd1.2/technicalContent/dtd/topic.dtd for DITA 1.2)

Note that this value is only used if the following are true:

The input documents are different DITA specializations, or only one of them is specialized At least one of the input documents specified a doctype The comparison has not been configured to re-specialize the result document even if the inputs were different types

The default value is ''.

xml-version-declaration

Sets the version to use in the XML declaration.

The 'system' special value is introduced; selecting it has the affect of choosing 'B' document's xml-version (as provided by the parser).

This parameter can take the following values:
1.0

Ensure the xml version is set to '1.0'.

1.1

Ensure the xml version is set to '1.1'.

system

Use the 'B' document's xml version (as provided by the parser).

The default value is 'system'.

B. Configuration Properties Appendix

Private use characters are used to encode special character entities that are difficult to preserve when performing XSLT processing.

Table 6. Configuration Property Summary Table
Property NameSummary Description
com.deltaxml.cmdline.forceOverwritewhether the command line tool overwrites output files without prompting
com.deltaxml.dita.DitaCompare.debugwhether the intermediate results are produced for debugging
com.deltaxml.dita.DitaCompare.timingwhether times for processing each filter are produced to stdout
com.deltaxml.ext.catalog.filesthe external catalog files to be searched
xml.catalog.filesthe catalog files to be searched
com.deltaxml.rsc.catalog.filesthe fixed internal catalog files to be searched
java.property.nameprovide the initial value for an unset java property.
com.deltaxml.cmdline.forceOverwrite

Specifies whether the command line tool overwrites output files without prompting.

When set to true all 'non-optimised' processing stages have an intermediate result associated with them, in the form of a file with name 'DitaCompare__<stage name>.xml'. Some filters in the pipeline processing are optimised by essentially merging their processing; in these cases the result of the merged filters is produced.

This output is only intended for use when directed by DeltaXML support staff.

The default value is 'false'.

com.deltaxml.dita.DitaCompare.debug

Specifies whether the intermediate results are produced for debugging.

When set to true the time it taken by each of processing stages is recorded. Note that a zero time duration indicates one of two situations, either the stage was too fast to register a time, or more likely that stage has been 'optimized' (merged into another stage) and so is included in another stages results.

This output is only intended for use when directed by DeltaXML support staff.

The default value is 'false'.

com.deltaxml.dita.DitaCompare.timing

Specifies whether times for processing each filter are produced to the standard-output/console.

com.deltaxml.ext.catalog.files

Specifies the external catalog files to be searched.

The tool is preconfigured to include the catalog's for supported document versions. However, this mechanism provides a means for supplementing/overriding given catalog entries with a user defined catalog.

Note that this property can be bypassed by directly changing the default definition of the more general xml.catalog.files property.

The default value is '' (the empty string).

xml.catalog.files

Specifies the catalog files to be searched.

This string provides a mechanism for setting the underpinning Apache XML commons resolver's catalog search path. Note that as this is a Java system property, it can be overridden on the command-line via use of the -Dxml.catalog.files=value JVM argument (or environmental setting). In general, this mechanism will only set the property, if it does not already exist.

Note changing the catalog files, can change what it means for a document to be valid, and thus affect the way in which the document comparison processing should be performed. For example, it can affect the allowed order of elements. Therefore, it is not possible for the tool to guarantee that valid inputs result in valid outputs if its catalog(s) have be modified or replaced.

The default value is '${com.deltaxml.ext.catalog.files};${com.deltaxml.rsc.catalog.files}'.

com.deltaxml.rsc.catalog.files

Specifies the the fixed internal catalog files to be searched.

This property cannot be modified, but can be referenced to enable the internal catalogs to be searched. By default the xml.catalog.files searches this catalog after any external catalog supplied by the user, via the com.deltaxml.ext.catalog.files property, has been searched.

java.property.name

Provide the initial value for an unset java property.

It is possible to use a deltaxmlConfig.xml file to provide arbitrary Java system properties, so long as they do not already have a definition provided, such as one provided by a JVM argument (-Djava.property.name=value) or the environment. For example it is possible to specify the Apache catalog verbosity by setting the initial value of the xml.catalog.verbosity system property. This can be useful for seeing which the catalog(s) are actually being loaded and used.

In general, this mechanism will only set the property, if it does not already exist.

Resources

XML Resources

http://www.w3.org/XML
W3C Consortium site for all XML specifications and news
http://www.w3.org/TR/REC-xml
The formal specification for XML
http://www.w3schools.com/xml
W3 Schools introduction to XML

DeltaXML Resources

http://www.deltaxml.com/products/dita/try.html
DITA Compare interactive demo

DITA Resources

http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita
OASIS DITA Technical Committee Homepage
http://dita.xml.org/
The Official DITA Community Page