This User Guide provides a high-level description of the DITA Compare product and includes links to more detailed descriptions contained in the Reference document.
What DITA Compare does
Provides features for comparing either:
- two different versions of a DITA topic, or
- two different sets of DITA topics referenced from two different versions of a DITA map, or
- two different DITA map files (and not referenced topics or sub-maps)
The result of a comparison is a DITA document with differences marked using one of a number of possible output formats
DITA Compare can be downloaded in three different versions. The Mac and Java/Unix versions provide a Java API whereas the .NET version provides a .NET API, and also differs from the others as it comes without a GUI application (which requires Java). The Mac and Java/Unix versions represent builds optimised specifically for the Mac and Unix operating systems, hence the minor differences between them are associated with the way the application is installed and the GUI application is started. The versions are summarised below:
DITA Compare provides an API designed for seamless integration with other systems. Whilst the API provides the most features for controlling and monitoring a comparison, a command-line interface is also provided for convenience.
A GUI application is also provided (except in the .NET version), but this is intended to provide a simple way to show the potential of DITA Compare's features, rather than as an every-day productivity tool.
The API provides a callback feature for monitoring the progress of large topics or maps. For Java, this is in the form of a Listener interface, for .NET, a set of events are available that can be subscribed to. See the API documentation for full details of this.
The result of a comparison is a DITA document that shows differences either using standard DITA markup or markup/processing-instructions tailored for specific DITA tools. The available output formats are standard DITA (exploiting rev and status attributes), plus tracked change formats for the Arbortext, FrameMaker, Oxygen and XMetal editors. These are summarised in the table below, and described fully in the Output Formats section of the Reference document.
All tracked change formats are available for all types of comparison. These formats are designed to work for markup of DITA topics and are provided, for completeness, for DITA 'mapfile' comparison. For DITA 'mapfile' comparison however only the Oxygen tracked change format has been tested.
Starting a Comparison
This section guides you through three simple DITA comparison scenarios, but let's first have a high-level look at what is required to invoke a comparison:
No matter what interface you use, to start a comparison with DITA Compare, you need supply only 4 pieces of information:
- The comparison type: Topic, Map-Topicset or Map-File
- The location of Input A
- The location of Input B
- The output destination (except when an 'inplace' map comparison is specified)
Parameters are used to control various aspects of a comparison, these are summarised in the Customizing a Comparison section of this document.
The most significant parameter when performing a comparison is output-format, described in the Output Format section. The value of this also affects the default values of other parameters.
In a topic comparison, two versions of a topic are compared and the result is a topic file with differences marked up in accordance with the selected Output Format. Here's a high-level diagram showing the comparison:
The Java code below shows the compare method being invoked on a new instance of DitaTopicCompare. The method call supplies the two input files and the output file as File arguments:
An extract from the resulting output file is shown below, the output-format is DITA Markup, the default. This extract shows the mis-spelled word 'hgh' has been replaced with 'high'. Two ph elements have been used to wrap each word and each element uses DITA's own rev and status attributes.
Viewing the Result
This DITA Markup output is 'vendor neutral' such that it can be rendered in a useful way in any XML editor that either works with conditional processing or attribute-controlled styling of the WYSIWYG (author) view. The screenshot below shows the output as viewed in the 'author' view of Oxygen XML Editor - with the 'Colored revision changes' style selected:
For more details on this specific output format, please see the DITA Markup section of the Reference document.
A Mapfile Comparison is a comparison of the DITA map file itself, as opposed to a comparison of the referenced topics (for this, see the Map Topic Set Comparison section).
For this comparison type, the output is a DITA map file with differences marked according to the chosen Output Format.
Map Topicset Comparison
This comparison type involves the comparison of a set of all the topics referenced from the supplied top-level DITA maps. Here's a high-level diagram showing the comparison:
Changes within topics are marked up using the Output Format specified at the time the comparison is invoked. In the example code (below), the OXYGEN_TCS (Oxygen Tracked Changes) output format is used.
In the example above, the MapResultStructure enum type UNIFIED_MAP is used to set the result structure, whilst MapResultOrigin.B_DOCUMENT specifies that the result should be shown as modifications to the second map (map B) passed as an argument to the compare method.
Update Modes: 'Map Copy' vs 'In Place'
In these map comparison scenarios, all topics referenced by each input map are copied to newly created container directories for their respective maps. It is therefore the copies of the DITA map and topic files that are annotated (in the 'Folder B' copy by default) to show the differences found. Whilst this Map Copy approach is inherently safer because the original file copies are left untouched, an In Place map comparison mode is also provided.
An In Place comparison annotates the files (in 'Folder B' by default) in their original location, but creates backup copies at the same location with a 'bak' suffix. This mode may be useful if you want to perform a comparison on your own copies of the two input maps, or if the input maps are under version control in a repository. To invoke an In Place comparison from the API, the
compareInplace method is used instead of the
compare method, from the command-line, the output destination path is simply substituted with the 'inplace' string.
When performing a map comparison, Dita Compare provides a choice of result structures: Topic Set, Map Pair and Unified Map (described in the Map Topicset Result section of the Reference). The map comparison scenarios outlined here use the same input maps but show each of the three available result structures.
The diagram below shows representations of the two DITA maps used as inputs to the comparison for each of the following comparison scenarios included in this section.
Topic Set (Result Structure)
In this scenario, three arguments are supplied to the compare method: the first two arguments are input locations referencing the top-level ditamap files for the DITA map versions, the final argument is the output destination, which should be an empty directory. The only other setting for this scenario is the MapResultStructure property which is set to TOPIC_SET (see the Map Processing section in the Reference for further details).
Note: A number of other properties control how the result of a map comparison is represented, identified by the 'map-' prefix in their name, a full description of these can be found in DITA Compare Parameters.
The diagram below shows that, for this map comparison, the input directories, labelled Folder A and Folder B are copied to the supplied output directory
By default, all changes are described in the result in terms of modifications to the second input map supplied, DitaMap B in this case. The map and topics in the Folder B copy are arranged and annotated to describe the map comparison result as outlined below:
- Topic references in DitaMap B show how topics in Folder B align with those in Folder A
- Each topic in the Folder B copy shows any differences to the corresponding Folder A version
- Topic references in DitaMap B are flattened so all topic references occur at the top-level.
- Dita Compare orders topic references, where possible, so they align with those in the DitaMap B original
- References to deleted topics (not referenced in the DitaMap B original) link to the copy in the Folder A copy
The DITA map result file (Topic Set)
The ditamap file in Folder B uses rev and status attributes to describes the differences found in the referenced topics, the XML for this is shown below:
Map-Pair (Result Structure)
This scenario is the same as the previous with one exception: a Map Pair result is specified for the output. This is done by setting the MapResultStructure property to MAP_PAIR.
The result of this map comparison is shown below.
This diagram shows the DITA map labelled DitaMap B only references topics in Folder B, but there's now an additional Remainder DITA map. This Remainder DITA map references all topics referenced in the DitaMap A map, but not found in DitaMap B, effectively the deleted topics.
This Map Pair structure has a benefit over a Topic Set in that it preserves the hierarchy of topic references, as illustrated by Topic2 still being nested within Topic1, the downside is that deleted topics can not be seen in the context of DitaMap B, they can only be viewed from the Remainder map.
Unified Map (Result Structure)
This scenario is the same as the previous with one exception: a Unified Map result is specified for the output. This is done by setting the MapResultStructure property to UNIFIED_MAP.
The result of this map comparison is shown below:
This diagram shows the DITA map labelled DitaMap B references topics in Folder A and Folder B. The topicref elements in DitaMap A referencing topics not referenced in DitaMap B are adjusted and inserted into DitaMap B, but marked as deletions.
This Unified Map structure aspires to combine the benefits of the Topic Set and Map Pair result structures. So it preserves the hierarchy of topic references in DitaMap B and attempts to insert 'missing' DitaMap A topic references (and their hierarchy also) as close as possible to the DitaMap B location where they are found to be missing.
This result structure is most suitable for cases where the structure of the two DitaMaps (A and B) is broadly similar, so missing topic references can be inserted close to their original position without compromising the original structure. This diagram shows a oXygen XML Editor screenshot of a 'Unified Map' comparison result structure:
Configuring a Comparison
DITA Compare provides a set of parameters used to control the way comparisons are performed and results are formatted. Initially these parameters have default values for the most common use cases, but these values can be configured via the command-line, GUI or API. A summary of the features controlled by parameters is given in the table below:
|Lexical Preservation||Control preservation of processing instructions, comments, whitespace, entity references, DTD declarations etc.|
|Text Comparison||Control the granularity of a comparison, (e.g. word by word)|
|Table Processing||Validation and processing of DITA tables|
|Representation of Revisions||How the revision of elements, attributes or other node types is represented in the result.|
|Map Control||For map comparisons. Affects the way DITA maps are copied and results structured|
|Editor Optimization||Optimise the output to suit specific DITA editors|
|Control features for specialization, element ordering, reference resolving etc.|
The command line syntax for invoking a comparison with comparison parameters is shown in the topic comparison example below - here param is the parameter name and value is the value to assign to the parameter.
Comparison parameters are available (for setting or getting) via setter/getter methods available on the top-level
DitaTopic objects. Naturally, parameters applicable only to map comparisons are not accessible from the
Comparison parameters are exposed as read/write properties on the top-level DitaCompareDotNet DitaMapCompareDotNet DitaTopicCompareDotNet objects. Naturally, parameters applicable only to map comparisons are not accessible from the DitaTopicDotNet object.