User Guide
Overview
This User Guide provides a high-level description of the DITA Compare product and includes links to more detailed descriptions contained in the Reference document.
What DITA Compare does
Provides features for comparing either:
two different versions of a DITA topic, or
two different sets of DITA topics referenced from two different versions of a DITA map, or
two different DITA map files (and not referenced topics or sub-maps)
The result of a comparison is a DITA document with differences marked using one of a number of possible output formats
Product Versions
DITA Compare can be downloaded in three different versions. The Mac and Java/Unix versions provide a Java API whereas the .NET version provides a .NET API, and also differs from the others as it comes without a GUI application (which requires Java). The Mac and Java/Unix versions represent builds optimised specifically for the Mac and Unix operating systems, hence the minor differences between them are associated with the way the application is installed and the GUI application is started. The versions are summarised below:
Interfaces
DITA Compare provides an API designed for seamless integration with other systems. Whilst the API provides the most features for controlling and monitoring a comparison, a command-line interface is also provided for convenience.
A GUI application is also provided (except in the .NET version), but this is intended to provide a simple way to show the potential of DITA Compare's features, rather than as an every-day productivity tool.
Progress Monitoring
The API provides a callback feature for monitoring the progress of large topics or maps. For Java, this is in the form of a Listener interface, for .NET, a set of events are available that can be subscribed to. See the API documentation for full details of this.
Output Formats
The result of a comparison is a DITA document that shows differences either using standard DITA markup or markup/processing-instructions tailored for specific DITA tools. The available output formats are standard DITA (exploiting rev
and status
attributes), plus tracked change formats for the Arbortext, FrameMaker, Oxygen and XMetal editors. These are summarised in the table below, and described fully in the Output Formats section of the Reference document.
All tracked change formats are available for all types of comparison. These formats are designed to work for markup of DITA topics and are provided, for completeness, for DITA 'mapfile' comparison. For DITA 'mapfile' comparison however only the Oxygen tracked change format has been tested.
Starting a Comparison
This section guides you through three simple DITA comparison scenarios, but let's first have a high-level look at what is required to invoke a comparison:
Essential Arguments
No matter what interface you use, to start a comparison with DITA Compare, you need supply only 4 pieces of information:
The comparison type: Topic, Map-Topicset or Map-File
The location of Input A
The location of Input B
The output destination (except when an 'inplace' map comparison is specified)
Optional Parameters
Parameters are used to control various aspects of a comparison, these are summarised in the Customizing a Comparison section of this document.
The most significant parameter when performing a comparison is output-format, described in the Output Format section. The value of this also affects the default values of other parameters.
Topic Comparison
In a topic comparison, two versions of a topic are compared and the result is a topic file with differences marked up in accordance with the selected Output Format. Here's a high-level diagram showing the comparison:
The Java code below shows the compare
method being invoked on a new instance of DitaTopicCompare
. The method call supplies the two input files and the output file as File
arguments:
Invoking a topic comparison using the DitaTopicCompare class
DitaTopicCompare dtc= new DitaTopicCompare();
dtc.compare(new File("C:\\test\\topic-scenario\\topic-a.dita"),
new File("C:\\test\\topic-scenario\\topic-b.dita"),
new File("C:\\test\\topic-scenario\\out-file.dita"));
An extract from the resulting output file is shown below, the output-format is DITA Markup, the default. This extract shows the mis-spelled word 'hgh' has been replaced with 'high'. Two ph
elements have been used to wrap each word and each element uses DITA's own rev
and status
attributes.
<p>In this User Guide we provide a
<ph status="deleted" rev="deltaxml-delete">hgh</ph>
<ph status="new" rev="deltaxml-add">high</ph>
-level description
Viewing the Result
This DITA Markup output is 'vendor neutral' such that it can be rendered in a useful way in any XML editor that either works with conditional processing or attribute-controlled styling of the WYSIWYG (author) view. The screenshot below shows the output as viewed in the 'author' view of Oxygen XML Editor - with the 'Colored revision changes' style selected:
For more details on this specific output format, please see the DITA Markup section of the Reference document.
Mapfile Comparison
A Mapfile Comparison is a comparison of the DITA map file itself, as opposed to a comparison of the referenced topics (for this, see the Map Topic Set Comparison section).
For this comparison type, the output is a DITA map file with differences marked according to the chosen Output Format.
Initiating a map file comparison using the DitaMapfileCompare class
DitaMapfileCompare dmc = new DitaMapfileCompare();
File in1 = new File("c:\\test\\folderA\\maps\\ditamapA.ditamap");
File in2 = new File("c:\test\\folderB\\maps\\ditamapB.ditamap");
File out = new File("c:\\test\\map-file-scenario\\result.ditamap");
dmc.compare(in1, in2, out);
Map Topicset Comparison
This comparison type involves the comparison of a set of all the topics referenced from the supplied top-level DITA maps. Here's a high-level diagram showing the comparison:
Changes within topics are marked up using the Output Format specified at the time the comparison is invoked. In the example code (below), the OXYGEN_TCS (Oxygen Tracked Changes) output format is used.
Invoking a map file comparison using the DitaMapTopicsetCompare class
DitaMapTopicsetCompare dmc = new DitaMapTopicsetCompare();
dmc.setMapResultStructure(MapResultStructure.UNIFIED_MAP);
dmc.setMapResultOrigin(MapResultOrigin.B_DOCUMENT);
dmc.setOutputFormat(OutputFormat.OXYGEN_TCS);
File in1 = new File("c:\\test\\folderA\\maps\\ditamapA.ditamap");
File in2 = new File("c:\\test\\folderB\\maps\\ditamapB.ditamap");
File out = new File("c:\\test\\map-scenario\\result");
dmc.compare(in1, in2, out);
In the example above, the MapResultStructure enum type UNIFIED_MAP is used to set the result structure, whilst MapResultOrigin.B_DOCUMENT specifies that the result should be shown as modifications to the second map (map B) passed as an argument to the compare
method.
Update Modes: 'Map Copy' vs 'In Place'
In these map comparison scenarios, all topics referenced by each input map are copied to newly created container directories for their respective maps. It is therefore the copies of the DITA map and topic files that are annotated (in the 'Folder B' copy by default) to show the differences found. Whilst this Map Copy approach is inherently safer because the original file copies are left untouched, an In Place map comparison mode is also provided.
An In Place comparison annotates the files (in 'Folder B' by default) in their original location, but creates backup copies at the same location with a 'bak' suffix. This mode may be useful if you want to perform a comparison on your own copies of the two input maps, or if the input maps are under version control in a repository. To invoke an In Place comparison from the API, the compareInplace
method is used instead of the compare
method, from the command-line, the output destination path is simply substituted with the 'inplace' string.
Result Structures
When performing a map comparison, Dita Compare provides a choice of result structures: Topic Set, Map Pair and Unified Map (described in the Map Topicset Result section of the Reference). The map comparison scenarios outlined here use the same input maps but show each of the three available result structures.
The diagram below shows representations of the two DITA maps used as inputs to the comparison for each of the following comparison scenarios included in this section.
Topic Set (Result Structure)
In this scenario, three arguments are supplied to the compare
method: the first two arguments are input locations referencing the top-level ditamap
files for the DITA map versions, the final argument is the output destination, which should be an empty directory. The only other setting for this scenario is the MapResultStructure
property which is set to TOPIC_SET
(see the Map Processing section in the Reference for further details).
Note: A number of other properties control how the result of a map comparison is represented, identified by the 'map-' prefix in their name, a full description of these can be found in DITA Compare Parameters.
The diagram below shows that, for this map comparison, the input directories, labelled Folder A and Folder B are copied to the supplied output directory
By default, all changes are described in the result in terms of modifications to the second input map supplied, DitaMap B in this case. The map and topics in the Folder B copy are arranged and annotated to describe the map comparison result as outlined below:
Topic references in DitaMap B show how topics in Folder B align with those in Folder A
Each topic in the Folder B copy shows any differences to the corresponding Folder A version
Topic references in DitaMap B are flattened so all topic references occur at the top-level.
Dita Compare orders topic references, where possible, so they align with those in the DitaMap B original
References to deleted topics (not referenced in the DitaMap B original) link to the copy in the Folder A copy
The DITA map result file (Topic Set)
The ditamap file in Folder B uses rev and status attributes to describes the differences found in the referenced topics, the XML for this is shown below:
<!DOCTYPE map
PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd">
<map>
<topicref href="../topics/topic1.dita" status="unchanged"/>
<topicref href="../topics/topic2.dita" status="unchanged"/>
<topicref href="../../_a-0-file-/topics/topic4A.dita"
rev="deltaxml-delete"
status="deleted"/>
<topicref href="../topics/topic3.dita" status="changed"/>
<topicref href="../topics/topic4B.dita" rev="deltaxml-add" status="new"/>
</map>
Map-Pair (Result Structure)
This scenario is the same as the previous with one exception: a Map Pair result is specified for the output. This is done by setting the MapResultStructure
property to MAP_PAIR
.
The result of this map comparison is shown below.
This diagram shows the DITA map labelled DitaMap B
only references topics in Folder B, but there's now an additional Remainder
DITA map. This Remainder
DITA map references all topics referenced in the DitaMap A
map, but not found in DitaMap B
, effectively the deleted topics.
This Map Pair structure has a benefit over a Topic Set in that it preserves the hierarchy of topic references, as illustrated by Topic2 still being nested within Topic1, the downside is that deleted topics can not be seen in the context of DitaMap B, they can only be viewed from the Remainder map.
The DITA map result file (Map Pair)
<!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd">
<map xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/">
<title>Simple DITA Map Sample</title>
<topicref href="../topics/topic1.dita" status="unchanged">
<topicref href="../topics/topic2.dita" status="unchanged"/>
</topicref>
<topicref href="../topics/topic3.dita" status="changed"/>
<topicref href="../topics/topic4B.dita" rev="deltaxml-add" status="new"/>
</map>
Unified Map (Result Structure)
This scenario is the same as the previous with one exception: a Unified Map result is specified for the output. This is done by setting the MapResultStructure
property to UNIFIED_MAP
.
The result of this map comparison is shown below:
This diagram shows the DITA map labelled DitaMap B
references topics in Folder A and Folder B. The topicref elements in DitaMap A referencing topics not referenced in DitaMap B
are adjusted and inserted into DitaMap B, but marked as deletions.
The DITA map result file (Unified Map)
<!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd">
<map xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/">
<title>Simple DITA Map Sample</title>
<topicref href="../topics/topic1.dita" status="unchanged">
<topicref href="../topics/topic2.dita" status="unchanged"/>
</topicref>
<topicref href="../../_a-0-file-/topics/topic4a.dita"
rev="deltaxml-delete" status="deleted"/>
<topicref href="../topics/topic3.dita" status="changed"/>
<topicref href="../topics/topic4B.dita" rev="deltaxml-add" status="new"/>
</map>
This Unified Map structure aspires to combine the benefits of the Topic Set and Map Pair result structures. So it preserves the hierarchy of topic references in DitaMap B and attempts to insert 'missing' DitaMap A topic references (and their hierarchy also) as close as possible to the DitaMap B location where they are found to be missing.
This result structure is most suitable for cases where the structure of the two DitaMaps (A and B) is broadly similar, so missing topic references can be inserted close to their original position without compromising the original structure. This diagram shows a oXygen XML Editor screenshot of a 'Unified Map' comparison result structure:
Configuring a Comparison
DITA Compare provides a set of parameters used to control the way comparisons are performed and results are formatted. Initially these parameters have default values for the most common use cases, but these values can be configured via the command-line, GUI or API. A summary of the features controlled by parameters is given in the table below:
Category | Description |
---|---|
Lexical Preservation | Control preservation of processing instructions, comments, whitespace, entity references, DTD declarations etc. |
Text Comparison | Control the granularity of a comparison, (e.g. word by word) |
Table Processing | Validation and processing of DITA tables |
Representation of Revisions | How the revision of elements, attributes or other node types is represented in the result. |
Map Control | For map comparisons. Affects the way DITA maps are copied and results structured |
Editor Optimization | Optimise the output to suit specific DITA editors |
DITA settings | Control features for specialization, element ordering, reference resolving etc. |
Full details on all these parameters can be found in DITA Compare Parameters. In addition to these parameters, a set of Configuration Properties can be used to control DITA Compare at a low-level.
Command Line
The command line syntax for invoking a comparison with comparison parameters is shown in the topic comparison example below - here param is the parameter name and value is the value to assign to the parameter.
java -jar deltaxml-dita.jar compare topic in1.xml in2.xml out.xml [param=value]*
Java API
Comparison parameters are available (for setting or getting) via setter/getter methods available on the top-level DitaCompare
, DitaMapTopicsetCompare
and DitaTopic
objects. Naturally, parameters applicable only to map comparisons are not accessible from the DitaTopic
object.
.NET API
Comparison parameters are exposed as read/write properties on the top-level DitaCompareDotNet DitaMapCompareDotNet DitaTopicCompareDotNet objects. Naturally, parameters applicable only to map comparisons are not accessible from the DitaTopicDotNet object.