Output Formats
DITA Compare can represent the differences between two inputs using a variety of output formats.
Each output format has its own limitations on what types of change are supported and where changes are allowed to take place. For example, DITA's own change markup language does not support changes in attributes.
DITA Compare currently provides the following output formats:
DITA Markup | DITA's own change markup scheme. This is the default option. |
Arbortext Tracked Changes | Arbortext Change Tracking Markup Specification. |
FrameMaker Tracked Changes | FrameMaker Tracked Changes format, which is supported by the Adobe FrameMaker Editor. |
Oxygen Tracked Changes | Oxygen Tracked Changes format, which is supported by the Oxygen Editor and Author products. |
XMetaL Tracked Changes | XMetaL Tracked Changes format, which is supported by the XMetaL Editor. |
DITA Markup
When using DITA Markup, the output of a comparison is itself a DITA document whose changes are marked up using DITA specific scheme.
The DITA Markup output format uses rev
and status
attributes to identify change. These attributes can be added to many DITA elements. The rev attribute can take any value and the status attribute can take one of the following values: 'changed', 'new', 'deleted', 'unchanged'. These values can be used to highlight changes between two given versions of a document.
Each of these attributes is optionally added to elements that have changed (as detailed below). The rev attribute values default to 'deltaxml-delete' for elements that existed in the original document but are no longer present, and 'deltaxml-add' for elements that exist in the new version of the document but were not present in the original. These values can be changed using the 'old-revision' and 'new-revision' parameters.
The status values are used on elements as follows:
when an element existed in the original document but is no longer present, a
status="deleted"
attribute is added to that element,when an element exists in the new version of the document but was not present in the original a
status="new"
attribute is added to that element,for certain elements, attribute changes cause a
status="changed"
attribute to be added to the element
Changes to text content are handled slightly differently. Since it is not possible to add attributes to text, it must first be wrapped in an element that can have the attribute added. The phrase (<ph>
) element is intended for this kind of purpose. Text that was in the original document but is no longer present is wrapped with <ph status="deleted">...</ph>
, text that is only included in the new version of the document is wrapped with <ph status="new">...</phrase>
.
It is possible to change the element used to mark phrases. This should be used if the inputs are specialized DITA and the <ph>
element has been specialized into a new element type. The element name to use can be set using the 'phrase-element-name' parameter.
If text changes are made in a context where <ph>
is not a valid element, there is the option to use textual markers to display the change. This option is controlled using the 'show-non-phrase-changes' parameter, which is set to true
by default. Text changes are marked up as follows:
text that existed in the original document but is no longer present is wrapped like this: -[[...]]-
text that exists in the new version of the document but was not present in the original is wrapped like this: +[[...]]+
As well as marking the text in this fashion, the element that contains the text will also have a
status="changed"
attribute added to it
If the 'show-non-phrase-changes' parameter is set to false
, then only the text from the new version of the document will be output, without any marking.
Tracked Changes
DITA Compare also supports a variety of change tracking output formats.
Here, the intention is to enable the differences to be viewed, accepted, and rejected in an editor or word-processor that supports the given tracked changes format.
Arbortext
When using Arbortext Tracked Changes Markup the output of the comparison is an Arbortext tracked change version of a DITA document. Here DITA elements can contain Arbortext tracked change elements, and vice versa.
One consequence of using XML elements to represent tracked changes is that the resulting tracked change document does not conform to the DITA specification. In order to return an Arbortext tracked change document back to the DITA specification all changes need to be accepted or rejected (and the tracked change author information has to be removed).
Assuming that the inputs to the comparison are valid DITA documents and all the changes to the output are accepted (or rejected) as previously discussed, then the resulting document will be a valid DITA document. Note that in general it is not possible to guarantee that an arbitrary combination of 'accepted' and 'rejected' changes will result in a valid document, due to the granularity of change.
The generated tracked changes use three of the available tracked change elements:
atict:add | For inserted content |
atict:del | For deleted content |
atict:chgm | For attribute modification (outside the context of a table). |
Changes within comments and CDATA Sections results in the whole of the old version of the text being marked as deleted, and the whole of the new version of the text being marked as inserted.
The Arbortext tracked change format does not support changes to processing instructions or those comments that are outside the body of the DITA document. It does, however, support both cell and row level changes within tables.
Adobe FrameMaker
The FrameMaker Tracked Changes Markup output format is a valid DITA document that includes annotations to represent changes in the document.
This format employs FrameMaker's method for tracking changes, exploiting XML processing-instructions and comments to mark additions and deletions within documents.
The FrameMaker tracked change format is restricted to the Author and WYSIWYG views, these views do not support edits within XML marked as CDATA, changes to CDATA sections are therefore converted to normally parsed XML content. This format uses a pseudo-entity '&fm-double-hyphen;' to allow two adjacent hyphen characters to be represented within comments - which the track change format uses to contain deleted content.
As with most editors, FrameMaker has a few limitations on what types of change can be tracked for different element types, an example is the addition/deletion of table rows. For this specific example, the output format defaults to (affected by framemaker-tcs-table-change-mode) showing changes as changes in the text content of the row cells; however, other limitations have not been fully explored and its possible that some changes marked in the output format will be ignored by FrameMaker.
Oxygen
When using Oxygen Tracked Changes Markup the output of the comparison is itself a DITA document.
This output format uses processing instructions to identify change, where deleted content is typically contained within the processing instruction and inserted content is typically sandwiched between two processing instructions, one marking the start of the insertion and the other the end. Hence, removing (or ignoring) the processing instructions has the affect of accepting all changes to the document.
Comments and CDATA Sections are handled specially, as processing instructions cannot be placed inside their content. Instead, changes are identified by a sequence of processing instructions that immediately follow the Comment or CDATA Section, which mark the location of the change by using a character counting technique. Here, deleted content is contained in the processing instructions, whereas inserted content is already in the Comment or CDATA Section text itself. This preserves the principle of being able to accept all the changes within a document by either ignoring or removing the tracked change processing instructions.
The oXygen tracked change format does not support changes to attributes, processing instructions, or those comments that are outside the body of the DITA document. It does, however, support both cell and row level changes within tables.
XMetaL
When using XMetaL Tracked Changes Markup the output of the comparison is itself a DITA document.
This output format uses processing instructions to identify change in a similar manner to Oxygen's tracked change format. However, it does not support changes to comments, changes within CDATA sections, or row or cell level table changes.
Changes within CDATA Sections are handled by moving the change to the CDATA Section level as a whole. Therefore any textual change with in a CDATA section results in the old version of the whole CDATA section being marked as deleted, and the whole of the new version of the CDATA Section being marked as inserted.
There is a special XMetaL specific parameter (xmetal-tcs-table-change-mode) which controls what happens when row or cell level table changes are present. These changes can be pushed down to the cell content level, where the content of each cell within the changed region is appropriately deleted and inserted; this is the 'default' behaviour. The second option is that changes to rows or cells can be pushed up to the table level, so that the old and new versions of the table as a whole are tracked. The third option is that changes can simply be ignored (which mirrors what the XMetaL editor would do). However, selecting the ignore mode means that all changes within a table are ignored, not just those that are at the 'row' or 'cell' level. This is deliberate, as we believe that partial tracking of changes within a table would be confusing.