Technical Specification

 Table of Contents

1. Overview

DeltaXML DITA Merge merges three or more well-formed Darwin Information Typing Architecture (DITA) file inputs and generates a single well-formed XML file describing the differences between the files. The file representing the differences is known as a delta file. This delta file can be post processed to create a merged DITA document. We use the term 'file' in this specification but the inputs and outputs may use other datatype representations including strings, in-memory trees or event streams.

The DITA Merge software provides a procedural interface that can be embedded in other Java-based software.

2. DITA Versions and Inputs

DITA Merge targets the language features of OASIS DITA 1.1  and DITA 1.2.  XML catalog support is provided by the tool. For other versions or specialisations, some configuration of the catalog system will be necessary. DITA Merge only merges DITA topic inputs, it does not merge DITA Maps.

3. Delta Files

A DeltaXML delta file has the same basic structure as the files that have been compared, with some additional attributes and elements. An XML namespace (the DeltaXML namespace) distinguishes these additional elements and attributes from those found in the input files. The delta file includes unchanged elements and attributes. The delta file provides a structured representation of the input files as a single file in which common data is shared.

4. XML Processing

DITA Merge is built on XML Compare and handles whitespace in the same way. 

Comments and processing instructions can be preserved so that they appear in the delta file. Internal parsed general entities can be expanded or preserved. CDATA sections can be expanded or preserved.

DeltaXML handles namespaces and will detect elements in the same namespace even if the namespace prefix values are different. An element or attribute in a namespace may have a different namespace prefix in the delta file from that used in the input file.

5. Merge Process

DITA Merge merges the DITA files, taking account of the tree structure of the files and identifying corresponding elements in the files. Corresponding elements will have the same element local name and namespace and will have corresponding parent elements. The root elements of the files must have the same local name and namespace. DITA Merge determines the alignment at each level in the tree structure between the files. The alignment algorithm determines the longest common subsequence of corresponding elements. The alignment algorithm gives precedence to elements that are exactly equal over those that have just the same element name and namespace.

The DITA inputs are loaded into DITA Merge in order. The order is recorded in an attribute on the root element of the merged delta file.

For a delta with type 'merge-concurrent', one input file is considered to the common ancestor from which the other input files have been derived. As each successive file is loaded into the delta, the file is first aligned with the common ancestor and this alignment will take precedence over alignment between this file and other files previously loaded into the delta.

DITA Merge can use key values, identified to the software using an attribute in the DeltaXML namespace, to identify corresponding elements in the inputs. Alignment of elements with the same namespace, local name and key will take precedence in the alignment process over other alignment criteria. Elements with different keys in the files will not be considered to correspond.

DITA Merge treats elements as ordered, i.e. a change in order is identified as a change. Optionally any element can be identified to DeltaXML as orderless, using an attribute in the DeltaXML namespace which must be present in all files. In this case the child elements may appear in any order in the files and DeltaXML will match corresponding elements. Within an orderless element, a corresponding element is an element with the same name, namespace and key or an element that is exactly equal through its tree structure. Orderless elements must have element-only content.

DITA Merge ignores the order of attributes. Changes to attributes are represented using elements in the DeltaXML namespace.

6. System Requirements

DITA Merge requires Java Standard Edition JRE version 6.0 or later. We test on: Solaris (Intel 64 bit), and macOS (Intel 64 bit). For support any reported problem should be reproducible on at least one of these platforms.

#content .code