The XML output from DITA Merge conforms to an extended version of the DeltaV2 format which is common to other DeltaXML products. This section describes parts of DeltaV2 particularly significant in the DITA Merge context, please read the DeltaV2 Reference for a full description of this format.
The formatting-element feature in DITA Merge uses DeltaV2 format extensions to represent formatting-element differences. See the Formatting Element Representations section for more details.
2. Features of the DeltaV2 output
- Contains all of the data from all of the input files; any of the input files can be extracted without loss.
- Structure follows that of the input files (e.g. the root element will be the same).
- Describes all of the changes to elements, text and attributes using an expanded deltaV2 format that supports n-way merge.
- DeltaV2 version identifiers are specified through the API prior to processing and appear in the output.
- Format is versatile and optimised for further processing to resolve differences.
3. DeltaV2 attributes
The deltaxml:deltaV2 attributes in the deltav2 format contain a sequence of one or more 'version identifiers' joined by the '=' character or '!=' character-pair.
The deltaV2 attributes conform to the following rules:
- Where document versions have the same content at the level of the attribute they are referred to as being within an 'equality group' and the version identifiers are joined using the '=' character within this group
- Document versions with different content are separated by the '!=' character-pair.
- Within equality groups and between equality groups (i.e. groups of versions separated with '!=') the versions are ordered according to an ordering sequence specified in the deltaxml:version-order element defined on the document root element. The version-order attribute contains a comma separated list of version identifiers.
A sample deltaV2 attribute value from DITA Merge:
4. Version Identifiers
Version identifiers are user-specified labels assigned to the common ancestor and each revision document. An identifier must be supplied each time a new document is added and each new identifier must have a unique value.
4.1. Choosing values for version identifiers
Version identifiers may be user-specified or machine generated (provided they meet the constraints outlined below). For example, the revision numbers or hash values used in a version control system could be used.
4.2. Constraints on version identifiers
Version identifiers should conform to the NMTOKEN production rule defined in the XML Specification. The same production rules are used in both the XML 1.0 and XML 1.1 specifications. This production rule allows many unicode characters, but prohibits the use of the '!' (hex value 0x21) and '=' characters (hex value 0x3b) which are used as the version separators as discussed above, and also the space character.
5. Other Functionality in the DITA Merge variant
- Supports n-way merge (instead of 2-way and 3-way for XML Compare and the legacy Sync respectively)
- DeltaV2 version identifiers specified using the API appear in the result deltaV2 attributes instead of the single characters in XML Compare and Sync.
- The version-order attribute is used to provide a persistent record of the order in which documents were added to the merger object.
- Additional attributes, showing the results of a merge analysis, may also be present in the result.
5.1. Content Groups
A new element, the content group (
deltaxml:contentGroup) is used to describe changes involving entity references, processing instructions and comments. This element is modelled on the
deltaxml:textGroup element, but relaxes the restriction that the child elements (in this case
deltaxml:content being equivalent to
deltaxml:text) must only contain text() nodes.
contentGroup provides alternative content that appears in similar positions in each of the files. The
deltaxml:content child elements provide the alternative content and their
deltaV2 attributes indicate which of the input files contained that content.
The following example indicates how a contentGroup is used to show that different entity references are used in corresponding locations in the merge inputs:
6. Differences from the shared format
- DeltaV2 member values: user-selected version identifier strings used instead of auto-generated single characters
- Number of members: There may be more than 3 members in a DeltaV2 attribute (to support n-way merge).
- The required top-level attribute
deltaxml:content-typewill have different values in DITA Merge results. The values '
simplified-merge-concurrent' and '
simplified-merge-sequential' represents concurrent and sequential editing respectively.