The DeltaV2 Format for DITA Merge

 Table of Contents

Introduction

The XML output from DITA Merge conforms to the DeltaV2 format which is common to other DeltaXML products. This section describes parts of DeltaV2 particularly significant in the DITA Merge context, please read the DeltaV2 Reference for a full description of this format.

Features of the DeltaV2 output

  • Contains all of the data from all of the input files; any of the input files can be extracted without loss.
  • Structure follows that of the DITA inputs (e.g. topic is the root element).
  • Describes all of the changes to elements, text and attributes using an expanded deltaV2 format that supports n-way merge.
  • DeltaV2 version identifiers are specified through the API prior to processing and appear in the output.
  • Format is versatile and optimised for further processing to resolve differences.

DeltaV2 attributes

The deltaxml:deltaV2 attributes in the deltav2 format contain a sequence of one or more 'version identifiers' joined by the '=' character or '!=' character-pair.

The deltaV2 attributes conform to the following rules:

  • Where document versions have the same content at the level of the attribute they are referred to as being within an 'equality group' and the version identifiers are joined using the '=' character within this group
  • Document versions with different content are separated by the '!=' character-pair.
  • Within equality groups and between equality groups (i.e. groups of versions separated with '!=') the versions are ordered according to an ordering sequence specified in the deltaxml:version-order element defined on the document root element. The version-order attribute contains a comma separated list of version identifiers.

A sample deltaV2 attribute value from DITA Merge:

base=anna!=ben=chris!=david

Version Identifiers

Version identifiers are user-specified labels assigned to the common ancestor and each revision document. These identifiers must be supplied each time a new document is added and each new identifier must have a unique value, until the DITA Merge process is reset.

Choosing values for version identifiers

Version identifiers may be user-specified or machine generated (provided they meet the constraints outlined below). For example, the revision numbers or hash values used in a version control system could be used.

Constraints on version identifiers

Version identifiers should conform to the NMTOKEN production rule defined in the XML Specification. The same production rules are used in both the XML 1.0 and XML 1.1 specifications. This production rule allows many unicode characters, but prohibits the use of the ! (hex value 0x21) and '=' characters (hex value 0x3b) which are used as the version separators as discussed above and also space characters.

New functionality in the DITA Merge variant

  • Supports n-way merge (instead of 2-way and 3-way for XML Compare and the legacy DeltaXML Sync respectively)
  • DeltaV2 version identifiers specified using the API appear in the result deltaV2 attributes instead of the single characters in XML Compare and DeltaXML Sync.
  • The version-order attribute is used to provide a persistent record of the order in which documents were added to the merger object.
  • Additional attributes, showing the results of a merge analysis, may also be present in the result. These are documented in Concurrent Merge Analysis.

Differences from the shared format

  • DeltaV2 member values: user-selected version identifier strings used instead of auto-generated single characters
  • Member sort order: the order of version identifiers within a DeltaV2 attribute is represented by a consistent, canonical manner throgih the document.
  • Number of members: There may be more than 3 members in a DeltaV2 attribute (to support n-way merge).
  • The required top-level attribute deltaxml:content-type will have different values in Merge results. The value 'merge-concurrent' represents the current Merge 1.0 algorithm corresponding to concurrent editing. Future versions of the Merge product will also support a 'travelling draft' model where there is not necessarily the concept of a common ancestor version. It is likely the value 'merge-consecutive' will be used for this algorithm. Other values may also be introduced for subsequent Merge developments.
#content .code