XML Compare Delta Format (deltaV1)
XML Compare Delta Format Description (deltaV1 for versions of XML Compare prior to 5.0 (June 2008))
NOTE: If you are using DeltaXML Core version 5.0 or later please refer to the documentation on deltaV2 format.
The XML Compare Delta format is a representation of the changes between two XML documents. A Delta can be re-combined with either of the original documents to generate the other, i.e. 'old'+delta to generate 'new' or 'new'-delta to generate 'old'. Delta files have the same overall structure as the files being compared, with a few additional attributes and elements. These special attributes and elements are introduced to represent the differences between the files.
In this description we denote the input documents as 'old' and 'new'.
Namespace and Prefix
The namespace for a Delta document is http://www.deltaxml.com/ns/well-formed-delta-v1
and the preferred prefix is deltaxml:
.
Namespaces in the delta file are declared on the top (root) element. In general, the namespace prefixes used will be the same as those used in the input files. If the input files use different prefixes for the same namespace, the first one encountered will be adopted in the delta file. For all versions prior to 4.2, all elements that have a namespace are assigned a prefix in the delta file, and this is true even if some elements in the input files did not have prefixes. In cases where there is no prefix defined in the input files, a prefix will be generated, for example p0:
.
Elements and Attributes
This is a list of the elements used by this format:
Element name | Content | Purpose |
---|---|---|
deltaxml:PCDATAmodify | A sequence of one deltaxml:PCDATAold and one deltaxml:PCDATAnew element | To indicate a change to the parsed character data (PCDATA) within an element |
deltaxml:PCDATAold | PCDATA, i.e. text | To record a text item that appeared in the 'old' input document. |
deltaxml:PCDATAnew | PCDATA, i.e. text | To record a text item that appeared in the 'new' input document. |
deltaxml:exchange | A sequence of one deltaxml:old and one deltaxml:new element | To indicate an exchange, at an equivalent place in the two input documents, of two elements or of an element and a PCDATA string. A deltaxml:exchange will be used whenever two items in the files being compared are deemed to correspond with each other, because of their positions in the files, but they have a different type. For some processing of a delta it may be convenient to remove this wrapper element and a filter is provided to do this. |
deltaxml:old | A single element or PCDATA string. | To record an item that appeared in the 'old' input document. |
deltaxml:new | A single element or PCDATA string. | To record an item that appeared in the 'new' input document. |
This is a list of the attributes used by Core Delta
Attribute name | Content | Purpose |
---|---|---|
deltaxml:delta | One of the values: add, delete, unchanged, WFmodify or WFmodifyUnordered. The value "add" means the element apears only in the new document. The value "delete" means the element appears only in the old document. The value "unchanged" means that the element appears in both documents and there are no differences in attributes or child elements in the two documents. The value "WFmodify" (Well Formed modify) means that the element appears in both old and new documents and is different. The value "WFmodifyUnordered" is similar to WFmodify except that this is used for an element which is orderless, i.e. it had an attribute deltaxml:ordered="false". | To indicate how the containing element has been changed. |
deltaxml:new-attributes | The value will be a list of any attributes (name and value) that appear in the new input document. If an attribute appears also in deltaxml:old-attributes then this means it has been modified. An attribute that appears only in deltaxml:new-attributes has been added. | To show changes to attributes. |
deltaxml:old-attributes | The value will be a list of any attributes (name and value) which appear in the old input document. If an attribute appears also in deltaxml:new-attributes then this means it has been modified. An attribute that appears only in deltaxml:old-attributes has been deleted. | To show changes to attributes. |
deltaxml:ordered | The value may be 'true', the default, meaning all child elements are ordered. Or it may have the value 'false', meaning the child elements will be compared as if the order is not important (orderless). This is a control attribute and is not subject to change. | This attribute may be used in the input documents to indicate whether or not the order of the child elements is significant. |
deltaxml:key | Any string which represents a key for the enclosing element within the context of the parent element. This is a control attribute and is not subject to change. | This attribute may be used in the input documents to provide a key for an element, in order to identify correspondence between two elements at the same hierarchical level in each of the two input documents. |
Description
There is no DTD or Schema for a Delta document, but the Delta will have the same look and feel as the original documents. There is a set of simple rules which apply to the Delta format.
Elements, attributes and text that are identified by DeltaXML as common to both input documents are shared in the Delta. A subtree that appears unchanged in one or more documents will appear in the Delta almost exactly as it appeared in the original document(s).
Added, deleted or changed attributes are encoded and their values are delimited using a single character. The character used to delimit attribute values will generally be a double quote, represented as the entity "
, a single quote or a vertical bar. The character is picked according to the content of the attribute value, i.e. if it contains "
then this cannot be used as a delimiter. If an attribute value that is changed includes all the delimiter characters, this will cause an error. From Version 2.4, the possible choice of delimiters has been increased to include: "'|~%^+`/\$?,;!
The handling of whitespace needs to be understood to avoid unexpected results. Whitespace is considered significant in XML except when a DTD or Schema is provided and the parser can identify some whitespace, e.g. between elements, as ignorable. In this case, DeltaXML ignores it. Often it is best to remove all extra whitespace before comparison using one of the standard filters provided with DeltaXML.
Note that there is no representation of 'move' where an element is repositioned within its siblings. Such situations are represented using the delete, add or exchange options shown above.
Rules
Summary of Delta format
The root element has a
deltaxml:delta
attribute with a value showing whether or not the two documents are the same.The
deltaxml:delta
attribute takes one of five allowed values.All elements will have a
deltaxml:delta
attribute unless there can be no changes to child elements or text, i.e. when the value of the attribute on an element is either add, delete or unchanged.The value of each
deltaxml:delta
attribute will be consistent with thedeltaxml:delta
attribute on its parent, i.e. the value of thedeltaxml:delta
attribute on the parent will either bedeltaxml:delta="WFmodifyUnordered"
ordeltaxml:delta="WFmodify"
. Note that child elements of any element withdeltaxml:delta="add"
,deltaxml:delta="delete"
ordeltaxml:delta="unchanged"
will not have adeltaxml:delta
attribute.An element with
deltaxml:delta="delete"
will appear with all its attributes and child elements exactly as it was in the old input document.An element with
deltaxml:delta="add"
will appear with all its attributes and child elements exactly as it was in the new input document.No child elements or attributes (except
deltaxml:key
anddeltaxml:ordered
attributes) from the input documents will appear on an element withdeltaxml:delta="unchanged"
unless the delta is a full context delta.A child element of an element with
deltaxml:delta="WFmodifyUnordered"
can only have an attributedeltaxml:delta="WFmodify"
if it also has adeltaxml:key
attribute.Any text (PCDATA) that is different in the two input documents will appear as a grand-child within either a
deltaxml:exchange
element or adeltaxml:PCDATAmodify
element.Unchanged attributes of any element remain as attributes and appear only in the full context delta.
Changed attributes are held in a
deltaxml:old-attributes
anddeltaxml:new-attributes
attributes.The
deltaxml:key
attribute and other control attributes remain as attributes and always have the same value as in the input files.
Full Context Delta
A delta file normally represents just the changes between two files, and does not include data that has not changed. DeltaXML provides an option to generate a 'full delta' which includes unchanged data. The 'full delta' provides a structured representation of two files within a single file where the common data is shared.
Examples
Examples of Delta for Elements
Document A | Document B |
---|---|
<example> </person> | <example> |
And the Delta for this will be as follows:
Delta | Comments |
---|---|
<example deltaxml:delta="WFmodify"> | Element <lastName> is added in the 'new' document. |
Examples of Delta for Text
Document A | Document B |
---|---|
<example> | <example> |
And the Delta for this will be as follows:
Delta | Comments |
---|---|
<example deltaxml:delta="WFmodify"> | The text in <firstName> is "J" in both the old document and "John" in the new. The text in <lastName> is the same in both documents, and is shown here because this is a Full Context delta. |
Examples of Delta for Attributes
Document A | Document B |
---|---|
<example> | <example> |
And the Delta for this will be as follows:
Delta | Comments |
---|---|
<example deltaxml:delta="WFmodify"> deltaxml:new-attributes="age="37"" | The attribute 'gender' is unchanged and so appears as a regular attribute. The attribute 'age' has a value of 36 in the old document and 37 in the new document. |