Getting Started With XML Merge
Introduction
Welcome to the XML Merge product. XML Merge allows you to easily manage the complexities that arise when multiple authors work, either concurrently or sequentially, on the same document. The resulting document variants can be merged, or their differences monitored as required.
What can XML Merge do?
XML Merge takes two or more files and merges the changes that have been made into a single file. It aligns content, identifies potential conflicts and shows differences in the output.
There are two broad scenarios:
Concurrent merge is useful when there is one ancestor document that has been amended by a number of editors concurrently.
Sequential merge is useful in a scenario in which each of the versions is a derivative of the previous version.
A common use case for XML Merge is to merge two variants of a file. A variety of output types are provided to support this case. These outputs include optional resolution of the complex three-way changes into a simpler two-way representation familiar to users of change tracking in document editors.
Changes detected by XML Merge are then typically resolved in an external process using rule-based approaches or interactively in an authoring or reviewing tool. For more complex requirements with more than three input files, a simple rule-based resolver can be configured and run within the merge process. Again, different types of output can be produced depending on the intended use of the result.
See the samples and guides to learn more about XML Merge and to see examples of how XML Merge can be run.
License Setup
In addition to a product download (usually a ZIP file) you will also need to obtain a license file before you can run the software.
License files can be obtained either from our download and licensing support pages on our website, or by contacting DeltaXML support. Once you have obtained a license file, it should then be placed either into the directory created by unpacking the product download (where you will find the product .jar files) or into the home directory of the user(s) who will run the software. In either of these locations the software will attempt to locate the license file.
The above procedure should allow the included developer applications and samples to run. More advanced forms of licensing are also supported including configuring the license using the product APIs and using concurrent or floating license servers. For further information on licensing, see our Licensing User Guide.
XML Merge Java API
The Java API provides a simple way to integrate merge into your existing processing workflow or Content Management System. It provides access to all of the parameters and allows the use of various combinations of Java objects as input/output types.
See the API documentation for details on parameter settings and their effects.
XML Merge REST API
The XML Merge also has a REST API which allows merge operations from wide range of programming languages and systems. XML Merge REST service allows you to invoke concurrent merge, three way concurrent merge and sequential merge either synchronously or asynchronously.
Please note that the initial release of XML Merge REST API does not support formatting elements.
See the REST documentation for details.
Using the Command-line Tool
XML Merge can be run using a command-line interface that is invoked from a terminal window or Windows command line.
Replace x.y.z
with the major.minor.patch
version number of your release e.g. deltaxml-merge-7.0.0.jar
There are two types of command-line interfaces:
Java API-based command-line interface
CODEjava -jar deltaxml-merge-x.y.z.jar command mergeType arguments
REST API-based command-line interface
CODEjava -jar deltaxml-merge-rest-client-x.y.z.jar command mergeType arguments
The source code for this interface is available in Bitbucket. This was designed to give you some understanding of the XML Merge REST API.
The two jars deltaxml-merge-x.y.z.jar
and deltaxml-merge-rest-client-x.y.z.jar
are included in the distribution.
These interfaces are implemented to quickly run merge from command line. However, they do not provide all the features of the JAVA or REST API such as rule configuration.
Supported Merge Types
Merge Type | Description |
---|---|
concurrent | N-Way concurrent merge |
concurrent3 | Three way concurrent merge |
sequential | N-Way sequential merge |
Supported Commands
describe
This command is used to see the description of the available parameters for the specified merge type.
java -jar <jar-name> describe merge-type
For Example,
java -jar <jar-name> describe concurrent
merge
Runs a merge for the specified merge type. Each file needs to have a name which is used to identify the version in the generated DeltaV2 result.
java -jar <jar-name> merge mergeType
ancestorName/version1Name ancestorFile/version1File
(versionName versionFile)+
resultFile
params
The example below will merge three revisions ('anna', 'ben' and 'chris') against the 'ancestor' version with merge type as concurrent. The inputs used in the example below are included in in your release in the samples/html-data
directory. Note: the command has been wrapped to aid readability.
java -jar <jar-name> merge concurrent
ancestor samples/html-data/four-edits.html
anna samples/html-data/four-edits-anna.html
ben samples/html-data/four-edits-ben.html
chris samples/html-data/four-edits-chris.html
result.xml
license
Prints out the current activated license details. NOTE: The REST API-based command-line interface does not support this command.
Merge command parameters
Optional command-line parameters can be added to the end of the command-line, these are used to set options for the merge process.
The command-line syntax for parameters is: param=value
, where param
is the parameter name and value the parameter value.
The supported parameters are listed below:
ResultType
Specifies the type of post-processing applied to the merge result. The types available are:
Result Type | Description | Merge Type | ||
---|---|---|---|---|
concurrent | concurrent3 | sequential | ||
DELTAV2 | the raw result with no post-processing. | ✓ | ✓ | ✓ |
ANALYZED_DELTAV2 | performs first-line analysis of the result and adds attributes to indicate types of change | ✓ | ✓ | ✓ |
RULE_PROCESSED_DELTAV2 | applies default rules to auto-resolve the simple change types 'add' and 'delete'. | ✓ | ✓ | |
SIMPLIFIED_DELTAV2 | a simplified form of the deltaV2 format | ✓ | ✓ | ✓ |
SIMPLIFIED_RULE_PROCESSED_DELTAV2 | the simplified format with the simple change types 'add' and 'delete' auto-resolved. | ✓ | ✓ |
DoctypePreservationMode
Controls how DOCTYPE declarations appear in the result. The available modes are:
REMOVE_ALWAYS | no doctypes appear in the result, irrespective of what's in the inputs. |
---|---|
PRESERVE_WHEN_UNCHANGED | if no changes then preserved, otherwise removed. |
ERROR_WHEN_CHANGED | if changes signal error, otherwise preserved in result. |
EntityReferencePreservationMode
Controls how general entity references appear in the result. The available modes are:
USE_REPLACEMENT_TEXT | Entity references are replaced with their 'replacement text' (which may actually include general XML such as text, attributes and elements). |
---|---|
PRESERVE_REFERENCES | Entity references remain in the body of the XML content. Declarations in the internal subset will also be preserved where possible. If multiple declarations with different values are used in the inputs then multiple declarations may appear in the result. |
PRESERVE_REFERENCES_ENCODED_FORM | Entity references remain in the body of the XML content in encoded output format. Declarations in the internal subset will also be preserved where possible. If multiple declarations with different values are used in the inputs then multiple declarations may appear in the result. |
WordByWord
Controls the granularity of text/PCDATA comparison, alignment and change reporting:
| Text is segmented into words (as described in Unicode Annex 29, Section 4), compared and results are then reported as this granularity. |
---|---|
| Text is compared and changes reported corresponding to the text/PCDATA structure found in the comparison inputs. |
ElementSplitting
Sets whether elements containing significantly modified text should be split.
| Enable element splitting when WordByWord is true and the amount of unchanged text in an element falls below 10%. |
---|---|
| Disable element splitting. |
TableProcessing
Controls whether to enable the HTML and CALS table processing.
| Enables the table processing. |
---|---|
| Disables the table processing. |
InvalidCalsTableBehaviour
This parameter declares what type of processing should be used for the invalid CALS tables.
PROPAGATE_UP | Propagate the changes to the |
---|---|
COMPARE_AS_XML | Compare tables as 'plain' XML. |
FAIL | Throw an Exception when invalid CALS tables are encountered. |
CalsValidationLevel
Controls the validation level to use for CALS table validation.
RELAXED | Performs relaxed validation. Invalidities which are known to have no effect on subsequent processing will not cause that processing to be bypassed. |
---|---|
STRICT | Performs strict validation. All invalidities will cause the appropriate subsequent processing to be bypassed. |
WarningReportMode
Specifies how CALS table invalidity warnings should be reported.
PROCESSING_INSTRUCTIONS | Reports warning using processing instructions with the format |
---|---|
COMMENTS | Reports warnings using XML comments. |
MESSAGE | Reports warnings using <xsl:message/>. |
Debug
Controls the generation of intermediate pipeline debug files. This parameter is not available for REST API-based command-line interface.
| Intermediate pipeline debug files are generated. |
---|---|
| Intermediate pipeline debug files are not generated. |
Additional Three Way Merge Parameters
The command 'merge' with merge type 'concurrent3' supports all the parameters listed within the section 5.3. In addition to that, a command 'merge' with merge type 'concurrent3' supports result format and three additional result types. These are used in a same way as the merge command parameters.
ResultType
Specifies the type of post-processing applied to the merge result. The types available are:
OXYGEN_TRACK_CHANGES | produces a merge result with oXygen Author track change processing instructions. |
---|---|
ALL_CHANGES | performs three to two-way simplification and shows as many changes as possible. |
CONFLICTING_CHANGES | performs three to two-way simplification and shows conflicts for further resolution. Simple, non-conflicting adds, deletes and modifications are automatically resolved. |
THEIR_CHANGES | performs three to two-way simplification and shows conflicts for further resolution. Additionally changes in the third input are displayed. Simple, non-conflicting changes in the second input are automatically resolved. This is designed for merge scenarios where the third input corresponds to the 'remote' or other users (their) branch. |
ResultFormat
Specifies various result formats for some of the result types produced by the three way merge. It is only applicable when the 'result-type' is all-changes, conflicting-changes or their-changes. The formats available are:
XML_DELTA | produces either a deltaV2 result or a simplified delta result. |
---|---|
OXYGEN_TRACK_CHANGES | produces a result format which is an XML file with processing instructions used in the accept/reject interface of the oXygen XML editor/author |
JAR Files
This section lists the '.jar' files in the release. They should always be included on the classpath while executing merge.
x.y.z
represent the major.minor.patch
version number of your release.
deltaxml-merge-x.y.z.jar | This jar file contains the main XML Merge API classes and associated resources (such as Java filters). |
---|---|
deltaxml.jar | This jar file contains the main XML Compare API classes and associated resources (such as Java filters). |
resolver.jar | This modified version of the Apache catalog resolver is needed when using catalogs through the deltaxml-merge-x.y.z.jar API. Please see Catalog Resolver Customizations for further details of the modifications we have made. |
flexlm.jar EccpressoAll.jar | These jar files are required for the Flexera based licensing capabilities introduced in the XML Merge 5.3 and later releases. They should always be included on the classpath. |
saxon9pe.jar xercesImpl.jar xml-apis.jar icu4j.jar fastutil-8.4.2.jar | These jar files are required by the client applications and are mandatory with merge packages by including them on the classpath. |
deltaxml-merge-rest-x.y.z.jar deltaxml-merge-rest-client-x.y.z.jar | These are XML Merge REST API jars. |
gson-2.8.5.jar | This JAR file is required when Usage Logging is enabled. |
istack-commons-runtime-3.0.12.jar, | These JAR files are required when Usage Logging is enabled and you’re using Java 11+ |
Feedback
We are always keen to improve our products - please contact us if you have any comments or suggestions while working with DeltaXML. We hope you enjoy it!
The DeltaXML Team.
Updating your licence
The licence file may need to be upgraded or replaced, such as when changing from an evaluation to full licence or renewing an annual subscription. It is usually located in the top-level directory of the installation. However, it can also be located in a user's home directory, though in this case the licence will only be available to the user whose home directory the licence file is in. The upgrading process is simply a matter of replacing the original licence file with an updated licence file.
Licensing and Legal Notices
The DeltaXML software is licensed under the terms described in the file Licence.html. The software also includes components developed by others, the redistribution and copyright terms, including necessary notices are described in legal-notices.html.
Input Restrictions
A few restrictions are imposed on the input XML for a merge operation, see Input XML Restrictions