Merging tables

 Table of Contents

1. Introduction

XML Merge currently supports CALS, HTML, and DITA simple table processing.

The CALS table processing ensures that when syntactically and semantically valid (as per OASIS CALS table model documentation) input tables are provided the result will be a valid CALS table.

Similarly, the HTML tables processing ensures that when valid input tables are provided - according to the HTML 4 or HTML 5 documentation - the result will be a valid HTML 4/5 table. Note that both inputs need to follow the same standard (i.e. be HTML 4 or HTML 5).

Simple changes to the table, such as changing the contents of an entry, adding a row or column are generally represented as fine grain changes.

Some type of changes such as table entries overlapping or spanning multiple rows and columns are difficult to represent at fine granularity, whilst ensuring validity. In these cases, the changes are represented at row (i.e. , groups of added/deleted rows) or even whole-table granularity.

In case of DITA simple tables, the syntactic constraints ensure that cells cannot overlap or span either rows or columns, therefore changes are represented at a fine grained level of detail.

2. Change representation

Changes to tables are represented differently according to the type of change.

See Comparing Document Tables for a details on how changes are represented along with links to a set of examples table comparisons. Please note that, while most of the linked table comparisons are only two-way comparisons, the same principles apply to three-way or n-way merge operations.

3. Table processing configuration

In XML Merge, CALS tables and HTML table processing are configured separately. The following section talks about how to turn table processing on and off and set different CALS table processing modes.

The following section describes the table configuration settings for the ConcurrentMerge class. Similar table configuration settings are also available for the SequentialMerge class.

3.1. CALS tables

CALS table processing is enabled/disabled using setCalsTableProcessing.

3.1.1. Invalid cals table behaviour

In order to ensure that only valid CALS tables are passed to our specialized CALS table processing, each input table is marked either valid or invalid. This parameter declares what type of processing should be used for those tables that are marked as invalid. The 'warning report mode' parameter configures how recoverable errors are reported.

Three options are provided:

  • FAIL: The fail option stops the comparison by throwing an appropriate exception (that includes the errors identified by the validity checker).
  • PROPAGATE_UP: The propagate up option ensures that changes to an invalid table (or more specifically tgroup) are represented at the table level.
  • COMPARE_AS_XML : The compare as XML option essentially compares the tables as if they were well-formed XML.

This can be configured using setInvalidCalsTableBehaviour.

3.1.2. Warning report mode

This mode specifies the way in which invalid table warnings should be reported.

Different options such as comments, messages or processing instructions are available to report warnings.

This can be configured using setWarningReportMode.

3.1.3. CALS table validation level

The CALS invalid table behaviour depends on the CALS table validation level.

The CALS table validation level can either be STRICT or RELAXED.

This can be configured using setCalsValidationLevel.

3.2. HTML tables

HTML table processing is enabled/disabled using setHtmlTableProcessing.


#content .code