Skip to main content
Skip table of contents

Technical Specification

Overview

HTML Compare compares two well-formed HTML files, identified as ‘A' and 'B’, and generates a HTML file describing the differences between the two files.

The HTML Compare software provides a REST interface which can be accessed from any other software, including non-Java systems

XML Processing

HTML Compare sends each document to an HTML parser followed by XML parser prior to processing. If the document starts with a DOCTYPE declaration or a call to an XML Schema the parser will process the DTD or Schema and return a SAX stream with all the entities expanded, and any unspecified attributes added with default values. HTML Compare normalises whitespace before the comparison.

Document Comparison

HTML Compare compares the two HTML or XHTML files, taking account of the tree structure of the files and identifying corresponding elements in the two files. Corresponding elements will have the same element local name and namespace and will have corresponding parent elements. The root elements of the two files must have the same local name and namespace. HTML Compare determines the best fit at each level in the tree structure between the two files. The best fit algorithm determines the longest common subsequence of corresponding elements. The best fit gives precedence to elements that are exactly equal over those that have just the same element name and namespace.

HTML Compare treats elements as ordered, i.e. a change in order is identified as a change.

HTML Compare ignores the order of attributes. Changes to attributes are represented using elements in the DeltaXML namespace.

Text Handling

PCDATA items are treated as a whole and are not subdivided into words or characters. XSLT filters may be used to modify the markup before the files are compared and thus provide a word-by-word comparison. The XML parser interprets CDATA sections and expands entity references prior to comparison with HTML Compare.

System Requirements

HTML Compare requires either:

  • A Java Standard Edition JRE version 8 or later. We test on: Oracle Solaris 10 (Intel Xeon), Mac OSX (10.6 or higher on Intel), Windows Server 2008 R2 and Windows 7 platforms. For support any reported problem should be reproducible on at least one of these platforms.

Patent granted 2001270901; EP1325432; 60134999.7; US8,196,135B2; CA 2416876; US 8,423,518 B2; EP2174238; 602008031420.0. Patents pending 1315520.5; 14275178.3; 14/474,377

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.