Asynchronous Comparison

 Table of Contents

Synchronous comparison is used when the response is obtained quickly and where the Delta is returned as part of the response.

However in other cases asynchronous comparison is useful:

  • If there is a risk of HTTP timeout
  • When using HTTP or Cloud IO - as the REST service will need to make HTTP requests to retrieve these resources

  • Using large inputs

  • When the comparison result is needed somewhere other than where the response is returned
  • Where progress, logging or resource consumption information is needed
  • When multiple comparisons can be started at once

XML / JSON Request

Whether using XML or JSON, an asynchronous comparison will use the async element to specify the optional values callback and output.

  • callback provides a callback URL that the REST service will make a GET request to after the comparison is complete. This is detailed in Callbacks below.
  • output allows you to specify a location for the result to be written to. For more information see I/O Types

The response will consist of the created Job, and a Location header pointing to where you can poll the Job status.


Request (XML)
<comparison>
  <inputA type="http">
    <uri>http://www.example.com/file1.xml</uri>
  </inputA>
  <inputB type="http">
    <uri>http://www.example.com/file2.xml</uri>
  </inputB>
  <configurationParameters>
    <configurationParameter type="boolean">
      <name>Word By Word</name>
      <value>true</value>
    </configurationParameter>
  </configurationParameters>
  <async>
    <callback>http://www.example.com/comparison-callback</callback>
    <output type="file">
      <path>/Users/exampleUser/Documents/result.xml</path>
    </output>
  </async>
</comparison>


Request (JSON)
{
  "inputA": {
    "type": "http",
    "uri": "http://www.example.com/file1.xml"
  },
  "inputB": {
    "type": "http",
    "uri": "http://www.example.com/file2.xml"
  },
  "configurationParameters": [
    {
      "name": "Word By Word",
      "value": true,
      "type": "boolean"
      }
    }
  ],
  "async": {
    "callback": "http://www.example.com/comparison-callback",
    "output": {
      "type": "file",
      "path": "/Users/exampleUser/Documents/result.xml"
    }
  }
}


Response (XML)
HTTP/1.1 202 Accepted
Location:  /api/xml-compare/v1/jobs/12346124
<job>
    <links>
        <link href="/api/xml-compare/v1/jobs/12346124" rel="self"/>
    </links>
    <creationTime>2018-11-23T10:39:47.742Z</creationTime>
    <jobId>12346124</jobId>
    <jobStatus>STARTED</jobStatus>
    <numberOfStages>0</numberOfStages>
    <progressInPercentage>0.0</progressInPercentage>
    <startTime>2018-11-23T10:39:47.746Z</startTime>
</job>
Response (JSON)
HTTP/1.1 202 Accepted
Location:  /api/xml-compare/v1/jobs/12346124
{
  "startTime": "2018-11-23T10:39:47.746Z",
  "creationTime": "2018-11-23T10:39:47.742Z",
  "jobId": 12346124,
  "numberOfStages": 8,
  "progressInPercentage": 0,
  "jobStatus": "STARTED",
  "links": [{
    "rel": "self",
    "href": "/api/xml-compare/v1/jobs/12346124"
   }]
}

Form Request

When using multipart/form-data, set the parameter async to true.

You can also use the parameters callback and keepResult in a similar fashion to XML / JSON async requests.

.....
Content-Disposition: form-data; name= "async"
true
--boundary
Content-Disposition: form-data; name= "callback"
http: //www.example.com/comparison-callback
--boundary
Content-Disposition: form-data; name= "output"
/Users/exampleUser/Documents/out.xml
--boundary
 .....

Jobs

For each async comparison a Job is created. The jobs can be read from the /jobs/{jobId} endpoint. For example, you can poll for the Job's status:

Request
GET /api/xml-compare/v1/jobs/12346124
Response (XML)
<job>
    <links/>
    <creationTime>2018-04-19T10:56:01.644+01:00</creationTime>
    <jobStatus>INPUT_FILTER_CHAIN_B</jobStatus>
    <numberOfStages>77</numberOfStages>
    <pipelineStage>dxp-PRE_FLATTENING-mark-important-attributes.xsl</pipelineStage>
    <progressInPercentage>12.0</progressInPercentage>
    <startTime>2018-04-19T10:56:01.656+01:00</startTime>
</job>
Response (JSON)
{
  "startTime": "2018-04-19T09:56:01.656+0000",
  "creationTime": "2018-04-19T09:56:01.644+0000",
  "numberOfStages": 77,
  "progressInPercentage": 12,
  "pipelineStage": "dxp-PRE_FLATTENING-mark-important-attributes.xsl",
  "jobStatus": "INPUT_FILTER_CHAIN_B",
  "links": []
}

You can see the pipeline is currently running the Input Filter Chain for Input B, but some time later...

Response (XML)
<job>
    <links>
        <link href="/api/xml-compare/v1/results/12346124" rel="result"/>
    </links>
    <creationTime>2018-04-19T10:56:01.644+01:00</creationTime>
    <finishedTime>2018-04-19T10:57:24.095+01:00</finishedTime>
    <jobStatus>FINISHED</jobStatus>
    <numberOfStages>77</numberOfStages>
    <pipelineStage>dxml-clean</pipelineStage>
    <progressInPercentage>100.0</progressInPercentage>
    <startTime>2018-04-19T10:56:01.656+01:00</startTime>
</job>
Response (JSON)
{
  "startTime": "2018-04-19T09:56:01.656+0000",
  "creationTime": "2018-04-19T09:56:01.644+0000",
  "finishedTime": "2018-04-19T09:57:24.095+0000",
  "numberOfStages": 77,
  "progressInPercentage": 100,
  "pipelineStage": "dxml-clean",
  "jobStatus": "FINISHED",
  "links": [
    {
      "rel": "result",
      "href": "/api/xml-compare/v1/results/12346124"
    }
  ]
}

...the comparison is now finished - with a HATEOAS link available to view result information.

Job Model

The Job contains the following objects:

ObjectDescription
creationTimeISO 8601 timestamp of the time the Job was created.
startTime

ISO 8601 timestamp when the comparison started.

Depending on the number of queued jobs, there may be significant time between creationTime and startTime

finishedTimeISO 8601 timestamp of when the comparison finished.
comparisonTimeTime in ms the comparison took, as a string - e.g. "123 ms".
jobStatus

An enumeration of the state of the comparison.

States include:

  • QUEUED
  • STARTED
  • INPUTS_LOADING_A
  • INPUTS_LOADING_B
  • INPUT_FILTER_CHAIN_A
  • INPUT_FILTER_CHAIN_B
  • COMPARISON_RUNNING
  • OUTPUT_FILTERS
  • SAVING
  • FINISHED
  • FAILED
  • CANCELLED
jobIdThe ID of the Job
numberOfStagesNumber of stages in the pipeline, as an integer.
pipelineStageThe current pipeline stage of the comparison.
progressInPercentage

Percentage of progress in the pipeline, as an integer.

Note: numbers are approximate, and are based on the number of stages in the pipeline.
Individual stages will vary in length of time they take to complete. 

links / link

HATEOAS links containing a href URI and a rel describing its meaning.

A rel of "cancel" indicates you can use HTTP DELETE to cancel the Job, whereas "result" indicates the endpoint you should call GET on to receive result information.

outputSpecifies where the result has been output to.
errorContains error information if an error happened. See Errors Page for more information.

When using Cloud I/O output will contain the Cloud I/O information provided minus the client ID / secret information for security reasons, for example:

<output type="s3">
    <region>EU-WEST-1</region>
    <bucket>DeltaXML-Bucket</bucket>
    <fileName>output.xml</fileName>
</output>

If not using Cloud I/O for the output, then it will contain a uri and size (which provides the result size in bytes - as an integer). For example:

<output type="http">
    <uri>http://localhost:8080/api/xml-compare/v1/downloads/12346124</uri>
    <size>390</size>
</output>

If the comparison failed, an error element will contain details of the failure, including a stack trace which can help diagnose the issue. The size of the result is reported in terms of bytes.

A GET request to the output uri will then complete the asynchronous lifecycle:

Request
GET /api/xml-compare/v1/downloads/123456
Response
HTTP/1.1 200 OK
<root xmlns:deltaxml="http://deltaxml.com/ns/well-formed-delta-v1" deltaxml:deltaV2="A!=B">
  <child deltaxml:deltaV2="A=B">...</child>
  <child deltaxml:deltaV2="A!=B">...</child>
</root>

Note

If the pipeline POST request specified an output file location, the output file will be available at that location after a successful asynchronous comparison. A GET request from the downloads resource will merely provide the same file from that location.

Cancelling a Job

Jobs can be cancelled while they are queued or running, this is indicated through a HATEOAS link in the Job info, for example:

Response (XML)
<job>
    <links>
        <link href="/api/xml-compare/v1/jobs/12346124" rel="cancel"/>
    </links>
    <creationTime>2018-04-06T11:31:20.863+01:00</creationTime>
    <jobStatus>QUEUED</jobStatus>
    <numberOfStages>0</numberOfStages>
    <progressInPercentage>0.0</progressInPercentage>
</job>
Response (JSON)
{
  "creationTime": "2018-04-06T11:31:20.863+01:00",
  "numberOfStages": 0,
  "progressInPercentage": 0,
  "jobStatus": "QUEUED",
  "links": [
    {
      "rel": "cancel",
      "href": "/api/xml-compare/v1/jobs/12346124"
    }
  ]
}



It is invoked by using a DELETE request to the specified Job:

Request
DELETE /api/xml-compare/v1/jobs/3

The response will be the Job, note the jobStatus is now CANCELLED:

Response
<job>
    <links/>
    <creationTime>2018-04-06T11:31:20.863+01:00</creationTime>
    <jobStatus>CANCELLED</jobStatus>
    <numberOfStages>0</numberOfStages>
    <progressInPercentage>0.0</progressInPercentage>
</job>
Response
{
  "creationTime": "2018-04-18T13:19:31.989+0000",
  "numberOfStages": 0,
  "progressInPercentage": 0,
  "jobStatus": "CANCELLED",
  "links": []
}

Result Files 

In an asynchronous comparison, by default a comparison's result file will be deleted after doing a GET request on the Job's 'downloads' resource. This can be changed by setting the boolean parameter keepResult to true when making the comparison request.

A comparison's result file can also be deleted through a HTTP DELETE request to the jobs 'downloads' resource. This will be indicated by a change of the Job's status to DELETED. Further GET requests to this Job's 'downloads' ID will produce a HTTP 204 status code.

Note

If the output has been specified to be uploaded to a cloud service, it won't be deleted regardless of a DELETE request.
Request
DELETE /api/xml-compare/v1/downloads/123456
Response (XML)
<result>
    <status>DELETED</status>
</result>


Response (JSON)
{
    "status": "DELETED"
}


Callbacks 

If the user registered a callback URL with the comparison request, the REST service will attempt an HTTP GET to that URL on completion of the comparison.

In this example the callback URL "http://www.example.com/comparison-callback" was registered in the comparison request - we will add a request parameter of jobIf which allows the Job resource to be located:

Request
GET http://www.example.com/comparison-callback?jobId=12346124
#content .code