Asynchronous Merges

 Table of Contents

Synchronous merging is used when the response is obtained quickly and where the result is returned as the response.

However in other cases asynchronous merging is useful:

  • If there is a risk of HTTP timeout

  • When performing a merge with large input documents, or a high number of document versions

  • When the merge result is needed somewhere other than where the response is returned

  • Where progress, logging or resource consumption information is needed

  • When there is a need to start a batch of merges at one time

1. XML / JSON Request

When using XML or JSON, an asynchronous merge will specify the Async element. Inside Async are the optional values Callback and Output.

Callback provides a callback URL that the REST server will make a GET request to after the comparison is complete.

Output allows you to specify a File location that the result will be written to.

XML

Request Excerpt
<Concurrent>
  <Async>
  	<Callback>http://mycall.back/dita-merge-rest/</callback>
    <Output type="file">
      <Path>/path/to/output.xml</path>
    </Output>
  </Async>
  <Versions>
  	....
  </Versions>
  <Configuration>
  	....
  </Configuration>
</Concurrent>

JSON

Request Excerpt
{
  "Async":{
  	  "Callback": "http://mycall.back/dita-merge-rest/",
	  "Output": {
	    "type": "file",
	    "Path": "/path/to/output.xml"
	  }
  },
  "Versions": {....},
  "Configuration": {....}
}

2. Form Request

When using a multipart/form-data request there are three parameters available:

ParameterDescription
Async

Boolean enabling asynchronous merge.

Has to be set true for the other async parameters to have effect.

AsyncCallbackA URL that the REST service will make a GET request to after the comparison is complete.
AsyncOutputA File location the result will be written to.
Request Excerpt
....
Content-Disposition: form-data; name="Async"
true
---boundary---
Content-Disposition: form-data; name="AsyncCallback"
http://mycall.back/dita-merge-rest/
---boundary---
Content-Disposition: form-data; name="AsyncOutput"
/path/to/output.xml
---boundary---
....

3. Jobs

For each async merge we create a Job, that can be read from the /jobs/{jobId} endpoint. For example, you can poll for the Job's status.

3.1. Job Model

ObjectDescription
jobIdThe ID of the Job.
startTimeISO 8601 timestamp of when the merge started.
finishedTimeISO 8601 timestamp of when the merge finished.
processingTimeThe elapsed time of the merge, in nanoseconds.
status

An enumeration indicating the state of the merge.

States include:

  • QUEUED
  • STARTED
  • EXTRACTING
  • SAVING
  • SUCCESS
  • FAILED
  • DELETED
phaseDescription

Description of the current phase of the merge.

E.g. "Processing Version 'version-two'"

stageDescription

The current stage being ran.

E.g. "Filter input-b/0-deltaV3"

output

An IO object pointing to the location.

E.g. if no output was specified, the result is written on the server and a link is supplied to let you access it:

<output type="http">
<size>13132588</size>
<uri>http://localhost:8080/api/ditamerge/v1/downloads/29ce0993-f9a3-4227-b79b-284c40df201c</uri>

</output>

errorIf there was an error, this will contain an error message object.
links

HATEOAS links containing a href URI and a rel describing its meaning.

A rel of "cancel" indicates you can use HTTP DELETE to cancel the Job.

3.2. Cancelling Jobs

Running Jobs can be cancelled by using a HTTP DELETE request on the Job, for example:

Request
DELETE /api/ditamerge/v1/jobs/652a2e28-986c-4039-bf24-c53a971836e9

The response will be the Job as it existed when it was cancelled, with a status of DELETED:

XML

Response
<job>
    <jobId>652a2e28-986c-4039-bf24-c53a971836e9</jobId>
    <startTime>2018-09-18T08:57:49.668+0000</startTime>
    <processingTime>3740586327</processingTime>
    <jobStatus>DELETED</jobStatus>
    <phaseDescription>Processing Version 'version-one'</phaseDescription>
    <stageDescription>Filter result/2-key</stageDescription>
    <links/>
</job>


JSON

Response
{
    "jobId": "652a2e28-986c-4039-bf24-c53a971836e9",
    "startTime": "2018-09-18T08:57:49.668+0000",
    "jobStatus": "DELETED",
    "phaseDescription": "Processing Version 'version-one'",
    "stageDescription": "Filter result/2-key",
    "links": []
}

4. Result Files

When you specify a Merge request asynchronously you can specify an output location (using File IO). In this case Merge will output the result to that location when it completes. It is up to you then what to do with it - the REST service will not delete or modify it after writing.

If you wish to have the Service cache the result for you, you do not have to specify an output location in the async element. When the Job returns a status of SUCCESS it will contain an output location with a HTTP IO which will contain a URL of a location in the service to use to download the result.

The service will keep the file available for download until you DELETE the job (see discussion of the Job resource), or the Job has been unused for a timeout period. Unless this has been configured otherwise the default time for this is one hour. If you request the Job status after this you will get an HTTP 404 not found, and the same will be true for the downloads resource, which is the same as if you had deleted the Job yourself.

5. Callbacks

The callback feature allows the user to specify a URL that will be called via HTTP GET upon completion of the merge.

In this example the callback URL http://www.example.com/merge-callback was registered in the merge request.

When the merge is complete we will call this URL, with a query parameter specifying the ID of the Job:

Request
GET http://www.example.com/merge-callback?jobId=12346124


#content .code