I/O Types
Various I/O types are available for use with:
inputA
inputB
output
comparison
catalog
With inputs (inputA
, inputB
, & comparison
) you can use all of the available types.
However, with output
, only File and Cloud I/O are available.
With catalog
, only File and HTTP I/O are available.
HTTP
HTTP or HTTPS URIs can be specified and the comparator will try to load the specified file.
This may involve the downloading of additional entities or other resources that are referenced by the specified file. Examples include:
DOCTYPE dtds
XML schema files using xsi association
General external XML entities referenced in the file
Resources specified using the XInclude mechanism
multipart/form-data
To use File I/O with multipart/form-data, set the Content-Type to text/plain:
.....
Content-Disposition: form-data; name="inputA"; Content-Type: text/plain;
http://www.example.com/file1.dita
--boundary
In XML this is represented by setting the type
attribute on the I/O element to http
. The uri
child element is then used to set the HTTP or HTTPS URI.
<inputA type="http">
<uri>https://www.example.com/file1.xml</uri>
</input>
For specifying ZIP inputs to the Map Topicset comparison, an additional relative path to the master map in the zip is required. The masterMap
child is then used to identify this master map when processing the input.
<inputA type="http">
<uri>https://www.example.com/inA.zip</uri>
<masterMap>inA/inA.ditamap</masterMap>
</inputB>
In JSON this is represented by setting the type
key to http
. The uri
key is then used to set the HTTP or HTTPS URI.
"inputA": {
"type": "http",
"uri": "https://www.example.com/file1.xml"
}
// FOR ZIP INPUTS
"inputA": {
"type": "http",
"uri": "https://www.example.com/",
"masterMap": "inA/inA.ditamap"
}
HTTP Authentication
When content is stored on a server authentication may be required for access. Authentication data may be provided for comparison inputs when using XML and JSON comparison requests, using the authorizationHeader
element/value. This header is sent to the server when requesting the inputs. The format of data that is sent will depend on the type of authentication used. The header will often consist of a word describing the type of data and then the actual data. For example when HTTP basic authentication is used a colon separated username and password are base64 encoded and the header line would consist of the words 'Authorization Basic' followed by a space and then the actual data.
The authentication mechanism described here is designed to support HTTP basic, digest and oauth2 authentication. Indeed any form of authentication that uses the underlying HTTP authorization header should work. Support for creating base64 encodings of usernames and password, or for oauth2 refresh and access token fetching are not provided by the REST API directly. There are standard techniques and code libraries available that could be used to create the Authorization Header data. Here are some examples:
In XML authentication headers can optionally be added to the inputA, inputB and comparison elements.
XML
<inputA type="http">
<uri>http://www.example.com/file1.xml</uri>
<authorizationHeader>Basic dGVzdyOnNlY3JldA==</authorizationHeader>
</input>
An authorizationHeader string object primitive is also available in JSON:
JSON
{
"inputA": {
"type": "http",
"uri": "http://www.example.com/file1.xml",
"authorizationHeader": "Basic dGVzdyOnNlY3JldA=="
}
}
The HTTP protocol provides a 401 or UNAUTHORIZED status code and this can occur with comparison inputs when:
a resource is accessed that requires authorization, but none is provided
invalid authentication data (such as an incorrect username or password) is provided
Please be aware that in these circumstances that the comparison operation provided by the REST service is not 'unauthorized' and therefore it itself will not return a 401 response, but instead a 400 (Bad Request). The returned error data will however indicate the 401 response and the resource being accessed, for example:
XML
<error>
<errorMessage>Server returned HTTP response code: 401 for URL: https://www.example.com/file1.xml</errorMessage>
<errorCode>401</errorCode>
</error>
JSON
{
"errorMessage": "Server returned HTTP response code: 401 for URL: https://www.example.com/file1.xml",
"errorCode": 401
}
File
This type of I/O is only appropriate when the REST service is being run 'on-premise'. For cloud SaaS use-cases please choose an alternative I/O method.
A file path (on the server) can also be used to specify the input, output, and catalog locations. For example:
multipart/form-data
To use File I/O with multipart/form-data, set the Content-Type to text/plain:
.....
Content-Disposition: form-data; name="inputA"; Content-Type: text/plain;
/Users/exampleUser/Documents/file1.xml
--boundary
In XML this is represented by setting the type
attribute on the I/O element to file
. The path
child element is then used to set the file path on the server.
<inputA type="file">
<path>/Users/exampleUser/Documents/file1.xml</path>
</inputA>
<!-- FOR ZIP INPUTS -->
<inputA type="file">
<path>/Users/exampleUser/Documents/inA.zip</path>
<masterMap>inA/inA.ditamap</masterMap>
</inputA>
In JSON this is represented by setting the type
key to file
. The path
key is then used to set the file path on the server.
"inputA": {
"type": "file",
"path": "/Users/exampleUser/Documents/file1.xml"
}
// FOR ZIP INPUTS
"inputA": {
"type": "file",
"path": "/Users/exampleUser/Documents/inA.zip",
"masterMap": "inA/inA.ditamap"
}
Strings
Raw XML can be used from multipart/form-data. To specify raw XML use application/xml
, text/xml
or text/html
as the Content-Type of the part, for example:
Content-Type: multipart/form-data; boundary=boundary-id
Content-Length: number_of_bytes_in_entire_request_body
--boundary-id
Content-Disposition: form-data; name="inputA"; Content-Type: application/xml;
<root><a/></root>
--boundary-id
Content-Disposition: form-data; name="inputB"; Content-Type: application/xml;
<root/>
--boundary-id
Cloud I/O
It is possible to use the REST service with Amazon Web Services (AWS) S3, Microsoft Azure Blob storage and Google Cloud Storage. Other providers may be added in a later release. Contact DeltaXML about your I/O requirements to request access using a different service.
In order to make use of this functionality, you must provide the necessary information for each service and POST
a comparison request. For further details on how to generate the necessary information, please refer to the documentation for your chosen provider.
If you do not wish to upload the resultant file to your chosen services bucket then do not specify the output
element, only the input
elements. The file can then be downloaded through a GET request on the 'downloads' URI - See Job Model.
Asynchronous comparison is recommended when using Cloud I/O as the REST service needs to make additional HTTP requests to retrieve your data.
AWS S3
XML
<inputA type="aws_s3">
<accessKeyId>your_aws_key_id</accessKeyId>
<secretAccessKey>your_aws_secret_access_key</secretAccessKey>
<region>your_aws_region</region>
<bucket>your_bucket_name</bucket>
<fileName>input_a_file_name</fileName>
</inputA>
JSON
{
"inputA": {
"type": "aws_s3",
"accessKeyId": "your_aws_key_id",
"secretAccessKey": "your_aws_secret_access_key",
"region": "your_aws_region",
"bucket": "your_bucket_name",
"fileName": "input_a_file_name"
}
}
Azure Blob
This connector has been coded to use the Account Key method of authentication. For more information, see Azure's documentation.
XML
<inputA type="azure_blob">
<accountName>your_azure_storage_account_name</accountName>
<accountKey>your_azure_storage_account_key</accountKey>
<container>your_container</container>
<blobName>your_input_a_blob_name</blobName>
</inputA>
JSON
{
"inputA": {
"type": "azure_blob",
"accountName": "your_azure_storage_account_name",
"accountKey": "your_azure_storage_account_key",
"container": "your_container",
"blobName": "your_input_a_blob_name"
}
}
Google Cloud Storage
This connector has been coded to use the Service Account Key method of authentication. For more information, see https://cloud.google.com/docs/authentication/getting-started.
XML
Using default authentication : Set by environment variable
<inputA type="google_cloud">
<bucket>your_bucket_name</bucket>
<fileName>input_a_path</fileName>
</inputA>
OR
By specifying credentials file
<inputA type="google_cloud">
<credentialsFile type="file">
<path>your_credentials_file_path</path>
</credentialsFile>
<bucket>your_bucket_name</bucket>
<fileName>input_a_path</fileName>
</inputA>
JSON
Using default authentication : Set by environment variable
{
"inputA": {
"type": "google_cloud",
"bucket": "your_bucket_name",
"fileName": "input_a_path"
}
}
OR
By specifying credentials file
{
"inputA": {
"type": "google_cloud",
"credentialsFile": {
"type": "file",
"path": "your_credentials_file_path"
},
"bucket": "your_bucket_name",
"fileName": "input_a_path"
}
}