I/O Types

 Table of Contents

Various I/O types are available for use with:

  • inputA
  • inputB
  • output
  • catalog

With inputs (inputA, inputB, & catalog) you can use all of the available types.

However, with output, only File and Cloud I/O are available.

HTTP 

HTTP or HTTPS URIs can be specified and the comparator will try to load the specified file.
This may involve the downloading of additional entities or other resources that are referenced by the specified file.  Examples include:

  • DOCTYPE dtds
  • XML schema files using xsi association
  • General external XML entities referenced in the file
  • Resources specified using the XInclude mechanism

multipart/form-data

To use File I/O with multipart/form-data, set the Content-Type to text/plain:

.....
Content-Disposition: form-data; name="A"; Content-Type: text/plain;
http://www.example.com/file1.xml
--boundary

In XML this is represented by setting the type attribute on the I/O element to http. The uri child element is then used to set the HTTP or HTTPS URI.

<inputA type="http">
  <uri>https://www.example.com/file1.xml</uri>
</input>

In JSON this is represented by setting the type key to http.  The uri key is then used to set the HTTP or HTTPS URI.                                                

{
 "inputA": {
   "type": "http",
   "uri": "https://www.example.com/file1.xml"
 }
}

HTTP Authentication

When content is stored on a server authentication may be required for access.  Authentication data may be provided for comparison inputs when using XML and JSON comparison requests, using the authorizationHeader element/value. This header is sent to the server when requesting the inputs.  The format of data that is sent will depend on the type of authentication used. The header will often consist of a word describing the type of data and then the actual data. For example when HTTP basic authentication is used a colon separated username and password are base 64 encoded and the header line would consist of the words 'Authorization Basic' followed by a space and then the actual data.

The authentication mechanism described here is designed to support HTTP basic, digest and oath2 authentication.  Indeed any form of authentication that uses the underlying HTTP authorization header should work.  Support for creating base64 encodings of usernames and password, or for oauth2 refresh and access token fetching are not provided by the REST API directly.  There are standard techniques and code libraries available that could be used to create the Authorization Header data.  Here are some examples:

In XML authentication headers can optionally be added to the inputA, inputB and catalog elements.

XML
<inputA type="http">
  <uri>http://www.example.com/file1.xml</uri>
  <authorizationHeader>Basic dGVzdyOnNlY3JldA==</authorizationHeader>
</input>

An authorizationHeader string object primitive is also available in JSON:

JSON
{
 "inputA": {
   "type": "http",
   "uri": "http://www.example.com/file1.xml",
   "authorizationHeader": "Basic dGVzdyOnNlY3JldA=="
 }
}

The HTTP protocol provides a 401 or UNAUTHORIZED status code and this can occur with comparison inputs when:

  • a resource is accessed that requires authorization, but none is provided
  • invalid authentication data (such as an incorrect username or password) is provided

Please be aware that in these circumstances that the comparison operation provided by the REST service is not 'unauthorized' and therefore it itself will not return a 401 response, but instead a 400 (Bad Request). The returned error data will however indicate the 401 response and the resource being accessed, for example:

XML
<errorMessage>
  <errorMessage>Server returned HTTP response code: 401 for URL: https://www.example.com/file1.xml</errorMessage>
  <errorCode>401</errorCode>
  <stackTrace>...</stackTrace>
</errorMessage>
JSON
{
 "errorMessage": "Server returned HTTP response code: 401 for URL: https://www.example.com/file1.xml",
 "errorCode": 401,
 "stackTrace": [...]
}


File 

This type of I/O is only appropriate when the REST service is being run 'on-premise'. For cloud SaaS use-cases please choose an alternative I/O method.

A file path (on the server) can also be used to specify the input, output, and catalog locations. For example:

multipart/form-data

To use File I/O with multipart/form-data, set the Content-Type to text/plain:

.....
Content-Disposition: form-data; name="A"; Content-Type: text/plain;
/Users/exampleUser/Documents/file1.xml
--boundary

In XML this is represented by setting the type attribute on the I/O element to file. The path child element is then used to set the file path on the server.

<inputA type="file">
  <path>/Users/exampleUser/Documents/file1.xml</path>
</inputA>

In JSON this is represented by setting the type key to file. The path key is then used to set the file path on the server.

{
 "inputA": {
   "type": "file",
   "path": "/Users/exampleUser/Documents/file1.xml"
 }
}

Strings

Raw XML can be used from multipart/form-data. To specify raw XML use application/xmltext/xml or text/html as the Content-Type of the part, for example:

Content-Type: multipart/form-data; boundary=boundary-id
Content-Length: number_of_bytes_in_entire_request_body
--boundary-id
Content-Disposition: form-data; name="A"; Content-Type: application/xml;
<root><a/></root>
--boundary-id
Content-Disposition: form-data; name="B"; Content-Type: application/xml;
<root/>
--boundary-id

Cloud I/O 

It is possible to use the REST service with Amazon Web Services (AWS) S3,  Microsoft Azure Blob storage and Google Cloud Storage. Other providers may be added in a later release. Contact DeltaXML about your I/O requirements to request access using a different service.

In order to make use of this functionality you must provide the necessary information for each service and POST a comparison request. For further details on how to generate the necessary information, please refer to the documentation for your chosen provider.

If you do not wish to upload the resultant file to your chosen services bucket then do not specify the output element, only the input elements. The file can then be downloaded through a GET request on the 'downloads' URI - See Job Model.

Asynchronous comparison is recommended when using Cloud I/O as the REST service needs to make additional HTTP requests to retrieve your data.

AWS S3

XML

<inputA type="aws_s3">
    <accessKeyId>your_aws_key_id</accessKeyId>
    <secretAccessKey>your_aws_secret_access_key</secretAccessKey>
    <region>your_aws_region</region>
    <bucket>your_bucket_name</bucket>
    <fileName>input_a_file_name</fileName>
</inputA>

JSON

{
  "inputA": {
    "type": "aws_s3",
    "accessKeyId": "your_aws_key_id",
    "secretAccessKey": "your_aws_secret_access_key",
    "region": "your_aws_region",
    "bucket": "your_bucket_name",
    "fileName": "input_a_file_name"
  }
}

Azure Blob

This connector has been coded to use the Account Key method of authentication. For more information, see Azure's documentation.

XML

<inputA type="azure_blob">
    <accountName>your_azure_storage_account_name</accountName>
    <accountKey>your_azure_storage_account_key</accountKey>
    <container>your_container</container>
    <blobName>your_input_a_blob_name</blobName>
</inputA>

JSON

{
  "inputA": {
    "type": "azure_blob",
    "accountName": "your_azure_storage_account_name",
    "accountKey": "your_azure_storage_account_key",
    "container": "your_container",
    "blobName": "your_input_a_blob_name"
  }
}

Google Cloud Storage

This connector has been coded to use the Service Account Key method of authentication. For more information, see https://cloud.google.com/docs/authentication/getting-started.

XML

Using default authentication : Set by environment variable
    <inputA type="google_cloud">
        <bucket>your_bucket_name</bucket>
        <fileName>input_a_path</fileName>
    </inputA>



OR

By specifying credentials file
    <inputA type="google_cloud">
    	<credentialsFile type="file">
    		<path>your_credentials_file_path</path>
    	</credentialsFile>
        <bucket>your_bucket_name</bucket>
        <fileName>input_a_path</fileName>
    </inputA>

JSON

Using default authentication : Set by environment variable
{
    "inputA": {
    	"type": "google_cloud",
        "bucket": "your_bucket_name",
        "fileName": "input_a_path"
    }
}

OR

By specifying credentials file
{
    "inputA": {
    	"type": "google_cloud",
        "credentialsFile": {
	    	"type": "file",
	        "path": "your_credentials_file_path"
    		},
        "bucket": "your_bucket_name",
        "fileName": "input_a_path"
    }
}
#content .code