Guide to JSON Graft

 Table of Contents


Elevator pitch: JSON Graft applies changes defined in a JSON delta file to any other similar JSON data file.

If you have made/found some changes to one JSON file that you want to apply to another (similar) JSON file, then JSON Graft is what you need.

Another term used for graft is 'cherry pick'. This comes from version control systems where there are related branches and there is a need to cherry pick changes made between two versions in one branch and apply these to the data in another branch. This is not quite the same as a full three-way merge (merging two descendants of a common ancestor), but it is similar.

The JSON delta file (produced by our compare operation) represents a set of changes or changeset - so applying those changes to a target file is the way to perform a graft. It is that simple! So a graft executes all the relevant changes to the target file. By default, all the 'relevant' changes are applied. 'Relevant' here means that the data that is being changed is in the target file - if a change is made to data that does not appear in the target file, it is ignored.

For example, if you have a master file with 500 names and addresses, and a subset with just 50 of them, you can apply changes made to the master file to the subset. In fact you can also apply changes made in the subset to the master file. Similarly, you can take a list of changes to names and phone numbers and apply that to a related list of names, phone numbers and addresses. Or, again, the other way round (and then of course any changes to addresses will be ignored because there are no addresses in the target file).

You could apply changes made to a list of names of people to a list of zip codes, it will 'work' but no changes will be made!

It doesn't matter if some of the changes have already been made. In fact it doesn't matter if you apply the same patch a second time, the result will remain the same (this behaviour is described as 'idempotent').

There are a few choices you can make, e.g. if you want the changeset or the target to take priority or if you want only additions to be applied.

Technical Overview

The following diagram depicts a typical graft scenario. There are three successive versions of some data called A, B and C. The compare operation can be applied to two adjacent versions in the sequence, such as A and B, or it could cover a range of versions of the data. In this case versions A and C are compared. There is also a target T1 which will be used as input to the graft and the graft process will create the next, updated version of that file T2 which could be saved as the next version on that branch.

We are comparing files A and C, where C is a newer/later version of A. You now need to do the same set of changes to a target file T1 to get T2. The principle is:

  • compare A and C to get a delta file delta-AtoC, this is a graft changeset
  • apply delta-AtoC to T1 and the output will be T2

Two choices need to be made:

  • if a change would override some data in the target, should the change be applied or should the target data be left as it is (see 'overriding change' below)?
  • do I want to keep all my data as it is, i.e. apply additions only?

The principle is that delta-AtoC contains all the changes, and each one is applied to T1 if it makes sense. So if a change is made to an object in A and there is a corresponding object in T1, then the change is applied. If not, the change is ignored.

Technical Details

JSON has containers (object, array) and leaves (JSON primitive types; string, number, true, false, null). JSON has valueless primitive types (true, false, null) and valued primitive types (string, number) whose content can change.

A change is any difference between A and C. Any difference between T1 and T2 will have been driven by a change. However, a change will not always appear as a difference between T1 and T2, because the data in T1 may be the same as C or the data changed may not be in T1.

Any data which is only present in A, i.e. does not exist in C or T1, does not appear in the result. Similarly, any data which is only present in T1, i.e. does not exist in A or C, will appear unchanged in the result.

A change is either an addition to, deletion from or modification of something in A. Additions are new members in objects, new elements in arrays. Deletions are of existing members in objects, existing elements in arrays. Modifications are:

  • changes to the type of a member value or an array element, e.g. object changed to array, null to false, object to true, string to number
  • changes to the value of a valued primitive type, e.g. true to false.

A change is either simple, identical or overriding.

An identical change is one in which every aspect of the change, including all descendent elements, is identical (equal) in C and T1. Note that if the graft is applied a second time, i.e. to T2, then all the changes will be identical changes. Examples of identical change are the addition of a hierarchy of objects in C where the identical hierarchy already exists in T1 at the same place, or the deletion by C of a member that does not exist in T1.

A simple change is a change which occurred in C and the data in A and T1 is equal or not present in either of them. Examples of a simple change are addition in C of an object (no matter how complex the object is), or deletion in C of a member (no matter how complex the thing held in the member is), or a change from a value of true in A to false in C where the value in T1 is true, i.e. the same value as in A.

Simple or identical changes are non-conflicting.

An overriding change is a change which occurred in C and the data in A and T1 is not equal, so the change in delta-AtoC may override a change already made in the target. Another example is the change of the type of a member from string in A to number in C where the type in T1 is neither string nor number. Similarly, if it is a number in T1 with a different value to that in C, this would be an override because A and T1 are not equal. If it was a number with a value equal to A that would not cause an override, and of course if the number was equal to that in C then this would be an identical change. Where an overriding change occurs, the changeset or target priority is applied to determine the result.

Graft Parameters

The graft operation has fewer parameters than the compare operation because for process consistency some settings used during compare must also be used during grafting. This is true of the word-by-word and array-priority parameters, these are specified during construction of the changeset with the compare operation, stored as metadata in the changeset and used during grafting. This currently leaves output mode as the only graft parameter.

How is Graft different from Three-way Merge?

With three-way merge there is a common ancestor, i.e. a file from which the other two files are derived. This means we can detect the changes made in both branches and then merge these according to some set of rules. The rules can get complicated and there can be conflicts between the changes that have been made in the two derived files.

Graft is a bit different, because there is no common ancestor. Therefore there is only the concept of changes made to one branch, which we want to apply to a target. There is no concept of changes that may have been made to the target (before we apply the graft), it is just there as a target data file. This is important because it means that if the target contains a subset or a superset of the data in the graft (or delta) file, this is OK because irrelevant changes are simply ignored. So we can have a whole set of related data files and apply changes made to any one to any of the others.

#content .code