Author’s note: Since publishing this, I’ve learned that the array representation was first proposed in a joint paper by George James and Rob Tweed, A Universal NoSQL Engine, Using a Tried and Tested Technology.
Given the ubiquity of MUMPS in Health IT, and the increasing use of JSON as a data format, there is a need for a simple way of converting between the two formats. But before presenting such a mapping, let us briefly review the two formats.
In MUMPS, arrays are key/value pairs in which keys may be organized hierarchically. Not only are
ROOT(1)="abc" ROOT(2)="def"
and
ROOT("abc")="def"
legal arrays, but so is
ROOT("abc")="d" ROOT("abc",0,12)="ef" ROOT(4,"def")=1.1
There are a few things to notice here. For one thing, numbers and arrays may be freely intermixed. In fact, the language itself doesn’t distinguish between 1 and “1”. We could have quoted the numbers appearing in the above array and the semantics would have been the same. The next thing to notice is that values may be associated with array nodes at any subscript level, not just the deepest one, but they are not required.
By contrast, JSON objects are key/value pairs, the keys of which are strings. The values may be strings, numbers, arrays, other objects, or the special values true, false, and null. For a full description of the format, including train track diagrams for the syntax, see ECMA-404.
Just keep in mind that
- JSON objects are key/value pairs enclosed in curly braces ({}).
- The keys must be strings, but values may be primitives, arrays, or other objects.
- Arrays are sequences of values enclosed in square bracket ([]). Examples include [1, 2, 3], [1, “two”, 3] and [].
Now, let’s consider how a JSON object might be encoded in MUMPS. The following approach appears to be folklore, but it is described by Rob Tweed on his blog, The EWD Files, in JSON – Interfacing VistA (and Other Legacy MUMPS Systems). The idea is to use the fact that MUMPS allows strings as subscripts, and represent a JSON object as an array having, as it subscript, the keys of the object. For example, we would represent
{ "one": 1, "two": 2, "three": 3 }
as
ROOT("one")=1 ROOT("two")=2 ROOT("three")=3
If an object has other objects as values, we can then just add a layer of subscripts. For example,
ROOT("data","one")=1 ROOT("data","two")=2
would represent
{"data": {"one: 1, "two: 2} }
But what about arrays? An obvious idea is to use numeric subscripts. For example,
ROOT("data",1)=1 ROOT("data",2)=2 ROOT("data",3)=3
would represent
{"data": [1, 2, 3] }
Unfortunately, there is a problem. If we were to write
ROOT("data","1")=1 ROOT("data","2")=2 ROOT("data","3")=3 We would probably want it to be interpreted as
{" data": { "1": 1, "2": 2, "3": 3 } }
(However unnatural it is to write JSON this way.)
As an aside, in JavaScript, indexing with square brackets is equivalent to using attributes, but JSON is not JavaScript.
One possible solution is add a new node that identifies arrays as arrays and objects as objects. In particular, we could write
ROOT("data",0)="0^array" ROOT("data",1)=1 ROOT("data",2)=2 ROOT("data",3)=3
and
ROOT("data",0)="1^object" ROOT("data","1")=1 ROOT("data","2")=2 ROOT("data","3")=3
This is a bit ugly, but it removes the ambiguity.
I’m not sure what to make of my working JavaScript abstraction of Global Storage being described as “The following approach appears to be folklore”. It’s being used in real-world production environments. The latest evolution of this approach is ewd-document-store: http://gradvs1.mgateway.com/download/ewd-document-store.pdf, part of the EWD 3 suite of modules. It ain’t no fairy tale.
Mapping of arrays is a pretty trivial issuei my opinion, but you’re right: it’s asymmetric – JavaScript has a formally-implemented array type. Mumps arrays are just a notional convention using consecutive integer subscripts, so there’s all kinds of problems if you try to do a symmetric conversion between the two.
Think the other way around (ie instead of being Mumps-centric) and it gets much more interesting – for a JavaScript developer, a Global Storage database turns out to be an interesting beast as it can act as a fine-grained document database and support persistent JavaScript objects: there’s no equivalent in the NoSQL database world that I’m aware of. That’s the concept behind ewd-document-store. The problem is that almost nobody in the JavaScript world has ever heard of Mumps.
LikeLike
Sorry Rob. All I meant is that I can’t definitively trace the origin of the idea of using MUMPS arrays. It may be that EWD was first here. In any case, I certainly did not mean it to be disparaging.
LikeLike
None taken, Greg 🙂
I can tell you categorically where the idea first appeared: in this joint paper in 2010 by George James and me: http://mgateway.com/docs/universalNoSQL.pdf See the section on Document Databases To my knowledge, my EWD implementation was the first to demonstrate and productise the concept.
LikeLike
I’ve added an author’s note to that effect.
LikeLike
David Wicksell wrote a blog post about the issue.of roundtrip encoding of MUMPS data structures into JSON data structures at:
http://fwslc.blogspot.com/
LikeLike
See also the explanation here of how the ewd-document-store Document Database abstraction of Mumps globals works: http://www.slideshare.net/robtweed/ewd-3-training-course-part-25-document-database-capabilities
LikeLike