[HN Gopher] Invisible XML is a language for describing the impli...
___________________________________________________________________
Invisible XML is a language for describing the implicit structure
of data
Author : bryanrasmussen
Score : 46 points
Date : 2022-07-16 17:42 UTC (5 hours ago)
(HTM) web link (invisiblexml.org)
(TXT) w3m dump (invisiblexml.org)
| adamretter wrote:
| I think one of the interesting points missed here, is that if you
| can convert to XML then you can use the large selection of mature
| tooling around XML to query, transform, and process your data.
| solardev wrote:
| Now you can export your SQL dump to YAML, parse it to XML,
| convert it to JSON, put it in NoSQL, messagepack it into a key-
| value store and finally be able to query your customer names!
| ironhaven wrote:
| > Invisible XML (ixml) is a method for treating non-XML documents
| as if they were XML, > enabling authors to write documents and
| data in a format they prefer while providing XML for processes
| that are more effective with XML content.
|
| Interesting although this seems a little out of date because xml
| seems to one of the least desirable data formats for modern
| programming languages. Maybe this is more useful for legacy
| enterprise use cases that I don't know about.
| lucideer wrote:
| The popularity of the target isn't as really relevant here as
| the capability of the target. XML supports annotated trees
| (attributes + child nodes) whereas most popular modern
| interchange formats only support one of those dimensions (child
| nodes). Some of them do support types that xml lacks (integers,
| null, etc.) but these can be annotated in xml so the lack isn't
| critical.
|
| Ultimately what all that means is that all e.g. json documents
| can be represented losslessly in xml, whereas the reverse is
| not true without explicit external schema. Which means
| targeting XML covers other less capable interchange formats
| implicitly.
| zapperdulchen wrote:
| There are XML workflows in technical documentation outside of
| software development where something like this could be of
| interest.
| tokinonagare wrote:
| XML still has few advantages over JSON: comments, namespaces, a
| canonical form and slightly better extensibility.
| Dylan16807 wrote:
| Surely you could lay out a canonical form for JSON in like
| five minutes?
|
| Oh, here: https://www.rfc-editor.org/rfc/rfc8785
|
| No duplicates, no whitespace, sort the keys, copy number
| serialization from javascript, a couple other little details.
| tartoran wrote:
| At the expense of excessive verbosity. I used xml and still
| do to this day and it is one of my least favorite formats to
| read and edit.
| int_19h wrote:
| And also data query & transformation languages like XSLT and
| XQuery that are designed around its data model.
| mpyne wrote:
| JSON either has those (namespaces, extensibility) or the
| advantage is slight at best (comments, canonical form).
|
| There is a real advantage of XML over JSON not mentioned
| though, which is its usefulness in annotating computer-
| readable data into an otherwise human-editable document.
| There's not a _lot_ of these cases, and where they 're at
| you're probably still better off using AsciiDoc, Markdown or
| even HTML instead, but those use cases are out there and JSON
| is awful for those.
| mcswell wrote:
| "the advantage is slight at best (comments...)" I suppose
| that's true if the data is generated by machine and
| consumed by machine, and never edited or looked at by
| humans. But JSON seems to be used in a lot of datasets that
| humans do touch, like config files that you can edit.
| Visual Studio Code is one such user, and I've had to edit
| my own VSC config files. And I use comments, because
| http://catb.org/~esr/writings/unix-koans/prodigy.html.
| Fortunately, VSC allows comments between a // and a
| following newline.
| gavinray wrote:
| The fact that it's XML is not super relevant -- what you're
| working with in code is a tree structure that may as well be
| JSON or YAML
|
| This is essentially a way to write grammars for things and get
| the ability to parse them as trees in a common format that is
| interchangeable with things like JSON.
|
| I honestly don't see much of a difference between this, and
| something like a PEG grammar where you do this:
| let parsed = peg.parse(input, grammar) let xml =
| json2xml(parsed)
| tannhaeuser wrote:
| Hmm, on the one hand this proposal makes XML come full-circle and
| re-introduce SGML concepts (eg. SHORTREF) that were explicitly
| omitted from XML as a simplified SGML subset for canonical angle-
| bracket markup without the need for markup declarations; OTOH,
| Norm and Steve are fully aware of SGML. I'd really appreciate if
| whoever wants to re-introduce SGML features to XML would justify
| and align their proposal with SGML, just as XML has been
| introduced as a proper, well-aligned subset of SGML.
| pshc wrote:
| I see you're trying to sneak XML into common use again. Haven't
| we already suffered enough?
| mcswell wrote:
| I suppose you think JSON is better.
| oofbey wrote:
| I think the kids who never really used XML might find it quaint
| and charming.
| PaulHoule wrote:
| This looks like a parser generator that makes an object tree in
| the form of an XML document.
| secondcoming wrote:
| +1 XML is to data what UML is to software
| quesomaster9000 wrote:
| I was going to say the same, it looks like an (E)BNF to AST
| parser which outputs XML.
|
| My only quandary would be whether the output XML structure
| could be ambiguous given the parse tree and input (requiring
| lots of context-dependent if/then logic when interpreting the
| XML). Perhaps some kind of invisible XML stylesheet could pre-
| process the AST before outputting the XML.
|
| And secondly, can it handle CSV? If so, along with a command-
| line app like `jq` it could be an extremely useful addition to
| the general purpose data munging toolkit. Or do I have to pass
| the input through a 1000+ byte `sed` script first to normalize
| it.
___________________________________________________________________
(page generated 2022-07-16 23:00 UTC)