[HN Gopher] TOON - Token Oriented Object Notation
___________________________________________________________________
TOON - Token Oriented Object Notation
Author : royosherove
Score : 55 points
Date : 2025-10-26 22:19 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| anonymoushn wrote:
| Hello, it's probably better to add leading spaces before all of
| the words rather than none of them
| meander_water wrote:
| I don't get it, can't you just use yaml instead of inventing
| another DSL.
| mhosayny wrote:
| It's more compact than YAML. More like a combination of YAML
| and CSV.
| jscheel wrote:
| For repeating objects of the same structure, yaml will still
| require each key on each object, whereas this is a hybrid with
| csv, so it defines the keys once.
| inopinatus wrote:
| Norway.
| dragonwriter wrote:
| YAML 1.2 has been out for 16 years now, so I would simply not
| assume that the suggestion to use YAML for a new purpose
| means "use YAML 1.1".
| inopinatus wrote:
| I could agree that you would not make poor assumptions.
|
| Your LLM, however, may experience cross-format feature
| superposition and consequential spurious activation.
| flyer23 wrote:
| It is, also noone uses it:)
| vessenes wrote:
| I'll be interested to see benchmarks. My expectation is that
| accuracy will take a hit on mid or longer context prompts: I'd
| bet that the heavy use of JSON in fine tuning will end up
| impacting quality of a more terse (less reasoning space) novel
| encoding.
|
| That said: I like the idea!
| brian-bk wrote:
| There are a very light benchmarks in the Readme, or are you
| looking for more?
| Mumps wrote:
| Do you mean the [0] Token Benchmarks section? I only see
| token count numbers.
|
| Which doesn't address the question: do LLMs understand TOON
| the same as they would JSON? It's quite likely that this
| notation is not interpreted the same by most LLM, as they
| would JSON. So benchmarks on, say, data processing tasks,
| would be warranted.
|
| [0] https://github.com/johannschopplich/toon?tab=readme-ov-
| file#...
| tujux wrote:
| I think they're talking about these sections:
|
| 1. Retrieval Accuracy -
| https://github.com/johannschopplich/toon?tab=readme-ov-
| file#...
|
| 2. Performance by dataset -
| https://github.com/johannschopplich/toon?tab=readme-ov-
| file#...
| moralestapia wrote:
| [flagged]
| jayd16 wrote:
| I'm not sure which one would win but its a bit telling that
| compression isn't mentioned at all.
|
| I guess its about LLMs so the idea is has to be plaintext? But
| if you can train it on TOON can't you train it on BSON?
| inopinatus wrote:
| JSON unmarshalling often has to consider separately whether an
| attribute is absent, false, zero, null, or the empty string, but
| this was never quite semantically ambiguous enough for my tastes,
| so adding that void-ish values may also now be serialised as a
| tuple of length [0] seems to me an excellent additional
| obfuscation.
| joshribakoff wrote:
| The use case here is to reduce the token usage with LLMs, such
| as an agent that outputs a list of commands eg. Tuples with
| files to write and their new contents.
|
| Supporting this use case doesn't require perfectly marshaling
| every data structure ever.
|
| But to your point the tool could have wider use cases without
| the limitations.
| inopinatus wrote:
| If one trains a model to understand it then that model will
| inevitably emit it, which means in turn one shall have to
| parse it, and now the application supports TOON for anything,
| and good luck telling the users/customers any different.
| Pxtl wrote:
| I'm sorry I don't see this adding value over various other
| formats. I don't really _want_ a new object serialization format,
| I just want the existing ones to have the features I need. YAML
| but with static typing and schema. XML but without crazy internet
| features. TOML but with an object format that doesn 't hurt my
| brain. JSON but with decent multiline strings and comments.
| NestedText but with a sub-standard that provides static-typing
| and schema and whatnot.
| hedgehog wrote:
| It would be interesting to compare this to BAML and TOML.
___________________________________________________________________
(page generated 2025-10-27 23:00 UTC)