[HN Gopher] TOON - Token Oriented Object Notation
       ___________________________________________________________________
        
       TOON - Token Oriented Object Notation
        
       Author : royosherove
       Score  : 55 points
       Date   : 2025-10-26 22:19 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | anonymoushn wrote:
       | Hello, it's probably better to add leading spaces before all of
       | the words rather than none of them
        
       | meander_water wrote:
       | I don't get it, can't you just use yaml instead of inventing
       | another DSL.
        
         | mhosayny wrote:
         | It's more compact than YAML. More like a combination of YAML
         | and CSV.
        
         | jscheel wrote:
         | For repeating objects of the same structure, yaml will still
         | require each key on each object, whereas this is a hybrid with
         | csv, so it defines the keys once.
        
         | inopinatus wrote:
         | Norway.
        
           | dragonwriter wrote:
           | YAML 1.2 has been out for 16 years now, so I would simply not
           | assume that the suggestion to use YAML for a new purpose
           | means "use YAML 1.1".
        
             | inopinatus wrote:
             | I could agree that you would not make poor assumptions.
             | 
             | Your LLM, however, may experience cross-format feature
             | superposition and consequential spurious activation.
        
             | flyer23 wrote:
             | It is, also noone uses it:)
        
       | vessenes wrote:
       | I'll be interested to see benchmarks. My expectation is that
       | accuracy will take a hit on mid or longer context prompts: I'd
       | bet that the heavy use of JSON in fine tuning will end up
       | impacting quality of a more terse (less reasoning space) novel
       | encoding.
       | 
       | That said: I like the idea!
        
         | brian-bk wrote:
         | There are a very light benchmarks in the Readme, or are you
         | looking for more?
        
           | Mumps wrote:
           | Do you mean the [0] Token Benchmarks section? I only see
           | token count numbers.
           | 
           | Which doesn't address the question: do LLMs understand TOON
           | the same as they would JSON? It's quite likely that this
           | notation is not interpreted the same by most LLM, as they
           | would JSON. So benchmarks on, say, data processing tasks,
           | would be warranted.
           | 
           | [0] https://github.com/johannschopplich/toon?tab=readme-ov-
           | file#...
        
             | tujux wrote:
             | I think they're talking about these sections:
             | 
             | 1. Retrieval Accuracy -
             | https://github.com/johannschopplich/toon?tab=readme-ov-
             | file#...
             | 
             | 2. Performance by dataset -
             | https://github.com/johannschopplich/toon?tab=readme-ov-
             | file#...
        
       | moralestapia wrote:
       | [flagged]
        
         | jayd16 wrote:
         | I'm not sure which one would win but its a bit telling that
         | compression isn't mentioned at all.
         | 
         | I guess its about LLMs so the idea is has to be plaintext? But
         | if you can train it on TOON can't you train it on BSON?
        
       | inopinatus wrote:
       | JSON unmarshalling often has to consider separately whether an
       | attribute is absent, false, zero, null, or the empty string, but
       | this was never quite semantically ambiguous enough for my tastes,
       | so adding that void-ish values may also now be serialised as a
       | tuple of length [0] seems to me an excellent additional
       | obfuscation.
        
         | joshribakoff wrote:
         | The use case here is to reduce the token usage with LLMs, such
         | as an agent that outputs a list of commands eg. Tuples with
         | files to write and their new contents.
         | 
         | Supporting this use case doesn't require perfectly marshaling
         | every data structure ever.
         | 
         | But to your point the tool could have wider use cases without
         | the limitations.
        
           | inopinatus wrote:
           | If one trains a model to understand it then that model will
           | inevitably emit it, which means in turn one shall have to
           | parse it, and now the application supports TOON for anything,
           | and good luck telling the users/customers any different.
        
       | Pxtl wrote:
       | I'm sorry I don't see this adding value over various other
       | formats. I don't really _want_ a new object serialization format,
       | I just want the existing ones to have the features I need. YAML
       | but with static typing and schema. XML but without crazy internet
       | features. TOML but with an object format that doesn 't hurt my
       | brain. JSON but with decent multiline strings and comments.
       | NestedText but with a sub-standard that provides static-typing
       | and schema and whatnot.
        
       | hedgehog wrote:
       | It would be interesting to compare this to BAML and TOML.
        
       ___________________________________________________________________
       (page generated 2025-10-27 23:00 UTC)