[HN Gopher] Show HN: MTXT - Music Text Format
       ___________________________________________________________________
        
       Show HN: MTXT - Music Text Format
        
       Author : daninet
       Score  : 98 points
       Date   : 2025-11-30 10:21 UTC (4 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | yaoke259 wrote:
       | pretty cool!
        
       | 1313ed01 wrote:
       | I like the idea overall. Looks like something that would be fun
       | to combine with music programming languages (SuperCollider/Of
       | etc).
       | 
       | Not so sure how human-friendly the fractional beats are? Is that
       | something that people more into music than I am are comfortable
       | with? I would have expected something like MIDIs "24 ticks per
       | quarter note" instead. And a format like bar.beat.tick. Maybe
       | just because that is what I am used to.
        
         | bonzini wrote:
         | It should be fine, but fractions (or both fractions and
         | decimals) would be preferable in order to express triplets (3
         | over 2, effectively a duration of 0.3333...)
        
         | daninet wrote:
         | The library has MIT license, I would be more than happy to see
         | people use it in different synths.
         | 
         | I'm planning to add support for math formulas in beat numbers,
         | something like: "15+/3+/4" = 15.58333
        
           | soperj wrote:
           | > "15+/3+/4"
           | 
           | Can you explain how to read that? 15 plus divided by 3 plus
           | divided by 4?
        
             | daninet wrote:
             | It's a shorthand for 15 + (1/3) + (1/4), but I'm still not
             | settled on the syntax.
        
       | rock_artist wrote:
       | Hey, the idea is nice, It would be great to know what pushed you
       | to start this format.
       | 
       | Also, any apps that uses it would benefit from being add to the
       | repo assuring usability in addition to readibility.
        
         | daninet wrote:
         | My initial goal was to fix some mistakes in the MIDI files I
         | recorded from my keyboard. I was also interested in making
         | dynamic tempo and expression changes without dealing with
         | complicated DAW GUIs.
         | 
         | Now I'm working on a synth that uses MTXT as its first-class
         | recording format, and it's also pushing me to fine-tune a
         | language model on it.
        
       | giladvdn wrote:
       | Probably stating the obvious here, but this would be a good way
       | for an LLM to attempt to write or modify music.
        
       | gilrain wrote:
       | How does this compare to standard ABC? More capable, presumably,
       | but a comparison would be useful.
       | 
       | https://en.wikipedia.org/wiki/ABC_notation
       | https://abcnotation.com/
        
         | daninet wrote:
         | ABC notation is more oriented towards traditional sheet music,
         | with regular note lengths, standard Western tuning and a
         | simple, readable syntax. It isn't meant for playing back music
         | that sounds good to the ear. It's hard to catch the nuances of
         | a real human performance with it, but it works well as a lead
         | sheet for musicians. Its expressive marking are relatively
         | limited and interpreted subjectively.
         | 
         | MTXT focuses on editable recordings of live performances,
         | preserving all of those tiny irregularities that make the music
         | human. It can represent arbitrary timings, subtle expressive
         | variations and even arbitrary tuning systems. MTXT can also
         | capture transitions like crescendos and accelerandos exactly as
         | they happened.
        
       | intrasight wrote:
       | I think that for completeness it needs looping and conditional
       | constructs
        
       | dghf wrote:
       | Similar things:
       | 
       | * Perl MIDI::Score -- https://metacpan.org/pod/MIDI::Score
       | 
       | * Csound standard numeric scores --
       | https://csound.com/docs/manual/ScoreTop.html
       | 
       | * CsBeats (alternative score language for Csound) --
       | https://csound.com/docs/manual/CsBeats.html
        
         | bonzini wrote:
         | Lilypond, too. Though it needs a full scheme interpreter to
         | evaluate macros (provided by both the system and the user), it
         | can emit midi files.
        
           | ramses0 wrote:
           | Lilypond isn't well-known enough!
           | 
           | https://en.wikipedia.org/wiki/LilyPond#Integration_into_Medi.
           | ..
           | 
           | https://www.mutopiaproject.org
           | 
           | https://lilypond.org/text-input.html
           | \relative c' {              \key d             \major
           | fis4 fis g a             a g fis e             d d e fis
           | fis4. e8 e2          }
           | 
           | ...but why is it so complicated? A novice interpretation of
           | "music" is "a bunch of notes!" ... my amateur interpretation
           | of "music" is "layers of notes".
           | 
           | You can either spam 100 notes in a row, or you effectively
           | end up with:                   melody   = [ a, b, [c+d], e,
           | ... ]         bassline = [ b, _, b,     _, ... ]
           | music = melody + bassline         score = [            "a
           | bunch of helper text",            + melody,            +
           | bassline,            + page_size, etc...         ]
           | 
           | ...so Lilypond basically made "Tex4Music", and the format
           | serves a few dual purposes:
           | 
           | Engraving! Basically "typesetting" the music for human
           | eyeballs (ie: `*.ly` => `*.pdf`).
           | 
           | Rendering! Basically "playing" the music for human ears (ie:
           | `*.ly` => `*.mid`)
           | 
           | Librarification! Basically, if your music format has
           | "variables" and "for-loops", you can end up with an end score
           | that's something like: `song = [ intro + chorus + bridge +
           | chorus + outro ]`, and then not have to chase down and modify
           | all the places you use `chorus` when you modify it. (See this
           | answer for more precision:
           | https://music.stackexchange.com/a/130894 )
           | 
           | ...now imagine doing all of the above for multiple
           | instruments and parceling out `guitar.pdf`, `bass.pdf`,
           | `drums.pdf` and `whole-song.pdf`
           | 
           | TL;DR: Music is haaard, and a lot closer to programming than
           | you think!
        
             | sporkl wrote:
             | Lilypond is the only music engraving system I'm aware of
             | that can handle polytempo scores. The TEX-ness really comes
             | in handy.
        
       | vessenes wrote:
       | Cool. My one concern with this is that it has no horizontally
       | scannable note/chord mode. It's super common for humans to read a
       | sequence of notes left to right, or write it that way, but it's
       | also just more efficient in terms of scanning / reading.
       | 
       | Can I suggest a guarded mode that specifies how far apart each
       | given note/chord is by the count, e.g.
       | #1.0:verse1        Am - C - G - E - F F F F       #
       | 
       | You could then repeat this or overlay a melody line like
       | #0.25:melody1       C4 - C4 - C4 D4 C4 - D4 - D4 - D4 E4 D4 -
       | #
       | 
       | Etc. I think this would be easier to parse and produce for an
       | LLM, and it's would compile back to the original spec easily as
       | well.
        
         | daninet wrote:
         | I considered it but decided against it in the first version,
         | because specifying note durations is too tricky. It was more
         | important to get the .mid -> MTXT conversion and live-
         | performance recording working, where notes usually have
         | irregular note lengths. Representations like "C4 0.333 D4 0.333
         | E4 0.25" feel too hard to read.
        
       | Grom_PE wrote:
       | This made me remember old set of tools called mtx2midi and
       | midi2mtx, I used them to edit some midi files while making sure
       | I'm not introducing any unwanted changes. While roundtrip output
       | was not binary identical, it still sounded the same.
       | 
       | Looks like MTXT tool here does not quite work for this use case,
       | the result of the roundtrip of a midi I tried has a segment
       | folded over, making two separate segments play at the same time
       | while the total duration got shorter.
       | 
       | https://files.catbox.moe/5q44q0.zip (buggy output starts at 42
       | seconds)
        
         | daninet wrote:
         | Thank you, I will have a look. I consider it important to have
         | the round trip conversion working seamlessly.
         | 
         | I created an issue here:
         | https://github.com/Daninet/mtxt/issues/1
        
         | cestith wrote:
         | It reminded me of ABC and the tools abc2midi and midi2abc.
        
       | lokar wrote:
       | I have been using:
       | 
       | https://www.vexflow.com/
       | 
       | Which has a text format, and typesets it for you nicely.
        
       | xrd wrote:
       | I've been spending the last week casually looking at strudel.cc.
       | 
       | They have a notation that looks similar (basically a JavaScript
       | port of the Haskell version).
       | 
       | I like this, but I'm curious why I would want to use this over
       | strudel. Strudel blends the language with a js runtime and that's
       | really powerful and fun.
        
       | throw7 wrote:
       | It makes no sense to design for llm's. Do what makes sense for
       | the reader and forget that llm's exist at all.
        
         | amingilani wrote:
         | What prompted this and why does it not?
        
           | badlibrarian wrote:
           | It's not the 19th Century. You don't need to punch holes in
           | cards to help the machine "think" any more.
        
             | otabdeveloper4 wrote:
             | > You don't need to punch holes in cards to help the
             | machine "think" any more.
             | 
             | That's literally what "prompt engineering" is, though.
        
               | badlibrarian wrote:
               | "Transpose this MIDI file down a third" requires neither
               | a specialized data format nor fancy prompt engineering.
               | ChatGPT asked: "A) Major third up (+4 semitones) or B)
               | Minor third up (+3 semitones)" then did it.
        
       | jan_Sate wrote:
       | Obligatory xkcd: https://xkcd.com/927/
        
       | matheusmoreira wrote:
       | To me it seems like files could get hard to understand if events
       | that happen simultaneously aren't horizontally lined up like
       | this:                 2.0 voice1 | voice2 | ...
       | 
       | Like a text version of old school tracker interfaces:
       | 
       | https://youtu.be/eclMFa0mD1c                 POS | TRACK #1 |
       | TRACK #2 | ...
        
       | jasonjmcghee wrote:
       | This would lend itself well to a live-coding/live-music
       | experience.
       | 
       | I played around with a similar idea on my own (very simple /
       | poor) text music environment:
       | 
       | https://github.com/jasonjmcghee/vscode-extension-playground?...
       | 
       | in the middle of making an extension to allow making vs code
       | extensions live because I wanted a faster development feedback
       | loop.
        
       | chaosprint wrote:
       | Some simple thoughts:
       | 
       | I feel that one challenge of programming languages is how to
       | remember these rules, formats, and keywords. Even if you're using
       | familiar formats like YAML or JSON, how do you match keywords?
       | 
       | When developing Glicol (http://glicol.org/), I found that if it's
       | based on an audio graph, all node inputs and outputs are all
       | signals, which at least reduces the matching problems. The
       | remaining challenge is ensuring that reference documentation is
       | available at the minimal cost.
        
       | formula1 wrote:
       | This is pretty neat
       | 
       | I'm wondering if it can be used alongside strudal
       | https://strudel.cc/ Either mtxt => strudal or strudal => mtxt
       | 
       | Heres strudal in action
       | https://www.youtube.com/shorts/YFQm8Hk73ug
        
       | usrusr wrote:
       | Count me in as another one with a longstanding mostly dream
       | project aiming for human enjoyable notation grammar.
       | 
       | For me it was coming from tracker notation (buzz), where i was
       | wildly underwhelmed by all that whitespace for timing (well,
       | empty cells for timing) and the lack of parameterizable macros. A
       | seriously underexplored field, perhaps because almost everybody
       | who ever started got pulled in by the lure of textually defined
       | _synthesis_.
        
         | chrisjj wrote:
         | Solved 40yrs ago. AMPLE.
         | 
         | www.colinfraser.com/m5000/ample-nucleus-pg.pdf
         | 
         | For macros, see:
         | 
         | https://www.retro-kit.co.uk/user/custom/Acorn/3rdParty/Hybri...
         | 
         | https://www.retro-kit.co.uk/user/custom/Acorn/3rdParty/Hybri...
        
       ___________________________________________________________________
       (page generated 2025-12-04 23:01 UTC)