[HN Gopher] Why Does Claude Speak Byzantine Music Notation?
___________________________________________________________________
Why Does Claude Speak Byzantine Music Notation?
Author : fi-le
Score : 50 points
Date : 2025-04-01 12:06 UTC (3 days ago)
(HTM) web link (fi-le.net)
(TXT) w3m dump (fi-le.net)
| Telemakhos wrote:
| So, let me see what I think I understand here:
|
| 1. AI models are good at Caesar-cypher transposition, because it
| occurs often enough in training models for certain values of the
| cypher offset. Outside those values, AI doesn't handle the
| transformations well.
|
| 2. Somehow AI models perform this cypher also within high ranges
| of Unicode, because the characters are encoded as three tokens
| each, of which the last one encodes the same difference as
| between alphabetic letters, and so the first two tokens get
| discarded as irrelevant, meaning that by sheer chance the
| alphabet maps perfectly via Caesar-cypher (with a two-token
| offset) to a specific range of Unicode characters reserved for
| Byzantine music notation.
|
| 3. This is easy to understand for one AI model, because its
| explicable by chance that the offset between the alphabet and
| Byzantine music notation should coincide perfectly with two less-
| significant tokens. It's harder to understand why this works in
| more than one AI model, though.
| yorwba wrote:
| It's not that surprising that models encode Byzantine Music
| Notation characters using separate tokens for each UTF-8 byte,
| since they're unlikely to occur often enough for byte-pair
| encoding to allocate dedicated multi-byte tokens to them.
|
| What isn't clear to me is where ASCII text with 64 added to
| every byte occurs in the wild.
| the8472 wrote:
| [delayed]
| nestorD wrote:
| I was also surprised to find out (roughly a year ago) that Claude
| is good at Old English (which, despite its misleading name, looks
| nothing like English and is more of a Germanic language) whereas
| ChatGPT would output pure hallucinations.
| GaggiX wrote:
| Claude is much better than ChatGPT at low-resource languages,
| at least it was a year ago, I haven't tested on new models from
| OpenAI but I believe that Claude still has an edge.
|
| For example, when ChatGPT was outputting nonsense in Georgian,
| Claude was speaking it fluently, when ChatGPT learned Georgian,
| Claude was able to speak Mingrelian.
| rdlw wrote:
| Interesting. I was using ChatGPT to try to come up with a
| possible reconstruction of the Ketef Hinnom scrolls (I don't
| know Ancient Hebrew at all), with some mixed results. I had to
| prompt it with things like "What do you think that 'YHWH' bit
| could mean?", and then it sort of caught on. Maybe I'll see if
| Claude can do better.
|
| Your description of Old English is a bit odd. It's certainly
| very different from modern English, but it's its direct
| ancestor and both languages are Germanic.
| fcatalan wrote:
| Another thing that LLMs are (surprisingly?) good at is randomly
| mixed language conversations, here's something I tried a few days
| ago:
|
| Prompt: Kan du comprender questo text? Anata wa quelque inkling
| habe about lo que se parla dans la presente oracao? Repondez med
| sam way of goboriu.
|
| ChatGPT:
|
| Ja, ich kann verstehen este mix de idiomas! Tu es usando une
| combinacion de varias linguas pour expresar una pregunta. Je vais
| responder en the same modo:
|
| Si tu veux kommunicieren en cette maniere, c'est possible, aber
| il peut etre ein bisschen confusing pour andere people! Ni Jue De
| Zhe Ge Fang Shi You Qu Ma ? O preferes-tu un idioma mas claro?
|
| Deepseek:
|
| Ja, jeg kan forsta en lille bit av cio que tu esta dizendo. Anata
| no message wa molto interessante, pero un poco confusing aussi.
| Si tu peux ecrire in una lingua sola, seria plus facile para mi
| at comprender. Danke, merci, arigatou!
|
| Claude:
|
| Oui, je kan entender algo of this texte mezclado! C'est un
| interessante modo de communicare using differentes languages
| together. Je trouve ca quite divertido y puedo responder en la
| misma manera mixta. Ist das what you were looking for? Watashi wa
| happy to continue questa conversation in whatever linguistic
| style anata wa prefer!
|
| Perplexity:
|
| Ja, ich kann diesen Text verstehen! Vous avez cree una frase muy
| interesante con palabras from different languages. Let me
| rispondere using a similar mishmash of lingue:
|
| Questo e un esempio fascinante di code-switching e language
| mixing. Watashi wa impressed by la creativita y el modo en que
| vous avez combine various idiomas in one sentence. Es ist ein
| poco challenging zu decifrar, aber anche molto divertente!
|
| Ich wurde sagen que el mensaje principal e about comprehension
| linguistique et la capacite de entender verschiedene Sprachen
| when they're mixed together. C'est une maniere interessante de
| tester language skills et cognitive flexibility.
|
| Grazie for questa stimolante question! J'espere que mi respuesta
| was equally entertaining und in lo mismo estilo multilingue.
| mistrial9 wrote:
| this fits the supposition -- since LLMs can be fed patterns of
| nonsense and learn to reply in those patterns, LLMs are not
| intelligent.
|
| YNews corollary : since posters cannot resist making new
| swathes of Look At This LLM Output, the open nature of tech
| boards is doomed in some ways (?)
| fcatalan wrote:
| Those texts aren't nonsense. The prompt has a meaning, the
| LLMs are able to understand it, and are able to reply with
| coherent and understandable responses crafted in the same way
| the prompt was written. For me it's a very clear example of
| something that is very far from any training data coming out
| of the models. Intelligent? No, but for me it points to the
| idea that "language is solved".
| TechDebtDevin wrote:
| language will be solved when LLMs are translating Whale's
| songs to human language imo.
| TechDebtDevin wrote:
| As a Vegan, maybe I'm a little biased, but I often think
| about what the implications of a universal translator would
| be, if it did infact give us the ability to understand
| animals. What would that imply if you could drive by a
| slaughterhouse and be able to understand animals saying
| goodbye to their loved ones... assuming this is happening..
| Would all slaughtering stop? Or would people be okay with
| that? Interesting times ahead if there is any possibility
| for ML to translate animal language.
| stavros wrote:
| We don't slaughter animals because we think they don't
| mind dying, we slaughter them because we've outsourced
| the mass killings to people who don't mind doing it, and
| a steak looks enough unlike a cow that we don't think
| that it used to be alive.
|
| Basically, if we had to slaughter our own cows, I doubt
| we'd be eating as much meat.
| schoen wrote:
| I'm also a vegan, but it doesn't seem likely to me that
| other species have languages similar to ours. I think
| people have already used ML to interpret cat and dog
| communications, and they got general emotions more than
| something like syntax.
|
| It's complicated by the fact that other species' throats
| and mouths physically can't form many human language
| phonemes*, but even the use or recognition of human
| language by other great apes (and parrots) is very
| controversial, and they probably have cognition and
| sociality most similar to ours. But it's not clear that
| they can do much of what human language does.
| pbhjpbhj wrote:
| >very far from any training data
|
| It's not that far from training data surely. If you're only
| training on next-word basis then you'll "often" see
| individual words from other languages mixed in.
|
| It's like some sort of uber-pidgin.
| rdlw wrote:
| You're proposing that advanced pattern recognition is a sign
| of NOT being intelligent?
|
| Was the above comment nonsense, or did it have a pattern? If
| a real person happened to know ten languages and played along
| in this game with you, would you also see that as evidence
| that they are not intelligent?
| pbhjpbhj wrote:
| Word vectors for the same concept in different languages are
| very close, that's not really surprising. The whole `v_king -
| v_male + v_female = v_queen` idea leads to an intuition that
| `v_grazie - v_italien + v_deutsch = v_danke`.
|
| Code switching probably appears a lot in both literature and
| audio from some contexts too.
|
| Why would it matter that a concept is expressed in a different
| human language? How would you structure a multilingual LLM that
| it couldn't syncretise languages in this way? You'd seemingly
| need to properly segregate the languages.
| fcatalan wrote:
| Yes I understand the encodings will be close and that helps,
| I guess that's why they produce good translations, but I'm
| intrigued by the LLM having so much control of the switching
| without even explicit prompting, just a one-shot example. I
| also guess I'm easily impressed.
| foolswisdom wrote:
| I think the research by anthropic released recently showed that
| language is handled independently of the "concepts" they
| convey, so first you get the concepts, then you get the
| translation to language.
___________________________________________________________________
(page generated 2025-04-04 23:00 UTC)