[HN Gopher] The Decipherment of the Dhofari Script
___________________________________________________________________
The Decipherment of the Dhofari Script
Author : pseudolus
Score : 55 points
Date : 2025-07-13 10:25 UTC (12 hours ago)
(HTM) web link (www.science.org)
(TXT) w3m dump (www.science.org)
| tinco wrote:
| I wonder if you could decypher these scripts by bruteforcing
| decoding layers until an LLM could predict the next token. That
| would assume the text has a sort of logic to it that would still
| work in modern language, but the decyphering would be fully
| automatic so we could throw a bunch of compute at it.
| zaik wrote:
| Ok, your LLM can perfectly predict the next token. How do you
| extract the "logic" out of the weights?
| talos wrote:
| I don't think OP's idea would work, but if it did you could
| just ask for a translation.
| yorwba wrote:
| It's possible to identify a surprisingly large number of
| matching words by learning a linear transformation mapping
| word vectors from two different languages into the same space
| (e.g. https://arxiv.org/abs/1805.06297 ).
|
| But the problem with ancient languages is typically that
| there's not enough data to usefully constrain a large enough
| model. Doubly so for undeciphered scripts where scholars
| might not even agree on how many different letters there are.
| yyyk wrote:
| Presumably, they'd want to get at embeddings, and compare the
| dimensional space somehow to say: 'the relation between
| tokens a,b,c is close to the relation of tokens a1,b1,c1 in a
| similar model of texts of known language of apparently same
| family (same up to aN,bN,cN), and out of these N sequences,
| sequence X makes most sense given existing examples'.
|
| (As you can tell, the argument involves some handwaving, but
| it may possible?)
| noworld wrote:
| It's LLMs all the way down.
| MohamedMabrouk wrote:
| the available data from some of those lesser used scripts are
| miniscule. the most common ancient North Arabian script is
| safitic and only around 50K texts are processed and widely
| available each with a few words to a few sentences.
| analog31 wrote:
| "Pre-Islamic" is an odd description of a script that predates
| Islam by a millennium. Did they mean "pre-Arabic?"
| idoubtit wrote:
| "Preislamic" is a common term for near-East history. Islam is
| well dated, it introduced many changes and unified the region,
| so it's a powerful marker.
|
| I've never encountered the word "Pre-Arabic" about the Arabic
| peninsula. It would be hard to define precisely. The word
| "arab" is probably more than 3000 years old. The Arabic
| languages may be older ; they're semitic languages like the
| Akkadian of Mesopotamia. And when did an "Arab" people or
| culture emerge from the semitic people and culture? I guess
| between 6000 BP and 3000 BP, but it was probably a long
| process, and nomad tribes didn't leave many vestiges.
| gryn wrote:
| is it "pre-arabic" though ? it's believed that old arabic
| existed back then.
| arp242 wrote:
| Pre-Islamic Arabia is, as far as I know, a fairly widely
| accepted term. Not that different from pre-Roman Britain, pre-
| Columbian Americas, pre-colonial Africa, pre-imperial China, or
| even Pagan Europe. In all these cases a significant change took
| place which drastically changed the course of the region
| (usually some sort of unification as a nation or religion, not
| always peaceful or voluntary of course).
| comrade1234 wrote:
| Completely unreadable on iOS mobile...
| CharlesW wrote:
| Works fine here. https://imgur.com/a/px7cZAL
| ilinx wrote:
| Interesting. I didn't have any issues. Could you elaborate a
| bit more?
| ahazred8ta wrote:
| it's a form of Thamudic / Ancient North Arabian script
| https://en.wikipedia.org/wiki/Ancient_North_Arabian
___________________________________________________________________
(page generated 2025-07-13 23:00 UTC)