[HN Gopher] Minoan Language Linear A Linked to Linear B in Groun...
___________________________________________________________________
Minoan Language Linear A Linked to Linear B in Groundbreaking
Research
Author : clouddrover
Score : 172 points
Date : 2022-05-15 12:46 UTC (10 hours ago)
(HTM) web link (greekreporter.com)
(TXT) w3m dump (greekreporter.com)
| motohagiography wrote:
| Feels like a fun intro to comp.sci exercise would be taking texts
| from ancient languages and writing compression schemes, n-gram
| analyzers, regular expressions, and symbol call graphs for them.
| It's a bit like the apocryphal story of some old hackers in the
| 80s "decoding" a Chinese takeout menu (Jobs/Woz?), but it could
| get kids interested in archeology in a way that is smarter than
| that alien TV show.
| changoplatanero wrote:
| You mean like this? https://arxiv.org/pdf/1906.06718.pdf
| SemanticStrengh wrote:
| wow has this been applied to linear A?? Nice to see google
| doing original work. Also the MIT omnipresence is humiliating
| for other universities, as usual.
| jl6 wrote:
| I wonder if a machine learning model could shed some on the
| decipherment.
| thom wrote:
| Work has been done on using Markov models etc to predict missing
| symbols in these texts. But it feels like with all the data now
| available, and the fact that some signs' meanings are known, we
| must be able to at least reduce the constellation of possible
| meanings of some of the unknown signs. There are only so many
| things it was possible to say about olives in the ancient world,
| and presumably the semantic space wouldn't be so different to
| other vaguely contemporaneous languages (not just Linear B). Does
| anybody know of any work in this direction?
| [deleted]
| jcranmer wrote:
| I spent basically the entire article trying to figure out _what_
| the "groundbreaking" research actually was... this is a pretty
| mangled press release rendering of the scientific research, even
| worse than the kinds you normally see from university research.
|
| To its credit, this doesn't promise that we can (or will shortly
| be able to) actually read Linear A texts, and it actually
| explicates that we won't be able to do that. But that's pretty
| much the limit of credit due to this article.
|
| Linear A's connection to Linear B has been hypothesized since...
| well, at least as far back as when I was taught it in school,
| which considering how long it takes textbooks to update
| themselves to state-of-the-art archaeology may as well be time
| immemorial.
|
| What it looks like the actual "groundbreaking" research here is,
| based on the sleuthing done in the previous version of the
| article that leads to an academic review of the work in question
| (here: https://bmcr.brynmawr.edu/2021/2021.04.30/). The layman's
| version is that it's a detailed analysis of the structural
| elements of the script to propose how the (unknown) language was
| encoded into Linear A, combined with some analysis of how
| individual glyphs varied in time and space--and this results in
| the conclusion that Linear B is actually a version of a regional
| [script, not linguistic] dialect of Linear A that was used to
| write a different language.
| pianoraptor wrote:
| Thank you for your take on it. I too was hoping for more
| insights here, having followed the Linear A / Linear B
| developments from my armchair for many years.
| lsrinivas wrote:
| I too was scratching my head about as you put it , "what the
| groundbreaking research was". The academic review link that you
| posted was rather useful though it was a little technical.
|
| I think the book needs to be read. But thanks a ton.
| Torkel wrote:
| Yeah, not a good article at all - I bailed half way through and
| went to the comments here in hope of a tldr/abstract...
| ncmncm wrote:
| What it specifically fails to deliver is any indication of
| how "the internet" had any role at all in the work or in any
| progress made.
|
| The only suggestion of progress was that somebody finally
| noticed the LB symbols were mostly about the same as LA
| symbols, so they now can now pronounce the LA texts. There is
| no hint why that only just happened.
| bradrn wrote:
| Article from 2021, duplicate of
| https://news.ycombinator.com/item?id=27191364
| turndown wrote:
| >I am afraid there is currently no exact translation of the sign-
| sequences (= words) attested on Linear A tablets (as well as
| other document types). This is primarily because we have not yet
| identified the linguistic family the Minoan language belongs to
| (unless it has to be taken as an 'isolated' language)
|
| Seems as though they may have made some kind of advancement in
| the relationship between symbols, but as always we do not have
| nearly enough written material to approach deciphering.
| SemanticStrengh wrote:
| Linear A seems to derive from cretan hyeroglyphs
| https://en.wikipedia.org/wiki/Cretan_hieroglyphs What do cretan
| hyeroglyphs come from? And how does a population create a
| language? That's absurdly difficult to initiate.
| jcranmer wrote:
| I assume you mean "writing system" and not "language" here.
|
| Writing systems were independently invented no fewer than three
| times (Mesopotamian, Chinese, and Mayan are unquestionably
| independent inventions) and probably more times, while also
| being reinvented from scratch numerous times after that
| (Cherokee being perhaps the most well-documented such
| reinvention--Sequoyah knew _of_ writing from the Americans, but
| had no other conception of how it worked, and his documentation
| of the process of developing the Cherokee syllabry is a nice
| compression of the history of stages of writing systems). It
| does not seem to be a particularly challenging invention.
|
| There appear to be two key hurdles that are required for the
| development of writing. The first is the creation of a
| systematic inventory of stylized representations of objects,
| for example knowing that this symbol represents "sun" and that
| one represents "eye". In particular, I'd draw the "systematic"
| inventory here as the challenge--merely representing concepts
| in visual drawings seems to be a pretty universal capability.
| The second hurdle is the re-encoding of (some of) these symbols
| to represent _phonetic_ values in an abstract way. (Note that
| being able to represent any phonetic utterance of a language is
| the distinguishing characteristic between proto-writing and
| writing.)
|
| If you actually want to know how _language_ is created, well,
| there is a recent community of deaf people who spontaneously
| invented their own sign language de novo, which suggests that
| language is actually incredibly easy to invent.
| capitainenemo wrote:
| "Writing systems were independently invented no fewer than
| three times (Mesopotamian, Chinese, and Mayan are
| unquestionably independent inventions)"
|
| ...
|
| "Note that being able to represent any phonetic utterance of
| a language is the distinguishing characteristic between
| proto-writing and writing."
|
| Huh... does Chinese meet that criteria? https://en.wikipedia.
| org/wiki/Logogram#Differences_in_proces...
| capitainenemo wrote:
| Hm. " Chinese, they are fused with logographic elements
| used phonetically; such "radical and phonetic" characters
| make up the bulk of the script. Both languages relegated
| the active use of rebus to the spelling of foreign and
| dialectical words. "
|
| I guess that counts.
| edgyquant wrote:
| I think after cave paintings and tally marks modern written
| language is fairly straightforward. Just requires a need to
| pack more information smaller which farming provides.
| AprilArcus wrote:
| This article is a reasonable summary of the status quo in Linear
| A studies since 1956, but the reporter seemingly deliberately
| obfuscates the nature and scope of Dr. Ester Salgarella's new
| contribution.
|
| It appears to be the creation of an online corpus in
| collaboration Dr. Simon Castellan, linked near the bottom of the
| article: https://sigla.phis.me/
|
| This will be a great resource and is an important work, but it
| appears that today we are no closer to deciphering Linear A than
| we have ever been.
| Radim wrote:
| I don't know about "deliberately". Might be just a confused,
| inarticulate piece by a confused reporter.
|
| When an article opens with " _the Minoan language known as
| Linear A_ " (no, it's a script) and " _Linear B developed later
| in the prehistoric period_ " (an oxymoron, prehistoric = pre-
| literary)... you know not to expect much.
|
| Did anyone manage to parse out what is even being claimed in
| this article?
| tlb wrote:
| Is it not fair to call an society where writing existed but
| we can't read it prehistoric? From our perspective, we have
| no written history. Or is the distinction that they
| themselves could read it and therefore acted with historical
| awareness?
| irrational wrote:
| No. Was the Egyptian civilization (which lasted for 3000
| years until the time of the Romans) prehistoric because we
| could not read the Egyptian hieroglyphics? Did it only
| cease to be prehistoric once we deciphered the language?
| ncmncm wrote:
| Yes, exactly. Before we have written history is
| prehistory. As we get more history, the boundary of
| prehistory moves back.
|
| So most of the American remnants are prehistoric, despite
| being coeval with history in the "old world".
| SiempreViernes wrote:
| You'd have to give up the notion that "prehistoric" refers
| to some fixed moment in time, but other than that you are
| of course free to try to change it's meaning.
| ncmncm wrote:
| Absolutely. If we can read their script, and they wrote a
| history, we can read that history. Without, we are reduced
| to relying on archaeology.
|
| We can read historical texts from Egypt and Mesopotamia
| from the time the Minoans , er, "flourished". In exactly
| that sense they are not prehistoric. I think Egyptians
| mentioned them. (It seems odd if Egyptians did not mention
| Santorini blowing up and wiping them out, but maybe they
| did. Somebody must have mentioned "Thera".)
| jnwatson wrote:
| The author found a relationship between Linear B (the script)
| and Linear A (the script) to the point of being able to
| approximately pronounce Linear A. The actual language written
| in that Linear A script, Minoan, is still unknown, but this
| provides some important tools to better understand it.
| jcranmer wrote:
| > The author found a relationship between Linear B (the
| script) and Linear A (the script) to the point of being
| able to approximately pronounce Linear A.
|
| Except that has been known for like... 50 years? 60 years?
| adrian_b wrote:
| Exactly as you say, the fact that many signs are shared
| by Linear B and Linear A has been known for at least a
| half of century.
|
| What I understand from the review of the book is that
| after a more thorough analysis of the graphic variations
| of various Linear A signs, many more signs inherited by
| Linear B from Linear A have been identified than before.
|
| Having better and more phonetic readings of Linear A
| texts increases the hope that the Minoan language could
| be identified and understood, even if this remains highly
| unlikely, unless more Linear A texts would be discovered.
| tremon wrote:
| If we don't know the language it's encoding, how do we know
| the Linear A pronunciation is correct, approximately or
| not? Is this done purely on the assumption that Linear A
| and Linear B might encode similar phonemes in a similar
| way?
| SemanticStrengh wrote:
| can't they use the zipf law ?
| https://en.wikipedia.org/wiki/Zipf%27s_law The decreasing
| exponential law should allow to find "the" and some _closed
| form_ POS words, so yeah determiners, prepositions and
| conjonctions.
| eklitzke wrote:
| I'm skeptical of the claim that this would work at all even
| if there was a larger corpus. Let's say you had a million
| pages of classical Chinese text, but absolutely no context
| about what the text meant or was about. By looking at it
| closely and using statistical analysis you could certainly
| determine various rules of the grammar, and you might even be
| able to guess that certain characters are grammatical
| constructions representing things like conjunctions and
| prepositions. But this isn't really going to let you
| translate anything.
| bloak wrote:
| My guess is that if you had a really big, wide-ranging and
| high-quality corpus of a completely unknown human language
| then you probably would be able to decipher and translate
| it. If you could deduce or guess the grammatical structure
| the next step might be to look at which nouns can be
| subjects of which verbs, for example, and it might then be
| possible to guess which nouns refer to humans and which
| verbs describe actions that can normally only be performed
| by a human, and then ... well, there's lots of statistical
| stuff you can do with a really huge corpus ... It's an
| interesting problem to think about but it's not a problem
| we're ever likely to encounter in real life. It's more
| likely we'll discover a corpus of some alien, non-human
| language than a huge corpus of a completely unknown human
| language.
| inglor_cz wrote:
| The Voynich manuscript is fairly extensive (over 200
| pages), illustrated, and we still do not know what the
| heck of a language it is written in.
|
| https://en.wikipedia.org/wiki/Voynich_manuscript
| ncmncm wrote:
| ...assuming it is really language at all, and not just
| deliberate gibberish.
| vjerancrnjak wrote:
| I wonder can they construct a deep learning model that
| encodes human languages from script shapes and then somehow
| figure out the God language from which Linear A
| script/language is derived.
|
| There's too little data for Linear A. But it might be enough
| if there's a God language oracle waiting to be fed new
| descendant languages.
| liliumregale wrote:
| This is a God (language)-of-the-gaps argument: we can't
| figure out this rarely language, but maybe we can figure
| out an entirely unattested language instead, and also learn
| the correspondence between it and Linear A.
|
| Deep learning can predict plenty of phenomena in the world,
| sure, but it needs data, not aspirations.
| vjerancrnjak wrote:
| > figure out an entirely unattested language
|
| I did not say that. Human languages evolve in similar
| ways, use similar vocabulary, grammar etc. Linguistics
| has already unraveled the structure of many languages and
| the structure of evolution of language through time.
|
| I am not saying DL is THE approach to take, but given
| that there's only ~10k characters of Linear A, it is hard
| to tackle the problem without common representation of
| multiple languages that are close to it. That's the whole
| point of DL, how to build better and better
| representations, not how to accurately model uncertainty
| (which is what you get by doing statistics).
|
| I would say XLM [0] builds a common representation of a
| collection of languages and then works better on machine
| translation for languages for which the data is scarce
| but that are related to the languages in the model. (what
| it also does is discover and represent the structure of
| part-of-speech, grammar, entities etc. without being told
| about those particular things)
|
| Does there exist an abundance of data for languages close
| to Linear A? If not, then I admire the work of all that
| try to untangle this with their brains alone.
|
| 0: https://github.com/facebookresearch/XLM
| Contexti wrote:
| > Does there exist an abundance of data for languages
| close to Linear A? If not, then I admire the work of all
| that try to untangle this with their brains alone.
|
| In the article, Dr. Ester Salgarella says: "we have not
| yet identified the linguistic family the Minoan language
| belongs to (unless it has to be taken as an 'isolated'
| language)"
|
| If we knew that the Minoan language belonged to some
| extant language family and we had an abundance of data,
| the mystery of Linear A would already have been solved
| decades ago.
|
| In general, there's very little data for any of the
| Palaeo-European languages that got replaced by Indo-
| European languages.
|
| Linguistic relatives of the Minoan language could have
| gone extinct when their speakers shifted to Greek or some
| other Indo-European language. It is also possible that
| other Minoan languages died out centuries or millenia
| before the arrival of Indo-Europeans. I don't believe we
| will ever know.
| cge wrote:
| In addition to the other comments on difficulties of such
| analyses, an additional difficulty may be the _type_ of
| inscriptions in the corpus. We understand Linear B, for
| example, because it is early Greek. But the texts are not
| narrative prose or poetry: they 're administrative records,
| mostly lists and inventories. If Linear A texts are of
| similar types, then trying to decipher the language from them
| alone may be challenging or impossible, unless it can be
| linked to a known language: the forms of speech used may
| simply be too limited.
|
| Trying to understand English grammar by looking only at bare
| financial statements would likely be extremely hard.
| yk wrote:
| That makes assumptions about the language, for the languages
| I know: German has three definite articles, Latin doesn't
| have any, so it is not obvious what looking for "the" would
| result in either.
| sramsay wrote:
| I think people (here and below) are getting hung up on
| definite articles, but Zipf's Law makes no such
| observation. It says only that a word's frequency in a
| natural language corpus tends to be in inverse proportion
| to its rank in a frequency table.
|
| In English, the most frequent words are articles, but the
| general observation about word frequency holds across
| languages (whether those languages have articles or not).
| seoaeu wrote:
| "The most frequently appearing words in this pile of un-
| translateable text are the most common words in the
| language it is written in" seems like it falls somewhere
| between blindingly obvious, and entirely useless. Unless
| you have some clue what those words mean, how does that
| observation help you?
| bee_rider wrote:
| Just from skimming the wikipedia article, it doesn't seem
| useful for translating. But it is slightly stronger than
| "The most frequently appearing words in this pile of un-
| translateable text are the most common words in the
| language it is written in." It tells you that, for
| example, the most popular word should be about twice as
| popular as the second most popular word.
|
| It doesn't tell you what those words are, but it is a
| pretty specific observation about the frequency/rank
| relationship. So, as the wikipedia article liked about
| points out, it can tell us that the Voynich Manuscript
| was probably written in a language (of course, it could
| be a cypher of a real language or something made up, like
| elvish in Lord of the Rings, but it probably isn't just a
| random collection of symbols because it is unlikely that
| a random collection of symbols would happen to follow
| this distribution).
| sramsay wrote:
| It doesn't (in this case), and I didn't say it did. And
| there's nothing "blindingly obvious" about the ubiquity
| of the Zipf curve.
| mtlmtlmtlmtl wrote:
| You really think that in decades of linguists studying Linear
| A, no one has thought of trying Zipf's law?
|
| If scientists have studied something for this long, and you
| come up with an idea that fits in a single paragraph, it's
| probably been tried and didn't work. Unless you're the
| field's leading expert in which case you would be off doing
| it, not posting it on HN :)
|
| Edit: typos
| wheelinsupial wrote:
| Neither of these areas are my field, so I could be entirely
| misunderstanding this preprint [1] from 2021. The preprint
| mentions using Zipf's law in the objectives section on
| attempting to deciphering Linear A.
|
| The literature survey section mentions there have been good
| results using computational methods in 2020 to
| automatically decipher Linear B. The discussion section
| mentions "To the best of our knowledge, this is the first
| study to discuss and show computational analysis of Linear
| A."
|
| Again, neither of these are my fields, but it looks like if
| these linguists have tried to use Zipf's law or other
| computational methods unsuccessfully in deciphering Linear
| A, the results weren't published. (Or a poor literature
| survey, or other explanations...) I'm not an academic
| either, so I don't know what the practices are for
| publishing unsuccessful results.
|
| [1] https://hal.archives-ouvertes.fr/hal-03207615/document
| im3w1l wrote:
| Interesting. If they have only very recently tried Zipf's
| law then there may be some other more advanced stuff they
| haven't tried.
|
| I'm thinking word embeddings. Like maybe you could do a
| word embedding based on cooccurence and look for
| similarly shaped clusters in Linear A and Early Greek.
| SemanticStrengh wrote:
| It's showing that we had to wait for 2021 for them to try
| it.. Thanks for reporting anyway!
| escape_goat wrote:
| I understand the impulse to point out the obvious, but when
| the question is asked honestly rather than arrogantly or
| dismissively, it is even better to wait for someone to
| provide the specific answer; in this case, the reason that
| Zipf's law is of no help.
| mtlmtlmtlmtl wrote:
| It wasn't my intent to be overly dismissive. But I see
| this sort of thing all the time, and I find this
| phenomenon interesting, so I wanted to engage with that
| aspect of it, specifically.
|
| I agree with you in general though. Dismissing these
| things out of hand isn't helpful either. But multiple
| people had already made substantive replies to the actual
| content of their idea, anyway.
| SemanticStrengh wrote:
| useless dismissal, I made a question not an affirmation.
| Besides it allow for an exploration of the search space
| of solutions, which stimulate the depth of the discussion
| and might allow finer grained questions that would then
| become possibly innovative
| SemanticStrengh wrote:
| edit: I hope this will make you think twice next time.
| Using zipf law for linear A has only been attempted for the
| first time in 2021 https://hal.archives-
| ouvertes.fr/hal-03207615/document so had I commented last
| year it would have been prio art. I agree the idea is not
| very original and yet we had to wait that much time for it
| to be experimented.
| jcranmer wrote:
| The total extant corpus of Linear A amounts to fewer than
| 10,000 characters (and this is, I believe, the _largest_
| corpus of any undeciphered script).
|
| There's not enough text to do statistical analysis.
| adrian_b wrote:
| If all that text would have been that of a story, there
| would still have been a chance to decipher it.
|
| Even worse than the small number of texts is that all, or
| almost all, are just bookkeeping records, so they contain
| few words besides numbers, symbols for useful goods, e.g.
| wine, olive oil, barley, wool and so on, and proper names
| of places or people.
|
| So even if there might be a few hundreds of texts, most
| just reproduce the same phrases, only with different
| numbers and names substituted in them.
|
| Any statistics on this handful of stereotype phrases will
| offer no information about the statistics of the words of
| the Minoan language as used in a normal conversation or
| story telling.
| luma wrote:
| Can we assume that any of those features would be part of
| this language?
| SemanticStrengh wrote:
| a language without thoses would be hella weird and
| primitive, like stereotypical robotic talking. To answer
| you question, I don't know, do linear B have them?
| tgv wrote:
| Latin doesn't have articles (the/a), and frequently drops
| the verb. Aramaic encodes the article in a suffix. Arabic
| and Hebrew omit the vowels, leaving the interpretation
| depending on contextual clues. There are languages
| without auxiliary verbs. And there are tons of other
| constructions that English doesn't have.
| Keysh wrote:
| _Written_ Arabic and Hebrew often omit vowels; the actual
| spoken languages do not, of course.
| greenyoda wrote:
| There are languages used today that don't have a separate
| word for "the", such as Hebrew (which uses a prefix to
| denote "the"), or Chinese, which apparently doesn't use
| articles.[1]
|
| Also, knowing what the most common words are wouldn't
| really help you much if you didn't know what the
| documents are about. For example, if they were trade
| records, they might contain a lot of text saying
| something like "X agrees to buy 20 pounds of olives from
| Y for $50 if delivered by next week". But if they were
| historical records of wars, other words may be more
| common.
|
| [1] https://mylanguages.org/chinese_articles.php
| jcranmer wrote:
| Weird, perhaps; primitive, no.
|
| One of the historical issues with linguistics is that it
| analyzed every language as if it were Classical Latin or
| Classical Greek, and if that language had elements that
| didn't work out... well, that can't be proper then, can
| it? You still see some residuum of this in English
| prescriptivist poppycock, like the prohibition against
| ending sentences in prepositions.
|
| As linguists actually started inventorying world
| languages, it became more and more clear that there is a
| very wide dichotomy of grammatical features that don't
| necessarily translate well to familiar languages. There
| are vanishingly few features that are actually universal
| to all languages--the noun may well be the only universal
| part of speech. That a language doesn't choose to mark a
| feature in a particular way doesn't make it more
| primitive than another language. English doesn't have a
| numerical classifier... is it more primitive than an
| Australian Aboriginal language? Or is it more primitive
| than Japanese for not having a way to mark register (~
| politeness)?
|
| (FWIW, Linear B is used to write Mycenaean Greek, and
| this has been known for ~70 years.)
| inglor_cz wrote:
| LOL, my native language (Czech) has no articles, but it
| is so flexive and permits so many subtle, meaning-
| carrying changes in sequence of words in a sentence that
| it is actually hard to carry over some of those
| subtleties into written English.
|
| The only thing "robotic" about it is the fact that
| "robot" is a Czech word that was adopted worldwide.
| jhgb wrote:
| Did you just call all Slavs "weird and primitive"?
| JadeNB wrote:
| > a language without thoses would be hella weird and
| primitive, like stereotypical robotic talking. To answer
| you question, I don't know, do linear B have them?
|
| I think there are very few assumptions of the form "
| _every_ reasonable language has [...] " that hold up even
| for all current languages, let alone historical ones.
| jhgb wrote:
| Anyone claiming "surely every language needs X" had
| better look at Riau Indonesian
| (https://en.wikipedia.org/wiki/Riau#Language) first to
| check if _that_ language has X. If it doesn 't, then X is
| almost certainly not required for communication.
| heavenlyblue wrote:
| I think the issue with Linear A is that the amount of
| preserved units of culture in that language is incredibly
| small, so whichever the statistics you can obtain from it are
| limited in use.
| mwenge wrote:
| https://lineara.xyz/ is also worth a look
| sabr wrote:
| TED-Ed made a nice video explaining Linear A
| https://youtu.be/iePEw_cHp8s
___________________________________________________________________
(page generated 2022-05-15 23:00 UTC)