[HN Gopher] Exploring Linear A
___________________________________________________________________
Exploring Linear A
Author : mwenge
Score : 144 points
Date : 2023-07-16 19:47 UTC (1 days ago)
(HTM) web link (lineara.xyz)
(TXT) w3m dump (lineara.xyz)
| marthasimons2 wrote:
| [dead]
| devoutsalsa wrote:
| On a related note, the Heraklion Archaeological Museum on Crete
| is fantastic. 100% worth going if you like old stuff. One of the
| things on display is the Phaistos Disc, one of the best preserved
| relics depicting Linear A.
|
| https://maps.app.goo.gl/rwJVDVDjaoNJjaNH8?g_st=ic
|
| https://en.m.wikipedia.org/wiki/Phaistos_Disc
| nologic01 wrote:
| I sometimes wonder how much further we well be able to lift the
| veil of ignorance covering early civilizations (assuming our
| ongoing existence, cultural interest in the past and ever more
| powerful technologies in the aeons to come).
|
| Clearly there must be additional Linear A inscriptions in Crete
| and possibly elsewhere. The cost of finding them enters a spiral
| of diminishing returns, but that _may_ be remedied at some point.
|
| But, even so, there is no guarantee that even with all surving
| artefacts uncovered we would be able to reconstruct the language.
|
| Pressumably that "edge of knowledgeable history" calculus plays
| across many regions and sometimes ignorance is annoyingly "close"
| to the modern era. Even long after the invention of writing the
| vast majority of human culture was not recorded and is
| essentially lost.
| p-e-w wrote:
| > Clearly there must be additional Linear A inscriptions in
| Crete and possibly elsewhere.
|
| There are probably many of them in storage in various museums
| and antiquities departments.
|
| The vast majority of ancient inscriptions ever found are
| uncatalogued, and some have never even been looked at. This
| includes inscriptions in languages we already know how to read.
| I remember reading long ago somewhere that well over 90% of all
| known Egyptian hieroglyphics texts haven't been translated yet.
|
| Therefore, my guess is that once AI is good enough to do
| classification and translation automatically, there will be
| rapid progress, without requiring any new discoveries.
| shaftoe444 wrote:
| Very weird to see this, I went to an exhibition about Knossos in
| Oxford only today.
|
| Good episode here that covers a bit about the language and
| translation efforts. The translation of Linear B is a very cool
| story too.
|
| https://www.bbc.co.uk/programmes/b01292ts
| im3w1l wrote:
| I wonder if LLM's would be able to crack it. They should have a
| decent shot I feel.
| davedx wrote:
| Via this post I found the book "The Riddle of the Labyrinth"
| about the people who deciphered Linear B. Thank you Hacker News,
| looking forward to reading this!
| OfSanguineFire wrote:
| Work by amateurs on Linear A does not have a good track record.
| Since the dawn of the internet era it has drawn more crackpots
| than almost anything else language-related. Within the
| professional linguistics community, if someone comes along and
| claims that he has made any progress towards decipherment, it is
| generally met with skepticism so strong that one questions that
| person's mental health. That said, this website has a caveat that
| it is for recreational use only, and it points to John Younger's
| page at the University of Kansas for something serious. Lay
| readers on HN should take that caveat very seriously.
| p-e-w wrote:
| > Work by amateurs on Linear A does not have a good track
| record.
|
| Linear A is completely undeciphered, so amateurs have done
| exactly as well as professionals. Meanwhile, Egyptian
| hieroglyphs, cuneiform, and Linear B were all deciphered by
| people who would be called "amateurs" by today's standards.
|
| But hey, why miss an opportunity for elitist gatekeeping, even
| if the topic is demonstrably one of the least suitable places
| for it?
| greggsy wrote:
| > Linear A is completely undeciphered, so amateurs have done
| exactly as well as professionals.
|
| Academics have absolutely done better than amateurs by virtue
| of continually validating the fact that it isn't
| decipherable.
|
| The comment isn't even bashing amateurs - it's bashing
| crackpots, who tend to be allured towards the mysterious,
| especially if their crackpot ideas won't be inconvenienced by
| facts.
| chickenbittle wrote:
| Yeah I think it's really important to distinguish between
| 'amateur', and 'crackpot' here.
|
| Like that amateur stumbling upon that never-repeating tile
| recently.
|
| https://www.quantamagazine.org/hobbyist-finds-maths-
| elusive-...
|
| It still required academics to confirm it (and I think
| realize it's significants).
|
| In short it's okay to be an academic and it's also okay to
| be an amateur investigator. Both can and do contribute to
| the advancement and dissemination of knowledge.
|
| Crackpottery and pseudoscience not so much.
| delhanty wrote:
| Curious, what concrete progress have professional linguists
| made on deciphering Linear A?
| OfSanguineFire wrote:
| None. And that is in spite of massive attempts over the 20th
| century, including some of the first applications of
| computers to a problem of this nature. The conclusion drawn
| from this lack of progress is that the corpus is simply too
| small for decipherment and/or we lack any surviving relatives
| for the language that the script recorded.
| dmarchand90 wrote:
| What kind of corpus do we have? Is it largely fragmented
| segments with a few symbols?
| civilitty wrote:
| _> The extant corpus, comprising some 1,427 specimens
| totalling 7,362 to 7,396 signs, if scaled to standard
| type, would fit easily on two sheets of paper._ [1]
|
| [1] https://en.m.wikipedia.org/wiki/Linear_A#Corpus
| rustymonday wrote:
| An architect decoded Linear B.
| dmvdoug wrote:
| I mean, yeah, but an architect with advanced classical
| language training.
| light_hue_1 wrote:
| It's really misleading to say "An architect decoded Linear
| B."
|
| When Michael Ventris was working by himself he published
| junk. A basically crackpot theory that was immediately
| debunked that Linear B was Etruscan. Then Ventris worked hard
| to become an insider.
|
| Many key observations for the decoding were done by someone
| else, a classicist, Alice Kober, right before her untimely
| death. She worked for 20 years on Linear B and put down all
| of the foundations. The fact that Linear B has grammatical
| roots and suffixes, the language is inflected, has case,
| gender, etc. Kober was one of the first people to work
| systematically finding patterns and documenting her methods.
| The work Ventirs did would have been impossible without
| Kober's methods: extending her work is what worked and gave
| Ventris his main idea.
|
| Ventris briefly worked with Kober. It didn't go well. But
| over time Ventris came to know the key players and to be
| accepted in the inner circle. One of these players, Emmett
| Bennett, gave him what Kober did not have: the Pylos tablets.
| By the time they were published she had died.
|
| Ventris extended Kober's work to the Pylos tablets. Her work
| focused on systematically analyzing groups of characters.
| When he looked at the results, he made his first critical
| observation: some groups were unique to the Knossos tablets
| and others were unique to the Pylos tablets. What if these
| are place names?
|
| There aren't that many places to be had on Knossos and he
| knew the Greek names. So he looked for possible combinations
| and used them to guide the decoding. He used Kober's work and
| the place names, along with help from at least Bennett, to
| build a rough mapping from some signs to sound. And then he
| made his second critical observation: what if Linear B is
| Greek? Since the Greek names for places seemed to appear.
|
| Then he could try to decode word after word. And along the
| way he made his third critical observation: many Myceanean
| scribes were incredibly sloppy spellers. We can even tell now
| that some were much better than others, but everything is
| very messy because even the basic rules of spelling weren't
| agreed on yet. Not only were characters missing, but a single
| character could be one of 30+ different syllables at times.
| Bare statistical methods alone often resulted in a mess
| because of this.
|
| Only small parts of the text could potentially be decoded at
| this point. None of the classicists that Ventris normally
| talked to were convinced.
|
| That's when John Chadwick, a linguist, heard about Ventris
| and tried his idea out. Chadwick was an expert in very old
| Greek, 1000 years older than Plato. Chadwick was quickly
| convinced by Ventris because while the decodings were very
| poor for someone who knew classical Greek, they made a lot
| more sense to him. They worked together for several years to
| fix up the decoding.
|
| An architect did contribute the main idea for the decoding,
| but an architect that was a connected insider, with a
| background in Greek and Latin, who had published in the area
| before, knowledgeable in all of the latest methods, with
| access to privileged information, in conversation with the
| experts.
|
| The way you put it, it sounds like some random architect
| somewhere looked at Linear B, worked hard on their own, and
| came up with the answer. That's not even remotely true.
| xdennis wrote:
| > The way you put it, it sounds like some random architect
| somewhere looked at Linear B, worked hard on their own, and
| came up with the answer. That's not even remotely true.
|
| Your assumption here is that a normal discoverer works
| entirely by himself, but the norm has always been that a
| discovery is the work of multiple people.
|
| When someone says that an architect decoded Linear B he
| means to say that a non-professional decoded it.
|
| You can't just say: "oh, it's not fair to say that because
| he was actually really good at it".
| goodbyesf wrote:
| > The way you put it, it sounds like some random architect
| somewhere looked at Linear B, worked hard on their own, and
| came up with the answer. That's not even remotely true.
|
| But that's true of everything. Newton didn't invent
| calculus ( neither did leibniz either ). He didn't even
| understand the idea of a limit. It took contributions of
| many people over many decades and even centuries to develop
| the discipline of calculus. Not to mention his ideas came
| from ancient greeks, et al. The same applies to Einstein
| and of course the most overrated and misrepresented Turing.
|
| The idea of a lone genius or a singular great man who works
| by himself to produce something great is a lie. Brady
| didn't win 7 superbowls by himself, Jobs didn't create the
| iPhone by himself and Musk really didn't create anything by
| himself. It's just PR which creates heros out of mere
| mortals.
| hillsboroughman wrote:
| Very informative summary. A stupid question follows though,
| so request your patience. Did Michael Ventris really ask
| the question 'what if Linear B encoded some form of Greek'?
| Didn't Alice Kober already ask and answer this question,
| without seeming to do so. The fact that the underlying
| language was an inflected one and that it seemed to have
| singular, dual and plural forms for nouns etc - wasn't that
| enough? Was it academic carefulness that prevented Kober
| from proclaiming it was ancient Greek?
| light_hue_1 wrote:
| Kober was very determined that systematic analysis of the
| text would eventually work. She rejected the idea that
| you could just hypothesize what language it was. Because
| so many people had tried and failed that way.
|
| Maybe at some point she had this idea. But you really
| must understand how bad of a fit classical Greek, and
| even the early Greek dialects, really is. Like.. a few
| words work out here and there. What convinced Chadwick
| were the place names, some names of Gods, and one
| particularly long 13 symbol patronymic. But for anything
| more you had to start adding, removing, reinterpreting
| characters and assuming that the original text got them
| wrong.
|
| Also Kobler was missing most of the text since it hadn't
| been published yet, in her small corpus this would have
| been ever worse.
|
| Even after people saw the decoding the main sticking
| point for years was that you need to make so many changes
| for it to work out in Greek that you're just making up
| the text. It took decades of work to make the decoding
| work and many of the decodings Ventris put forward were
| found to be wrong.
|
| Eventually Kober maybe would have worked with Chadwick or
| someone similar who knew a more archaic variant or maybe
| Chadwick himself would have noticed it.
| smallnamespace wrote:
| Most Indo-European languages are inflected and have
| singular, dual, and plural forms (if not in the modern
| language, then in a more archaic form).
|
| Even Latin retains some dual forms for certain words even
| though it had otherwise lost it.
| hillsboroughman wrote:
| Homeric Greek had only sporadic use of the dual. It was
| apparently a matter of metrical convenience. Classical
| Greek had all but lost the use of dual. Dual ws lost in
| Latin. Whereas in Mycenaean Greek, the dual number was
| mandatory for both verbs and nouns. Like in Sanskrit. It
| is well known that Miss Kober traveled at considerable
| personal expense (?) and effort to travel from New York
| to New Haven to learn advanced Sanskrit. I feel there is
| every reason to believe that Miss Kober already guessed
| Linear B encoded a form of archaic Greek and her triplets
| more or less spoke to this informed guess. Just my 2c
| [deleted]
| OfSanguineFire wrote:
| An architect with significant training in the field, who did
| his work in close collaboration with the professional scholar
| John Chadwick. Plus that script had a relatively large corpus
| and, moreover, it encoded an earlier form of a language we
| already knew (and we already knew the sound values to expect
| from earlier Greek, like labiovelar consonants, from
| comparative Indo-European reconstruction). Not the case with
| Linear A.
| thaumasiotes wrote:
| It is not clear why his decipherment is accepted as
| meaningful. It has faced significant criticism: https://sci-
| hub.se/https://www.jstor.org/stable/20162981
|
| > The Ventris system thus set forth has been widely accepted
| by Greek scholars, including many of the highest eminence, in
| many countries. It has also been widely rejected by scholars
| of eminence, in varying degrees.
|
| > These Ventrisian rules enable bits of a curious sort of
| Greek to be got out of Lin[ear] B texts; but experiments have
| shown that bits of English or Latin or other tongues, when
| spelt out in syllables according to the Ventrisian system,
| are capable often of yielding bits of Greek just as plausible
| as anything in the Ventris-Chadwick _Documents_ volume. One
| eminent Oxonian, dining at a high table, amused himself by
| taking the names of the Fellows of the College present and
| turning them into Ventrisian syllables, from which he made a
| new translation of them into Greek, in which they all turned
| out to be Greek gods.
|
| > gentle reader, pray perpend the syllable-groups (reference
| number Dy 401), that run: _a-ma wi-ru-qe ka-no to-ro-ja qi-
| pi-ri-mu a-po-ri._ Here we have two specimens of the labio-
| velars, the syllables with _q-_ , discovered by Ventris, to
| the astonishment of philologists who had not expected to find
| them in Bronze Age Greek. _qe_ is, of course, equivalent to
| Latin _-que_ , Greek _te_ , while _qi_ doubtless here shows
| the development to a voiced dental noted by Ventris and
| Chadwick in their "Mycenaean Vocabulary,"
|
| > The Greek evaluation of the sentence would be, according to
| Ventris's spelling rules, _halmai wiluite kainos Tholoiai
| Diphilimus apolis:_ "With brine and slime in novel fashion at
| Tholoia (the place of _tholoi_ , beehive tombs) Diphilimus
| (is) cityless." No doubt this is a record of a Bronze Age
| tidal wave.
|
| > It is by coincidence that the acumen of Mr. Michael C.
| Stokes, the Edinburgh authority on ancient philosophy, has
| extracted the Virgilian hexameter, _Arma virumque cano Troiae
| qui primus ab oris..._.
|
| > Note that in this sentence one need assume only two of the
| six words to be names of persons or places, whereas, in the
| Lin B material as a whole, 75 per cent of the sign-groups
| have to be, on Ventris's system, evaluated as names
| OfSanguineFire wrote:
| You cite a 1965 article. That is practically ancient, and
| no, its criticism is not particularly significant. In the
| decades since, Ventris's decipherment has overwhelmingly
| been accepted by scholars. That is not to say that all of
| Ventris's _readings_ are accepted - many are superseded.
| But the fact that Linear B records Mycenaean Greek along
| the general lines that he and Chadwick worked out, has long
| been beyond doubt in the field.
|
| Mycenaean sources and their consensus readings will be
| discussed in any decent introduction to the history of the
| Greek language. I can recommend, for example, the relevant
| chapters in _A Companion to the Ancient Greek Language_ ed.
| Bakker and in Colvin's _A Historical Greek Reader_ as
| fairly accessible to a general audience.
| thaumasiotes wrote:
| > You cite a 1965 article. That is practically ancient
|
| > the fact that Linear B records Mycenaean Greek along
| the general lines that he and Chadwick worked out, has
| long been beyond doubt in the field.
|
| What are the major developments since 1965 that
| strengthened the position of Ventris's decipherment?
| theoldlove wrote:
| Well, for one, a bunch of additional tablets discovered
| at Thebes in the 90s, which broadly match and hence
| confirm the decipherment.
| https://en.m.wikipedia.org/wiki/Thebes_tablets
| thaumasiotes wrote:
| When the criticism is that your paradigm for translating
| Linear B is so unprincipled that your translation will
| say whatever you want it to say (compare _One eminent
| Oxonian, dining at a high table, amused himself by taking
| the names of the Fellows of the College present and
| turning them into Ventrisian syllables, from which he
| made a new translation of them into Greek, in which they
| all turned out to be Greek gods_ -- the destination is
| known before the journey begins), how can the
| confirmation of older Linear B tablets _by newer Linear B
| tablets_ address that criticism?
| theoldlove wrote:
| Just take 10 minutes and skim the book chapters. The
| rules of the script are nowhere near as loose as you say.
| For example, Linear B doesn't differentiate between
| k/g/kh like alphabetic Greek does (k,g,kh) -- an
| important distinction, sure, but its loss doesn't let you
| turn anything into anything else.
|
| So with the Theban tablets, if the decipherment were
| false it should have yielded nonsense when applied to
| unknown texts.
| thaumasiotes wrote:
| > So with the Theban tablets, if the decipherment were
| false it should have yielded nonsense when applied to
| unknown texts.
|
| How is this claim compatible with the observation that,
| when applied to a text written in Latin, the decipherment
| fails to yield nonsense?
| theoldlove wrote:
| In your very article the decipherment does yield nonsense
| when applied to Latin. Your article converts the first
| line of Vergil to Linear B and then tries to understand
| it as Greek, offering "With brine and slime in novel
| fashion at Tholoia Diphilimus (is) cityless." But that's
| nearly totally meaningless.
|
| And even this sentence requires cheating -- most
| prominently, Greek (both in Linear B and later) doesn't
| use the -us ending like Latin does, so its use here in a
| "Greek" sentence is very suspicious.
| OfSanguineFire wrote:
| Most of the Chadwick part of the Chadwick-Ventris
| collaboration was published after 1965. And I just
| pointed you to two popular references that, in turn, cite
| a number of publications from recent decades. I suggest
| you follow up on that.
| thaumasiotes wrote:
| Oh, I certainly will.
|
| But I was kind of hoping for some indication that
| developments of that kind actually occurred; it would be
| the least surprising thing in the world to see a
| selection effect in the study of Linear B inscriptions
| whereby students who couldn't reconcile themselves with
| the idea that decipherment will happily assign a meaning
| to any text, even where the actual meaning of the text is
| known to be different, left the field, while students who
| didn't mind that stayed in. Over time a strong consensus
| in favor of the position "no, I didn't waste the last 30
| years of my life" is exactly what you'd expect to see.
|
| There are no professions in which the professional
| consensus is "actually, none of this works". But there
| are many in which that is the _truth_.
| heyitsguay wrote:
| Is this site not just a handy visual catalog of known artifacts
| and transcriptions? Is there some speculative decipherment
| implied in the phoneticizations?
| VectorLock wrote:
| Probably getting a bit more popular notice after the mention in
| the latest Indiana Jones movie (at least, they mentioned Linear B
| a few times)
| dghughes wrote:
| I like writing systems and scripts especially obscure or ancients
| ones. It never even dawned on me to think of my local region as I
| did ancient Egypt, Greece, Italy etc.
|
| I was talking to a friend he is Mi'Kmaq here in Canada we call
| the people here First Nations in the USA it's Native American. He
| said that the Mi'Kmaq had an old writing system. I checked into
| it and it predates any contact with Europeans and is one of the
| very few writing systems by native peoples here. It's called
| suckerfish writing or suckerfish script the name inspired by the
| tracks the fish makes in sand.
|
| https://en.wikipedia.org/wiki/Mi%EA%9E%8Ckmaw_hieroglyphic_w...
| AlotOfReading wrote:
| The traditional definitions by linguist tend to exclude
| anything that can't represent "all oral communication" as
| proto- or partial writing systems, which are often pejoratively
| labeled mnemonic systems. Systems that can represent the full
| range of spoken expression are labeled "true" or "full"
| writing.
|
| This had the convenient side effect of neatly classifying all
| the American writing systems as protowriting in the early 20th
| century, as well as some more controversial examples like
| Chinese. Some of those have since been walked back (e.g.
| Mayan), but most remain in that limbo. We have a somewhat
| better understanding today that there was a huge variety of
| visual communication systems across the Americas prior to
| European contact, but properly redefining the term "writing" to
| include them is a slow, ongoing process.
| retrac wrote:
| For the unfamiliar, Linear A was an ancient script that is
| associated with the Minoan civilization of the island of Crete,
| around 1500 - 1800 BC. The later Linear B system encodes archaic
| Greek, and is very similar to Linear A in glyph form. The Minoan
| language written with Linear A is probably unrelated to any other
| language.
|
| Phonetic values are necessarily from Linear B or otherwise
| guesses - it's very likely there was a great deal of overlap,
| that the symbol representing, for example, the syllable "ni" in
| Greek, represented a syllable that sounded a lot like "ni" in
| Minoan. (Linear B is quite unsuited to writing Greek sounds, an
| indicator that it was borrowed from a very different language.)
| But since the language of Linear A remains undeciphered, that is
| really just an educated guess at best.
| djmips wrote:
| https://greekreporter.com/2022/04/20/minoan-language-linear-...
| ocschwar wrote:
| The interface is difficult to deal with, but TIL that Linear A
| potsherd was found in a Philistine site.
| hudsonhs wrote:
| Hacker News is a Philistine site.
| djmips wrote:
| Too good to downvote...
| cubefox wrote:
| Related thought: Imagine we received a lot of text in an alien
| language with a radio telescope, with no "Rosetta stone" to
| decipher it. Say, 1 TB worth of text.
|
| Now we add to that data another 1 TB of English text, and train
| an LLM on the 2 TB of data. Then we ask the model (in English) to
| translate some text from the alien language to English.
|
| Would it work?
| DemocracyFTW2 wrote:
| No. You always need some kind of Rosetta stone or other
| relationship to a known language plus some context and
| 'plausible guesses' to understand an unknown language. Sure if
| I gave you _III,IIII;VII -- II,II;IIII -- VI,II;VIII_ you would
| be able to guess that these are elementary number signs in what
| amounts to a rudimentary table of additions. That much would be
| true whether the snippet is from a potshard of an ancient
| civilization or received from outer space via a radio antenna.
| But outside of context--and nothing would be more out of
| context than an extraterrestrial culture--you cannot even tell
| with certainty whether _I_ stands for 'one' or 'ten' or
| 'twelve' or 'thousand', and here we've already reached the end
| of what a text per se can tell you about its meaning if the
| signs are not clearly pictorial (and even pictorial scripts
| like early Chinese or Egyptian hieroglyphs are already
| conventionalized to the degree that for quite a number of signs
| in either script we are to this day not sure what they depict).
|
| Your idea can not work unless the data that you feed the
| language model with correlated items. It can't. Imagine I feed
| a predictor with a long list of images on the one hand and, on
| the other hand, a long list of randomly ordered image
| descriptions that may or may not match the images. Do you think
| you could learn a foreign language that way? You absolutely
| need the image of a donkey be associated with the name for that
| animal in the foreign language, and the algorithm is no
| different.
| cubefox wrote:
| Those are good reasons, yet the language model discussed
| above would presumably understand _Alienese_ just as well as
| it would understand English. So if an LLM understands the
| meaning of an expression X and of an expression Y, wouldn 't
| it be able to tell how similar those meanings are?
|
| > here we've already reached the end of what a text per se
| can tell you about its meaning if the signs are not clearly
| pictorial
|
| Note that language models today seem to be quite good at
| understanding English, even though they are only trained on
| symbolic text, not on any images.
| DemocracyFTW2 wrote:
| Your understanding of 'understanding a language' is
| obviously different from mine when you write that "the
| language model discussed above would presumably understand
| Alienese just as well as it would understand English" and
| "language models today seem to be quite good at
| understanding English".
|
| Language models don't understand any natural language,
| they're very good at manipulating it (and us!) in terms of
| continuing patterns across the scale from letter
| (orthography) to phrases and paragraphs of seemingly
| utility and correctness. In _that_ regard, yes, the
| aforementioned model will likely have no difficulty in
| reproducing novel outputs that would appear likewise useful
| and correct to Alienese speakers as is the case for
| English. However this assumption, too, should come with the
| disclaimer that unless someone produces a reliable test for
| the utility and correctness of the _same_ LM for a variety
| of natural and invented languages with divergent grammars
| (such as including e.g. polysynthetic languages which have
| a very different view of what constitutes a 'word')
| _without_ having to tweak any of the many finnicky
| parameters of these models--we can 't be sure the model
| won't produce garbage when trained on the next 'exotic'
| language. So who knows, in English you use very few infixes
| and a lot of grammar takes places between fairly constant,
| fairly short words; a model with a given set of parameters
| that works well for such languages may not be very good at
| languages that has words built from many specific prefixes,
| infixes and suffixes that are as expressive as entire
| phrases in English. Just like the current generation of
| text-to-image generators are pretty good at a lot of things
| but then screw up when asked to picture a cornfield.
| cubefox wrote:
| > Your understanding of 'understanding a language' is
| obviously different from mine when you write that "the
| language model discussed above would presumably
| understand Alienese just as well as it would understand
| English" and "language models today seem to be quite good
| at understanding English".
|
| > Language models don't understand any natural language,
| they're very good at manipulating it (and us!) in terms
| of continuing patterns across the scale from letter
| (orthography) to phrases and paragraphs of seemingly
| utility and correctness.
|
| Come on, chatting an hour with GPT-4 should remove all
| doubt that it understands you quite well. Otherwise, what
| would be understanding? Lest it turns out that _we_ are
| stochastic parrots, too!
|
| https://www.bing.com/images/create/cornfield/64b58e89d412
| 420...
| tiluha wrote:
| The trained model would likely be able understand both
| Alienese and English equally well, but it never learned to
| translate even one word or context. It might have an
| internal representation for "eating food" in both
| languages, but since since no links exists between the
| languages the embeddings will not be close.
|
| You could try it on earth with if you train a model on two
| separate languages, being careful that the traning data
| does not contain any mixed language. But even then, modern
| Human languages most likely have too much cross-
| contamination. Would be an interesting experiment
| nevertheless
| cubefox wrote:
| That's the question, would the embeddings be close?
|
| It's not clear that they wouldn't. Would an embedding of
| the Alienese word for "and" be close to the embedding of
| the English "and"? This does seem quite possible to me.
|
| > You could try it on earth with if you train a model on
| two separate languages, being careful that the traning
| data does not contain any mixed language. But even then,
| modern Human languages most likely have too much cross-
| contamination. Would be an interesting experiment
| nevertheless
|
| I agree. Though shouldn't we be able to answer this _a
| priori_? It sounds like a mathematical question.
| WorldMaker wrote:
| Also, the assumption that math is universal so sharing
| vocabulary in math is helpful for bootstrapping language
| understanding is a fascinating assumption to question. Even
| if you can explain Pi and prove that you can mutually
| understand trigonometry that might give you some small
| portion of engineering insight, but it can't yield most of
| the rest of engineering such as design or aesthetics (or
| emotions) or any number of other things that make for useful
| project communication.
|
| It's something I've often thought about in the way that the
| Voyager record was built and Sagan's Cosmos novel assumes it
| and many others. Even recently, the novel Project Hail Mary
| borrowed that assumption that math is enough shared language
| to bootstrap understanding. I think the movie Arrival did
| some of the best work of showing why that wouldn't
| necessarily work, but also had the language in question
| designed by a mathematician and still fell into some parts of
| the assumption/trope. I'm not saying any of these examples
| are bad for doing this, I certainly love them all. It's still
| a small something worth criticizing.
|
| It's certainly not a bad thing to want to communicate math,
| and to hope that things like Pi are "constant enough" to
| provide bootstraps to other communications, but it's also
| such a fascinating thing how much science fiction thinking
| (and real world scientific thinking such as the Voyage
| Record) think that you can just sort of "yada yada yada" your
| way from "so we established communications of basic
| mathematical constants and concepts" directly as a straight
| line of some sort to "now we can communicate all sorts of
| other things".
| fiddlerwoaroof wrote:
| Looks like there's a parallel site for Linear B:
| https://linearb.xyz/
___________________________________________________________________
(page generated 2023-07-17 23:02 UTC)