[HN Gopher] Exploring Linear A
       ___________________________________________________________________
        
       Exploring Linear A
        
       Author : mwenge
       Score  : 144 points
       Date   : 2023-07-16 19:47 UTC (1 days ago)
        
 (HTM) web link (lineara.xyz)
 (TXT) w3m dump (lineara.xyz)
        
       | marthasimons2 wrote:
       | [dead]
        
       | devoutsalsa wrote:
       | On a related note, the Heraklion Archaeological Museum on Crete
       | is fantastic. 100% worth going if you like old stuff. One of the
       | things on display is the Phaistos Disc, one of the best preserved
       | relics depicting Linear A.
       | 
       | https://maps.app.goo.gl/rwJVDVDjaoNJjaNH8?g_st=ic
       | 
       | https://en.m.wikipedia.org/wiki/Phaistos_Disc
        
       | nologic01 wrote:
       | I sometimes wonder how much further we well be able to lift the
       | veil of ignorance covering early civilizations (assuming our
       | ongoing existence, cultural interest in the past and ever more
       | powerful technologies in the aeons to come).
       | 
       | Clearly there must be additional Linear A inscriptions in Crete
       | and possibly elsewhere. The cost of finding them enters a spiral
       | of diminishing returns, but that _may_ be remedied at some point.
       | 
       | But, even so, there is no guarantee that even with all surving
       | artefacts uncovered we would be able to reconstruct the language.
       | 
       | Pressumably that "edge of knowledgeable history" calculus plays
       | across many regions and sometimes ignorance is annoyingly "close"
       | to the modern era. Even long after the invention of writing the
       | vast majority of human culture was not recorded and is
       | essentially lost.
        
         | p-e-w wrote:
         | > Clearly there must be additional Linear A inscriptions in
         | Crete and possibly elsewhere.
         | 
         | There are probably many of them in storage in various museums
         | and antiquities departments.
         | 
         | The vast majority of ancient inscriptions ever found are
         | uncatalogued, and some have never even been looked at. This
         | includes inscriptions in languages we already know how to read.
         | I remember reading long ago somewhere that well over 90% of all
         | known Egyptian hieroglyphics texts haven't been translated yet.
         | 
         | Therefore, my guess is that once AI is good enough to do
         | classification and translation automatically, there will be
         | rapid progress, without requiring any new discoveries.
        
       | shaftoe444 wrote:
       | Very weird to see this, I went to an exhibition about Knossos in
       | Oxford only today.
       | 
       | Good episode here that covers a bit about the language and
       | translation efforts. The translation of Linear B is a very cool
       | story too.
       | 
       | https://www.bbc.co.uk/programmes/b01292ts
        
       | im3w1l wrote:
       | I wonder if LLM's would be able to crack it. They should have a
       | decent shot I feel.
        
       | davedx wrote:
       | Via this post I found the book "The Riddle of the Labyrinth"
       | about the people who deciphered Linear B. Thank you Hacker News,
       | looking forward to reading this!
        
       | OfSanguineFire wrote:
       | Work by amateurs on Linear A does not have a good track record.
       | Since the dawn of the internet era it has drawn more crackpots
       | than almost anything else language-related. Within the
       | professional linguistics community, if someone comes along and
       | claims that he has made any progress towards decipherment, it is
       | generally met with skepticism so strong that one questions that
       | person's mental health. That said, this website has a caveat that
       | it is for recreational use only, and it points to John Younger's
       | page at the University of Kansas for something serious. Lay
       | readers on HN should take that caveat very seriously.
        
         | p-e-w wrote:
         | > Work by amateurs on Linear A does not have a good track
         | record.
         | 
         | Linear A is completely undeciphered, so amateurs have done
         | exactly as well as professionals. Meanwhile, Egyptian
         | hieroglyphs, cuneiform, and Linear B were all deciphered by
         | people who would be called "amateurs" by today's standards.
         | 
         | But hey, why miss an opportunity for elitist gatekeeping, even
         | if the topic is demonstrably one of the least suitable places
         | for it?
        
           | greggsy wrote:
           | > Linear A is completely undeciphered, so amateurs have done
           | exactly as well as professionals.
           | 
           | Academics have absolutely done better than amateurs by virtue
           | of continually validating the fact that it isn't
           | decipherable.
           | 
           | The comment isn't even bashing amateurs - it's bashing
           | crackpots, who tend to be allured towards the mysterious,
           | especially if their crackpot ideas won't be inconvenienced by
           | facts.
        
             | chickenbittle wrote:
             | Yeah I think it's really important to distinguish between
             | 'amateur', and 'crackpot' here.
             | 
             | Like that amateur stumbling upon that never-repeating tile
             | recently.
             | 
             | https://www.quantamagazine.org/hobbyist-finds-maths-
             | elusive-...
             | 
             | It still required academics to confirm it (and I think
             | realize it's significants).
             | 
             | In short it's okay to be an academic and it's also okay to
             | be an amateur investigator. Both can and do contribute to
             | the advancement and dissemination of knowledge.
             | 
             | Crackpottery and pseudoscience not so much.
        
         | delhanty wrote:
         | Curious, what concrete progress have professional linguists
         | made on deciphering Linear A?
        
           | OfSanguineFire wrote:
           | None. And that is in spite of massive attempts over the 20th
           | century, including some of the first applications of
           | computers to a problem of this nature. The conclusion drawn
           | from this lack of progress is that the corpus is simply too
           | small for decipherment and/or we lack any surviving relatives
           | for the language that the script recorded.
        
             | dmarchand90 wrote:
             | What kind of corpus do we have? Is it largely fragmented
             | segments with a few symbols?
        
               | civilitty wrote:
               | _> The extant corpus, comprising some 1,427 specimens
               | totalling 7,362 to 7,396 signs, if scaled to standard
               | type, would fit easily on two sheets of paper._ [1]
               | 
               | [1] https://en.m.wikipedia.org/wiki/Linear_A#Corpus
        
         | rustymonday wrote:
         | An architect decoded Linear B.
        
           | dmvdoug wrote:
           | I mean, yeah, but an architect with advanced classical
           | language training.
        
           | light_hue_1 wrote:
           | It's really misleading to say "An architect decoded Linear
           | B."
           | 
           | When Michael Ventris was working by himself he published
           | junk. A basically crackpot theory that was immediately
           | debunked that Linear B was Etruscan. Then Ventris worked hard
           | to become an insider.
           | 
           | Many key observations for the decoding were done by someone
           | else, a classicist, Alice Kober, right before her untimely
           | death. She worked for 20 years on Linear B and put down all
           | of the foundations. The fact that Linear B has grammatical
           | roots and suffixes, the language is inflected, has case,
           | gender, etc. Kober was one of the first people to work
           | systematically finding patterns and documenting her methods.
           | The work Ventirs did would have been impossible without
           | Kober's methods: extending her work is what worked and gave
           | Ventris his main idea.
           | 
           | Ventris briefly worked with Kober. It didn't go well. But
           | over time Ventris came to know the key players and to be
           | accepted in the inner circle. One of these players, Emmett
           | Bennett, gave him what Kober did not have: the Pylos tablets.
           | By the time they were published she had died.
           | 
           | Ventris extended Kober's work to the Pylos tablets. Her work
           | focused on systematically analyzing groups of characters.
           | When he looked at the results, he made his first critical
           | observation: some groups were unique to the Knossos tablets
           | and others were unique to the Pylos tablets. What if these
           | are place names?
           | 
           | There aren't that many places to be had on Knossos and he
           | knew the Greek names. So he looked for possible combinations
           | and used them to guide the decoding. He used Kober's work and
           | the place names, along with help from at least Bennett, to
           | build a rough mapping from some signs to sound. And then he
           | made his second critical observation: what if Linear B is
           | Greek? Since the Greek names for places seemed to appear.
           | 
           | Then he could try to decode word after word. And along the
           | way he made his third critical observation: many Myceanean
           | scribes were incredibly sloppy spellers. We can even tell now
           | that some were much better than others, but everything is
           | very messy because even the basic rules of spelling weren't
           | agreed on yet. Not only were characters missing, but a single
           | character could be one of 30+ different syllables at times.
           | Bare statistical methods alone often resulted in a mess
           | because of this.
           | 
           | Only small parts of the text could potentially be decoded at
           | this point. None of the classicists that Ventris normally
           | talked to were convinced.
           | 
           | That's when John Chadwick, a linguist, heard about Ventris
           | and tried his idea out. Chadwick was an expert in very old
           | Greek, 1000 years older than Plato. Chadwick was quickly
           | convinced by Ventris because while the decodings were very
           | poor for someone who knew classical Greek, they made a lot
           | more sense to him. They worked together for several years to
           | fix up the decoding.
           | 
           | An architect did contribute the main idea for the decoding,
           | but an architect that was a connected insider, with a
           | background in Greek and Latin, who had published in the area
           | before, knowledgeable in all of the latest methods, with
           | access to privileged information, in conversation with the
           | experts.
           | 
           | The way you put it, it sounds like some random architect
           | somewhere looked at Linear B, worked hard on their own, and
           | came up with the answer. That's not even remotely true.
        
             | xdennis wrote:
             | > The way you put it, it sounds like some random architect
             | somewhere looked at Linear B, worked hard on their own, and
             | came up with the answer. That's not even remotely true.
             | 
             | Your assumption here is that a normal discoverer works
             | entirely by himself, but the norm has always been that a
             | discovery is the work of multiple people.
             | 
             | When someone says that an architect decoded Linear B he
             | means to say that a non-professional decoded it.
             | 
             | You can't just say: "oh, it's not fair to say that because
             | he was actually really good at it".
        
             | goodbyesf wrote:
             | > The way you put it, it sounds like some random architect
             | somewhere looked at Linear B, worked hard on their own, and
             | came up with the answer. That's not even remotely true.
             | 
             | But that's true of everything. Newton didn't invent
             | calculus ( neither did leibniz either ). He didn't even
             | understand the idea of a limit. It took contributions of
             | many people over many decades and even centuries to develop
             | the discipline of calculus. Not to mention his ideas came
             | from ancient greeks, et al. The same applies to Einstein
             | and of course the most overrated and misrepresented Turing.
             | 
             | The idea of a lone genius or a singular great man who works
             | by himself to produce something great is a lie. Brady
             | didn't win 7 superbowls by himself, Jobs didn't create the
             | iPhone by himself and Musk really didn't create anything by
             | himself. It's just PR which creates heros out of mere
             | mortals.
        
             | hillsboroughman wrote:
             | Very informative summary. A stupid question follows though,
             | so request your patience. Did Michael Ventris really ask
             | the question 'what if Linear B encoded some form of Greek'?
             | Didn't Alice Kober already ask and answer this question,
             | without seeming to do so. The fact that the underlying
             | language was an inflected one and that it seemed to have
             | singular, dual and plural forms for nouns etc - wasn't that
             | enough? Was it academic carefulness that prevented Kober
             | from proclaiming it was ancient Greek?
        
               | light_hue_1 wrote:
               | Kober was very determined that systematic analysis of the
               | text would eventually work. She rejected the idea that
               | you could just hypothesize what language it was. Because
               | so many people had tried and failed that way.
               | 
               | Maybe at some point she had this idea. But you really
               | must understand how bad of a fit classical Greek, and
               | even the early Greek dialects, really is. Like.. a few
               | words work out here and there. What convinced Chadwick
               | were the place names, some names of Gods, and one
               | particularly long 13 symbol patronymic. But for anything
               | more you had to start adding, removing, reinterpreting
               | characters and assuming that the original text got them
               | wrong.
               | 
               | Also Kobler was missing most of the text since it hadn't
               | been published yet, in her small corpus this would have
               | been ever worse.
               | 
               | Even after people saw the decoding the main sticking
               | point for years was that you need to make so many changes
               | for it to work out in Greek that you're just making up
               | the text. It took decades of work to make the decoding
               | work and many of the decodings Ventris put forward were
               | found to be wrong.
               | 
               | Eventually Kober maybe would have worked with Chadwick or
               | someone similar who knew a more archaic variant or maybe
               | Chadwick himself would have noticed it.
        
               | smallnamespace wrote:
               | Most Indo-European languages are inflected and have
               | singular, dual, and plural forms (if not in the modern
               | language, then in a more archaic form).
               | 
               | Even Latin retains some dual forms for certain words even
               | though it had otherwise lost it.
        
               | hillsboroughman wrote:
               | Homeric Greek had only sporadic use of the dual. It was
               | apparently a matter of metrical convenience. Classical
               | Greek had all but lost the use of dual. Dual ws lost in
               | Latin. Whereas in Mycenaean Greek, the dual number was
               | mandatory for both verbs and nouns. Like in Sanskrit. It
               | is well known that Miss Kober traveled at considerable
               | personal expense (?) and effort to travel from New York
               | to New Haven to learn advanced Sanskrit. I feel there is
               | every reason to believe that Miss Kober already guessed
               | Linear B encoded a form of archaic Greek and her triplets
               | more or less spoke to this informed guess. Just my 2c
        
               | [deleted]
        
           | OfSanguineFire wrote:
           | An architect with significant training in the field, who did
           | his work in close collaboration with the professional scholar
           | John Chadwick. Plus that script had a relatively large corpus
           | and, moreover, it encoded an earlier form of a language we
           | already knew (and we already knew the sound values to expect
           | from earlier Greek, like labiovelar consonants, from
           | comparative Indo-European reconstruction). Not the case with
           | Linear A.
        
           | thaumasiotes wrote:
           | It is not clear why his decipherment is accepted as
           | meaningful. It has faced significant criticism: https://sci-
           | hub.se/https://www.jstor.org/stable/20162981
           | 
           | > The Ventris system thus set forth has been widely accepted
           | by Greek scholars, including many of the highest eminence, in
           | many countries. It has also been widely rejected by scholars
           | of eminence, in varying degrees.
           | 
           | > These Ventrisian rules enable bits of a curious sort of
           | Greek to be got out of Lin[ear] B texts; but experiments have
           | shown that bits of English or Latin or other tongues, when
           | spelt out in syllables according to the Ventrisian system,
           | are capable often of yielding bits of Greek just as plausible
           | as anything in the Ventris-Chadwick _Documents_ volume. One
           | eminent Oxonian, dining at a high table, amused himself by
           | taking the names of the Fellows of the College present and
           | turning them into Ventrisian syllables, from which he made a
           | new translation of them into Greek, in which they all turned
           | out to be Greek gods.
           | 
           | > gentle reader, pray perpend the syllable-groups (reference
           | number Dy 401), that run: _a-ma wi-ru-qe ka-no to-ro-ja qi-
           | pi-ri-mu a-po-ri._ Here we have two specimens of the labio-
           | velars, the syllables with _q-_ , discovered by Ventris, to
           | the astonishment of philologists who had not expected to find
           | them in Bronze Age Greek. _qe_ is, of course, equivalent to
           | Latin _-que_ , Greek _te_ , while _qi_ doubtless here shows
           | the development to a voiced dental noted by Ventris and
           | Chadwick in their  "Mycenaean Vocabulary,"
           | 
           | > The Greek evaluation of the sentence would be, according to
           | Ventris's spelling rules, _halmai wiluite kainos Tholoiai
           | Diphilimus apolis:_ "With brine and slime in novel fashion at
           | Tholoia (the place of _tholoi_ , beehive tombs) Diphilimus
           | (is) cityless." No doubt this is a record of a Bronze Age
           | tidal wave.
           | 
           | > It is by coincidence that the acumen of Mr. Michael C.
           | Stokes, the Edinburgh authority on ancient philosophy, has
           | extracted the Virgilian hexameter, _Arma virumque cano Troiae
           | qui primus ab oris..._.
           | 
           | > Note that in this sentence one need assume only two of the
           | six words to be names of persons or places, whereas, in the
           | Lin B material as a whole, 75 per cent of the sign-groups
           | have to be, on Ventris's system, evaluated as names
        
             | OfSanguineFire wrote:
             | You cite a 1965 article. That is practically ancient, and
             | no, its criticism is not particularly significant. In the
             | decades since, Ventris's decipherment has overwhelmingly
             | been accepted by scholars. That is not to say that all of
             | Ventris's _readings_ are accepted - many are superseded.
             | But the fact that Linear B records Mycenaean Greek along
             | the general lines that he and Chadwick worked out, has long
             | been beyond doubt in the field.
             | 
             | Mycenaean sources and their consensus readings will be
             | discussed in any decent introduction to the history of the
             | Greek language. I can recommend, for example, the relevant
             | chapters in _A Companion to the Ancient Greek Language_ ed.
             | Bakker and in Colvin's _A Historical Greek Reader_ as
             | fairly accessible to a general audience.
        
               | thaumasiotes wrote:
               | > You cite a 1965 article. That is practically ancient
               | 
               | > the fact that Linear B records Mycenaean Greek along
               | the general lines that he and Chadwick worked out, has
               | long been beyond doubt in the field.
               | 
               | What are the major developments since 1965 that
               | strengthened the position of Ventris's decipherment?
        
               | theoldlove wrote:
               | Well, for one, a bunch of additional tablets discovered
               | at Thebes in the 90s, which broadly match and hence
               | confirm the decipherment.
               | https://en.m.wikipedia.org/wiki/Thebes_tablets
        
               | thaumasiotes wrote:
               | When the criticism is that your paradigm for translating
               | Linear B is so unprincipled that your translation will
               | say whatever you want it to say (compare _One eminent
               | Oxonian, dining at a high table, amused himself by taking
               | the names of the Fellows of the College present and
               | turning them into Ventrisian syllables, from which he
               | made a new translation of them into Greek, in which they
               | all turned out to be Greek gods_ -- the destination is
               | known before the journey begins), how can the
               | confirmation of older Linear B tablets _by newer Linear B
               | tablets_ address that criticism?
        
               | theoldlove wrote:
               | Just take 10 minutes and skim the book chapters. The
               | rules of the script are nowhere near as loose as you say.
               | For example, Linear B doesn't differentiate between
               | k/g/kh like alphabetic Greek does (k,g,kh) -- an
               | important distinction, sure, but its loss doesn't let you
               | turn anything into anything else.
               | 
               | So with the Theban tablets, if the decipherment were
               | false it should have yielded nonsense when applied to
               | unknown texts.
        
               | thaumasiotes wrote:
               | > So with the Theban tablets, if the decipherment were
               | false it should have yielded nonsense when applied to
               | unknown texts.
               | 
               | How is this claim compatible with the observation that,
               | when applied to a text written in Latin, the decipherment
               | fails to yield nonsense?
        
               | theoldlove wrote:
               | In your very article the decipherment does yield nonsense
               | when applied to Latin. Your article converts the first
               | line of Vergil to Linear B and then tries to understand
               | it as Greek, offering "With brine and slime in novel
               | fashion at Tholoia Diphilimus (is) cityless." But that's
               | nearly totally meaningless.
               | 
               | And even this sentence requires cheating -- most
               | prominently, Greek (both in Linear B and later) doesn't
               | use the -us ending like Latin does, so its use here in a
               | "Greek" sentence is very suspicious.
        
               | OfSanguineFire wrote:
               | Most of the Chadwick part of the Chadwick-Ventris
               | collaboration was published after 1965. And I just
               | pointed you to two popular references that, in turn, cite
               | a number of publications from recent decades. I suggest
               | you follow up on that.
        
               | thaumasiotes wrote:
               | Oh, I certainly will.
               | 
               | But I was kind of hoping for some indication that
               | developments of that kind actually occurred; it would be
               | the least surprising thing in the world to see a
               | selection effect in the study of Linear B inscriptions
               | whereby students who couldn't reconcile themselves with
               | the idea that decipherment will happily assign a meaning
               | to any text, even where the actual meaning of the text is
               | known to be different, left the field, while students who
               | didn't mind that stayed in. Over time a strong consensus
               | in favor of the position "no, I didn't waste the last 30
               | years of my life" is exactly what you'd expect to see.
               | 
               | There are no professions in which the professional
               | consensus is "actually, none of this works". But there
               | are many in which that is the _truth_.
        
         | heyitsguay wrote:
         | Is this site not just a handy visual catalog of known artifacts
         | and transcriptions? Is there some speculative decipherment
         | implied in the phoneticizations?
        
       | VectorLock wrote:
       | Probably getting a bit more popular notice after the mention in
       | the latest Indiana Jones movie (at least, they mentioned Linear B
       | a few times)
        
       | dghughes wrote:
       | I like writing systems and scripts especially obscure or ancients
       | ones. It never even dawned on me to think of my local region as I
       | did ancient Egypt, Greece, Italy etc.
       | 
       | I was talking to a friend he is Mi'Kmaq here in Canada we call
       | the people here First Nations in the USA it's Native American. He
       | said that the Mi'Kmaq had an old writing system. I checked into
       | it and it predates any contact with Europeans and is one of the
       | very few writing systems by native peoples here. It's called
       | suckerfish writing or suckerfish script the name inspired by the
       | tracks the fish makes in sand.
       | 
       | https://en.wikipedia.org/wiki/Mi%EA%9E%8Ckmaw_hieroglyphic_w...
        
         | AlotOfReading wrote:
         | The traditional definitions by linguist tend to exclude
         | anything that can't represent "all oral communication" as
         | proto- or partial writing systems, which are often pejoratively
         | labeled mnemonic systems. Systems that can represent the full
         | range of spoken expression are labeled "true" or "full"
         | writing.
         | 
         | This had the convenient side effect of neatly classifying all
         | the American writing systems as protowriting in the early 20th
         | century, as well as some more controversial examples like
         | Chinese. Some of those have since been walked back (e.g.
         | Mayan), but most remain in that limbo. We have a somewhat
         | better understanding today that there was a huge variety of
         | visual communication systems across the Americas prior to
         | European contact, but properly redefining the term "writing" to
         | include them is a slow, ongoing process.
        
       | retrac wrote:
       | For the unfamiliar, Linear A was an ancient script that is
       | associated with the Minoan civilization of the island of Crete,
       | around 1500 - 1800 BC. The later Linear B system encodes archaic
       | Greek, and is very similar to Linear A in glyph form. The Minoan
       | language written with Linear A is probably unrelated to any other
       | language.
       | 
       | Phonetic values are necessarily from Linear B or otherwise
       | guesses - it's very likely there was a great deal of overlap,
       | that the symbol representing, for example, the syllable "ni" in
       | Greek, represented a syllable that sounded a lot like "ni" in
       | Minoan. (Linear B is quite unsuited to writing Greek sounds, an
       | indicator that it was borrowed from a very different language.)
       | But since the language of Linear A remains undeciphered, that is
       | really just an educated guess at best.
        
         | djmips wrote:
         | https://greekreporter.com/2022/04/20/minoan-language-linear-...
        
       | ocschwar wrote:
       | The interface is difficult to deal with, but TIL that Linear A
       | potsherd was found in a Philistine site.
        
         | hudsonhs wrote:
         | Hacker News is a Philistine site.
        
           | djmips wrote:
           | Too good to downvote...
        
       | cubefox wrote:
       | Related thought: Imagine we received a lot of text in an alien
       | language with a radio telescope, with no "Rosetta stone" to
       | decipher it. Say, 1 TB worth of text.
       | 
       | Now we add to that data another 1 TB of English text, and train
       | an LLM on the 2 TB of data. Then we ask the model (in English) to
       | translate some text from the alien language to English.
       | 
       | Would it work?
        
         | DemocracyFTW2 wrote:
         | No. You always need some kind of Rosetta stone or other
         | relationship to a known language plus some context and
         | 'plausible guesses' to understand an unknown language. Sure if
         | I gave you _III,IIII;VII -- II,II;IIII -- VI,II;VIII_ you would
         | be able to guess that these are elementary number signs in what
         | amounts to a rudimentary table of additions. That much would be
         | true whether the snippet is from a potshard of an ancient
         | civilization or received from outer space via a radio antenna.
         | But outside of context--and nothing would be more out of
         | context than an extraterrestrial culture--you cannot even tell
         | with certainty whether _I_ stands for  'one' or 'ten' or
         | 'twelve' or 'thousand', and here we've already reached the end
         | of what a text per se can tell you about its meaning if the
         | signs are not clearly pictorial (and even pictorial scripts
         | like early Chinese or Egyptian hieroglyphs are already
         | conventionalized to the degree that for quite a number of signs
         | in either script we are to this day not sure what they depict).
         | 
         | Your idea can not work unless the data that you feed the
         | language model with correlated items. It can't. Imagine I feed
         | a predictor with a long list of images on the one hand and, on
         | the other hand, a long list of randomly ordered image
         | descriptions that may or may not match the images. Do you think
         | you could learn a foreign language that way? You absolutely
         | need the image of a donkey be associated with the name for that
         | animal in the foreign language, and the algorithm is no
         | different.
        
           | cubefox wrote:
           | Those are good reasons, yet the language model discussed
           | above would presumably understand _Alienese_ just as well as
           | it would understand English. So if an LLM understands the
           | meaning of an expression X and of an expression Y, wouldn 't
           | it be able to tell how similar those meanings are?
           | 
           | > here we've already reached the end of what a text per se
           | can tell you about its meaning if the signs are not clearly
           | pictorial
           | 
           | Note that language models today seem to be quite good at
           | understanding English, even though they are only trained on
           | symbolic text, not on any images.
        
             | DemocracyFTW2 wrote:
             | Your understanding of 'understanding a language' is
             | obviously different from mine when you write that "the
             | language model discussed above would presumably understand
             | Alienese just as well as it would understand English" and
             | "language models today seem to be quite good at
             | understanding English".
             | 
             | Language models don't understand any natural language,
             | they're very good at manipulating it (and us!) in terms of
             | continuing patterns across the scale from letter
             | (orthography) to phrases and paragraphs of seemingly
             | utility and correctness. In _that_ regard, yes, the
             | aforementioned model will likely have no difficulty in
             | reproducing novel outputs that would appear likewise useful
             | and correct to Alienese speakers as is the case for
             | English. However this assumption, too, should come with the
             | disclaimer that unless someone produces a reliable test for
             | the utility and correctness of the _same_ LM for a variety
             | of natural and invented languages with divergent grammars
             | (such as including e.g. polysynthetic languages which have
             | a very different view of what constitutes a  'word')
             | _without_ having to tweak any of the many finnicky
             | parameters of these models--we can 't be sure the model
             | won't produce garbage when trained on the next 'exotic'
             | language. So who knows, in English you use very few infixes
             | and a lot of grammar takes places between fairly constant,
             | fairly short words; a model with a given set of parameters
             | that works well for such languages may not be very good at
             | languages that has words built from many specific prefixes,
             | infixes and suffixes that are as expressive as entire
             | phrases in English. Just like the current generation of
             | text-to-image generators are pretty good at a lot of things
             | but then screw up when asked to picture a cornfield.
        
               | cubefox wrote:
               | > Your understanding of 'understanding a language' is
               | obviously different from mine when you write that "the
               | language model discussed above would presumably
               | understand Alienese just as well as it would understand
               | English" and "language models today seem to be quite good
               | at understanding English".
               | 
               | > Language models don't understand any natural language,
               | they're very good at manipulating it (and us!) in terms
               | of continuing patterns across the scale from letter
               | (orthography) to phrases and paragraphs of seemingly
               | utility and correctness.
               | 
               | Come on, chatting an hour with GPT-4 should remove all
               | doubt that it understands you quite well. Otherwise, what
               | would be understanding? Lest it turns out that _we_ are
               | stochastic parrots, too!
               | 
               | https://www.bing.com/images/create/cornfield/64b58e89d412
               | 420...
        
             | tiluha wrote:
             | The trained model would likely be able understand both
             | Alienese and English equally well, but it never learned to
             | translate even one word or context. It might have an
             | internal representation for "eating food" in both
             | languages, but since since no links exists between the
             | languages the embeddings will not be close.
             | 
             | You could try it on earth with if you train a model on two
             | separate languages, being careful that the traning data
             | does not contain any mixed language. But even then, modern
             | Human languages most likely have too much cross-
             | contamination. Would be an interesting experiment
             | nevertheless
        
               | cubefox wrote:
               | That's the question, would the embeddings be close?
               | 
               | It's not clear that they wouldn't. Would an embedding of
               | the Alienese word for "and" be close to the embedding of
               | the English "and"? This does seem quite possible to me.
               | 
               | > You could try it on earth with if you train a model on
               | two separate languages, being careful that the traning
               | data does not contain any mixed language. But even then,
               | modern Human languages most likely have too much cross-
               | contamination. Would be an interesting experiment
               | nevertheless
               | 
               | I agree. Though shouldn't we be able to answer this _a
               | priori_? It sounds like a mathematical question.
        
           | WorldMaker wrote:
           | Also, the assumption that math is universal so sharing
           | vocabulary in math is helpful for bootstrapping language
           | understanding is a fascinating assumption to question. Even
           | if you can explain Pi and prove that you can mutually
           | understand trigonometry that might give you some small
           | portion of engineering insight, but it can't yield most of
           | the rest of engineering such as design or aesthetics (or
           | emotions) or any number of other things that make for useful
           | project communication.
           | 
           | It's something I've often thought about in the way that the
           | Voyager record was built and Sagan's Cosmos novel assumes it
           | and many others. Even recently, the novel Project Hail Mary
           | borrowed that assumption that math is enough shared language
           | to bootstrap understanding. I think the movie Arrival did
           | some of the best work of showing why that wouldn't
           | necessarily work, but also had the language in question
           | designed by a mathematician and still fell into some parts of
           | the assumption/trope. I'm not saying any of these examples
           | are bad for doing this, I certainly love them all. It's still
           | a small something worth criticizing.
           | 
           | It's certainly not a bad thing to want to communicate math,
           | and to hope that things like Pi are "constant enough" to
           | provide bootstraps to other communications, but it's also
           | such a fascinating thing how much science fiction thinking
           | (and real world scientific thinking such as the Voyage
           | Record) think that you can just sort of "yada yada yada" your
           | way from "so we established communications of basic
           | mathematical constants and concepts" directly as a straight
           | line of some sort to "now we can communicate all sorts of
           | other things".
        
       | fiddlerwoaroof wrote:
       | Looks like there's a parallel site for Linear B:
       | https://linearb.xyz/
        
       ___________________________________________________________________
       (page generated 2023-07-17 23:02 UTC)