[HN Gopher] Language models and linguistic theories beyond words
       ___________________________________________________________________
        
       Language models and linguistic theories beyond words
        
       Author : Anon84
       Score  : 7 points
       Date   : 2023-11-15 21:25 UTC (1 hours ago)
        
 (HTM) web link (www.nature.com)
 (TXT) w3m dump (www.nature.com)
        
       | og_kalu wrote:
       | Paraphrasing and summarizing parts of this article,
       | https://hedgehogreview.com/issues/markets-and-the-good/artic...
       | 
       | Some ~72 years ago in 1951, Claude Shannon released his Paper,
       | "Prediction and Entropy of Printed English", an extremely
       | fascinating read now.
       | 
       | It begins with a game. Claude pulls a book down from the shelf,
       | concealing the title in the process. After selecting a passage at
       | random, he challenges his wife, Mary to guess its contents letter
       | by letter. The space between words will count as a twenty-seventh
       | symbol in the set. If Mary fails to guess a letter correctly,
       | Claude promises to supply the right one so that the game can
       | continue.
       | 
       | In some cases, a corrected mistake allows her to fill in the
       | remainder of the word; elsewhere a few letters unlock a phrase.
       | All in all, she guesses 89 of 129 possible letters correctly--69
       | percent accuracy.
       | 
       | Discovery 1: It illustrated, in the first place, that a
       | proficient speaker of a language possesses an "enormous" but
       | implicit knowledge of the statistics of that language. Shannon
       | would have us see that we make similar calculations regularly in
       | everyday life--such as when we "fill in missing or incorrect
       | letters in proof-reading" or "complete an unfinished phrase in
       | conversation." As we speak, read, and write, we are regularly
       | engaged in predication games.
       | 
       | Discovery 2: Perhaps the most striking of all, Claude argues that
       | that a complete text and the subsequent "reduced text" consisting
       | of letters and dashes "actually...contain the same information"
       | under certain conditions. How?? (Surely, the first line contains
       | more information!).The answer depends on the peculiar notion
       | about information that Shannon had hatched in his 1948 paper "A
       | Mathematical Theory of Communication" (hereafter "MTC"), the
       | founding charter of information theory.
       | 
       | He argues that transfer of a message's components, rather than
       | its "meaning", should be the focus for the engineer. You ought to
       | be agnostic about a message's "meaning" (or "semantic aspects").
       | The message could be nonsense, and the engineer's problem--to
       | transfer its components faithfully--would be the same.
       | 
       | a highly predictable message contains less information than an
       | unpredictable one. More information is at stake in ("villapleach,
       | vollapluck") than in ("Twinkle, twinkle").
       | 
       | Does "Flinkle, fli- - - -" really contain less information than
       | "Flinkle, flinkle" ?
       | 
       | Shannon concludes then that the complete text and the "reduced
       | text" are equivalent in information content under certain
       | conditions because predictable letters become redundant in
       | information transfer.
       | 
       | Fueled by this, Claude then proposes an illuminating thought
       | experiment: Imagine that Mary has a truly identical twin (call
       | her "Martha"). If we supply Martha with the "reduced text," she
       | should be able to recreate the entirety of Chandler's passage,
       | since she possesses the same statistical knowledge of English as
       | Mary. Martha would make Mary's guesses in reverse.
       | 
       | Of course, Shannon admitted, there are no "mathematically
       | identical twins" to be found, _but_ and here 's the reveal, "we
       | do have mathematically identical computing machines."
       | 
       | Those machines could be given a model for making informed
       | predictions about letters, words, maybe larger phrases and
       | messages. In one fell swoop, Shannon had demonstrated that
       | language use has a statistical side, that languages are, in turn,
       | predictable, and that computers too can play the prediction game.
        
         | hyeonwho22 wrote:
         | There was a fun recent variant on this game using LLMs, asking
         | GPT3 (3.5?) to encode text in a way that it will be able to
         | decode the meaning. Some of the encodings are insane:
         | 
         | https://www.piratewires.com/p/compression-prompts-gpt-hidden...
        
         | esafak wrote:
         | If you want to go there, you could say that natural languages
         | are error-correcting codes -- somewhat robust to corruption
         | (typos). https://en.wikipedia.org/wiki/Error_correction_code
        
       ___________________________________________________________________
       (page generated 2023-11-15 23:00 UTC)