[HN Gopher] Language models and linguistic theories beyond words
___________________________________________________________________
Language models and linguistic theories beyond words
Author : Anon84
Score : 7 points
Date : 2023-11-15 21:25 UTC (1 hours ago)
(HTM) web link (www.nature.com)
(TXT) w3m dump (www.nature.com)
| og_kalu wrote:
| Paraphrasing and summarizing parts of this article,
| https://hedgehogreview.com/issues/markets-and-the-good/artic...
|
| Some ~72 years ago in 1951, Claude Shannon released his Paper,
| "Prediction and Entropy of Printed English", an extremely
| fascinating read now.
|
| It begins with a game. Claude pulls a book down from the shelf,
| concealing the title in the process. After selecting a passage at
| random, he challenges his wife, Mary to guess its contents letter
| by letter. The space between words will count as a twenty-seventh
| symbol in the set. If Mary fails to guess a letter correctly,
| Claude promises to supply the right one so that the game can
| continue.
|
| In some cases, a corrected mistake allows her to fill in the
| remainder of the word; elsewhere a few letters unlock a phrase.
| All in all, she guesses 89 of 129 possible letters correctly--69
| percent accuracy.
|
| Discovery 1: It illustrated, in the first place, that a
| proficient speaker of a language possesses an "enormous" but
| implicit knowledge of the statistics of that language. Shannon
| would have us see that we make similar calculations regularly in
| everyday life--such as when we "fill in missing or incorrect
| letters in proof-reading" or "complete an unfinished phrase in
| conversation." As we speak, read, and write, we are regularly
| engaged in predication games.
|
| Discovery 2: Perhaps the most striking of all, Claude argues that
| that a complete text and the subsequent "reduced text" consisting
| of letters and dashes "actually...contain the same information"
| under certain conditions. How?? (Surely, the first line contains
| more information!).The answer depends on the peculiar notion
| about information that Shannon had hatched in his 1948 paper "A
| Mathematical Theory of Communication" (hereafter "MTC"), the
| founding charter of information theory.
|
| He argues that transfer of a message's components, rather than
| its "meaning", should be the focus for the engineer. You ought to
| be agnostic about a message's "meaning" (or "semantic aspects").
| The message could be nonsense, and the engineer's problem--to
| transfer its components faithfully--would be the same.
|
| a highly predictable message contains less information than an
| unpredictable one. More information is at stake in ("villapleach,
| vollapluck") than in ("Twinkle, twinkle").
|
| Does "Flinkle, fli- - - -" really contain less information than
| "Flinkle, flinkle" ?
|
| Shannon concludes then that the complete text and the "reduced
| text" are equivalent in information content under certain
| conditions because predictable letters become redundant in
| information transfer.
|
| Fueled by this, Claude then proposes an illuminating thought
| experiment: Imagine that Mary has a truly identical twin (call
| her "Martha"). If we supply Martha with the "reduced text," she
| should be able to recreate the entirety of Chandler's passage,
| since she possesses the same statistical knowledge of English as
| Mary. Martha would make Mary's guesses in reverse.
|
| Of course, Shannon admitted, there are no "mathematically
| identical twins" to be found, _but_ and here 's the reveal, "we
| do have mathematically identical computing machines."
|
| Those machines could be given a model for making informed
| predictions about letters, words, maybe larger phrases and
| messages. In one fell swoop, Shannon had demonstrated that
| language use has a statistical side, that languages are, in turn,
| predictable, and that computers too can play the prediction game.
| hyeonwho22 wrote:
| There was a fun recent variant on this game using LLMs, asking
| GPT3 (3.5?) to encode text in a way that it will be able to
| decode the meaning. Some of the encodings are insane:
|
| https://www.piratewires.com/p/compression-prompts-gpt-hidden...
| esafak wrote:
| If you want to go there, you could say that natural languages
| are error-correcting codes -- somewhat robust to corruption
| (typos). https://en.wikipedia.org/wiki/Error_correction_code
___________________________________________________________________
(page generated 2023-11-15 23:00 UTC)