[HN Gopher] Machine learning won't solve natural language unders...
___________________________________________________________________
Machine learning won't solve natural language understanding
Author : andreyk
Score : 36 points
Date : 2021-08-09 21:01 UTC (1 hours ago)
(HTM) web link (thegradient.pub)
(TXT) w3m dump (thegradient.pub)
| visarga wrote:
| An armchair effort to redefine the goalposts and judge NLP, but
| proof is in the pudding. For now, large language models are the
| best flavor. NLP models are already useful even in this early
| stage.
| ampdepolymerase wrote:
| I look forward to when something like Siri or Google Assistant
| is hooked up to GPT-3. The current voice assistant ML systems
| are useless for anything but the most basic of tasks.
| emodendroket wrote:
| They are certainly useful, but it seems quite plausible that
| they could continue being useful without ever "solving" the
| problem, in the general sense.
| criticaltinker wrote:
| > An armchair effort
|
| Given the authors credentials and publication history [1] it's
| a bit disingenuous to call this an 'armchair effort'.
|
| [1]
| https://scholar.google.com/citations?user=i5sEc1YAAAAJ&hl=en...
| ppod wrote:
| That's not a particularly impressive publication record.
| mnky9800n wrote:
| My h index is twice as high and that's a terrible h index
| still lol
| lalaithion wrote:
| > In other words, we must get, from a multitude of possible
| interpretations of the above question, the one and only one
| meaning that, according to our commonsense knowledge of the
| world, is the one thought behind the question some speaker
| intended to ask.
|
| But, can humans do this? I think not; I still disagree with the
| author about what "Do we have a retired BBC reporter that was
| based in an East European country during the Cold War?",
| translated into code, means.
|
| They write "Doing the correct quantifier scoping: we are looking
| not for 'a' (single) reporter who worked in 'some' East European
| country, but to any reporter that worked in any East European
| country"
|
| My interpretation of this requirement is that they want a list of
| all the reporters who meet the criteria. However, I would
| probably write this query to return a boolean, not a list of
| reporters.
|
| And even if my interpretation is wrong... well, my point is still
| correct, because I failed to extract the "the one and only one
| meaning" that the author intended from _that_ sentence.
|
| Even humans are only probably approximately correct.
| emodendroket wrote:
| Well, that's true, but I don't think we can pretend that
| computers are anywhere near as good at humans at extracting the
| intended meaning from human utterances.
| burnte wrote:
| > Even humans are only probably approximately correct.
|
| This is very true, more true than we realize. Notice how much
| more "could you repeat that?" we have with masks on. It's not
| JUST the mild muffling of the speaker's voice, it's not seeing
| their lips move. We're all lip readers to a small degree, and
| it helps inform our decoding to see the lips. Fff and th sound
| similar but look very different.
|
| Even without that, think if your life, ever said "Waht was
| that? Oh, right..." then reply. At first you missed part of
| what they said (for various reasons), but you were able to
| "interpolate" the missing part from the context, and most of
| the time you get it right.
|
| Our communication modes are lossy, and our brains make up for
| that to a large degree. That's the hole in natural language
| decoding, figuring out the hinting needed for an engine,
| because we're not totally aware of how we do it ourselves.
| emodendroket wrote:
| That's a different problem, isn't it? That's more about
| transcription -- getting the speech into words -- than about
| what he's talking about, making sense of the words once you
| have them.
| drdeca wrote:
| The two tasks are interconnected. The reasoning flows both
| ways.
| amirkdv wrote:
| > But, can humans do this? I think not;
|
| > Even humans are only probably approximately correct.
|
| Fair point. But how about this:
|
| It's true that what John would do with the sentence is
| technically an "approximation" of what Alice would do because
| they have slightly different understandings of correct
| behavior. However, for humans to do what they do, they still do
| build an absolutely correct model of meaning in their mind wrt
| their (subjective) notion of correctness.
|
| This may sound like an obtuse play with words but the point is
| that to even attempt to do the right _kind_ of reasoning in
| NLU, you need a different framework than PAC. You can 't for
| example _approximate_ whether "during the Cold War" qualifies
| "was based in" or qualifies "an Eastern European country". You
| just need to decide. And once you decide, you have an absolute
| correct interpretation, not an approximate one.
|
| EDIT: wording.
| 2muchcoffeeman wrote:
| Maybe your computer science brain kicked in took a CA
| interpretation of the question once they mentioned a query. Eg:
|
| Q: Are you a man or a woman? A: Yes
|
| Despite your interpretations of the article's question. I bet
| you know which one is the more likely answer.
| bopbeepboop wrote:
| I'm confused, but think I concur with you.
|
| The two phrasings read as identical to me:
|
| - do we have a reporter who worked in some East European
| country?
|
| - is there any reporter (among ours) who worked in any East
| European country?
|
| - exists? (reporter in our-reporters) where (reporter.base in
| east-european-countries)
|
| I'm confused where the difference is supposed to be, between
| them.
|
| For every case, naming a person for whom that is true is a
| witness "yes" is correct; and being unable to is "no" being
| correct.
| hamilyon2 wrote:
| Hunter-gatherer's brain did not evolve to facilitate unambiguous
| thought transmission.
|
| Speech evolved to be very visual, spacial, quantitative and
| political. Ability to lie efficiently is evolutionary trait.
| Sometimes, we don't even need words for that. And sometimes we
| lie by pronouncing exclusively true words and sentences.
| Ambiguity of speech always was a feature.
|
| None of that makes work of NL researcher easier, of course.
|
| Understanding a sentence is not like decoding or decompressing,
| it is more like trying to guess what is utterer up to,
| politically, and if he is a friend. And only then there is
| deciding where to steer according to what he says. And for that
| we sometimes should start decoding message, but only with
| sender's goals firmly materialized in mind.
| Barrin92 wrote:
| I think that's very true and it's maybe even more clear when you
| consider mathematics.
|
| You can maybe imitate but not effectively learn mathematics
| empirically. There is an infinite number of mathematical
| expressions or sequences that can be generated, so learning can
| never be done, you cannot compress yourself to mathematical
| understanding. (which is obvious if you try to feed language
| models simple arithmetic, they can maybe do 5+5 because it shows
| up somewhere in the data, but then they can't do 3792 + 29382,
| hence they do not understand anything about addition at all).
|
| The correct way to mathematical understanding is decompressing,
| understanding the fundamental axioms of mathematics and internal
| relationships of mathematical objects (comparable to the semantic
| meaning behind language artifacts), and then expanding them.
| exporectomy wrote:
| I didn't get beyond his argument that ML is compression while NLU
| is decompression but that doesn't seem to be right. ML is often
| used to decompress data such as increasing image resolution. Of
| course it needs more data than _just_ the compressed form, but
| only for training. For inference, of course you can use ML to add
| assumed common knowledge information.
| karpierz wrote:
| > For inference, of course you can use ML to add assumed common
| knowledge information.
|
| I think this is easier said than done, and hasn't yet been
| accomplished in any general sense.
| criticaltinker wrote:
| > I have discussed in this article three reasons that proves
| Machine Learning and Data-Driven approaches are not even relevant
| to NLU
|
| This is a pretty hard-line position to take, and given the
| authors credentials I'm inclined to believe this is somehow
| poorly worded and not reflective of the thesis he intended with
| this article.
|
| > Languages are the external artifacts that we use to encode the
| infinite number of thoughts that we might have.
|
| > in building larger and larger language models, Machine Learning
| and Data-Driven approaches are trying to chase infinity in futile
| attempt at trying to find something that is not even 'there' in
| the data
|
| > Ordinary spoken language, we must realize, is not just
| linguistic data
|
| I would be curious to know what the author thinks of multimodal
| representation learning - which is conceptually promising in that
| it opens the door for machine learning models to learn
| relationships that span text, images, video, etc. For example
| OpenAI's CLIP [1], and other models like it.
|
| [1] https://arxiv.org/abs/2103.00020
| fungiblecog wrote:
| "Who was based" surely?
| Eliezer wrote:
| Nobody tell him about GPT-3, I guess...? How do you write this in
| 2021 and not specifically confront the evidence of what modern ML
| systems can do and have already done?
| drdeca wrote:
| Going through this, it has the old "A full understanding of an
| utterance or a question requires understanding the one and only
| one thought that a speaker is trying to convey. " claim, which
| continues to not make any sense, because obviously people don't
| do that; as much as I would like to be understood in precisely
| the way I mean, down to the most subtle nuance/shade of meaning
| and connotation, at least much of the time, this is not something
| we can actually get across in an at all reasonable amount of
| time.
|
| Also, claiming that natural language is infinite, if taken
| literally, would imply a large and contrary to the common
| consensus claim about physics, contradicting the Bekenstein bound
| and all that.
|
| But one thing which seemed, at least initially, like a point that
| could have some merit, was the point about compression vs
| decompression.
|
| But the alleged syllogism about it, is, pretending to be much
| more formal/rigorous than it is, and is also kind of nonsense?
| Or, like, it conflates "NLU is about decompression" with, "NLU
| \equiv not COMP" which I assume is meant to mean, -- well,
| actually, I'm not sure what it is supposed to mean. Initially I
| thought it was supposed to mean "NLU is nonequivalent to
| compression", but if so, it should be written as like, "NLU
| \not\equiv COMP" (where \not\equiv is the struckthrough version
| of the \equiv symbol) , but if it is supposed to mean "NLU is
| equivalent to the inverse or opposite of compression" (which I
| suppose better fits the text description on the right better),
| then I don't think "not" is the appropriate was to express that.
| And, if by "not" the author really means "the inverse of", then,
| well, there's nothing wrong with something being equivalent to
| its own inverse! Nor, does something being equivalent to the
| inverse of something else imply that it is "incompatible" with
| it.
|
| For something talking about communicating ideas and these ideas
| being understood precisely by the recipient of the message, the
| author sure did not work to communicate precisely.
|
| The value in formalization comes not in its trappings, but in
| actually being careful and precise, etc., not merely pretending
| to be.
|
| The part on intensional equality vs extensional equality was
| interesting, but the claim that neural networks cannot represent
| intension is, afaict, not given any justification (other than
| just "because they are numeric").
| foldr wrote:
| > Also, claiming that natural language is infinite, if taken
| literally, would imply a large and contrary to the common
| consensus claim about physics, contradicting the Bekenstein
| bound and all that.
|
| No, it wouldn't. Natural language is infinite in the pretty
| straightforward sense that, say, chess is infinite (there is an
| infinite number of valid chess games - if you ignore arbitrary
| restrictions such as the 50 move rule). This of course doesn't
| mean that a chess computer has to be infinitely large or
| violate any known laws of physics.
|
| I'd also be curious to know how you would propose to represent
| intentions in neural networks.
___________________________________________________________________
(page generated 2021-08-09 23:00 UTC)