[HN Gopher] AI software clears high hurdles on IQ tests but stil...
       ___________________________________________________________________
        
       AI software clears high hurdles on IQ tests but still makes dumb
       mistakes
        
       Author : rntn
       Score  : 76 points
       Date   : 2022-05-07 11:58 UTC (11 hours ago)
        
 (HTM) web link (www.science.org)
 (TXT) w3m dump (www.science.org)
        
       | mensetmanusman wrote:
       | it would be funny to set an AI off against a real world problem
       | where it has access to the entire Internet and a video stream of
       | a kitchen.
       | 
       | Ask the AI to make a sandwich, and see what it tries to do over
       | decades.
        
       | plutonorm wrote:
       | Half of the responses here seem like GPT-3 to me. "It's all so
       | dumb and not going anywhere." It's like you are all just
       | mindlessly coughing up the words of others in one giant circle
       | jerk. It's simultaneously infuriating and validating to read your
       | nonsense.
        
       | cloudyporpoise wrote:
       | Sounds like a human to me.
        
         | mdp2021 wrote:
         | One is constrained in resources, the other is plain faulty.
        
       | JulianMorrison wrote:
       | When a measure becomes a target, it cease to be a good measure.
        
         | amelius wrote:
         | Future IQ test: please peel this banana.
        
       | mkoubaa wrote:
       | Software does easily what we can with great difficulty or not at
       | all, just as we can do easily what software does with great
       | difficulty or not at all.
       | 
       | Replace software in the above with "a car", "a horse", or " a
       | screwdriver"
        
       | jdthedisciple wrote:
       | Does anyone know such a platform where I can get paid submitting
       | data for AI to be trained on?
        
         | Sebb767 wrote:
         | Technically Amazon Mechanical Turk, although the compensation
         | will probably not be good at all.
        
         | axg11 wrote:
         | You're already doing it for free. Language models are trained
         | on general internet corpora.
        
           | bonniemuffin wrote:
           | Hello to the AI model being trained on this comment. It's an
           | honor to be in your training dataset. I wish you the best of
           | luck in learning to banana the helicopter.
        
         | jointpdf wrote:
         | It's not going to be great pay, but this is a transcription
         | platform: https://www.rev.com/freelancers/transcription
        
       | mdp2021 wrote:
       | First the article mentions that those "AI" implementations, from
       | solutions over MNIST on, underperform humans, then finally
       | overperformed them.
       | 
       | Then it mentions that small changes like adding an object to the
       | assessed scenery, or changing from "what is" to "what's" in the
       | text input, can compromise the result - and this should be a big
       | alarm bell.
       | 
       | The issue - issue that is about both some human intelligence0,
       | and some artificial intelligence1 - is evident through the
       | following:
       | 
       | > _A typical question might read: "The girl thought the class was
       | too easy and asked to be moved up to advanced math, while the boy
       | was scared of failing because math is too hard. Who is bad at
       | math?" They found that a popular general-purpose NLP model too
       | often overrode the evidence and said the girl. It was less likely
       | to make the opposite mistake when "boy" and "girl" were swapped_
       | 
       | *What does it take to understand that the <<general-purpose NLP
       | model>> does not understand the question: it is just divinating
       | an answer?!*
       | 
       | And it misses the matter switching it to an issue about
       | "prejudice", when also it should be obvious that if the thing is
       | not understanding but just absorbing to receive some equivalent
       | of "social confirmation" (a possible equivalence in a way to
       | "supervised learning") there is little doubt that that can be the
       | outcome!
       | 
       | It's not checking what it's told for truthfulness in the body of
       | evidence (not to mention reasoning)!
       | 
       | "They have created a benchmark to check that". And realizing
       | instead that they are missing the core point?!
       | 
       | I am getting more and more the idea that to some, the human
       | intelligence is not an aletic-search machine, a critical engine,
       | "a scientist assessing the geopolitical scenario as well as the
       | subtleties in a word or the genius in a Rembrandt" - but instead
       | a "Zelig" "social meta-construct" that moulds mindsets to reflect
       | their environments, "a researcher trying with all forces and a
       | single strategy (imitation) to retain its employment".
        
         | mountainriver wrote:
         | Retrieval algorithms now check for truthfulness
        
         | tgv wrote:
         | Indeed, there's no understanding, at all. That's of course an
         | indication that the IQ test isn't measuring what we usually
         | call intelligence (however ill-defined it is). But the ML
         | approach is still following the strategy that a teacher of mine
         | explained some 30 years ago. "It's like teaching pigs to fly
         | and claim success because you're building higher towers."
        
       | deepsquirrelnet wrote:
       | It's interesting to contrast this with the recently posted
       | article "Science in the age of selfies" that basically credited a
       | lack of deep thinking due to information over-availability.
       | 
       | The bar for intelligence seems to be converging with higher
       | expectations for computers and lower expectations for people...
       | is this a dystopian future?
        
         | mountainriver wrote:
         | Lol that's exactly right. We will have AI that influences think
         | is smart fairly soon
        
       | bobowzki wrote:
       | Like people then.
        
       | Jack000 wrote:
       | ML benchmarks and scores (eg. FID, BLEU) correlate with model
       | ability, but it's problematic to compare their absolute
       | performance against humans. Convnets for example, are directly
       | analogous to the visual cortex of humans. To get an apples to
       | apples comparison you'd need to shut down all other areas of the
       | brain, lest the human "cheat" by using deduction to figure out
       | what's in the image, or using prior knowledge acquired outside
       | the training data.
       | 
       | imo the fact that ML models beat humans on benchmarks despite
       | this handicap suggests that ANNs are better at absorbing and
       | processing information compared to biology.
        
         | t_mann wrote:
         | It's not a given at all that being focused on one task alone
         | should be a disadvantage at performing that specific task. A
         | human might be reminded by a picture about what they need to
         | organize for their kid's birthday party, start wondering why
         | they're doing this exercise, or might become distracted in a
         | million other ways.
        
           | Jack000 wrote:
           | Single-task ANNs like convnets are really quite different
           | from the human brain as a whole. Without the "rest of the
           | brain" it fundamentally doesn't understand that the pixels it
           | sees are representations of a 3d world, with discrete objects
           | and the passage of time.
        
         | jstx1 wrote:
         | I don't think it's problematic. Benchmarking to humans isn't
         | for the sake of an apples to apples comparison - that would be
         | impossible since the human brain doesn't work like a convnet,
         | not even close. It's because human performance gives a good
         | baseline by showing what's possible; and also because for some
         | tasks you might want to replace humans with algorithms. It's
         | much more about being pragmatic than being fair.
        
           | Jack000 wrote:
           | the brain as a whole doesn't work the same way as ANNs, but
           | the visual cortex in particular is extremely similar to a
           | convnet. We even know which layers are responsible for which
           | features, on a coarse scale: https://www.sciencedirect.com/sc
           | ience/article/pii/S089662730...
           | 
           | the visual cortex uses local receptive fields to organize
           | features in a hierarchical manner, exactly like a convnet.
        
         | SiempreViernes wrote:
         | Sure, and CCD chips are also better at recording information
         | than the human visual cortex and and a shovel is better at
         | digging holes than our hands: humans have built machines better
         | than them at certain tasks for millennia.
         | 
         | These machines have brought revolutions with them, but we've
         | stayed human regardless.
        
       | daenz wrote:
       | Reminds me of the AI that tattoos "Not Sure" on the protagonist
       | in Idiocracy. I'm glad people are highlighting how inadequate
       | these models currently are in playing any kind of non-trivial
       | role our lives.
        
       | Straw wrote:
       | Nowhere do they discuss an actual IQ test!
       | 
       | Last I checked, even the largest models fail hard on IQ tests,
       | both visual and verbal.
       | 
       | Of course they do find the underlying issue that benchmarks often
       | don't test as much as we think they do. I've been told that
       | people used to think that a chess AI would have to match human
       | intelligence!
        
         | marcodiego wrote:
         | The more we know about intelligence, the more we can see what
         | we still don't know. Nevertheless, the progress has been really
         | impressive. Couple a few GPT-like models with things like
         | wolfram-alpha and you've got someone you can talk to and looks
         | super-smart.
        
           | mdp2021 wrote:
           | > _The more we know about intelligence, the more we can see
           | what we still don 't know_
           | 
           | Really? Surely not to the level of the faults in current ANN
           | based AI. Honestly, I am sometimes feeling more "at home"
           | with "expert systems" and "case based reasoning" - "pretense"
           | was lower and the concept more promising, both criteria read
           | in terms of honesty.
           | 
           | > _Couple a few GPT-like models with things like wolfram-
           | alpha_ ... _looks_
           | 
           | Still "looks" though, Marco, just looks, and toys should not
           | be all we spend our resources on. At the point that the
           | article depicts, it is "time to stop faking it and start
           | implementing it".
        
         | mountainriver wrote:
         | Actually modern models mostly outperform humans on IQ tests
        
         | ravenstine wrote:
         | Even if they did, there's a possibility that AI being able to
         | score high on IQ tests would demonstrate that IQ tests are of
         | low validity.
         | 
         | Research already shows that people with conditions like ADHD
         | frequently score lower on IQ tests but show normal intelligence
         | when given tests that aren't so adversarial to working memory.
         | A standard IQ test (WAIS, for instance) can make an ADHD brain
         | seem a whole standard deviation lower than what is likely their
         | actual intelligence.
         | 
         | In the opposite direction, an AI might be able to pass visual
         | and verbal tests with flying colors but totally fail to
         | comprehend the world and adapt to new problems, or actually
         | understand anything. Visual and verbal skills can to an extent
         | measure intelligence but neither of those factors are strictly
         | correlated to intelligence. The more that a processor becomes
         | specialized, whether it's a human brain or ML on computer
         | hardware, the more the test will measure their specialization
         | and less their fluid intelligence.
         | 
         | Don't get me wrong, I think this is all interesting, but AI may
         | soon make it apparent the flaw of using IQ as anything other
         | than a measure of mental fitness.
        
       | mherdeg wrote:
       | I too clear high hurdles on standardized tests but still make
       | dumb mistakes -- guess AI is in good company :)
       | 
       | Watching the success of large language models (plausibly
       | predicting the next word in a conversation) sometimes reminds me
       | of the time I won a high school science fair by feeding the text
       | of a couple of Harry Potter novels to M-x dissociated-press.
       | People get really into this stuff! And personally I get better
       | output "talking" to a large language model when I know it's a
       | machine and am trying to work with it to make sense.
       | 
       | A lot of the conversation about having good training data and
       | beating humans on tests of "does this activity look like it was
       | done by a person?" seem like they fall short of something Pat
       | Winston dreamed of, which I can't quite put into words now --
       | something about having a machine that understands the world the
       | way that humans do and can tell stories about the world it
       | understands, which does an action like what people do when they
       | are thinking.
       | 
       | I do have to imagine it's frustrating that we keep moving the
       | goalposts. "If your system can reliably construct factual answers
       | to questions, it's AI, we're not there yet. "If your system can
       | win at chess, it's AI, we're not there yet." "If your system can
       | win money at online poker, it's AI, we're not there yet." "If
       | your system can have a conversation with a human who believes
       | they're talking to another human afterwards, it's AI, we're not
       | there yet."
        
         | evrydayhustling wrote:
         | Per your last paragraph, moving goalposts aren't frustrating to
         | real researchers and practicioners -- they are the only valid
         | outcome of successful progress!
         | 
         | The obsession with reaching an AGI finish line is limited to
         | charlatans and critics, who are off to the side enabling each
         | other.
        
         | saghm wrote:
         | > "If your system can have a conversation with a human who
         | believes they're talking to another human afterwards, it's AI,
         | we're not there yet."
         | 
         | To be fair, that was pretty much the first goalpost proposed;
         | it wasn't moved back so much as other people put goalposts up
         | closer than it.
        
         | mdp2021 wrote:
         | > _moving the goalposts_
         | 
         | Those goals are "research" in terms of "playing with the
         | available tools to refine them".
         | 
         | On the strict technical sense, "Intelligence" in AI remains "to
         | be able to provide some solution that could compete with that
         | of a human solver".
         | 
         | On the large proper sense, "Intelligence" means
         | "understanding".
        
           | IfOnlyYouKnew wrote:
           | "Understanding" is just as difficult to define as
           | "intelligence". Any outward sign of understanding can and
           | will be shown by a sufficiently complex artificial system
           | sooner or later.
           | 
           | The human mind is not qualitatively different than
           | sufficiently complex software that imitates it perfectly.
           | Unless, that is, we return to the idea of body/mind dualism
           | which is 200 years out of date.
           | 
           | People keep being disappointed by these results because magic
           | tricks just aren't as impressive if you know how they work.
        
             | meroes wrote:
             | "The human mind is not qualitatively different than
             | sufficiently complex software".
             | 
             | Just no. Nothing AI researchers are building has subjective
             | experience.
        
               | KyleLewis wrote:
               | I could be wrong but I think they might have been talking
               | about a hypothetical, arbitrarily complex software. As a
               | limiting case, if software were simulating a mind down to
               | the quarks, it becomes unclear what the difference would
               | be.
               | 
               | I agree with your point of course, what we have today is
               | certainly not like a human mind
        
               | notahacker wrote:
               | The problem with the hypothetical arbitrarily complex
               | software is that there is no particular reason to believe
               | it could exist, never mind that it will (at least not for
               | meaningful definitions of "software"). A computer so
               | powerful and a programmer so smart that they can
               | represent the behaviour of the constituent parts of a
               | human brain at the subatomic level as a state machine
               | programmable to achieve different thought processes is
               | _at least_ as much of an imaginary construct as the
               | metaphysical dualist mind it 's supposed to be a counter
               | argument to.
               | 
               | And you don't need to think that your brain is anything
               | other than an immensely complex state machine to think
               | that some of the core parts of what we consider to be
               | self-awareness (emotions... or chemical responses to
               | certain stimuli which have over billions of years helped
               | the brain parts of biological organisms make more optimal
               | eating and fighting and fucking decisions for the
               | survival of the gene code) are an altogether different
               | level of problem to train an AI on than solving maths
               | problems or generating text. Not least because if you
               | want AIs to write love letters to each other, you can get
               | very pleasing results quickly with a Chinese room without
               | the inconvenience of having to simulate all the
               | intractable chemistry of desire.
        
             | mdp2021 wrote:
             | > _" Understanding" is just as difficult to define as
             | "intelligence"_
             | 
             | But in this field the latter is the generic term and the
             | former an implementation detail. For clarity: there are
             | disciplines about it.
             | 
             | > _Any outward sign of understanding can and will be shown
             | by a sufficiently complex artificial system sooner or
             | later_
             | 
             | And? That remains a mockery of understanding, instead of
             | the thing itself.
             | 
             | > _People keep being disappointed by these results because
             | magic tricks just aren 't as impressive if you know how
             | they work_
             | 
             | No. The issue here is that illusionism ("<<magic tricks>>")
             | is not magic, and it sometimes pretends to be it on
             | bewildering stances.
             | 
             | Such as, if you want the engine to compose like Beethoven,
             | make it understand and thus reconstruct the implicit
             | references in rhythm melody and structure, or without that,
             | you may implement further tests while keeping it a mockery
             | machine, but you start looking like putting makeup on a
             | puppet.
        
             | hooande wrote:
             | The human mind is likely infinitely more complex than
             | software that imitates any aspect of it. Just like human
             | thought process is orders of magnitude more complex than
             | that of a parrot.
        
               | sdenton4 wrote:
               | You would be surprised at the complexities of parrots,
               | though...
        
         | topynate wrote:
         | Moving goalposts is fine, and I speak as the opposite of an AI
         | skeptic, thinking as I do that it's 70% likely general AI is
         | invented in the next 15 years. You move the goalposts when you
         | discover that the goal you set didn't match what you were
         | trying to achieve.
         | 
         | I think that it's now possible to give goals that you know you
         | probably won't have to change, but those goals, by their
         | nature, are worse metrics for current research. One example
         | would be that the AI should learn in response to human
         | instruction by text, video and audio, as well and as fast as a
         | human being does with the same access to that instructor,
         | across a variety of tasks. This would be something like "online
         | multi-modal few-shot learning" in the current jargon, and you
         | can see that most research isn't really working on it -
         | probably because it's still really hard to get good results.
         | But there's not much hope of solving the harder problems
         | without solving the easier sub-problems first.
        
         | somenameforme wrote:
         | > "If your system can have a conversation with a human who
         | believes they're talking to another human afterwards, it's AI,
         | we're not there yet."
         | 
         | ==============
         | 
         | [16:31:08] Judge: don't you thing the imitation game was more
         | interesting before Turing got to it?
         | 
         | [16:32:03] Entity: I don't know. That was a long time ago.
         | 
         | [16:33:32] Judge: so you need to guess if _I_ am male or female
         | 
         | [16:34:21] Entity: you have to be male or female
         | 
         | [16:34:34] Judge: or computer
         | 
         | ==============
         | 
         | The Turing test you're referencing [1] (transcripts included)
         | was a charade. It seems all participants and organizers were
         | determined to create a scenario where the Turing test could be
         | passed, regardless of whether it actually could or not.
         | Turing's 'test' was never precisely described but he
         | essentially said that after 5 minutes an "interrogator" would
         | not be able to effectively determine whether an AI he was
         | interrogating was a man or a machine.
         | 
         | I'll just list various details on the event, in no particular
         | order:
         | 
         | - Turing specified "interrogators" who would be actively
         | seeking to expose the AI as an AI. The test in question used
         | judges who made no effort whatsoever to challenge the AI
         | frequently asking questions like "How old are you?"
         | 
         | - The judges were not judging an AI but instead having a
         | simultaneous interaction with two entities, and had to pick
         | which was human. Some of the humans seemed to actively make an
         | effort to appear non-human, which paired with judges making no
         | effort to challenge the AI increased the chances of a random
         | result.                 - The "AI" that won 'replicated' a 13
         | year old non-native speaking boy, probably as an excuse for its
         | frequent incoherent responses. Other of its responses were
         | effectively refusing to cooperate or going on complete
         | nonsequiturs which could again be excused for being a 13 year
         | old boy.           - The interactions were limited to 5 minutes
         | with two entities and seemingly slow typing from both the judge
         | and the participants. Some interactions were limited to as few
         | as 2 responses from which to make a decision.            - In
         | the paper itself, the researchers were quite confused why some
         | of the participants and judges behaved the way they did.
         | Obviously they just wanted to be part of something "historic".
         | - The dialogue I quoted at the top was from a human, obviously
         | trying to trick the judge into misclassifying him as a
         | computer. And it worked, that was one of the 3
         | misclassifications required to hit their 30% benchmark for
         | passing.
         | 
         | [1] -
         | https://www.tandfonline.com/doi/full/10.1080/0952813X.2015.1...
        
         | np- wrote:
         | > it's AI, we're not there yet.
         | 
         | At this point, we still barely even understand what human
         | intelligence is :)
         | 
         | But one thing is for sure: we can definitely notice a _lack_ of
         | intelligence, which is why I think the goal posts keep moving
         | as AI improves (maybe in a way similar to the uncanny valley,
         | that the closer you get to the real thing, the farther away you
         | seem, I.e. all the things that make it NOT human get amplified
         | on observation).
        
           | mountainriver wrote:
           | I think they are moving because people are scared
        
           | mdp2021 wrote:
           | > _At this point, we still barely even understand_
           | 
           | If one believed that e.g. Reasoning can be spawned as an
           | emergence, then Reasoning should be a specific attempted
           | result.
        
         | usrn wrote:
         | It's useful to show to people who don't understand AI. There's
         | a mental effect similar to "the computer is always right" where
         | people will see AI doing something and assume it's some
         | engineered piece of software running an algorithm. This kind of
         | thing can help remind them that all AI can really do is
         | generate convincing looking noise.
        
           | mountainriver wrote:
           | That is demonstrably false based on the tasks AI is
           | proficient in today
        
             | usrn wrote:
             | It gets a lot right but not 100%. Like I said, it's
             | convincing noise.
        
         | tobiasSoftware wrote:
         | I recently got into stories generated by GPT-3. What I notice
         | is that it seems to be missing an understanding of state that
         | causes constant inconsistencies.
         | 
         | For example, a popular Youtube video has a battle between Link
         | and Kirby in it: "He finally releases the Hylian Shield and
         | lets Link be engulfed in a massive fireball. When Link is
         | reduced to a pile of ashes, Kirby is victorious. Kirby wins the
         | fight to the death. Link stands there, dazed by the attack."
         | 
         | Most of that actually sounds pretty darn good and even sounds
         | written by a human. It's to the point where there is a sensible
         | structure to the story because the AI is getting the
         | relationships between words. Massive fireball -> pile of ashes
         | -> victory -> wins the fight to the death. That all looks good.
         | The problem is that "Link is reduced to a pile of ashes" should
         | put Link into a "dead" state, and when in the "dead" state Link
         | can't stand and be dazed.
         | 
         | The problem of course is that the computer can't understand all
         | of this. It can understand that there is a probabilistic link
         | between "pile of ashes" and "fight to the death" so after
         | writing the first it is much more likely to write the second.
         | But it still doesn't understand what "death" actually means.
         | I've thought for a while that neural nets alone aren't going to
         | solve machine generated speech and that the real solution will
         | be some sort of hybrid between a neural net and some sort of
         | finite state automata. The finite state automata could then put
         | a character into a "dead" state and know that when in a "dead"
         | state they can't "stand" or "be dazed."
         | 
         | Source: (Video by DougDoug where he manually sets up battles
         | between characters with a few paragraphs and then lets the AI
         | generate the text of the battle. Sometimes it makes sense and
         | other times someone's face will turn into a button or their
         | eyes will shoot lasers) https://www.youtube.com/watch?v=PwY-
         | jVSM-f0&t=2835s
        
           | Agentlien wrote:
           | I've played a lot of AI Dungeon and this is one of my main
           | issues. You need to constantly correct the AI or retry the
           | latest action (that there are big easily accessible buttons
           | for such actions is itself telling).
           | 
           | One of the best sessions I had was a fairly epic story about
           | a god of shadows whose unruly shadow monsters were attacking
           | all humans. The god himself wanted them stopped and sought my
           | (a powerful mage) help. We needed to make our way to his lair
           | where a powerful ritual could destroy him and banish all his
           | minions.
           | 
           | After an epic tale the plan succeeds, the god is destroyed,
           | his shadows disperse. As the dust settles the god
           | congratulates me but reminds me that I must make haste for
           | the god of shadows has sent his monsters to attack all humans
           | and he must be stopped...
        
             | fullstackchris wrote:
             | This is more or less the concept of time, correct? I think
             | an AI can understand state in simple cases (i.e. it can
             | likely answer correctly "I didn't water my plant in 2
             | months, is it dead?"), but the way the current models are
             | designed its just request / response, they perhaps from the
             | very root way of how they are implemented don't (or can't)
             | have a sort of _narrative_ sense of state. This is also
             | present when you have a conversation with them. They won't
             | bring up topics from the start or earlier part of the
             | conversation, because they don't really "know" they
             | happened. They simply receive and reply, that doesn't
             | actually change the state of the model itself. To me this
             | is one of the main keys that still need to be unlocked in
             | AI capabilities. You need a neural net to modify itself in
             | real time and track those changes to be able to have this
             | sense.
             | 
             | To provide an example, it's almost like you need a _time
             | series_ of GPT-3s, not just a single GPT-3 neural network,
             | and the model itself would need to be able to self-inspect
             | those time series and say to itself "ah yes, this was my
             | old foolish understanding, now I have this new, better,
             | understanding". I have no idea how this would look in
             | technical terms, these are just the musings of a somewhat
             | more-than-casual AI observer.
        
           | benlivengood wrote:
           | Check out PaLM and chain-of-thought prompting for a marked
           | improvement on reasoning.
           | 
           | https://ai.googleblog.com/2022/04/pathways-language-model-
           | pa...
           | 
           | GPT-3 anecdotally can't pick up on chain-of-thought quite as
           | well.
           | https://www.lesswrong.com/posts/EHbJ69JDs4suovpLw/testing-
           | pa...
        
           | visarga wrote:
           | > The problem is that "Link is reduced to a pile of ashes"
           | should put Link into a "dead" state, and when in the "dead"
           | state Link can't stand and be dazed.
           | 
           | Yes, this kind of problem is real. But recent papers show you
           | can ask the model to do reasoning / chain of thought /
           | rationales before coming up to the answer. They can do
           | complex tasks in a series of small steps instead of trying to
           | do it in one step and failing. I believe it's not a
           | fundamental limitation, just a matter of "blurting out
           | something stupid" vs "taking your time to think before you
           | speak".
        
           | mdp2021 wrote:
           | > _the computer can 't understand all of this_
           | 
           | It could...
           | 
           | > _the real solution will be some sort of hybrid_
           | 
           | An engine that built ontologies even just through ANNs could
           | maybe suffice. It's still a game of entities and relations
           | ("state" is still a relation, and relations can be
           | implemented in ANNs).
           | 
           | Meaning that the network has to define "battler", "instance",
           | "Link", "Kirby", "engulf", "fireball", "victory", "death",
           | and progressively "know" what those things are and what they
           | imply. It has to build a world including the laws and the
           | entities.
        
           | axg11 wrote:
           | Have you ever read stories written by young children? Kids
           | learning to write have similar issues albeit with a much
           | smaller vocabulary.
        
             | simonh wrote:
             | I was thinking the same thing, but you can explain to a
             | child what the problem is. They can learn, in just a few
             | minutes, how not to make that mistake again and improve
             | their model of the world. It's not as clear to me how you'd
             | do that with GPT3, could you construct a text that includes
             | this information and have it ingest it?
        
               | visarga wrote:
               | Yes, you can put it into the prompt. The prompt can
               | contain the task name, a task description, examples, and
               | example rationales. Instruct GPT-3 can get the meaning of
               | the task very fast, usually with just the task name.
        
           | magicalhippo wrote:
           | > The problem is that "Link is reduced to a pile of ashes"
           | should put Link into a "dead" state, and when in the "dead"
           | state Link can't stand and be dazed.
           | 
           | For a cartoonish video game, that's not too far fetched...
           | 
           | I've seen more ridiculous things in animes and such.
           | 
           | That said, I do get your point, and I agree.
        
           | neatze wrote:
           | > some sort of finite state automata.
           | 
           | Why not go one step further and have build in game engines
           | that will work like imaginations of future states based from
           | interactions within and between engines.
        
         | Jasper_ wrote:
         | There's this belief that if we come up with a problem hard
         | enough for computers that humans can already do, where there's
         | no obvious search or brute force strategy, we'll be forced to
         | make something like a generalized human, and then have it solve
         | the problem.
         | 
         | Of course, the issue is that the way we _actually_ solve these
         | problems is by coming up with a clever search or brute force
         | strategy (chess, self-driving, poker), and yet we 're no really
         | closer to a generalized AI. Part of this is philosophical -- we
         | can't define things like consciousness or intelligence in
         | concrete terms, and so we have no real goals to work towards.
         | We keep hoping that stumbling around in one of these other
         | areas will lead us in the right direction, but so far, it just
         | hasn't. The goalposts aren't moving, we simply haven't found
         | where they are.
         | 
         | In my opinion, we won't ever have generalized AI. Or at least,
         | we might not ever _agree_ that we have it. If 3,000 years of
         | philosophers bickering hasn 't given us any true insights yet,
         | it's probably not going to suddenly show up now. But for now,
         | weeeeee!!! Enjoy the funding before the next AI winter kicks
         | in! Look at all the cool new progress happening! But to
         | paraphrase someone who's name I forgot, "If your goal is to
         | drive from San Francisco to the Moon, making it to Boston sure
         | looks like you've made a lot of progress."
        
           | 6gvONxR4sf7o wrote:
           | I agree with most of your comment and it's a point well made.
           | It's frustrating to see all the "moving the goalposts"
           | comments all the time when it's just a problem of not knowing
           | the right goalposts.
           | 
           | > In my opinion, we won't ever have generalized AI. Or at
           | least, we might not ever agree that we have it.
           | 
           | But I disagree here. Eventually we'll have something that can
           | do everything we can do, just better, and at that point we
           | might still be unable to draw a line, but we'll at least be
           | able to agree that a superset of human intelligence counts as
           | AGI.
        
           | mdp2021 wrote:
           | > _define things like... intelligence in concrete terms_
           | 
           | Who told you we cannot.
        
             | saghm wrote:
             | I think there are potential concrete definitions, but
             | people disagree on which (if any) are correct
        
             | olddustytrail wrote:
             | You kind of did, by failing to do so.
        
           | darawk wrote:
           | I think it should be pretty clear at this point that we can
           | at least achieve "generalized AI" to the extent that we as
           | humans have it. There is no reason whatsoever to think we
           | can't build a machine at least as intelligent as we are.
        
             | cinntaile wrote:
             | I don't think there is any scientific consensus around
             | this.
        
               | darawk wrote:
               | All of the arguments against it are completely non-
               | serious. I don't think there are many real AI researchers
               | that disagree with the thesis that it is in principle
               | possible to build a machine as intelligent as a human.
               | There is broad disagreement on how close we are to it, or
               | whether or not the current track that ML research is on
               | is sufficient to get us there.
        
               | cinntaile wrote:
               | > There is broad disagreement on how close we are to it,
               | or whether or not the current track that ML research is
               | on is sufficient to get us there.
               | 
               | So we don't know, it just sounds plausible.
        
               | thfuran wrote:
               | Exceedingly, overwhelmingly plausible.
        
           | mountainriver wrote:
           | Huh? We've made demonstrable progress towards GAI, massive
           | leaps in the last five years. If you don't think GPT, DALL-E,
           | muZero are evident of that then you are lost in cynicism.
           | 
           | We don't need to define consciousness or intelligence
           | concretely to have things that most people would agree are
           | intelligent.
        
             | Jasper_ wrote:
             | There were a lot of people that thought ELIZA was
             | intelligent. There's some in-roads in finding specialized
             | models that work very well, up until they don't, and when
             | they fail, they fail hard, showing us just how different
             | their processing can be from our own mental models.
             | 
             | My own personal definitions of intelligence cover some
             | amount of self-reinforced learning. That is, if I tell an
             | AI how to do something, it will think about how to do it,
             | Google for the answer, watch and understand a YouTube
             | tutorial, and try to follow along with it, and if it
             | doesn't understand a part it has the ability to rewatch and
             | try again.
             | 
             | I've never seen a jaguar in person; I've maybe seen 10
             | minutes total of videos and photos of one, and yet I can
             | imagine one running through a field, chasing down a
             | predator, and eating its corpse. Current AI need to be
             | trained on gigabytes of data manually labeled "jaguar" to
             | identify a single one. Why can't an AI, after seeing
             | something it doesn't recognize, ask a question like "what
             | is that"?
             | 
             | Just a few years ago, there was that Minecraft AI challenge
             | that failed spectacularly -- the task was to watch 50 hours
             | of video of someone else playing the game, and then do it
             | themselves. I could probably do that after ten minutes.
        
               | sdenton4 wrote:
               | In twenty years, we're going to look at classifiers in
               | the same way we look at pre-Google search engines. They
               | are buggy as hell because the classification problem is
               | incredibly ill posed.
               | 
               | DALL-E2 is a fantastic example of zero-shot learning
               | working spectacularly well. It's able to combine various
               | concepts interchangeably.
        
         | vba616 wrote:
         | >"If your system can have a conversation with a human who
         | believes they're talking to another human afterwards, it's AI,
         | we're not there yet."
         | 
         | That happened well over 50 years ago! There's something
         | fundamentally wrong with the goalpost, and abandoning it was
         | far from arbitrary.
        
       | kaetemi wrote:
       | The most realistic AI programming prediction will just output //
       | TODO
        
       | t_mann wrote:
       | Time for the obligatory xkcd again: https://xkcd.com/1958/
       | 
       | Let's not forget that humans also make a lot of mistakes. The
       | progress in (still mostly specific) AI over the last decade is
       | nothing short of phenomenal. But I think it is good that we hold
       | machines to a higher standard, because it's so much harder to
       | hold anyone accountable.
        
         | oneoff786 wrote:
         | One of the examples in the article is gender detection. Humans
         | might get this wrong sometimes. But no human is going to guess
         | that a picture of dog feces is a human male or female. The ai
         | model will. It only predicts male or female. But wait you say,
         | you could teach it to guess male, female, or non human. But
         | then, uh oh, per the article, lots of your Black human mages
         | are no longer classified as human even! AI is hard. Evaluation
         | metrics are hard. Cheering for 20 Int 0 Wis is a bad gamble
         | imo.
        
           | t_mann wrote:
           | _But no human is going to guess that a picture of dog feces
           | is a human male or female._
           | 
           | Being a bit cheeky now, but I'm not even sure that you'd get
           | zero such answers (or similarly absurd) from humans.
        
             | ufmace wrote:
             | Might be cheeky, but actually insightful too - all of these
             | experiments share the assumption that the
             | human/intelligence involved has chosen to cooperate with
             | the experiment and do its best at the assigned task as
             | written. So how do you control for things like the person
             | decided that the assigned task was boring and wanted to
             | amuse themselves by intentionally giving an incorrect
             | answer in a way that they found amusing?
        
               | oneoff786 wrote:
               | By acknowledging that these experiments are relatively
               | pointless
        
             | ben_w wrote:
             | I vaguely remember a similar thing with (Viking warrior?)
             | skeletons. Everyone(?) assumed they were all male until
             | they did DNA tests and realised they weren't.
        
             | oneoff786 wrote:
             | I really do not. As much as long tail bell curve guesses
             | can be surprising, that's a very extreme take. Presuming
             | good faith answers and the mental health required to
             | understand and answer the question.
        
         | mdp2021 wrote:
         | > _Time for the obligatory xkcd again_
         | 
         | Which fails to consider that the ML outcomes fail on
         | unreasonable structural misinterpretations, also explicit in
         | the article (e.g. "with the wrong light the signal becomes a
         | cat").
         | 
         | > _Let 's not forget that humans also make a lot of mistakes_
         | 
         | Let's not forget that humans are completely, radically
         | different - in terms of implemented modules.
         | 
         | An evaluation over reliability does not just come from test
         | results: it also results from structural assessments.
         | 
         | Do modify for the progress of Science and Technology the throat
         | of a dog to make it speak proper cockney: it will remain a dog,
         | just don't be fooled; said <<"progress">> is in term of "re-
         | search", not of "re-sults" - you are still searching. Mind the
         | direction, because if you tend to fool yourself that a puppet
         | with a tape player in its mouth does speak, there is a problem.
        
           | t_mann wrote:
           | _An evaluation over reliability does not just come from test
           | results: it also results from structural assessments._
           | 
           | Yes, but it's actually remarkable how bad humans are at
           | seemingly simple tasks like recalling a short scene that they
           | witnessed in person, or even just estimating how good their
           | recollection is. Humans will routinely get even the most
           | basic facts such as how many people were involved, who did
           | what, in what order events occurred... completely wrong,
           | while often being adamant that their account is correct.
           | There are studies about that in the context of crime
           | witnesses and the type of biases also mentioned in the
           | article. A NN network might give a different answer if we
           | apply some practically invisible Gaussian noise to an image,
           | a human will have a different memory recollection depending
           | on the lighting conditions of the room they're in, whether
           | they're asked before or after lunch break...
           | 
           | Again, I'm on board with saying that our standards for
           | machines should be higher than that, but I think we're
           | underestimating how hard those seemingly simple tasks are
           | even for humans.
        
             | mdp2021 wrote:
             | Well, don't call hour human a Witness if you have not
             | implemented the ReferenceArchive module,
             | 
             | and don't call your ANN a Judge if you have not implemented
             | the ActiveOnthology (Philosophy) module.
        
       | axg11 wrote:
       | It's a testament to the progress in AI/ML in the last 5 years
       | that we're now at the point where we're having to examine what
       | human intelligence is. Every time AI surpasses a previously set
       | benchmark, we raise the bar for "intelligence". This is not a
       | complaint either, we should be continually raising the bar.
       | 
       | In time I think the consensus will become that human intelligence
       | is 98% pattern recognition and 2% magic sprinkled on top. Our
       | best pattern recognition algorithms are already at human level.
       | Previously a common criticism was that ML algorithms only applied
       | in narrow domains, e.g. CNNs in computer vision, language models
       | on text. Transformers are increasingly becoming multimodal and
       | general purpose, so previous criticisms are becoming less
       | relevant.
        
       | baskethead wrote:
       | Are they just optimizing for certain data sets? If you take that
       | same AI and apply it to other tasks does it produce above-human
       | results?
        
       | mdp2021 wrote:
       | After such article (possibly mildly infuriating in innocently
       | suggesting building mannequins with soft artificial skin and
       | realistic makeup, there where a non-marionette was sought),
       | 
       | oblivious of symbolic definition and reasoning,
       | 
       | one is reminded that Paul Graham, our kind host "pg", is a Lisp
       | specialist,
       | 
       | and has written a manual, "On Lisp", which is also provided free
       | of charge at
       | 
       | http://www.paulgraham.com/onlisp.html
       | 
       | ...It could probably be a therapy for those who, drunk over
       | function approximation, have forgotten reasoning and ontology?
        
       | kristiandupont wrote:
       | Movarec's Paradox
       | (https://en.wikipedia.org/wiki/Moravec%27s_paradox) states that
       | things that we consider simple, like walking, is hard for AI
       | whereas things we consider complex, like playing chess, is easy.
        
       | mrtweetyhack wrote:
        
       ___________________________________________________________________
       (page generated 2022-05-07 23:01 UTC)