[HN Gopher] AI software clears high hurdles on IQ tests but stil...
___________________________________________________________________
AI software clears high hurdles on IQ tests but still makes dumb
mistakes
Author : rntn
Score : 76 points
Date : 2022-05-07 11:58 UTC (11 hours ago)
(HTM) web link (www.science.org)
(TXT) w3m dump (www.science.org)
| mensetmanusman wrote:
| it would be funny to set an AI off against a real world problem
| where it has access to the entire Internet and a video stream of
| a kitchen.
|
| Ask the AI to make a sandwich, and see what it tries to do over
| decades.
| plutonorm wrote:
| Half of the responses here seem like GPT-3 to me. "It's all so
| dumb and not going anywhere." It's like you are all just
| mindlessly coughing up the words of others in one giant circle
| jerk. It's simultaneously infuriating and validating to read your
| nonsense.
| cloudyporpoise wrote:
| Sounds like a human to me.
| mdp2021 wrote:
| One is constrained in resources, the other is plain faulty.
| JulianMorrison wrote:
| When a measure becomes a target, it cease to be a good measure.
| amelius wrote:
| Future IQ test: please peel this banana.
| mkoubaa wrote:
| Software does easily what we can with great difficulty or not at
| all, just as we can do easily what software does with great
| difficulty or not at all.
|
| Replace software in the above with "a car", "a horse", or " a
| screwdriver"
| jdthedisciple wrote:
| Does anyone know such a platform where I can get paid submitting
| data for AI to be trained on?
| Sebb767 wrote:
| Technically Amazon Mechanical Turk, although the compensation
| will probably not be good at all.
| axg11 wrote:
| You're already doing it for free. Language models are trained
| on general internet corpora.
| bonniemuffin wrote:
| Hello to the AI model being trained on this comment. It's an
| honor to be in your training dataset. I wish you the best of
| luck in learning to banana the helicopter.
| jointpdf wrote:
| It's not going to be great pay, but this is a transcription
| platform: https://www.rev.com/freelancers/transcription
| mdp2021 wrote:
| First the article mentions that those "AI" implementations, from
| solutions over MNIST on, underperform humans, then finally
| overperformed them.
|
| Then it mentions that small changes like adding an object to the
| assessed scenery, or changing from "what is" to "what's" in the
| text input, can compromise the result - and this should be a big
| alarm bell.
|
| The issue - issue that is about both some human intelligence0,
| and some artificial intelligence1 - is evident through the
| following:
|
| > _A typical question might read: "The girl thought the class was
| too easy and asked to be moved up to advanced math, while the boy
| was scared of failing because math is too hard. Who is bad at
| math?" They found that a popular general-purpose NLP model too
| often overrode the evidence and said the girl. It was less likely
| to make the opposite mistake when "boy" and "girl" were swapped_
|
| *What does it take to understand that the <<general-purpose NLP
| model>> does not understand the question: it is just divinating
| an answer?!*
|
| And it misses the matter switching it to an issue about
| "prejudice", when also it should be obvious that if the thing is
| not understanding but just absorbing to receive some equivalent
| of "social confirmation" (a possible equivalence in a way to
| "supervised learning") there is little doubt that that can be the
| outcome!
|
| It's not checking what it's told for truthfulness in the body of
| evidence (not to mention reasoning)!
|
| "They have created a benchmark to check that". And realizing
| instead that they are missing the core point?!
|
| I am getting more and more the idea that to some, the human
| intelligence is not an aletic-search machine, a critical engine,
| "a scientist assessing the geopolitical scenario as well as the
| subtleties in a word or the genius in a Rembrandt" - but instead
| a "Zelig" "social meta-construct" that moulds mindsets to reflect
| their environments, "a researcher trying with all forces and a
| single strategy (imitation) to retain its employment".
| mountainriver wrote:
| Retrieval algorithms now check for truthfulness
| tgv wrote:
| Indeed, there's no understanding, at all. That's of course an
| indication that the IQ test isn't measuring what we usually
| call intelligence (however ill-defined it is). But the ML
| approach is still following the strategy that a teacher of mine
| explained some 30 years ago. "It's like teaching pigs to fly
| and claim success because you're building higher towers."
| deepsquirrelnet wrote:
| It's interesting to contrast this with the recently posted
| article "Science in the age of selfies" that basically credited a
| lack of deep thinking due to information over-availability.
|
| The bar for intelligence seems to be converging with higher
| expectations for computers and lower expectations for people...
| is this a dystopian future?
| mountainriver wrote:
| Lol that's exactly right. We will have AI that influences think
| is smart fairly soon
| bobowzki wrote:
| Like people then.
| Jack000 wrote:
| ML benchmarks and scores (eg. FID, BLEU) correlate with model
| ability, but it's problematic to compare their absolute
| performance against humans. Convnets for example, are directly
| analogous to the visual cortex of humans. To get an apples to
| apples comparison you'd need to shut down all other areas of the
| brain, lest the human "cheat" by using deduction to figure out
| what's in the image, or using prior knowledge acquired outside
| the training data.
|
| imo the fact that ML models beat humans on benchmarks despite
| this handicap suggests that ANNs are better at absorbing and
| processing information compared to biology.
| t_mann wrote:
| It's not a given at all that being focused on one task alone
| should be a disadvantage at performing that specific task. A
| human might be reminded by a picture about what they need to
| organize for their kid's birthday party, start wondering why
| they're doing this exercise, or might become distracted in a
| million other ways.
| Jack000 wrote:
| Single-task ANNs like convnets are really quite different
| from the human brain as a whole. Without the "rest of the
| brain" it fundamentally doesn't understand that the pixels it
| sees are representations of a 3d world, with discrete objects
| and the passage of time.
| jstx1 wrote:
| I don't think it's problematic. Benchmarking to humans isn't
| for the sake of an apples to apples comparison - that would be
| impossible since the human brain doesn't work like a convnet,
| not even close. It's because human performance gives a good
| baseline by showing what's possible; and also because for some
| tasks you might want to replace humans with algorithms. It's
| much more about being pragmatic than being fair.
| Jack000 wrote:
| the brain as a whole doesn't work the same way as ANNs, but
| the visual cortex in particular is extremely similar to a
| convnet. We even know which layers are responsible for which
| features, on a coarse scale: https://www.sciencedirect.com/sc
| ience/article/pii/S089662730...
|
| the visual cortex uses local receptive fields to organize
| features in a hierarchical manner, exactly like a convnet.
| SiempreViernes wrote:
| Sure, and CCD chips are also better at recording information
| than the human visual cortex and and a shovel is better at
| digging holes than our hands: humans have built machines better
| than them at certain tasks for millennia.
|
| These machines have brought revolutions with them, but we've
| stayed human regardless.
| daenz wrote:
| Reminds me of the AI that tattoos "Not Sure" on the protagonist
| in Idiocracy. I'm glad people are highlighting how inadequate
| these models currently are in playing any kind of non-trivial
| role our lives.
| Straw wrote:
| Nowhere do they discuss an actual IQ test!
|
| Last I checked, even the largest models fail hard on IQ tests,
| both visual and verbal.
|
| Of course they do find the underlying issue that benchmarks often
| don't test as much as we think they do. I've been told that
| people used to think that a chess AI would have to match human
| intelligence!
| marcodiego wrote:
| The more we know about intelligence, the more we can see what
| we still don't know. Nevertheless, the progress has been really
| impressive. Couple a few GPT-like models with things like
| wolfram-alpha and you've got someone you can talk to and looks
| super-smart.
| mdp2021 wrote:
| > _The more we know about intelligence, the more we can see
| what we still don 't know_
|
| Really? Surely not to the level of the faults in current ANN
| based AI. Honestly, I am sometimes feeling more "at home"
| with "expert systems" and "case based reasoning" - "pretense"
| was lower and the concept more promising, both criteria read
| in terms of honesty.
|
| > _Couple a few GPT-like models with things like wolfram-
| alpha_ ... _looks_
|
| Still "looks" though, Marco, just looks, and toys should not
| be all we spend our resources on. At the point that the
| article depicts, it is "time to stop faking it and start
| implementing it".
| mountainriver wrote:
| Actually modern models mostly outperform humans on IQ tests
| ravenstine wrote:
| Even if they did, there's a possibility that AI being able to
| score high on IQ tests would demonstrate that IQ tests are of
| low validity.
|
| Research already shows that people with conditions like ADHD
| frequently score lower on IQ tests but show normal intelligence
| when given tests that aren't so adversarial to working memory.
| A standard IQ test (WAIS, for instance) can make an ADHD brain
| seem a whole standard deviation lower than what is likely their
| actual intelligence.
|
| In the opposite direction, an AI might be able to pass visual
| and verbal tests with flying colors but totally fail to
| comprehend the world and adapt to new problems, or actually
| understand anything. Visual and verbal skills can to an extent
| measure intelligence but neither of those factors are strictly
| correlated to intelligence. The more that a processor becomes
| specialized, whether it's a human brain or ML on computer
| hardware, the more the test will measure their specialization
| and less their fluid intelligence.
|
| Don't get me wrong, I think this is all interesting, but AI may
| soon make it apparent the flaw of using IQ as anything other
| than a measure of mental fitness.
| mherdeg wrote:
| I too clear high hurdles on standardized tests but still make
| dumb mistakes -- guess AI is in good company :)
|
| Watching the success of large language models (plausibly
| predicting the next word in a conversation) sometimes reminds me
| of the time I won a high school science fair by feeding the text
| of a couple of Harry Potter novels to M-x dissociated-press.
| People get really into this stuff! And personally I get better
| output "talking" to a large language model when I know it's a
| machine and am trying to work with it to make sense.
|
| A lot of the conversation about having good training data and
| beating humans on tests of "does this activity look like it was
| done by a person?" seem like they fall short of something Pat
| Winston dreamed of, which I can't quite put into words now --
| something about having a machine that understands the world the
| way that humans do and can tell stories about the world it
| understands, which does an action like what people do when they
| are thinking.
|
| I do have to imagine it's frustrating that we keep moving the
| goalposts. "If your system can reliably construct factual answers
| to questions, it's AI, we're not there yet. "If your system can
| win at chess, it's AI, we're not there yet." "If your system can
| win money at online poker, it's AI, we're not there yet." "If
| your system can have a conversation with a human who believes
| they're talking to another human afterwards, it's AI, we're not
| there yet."
| evrydayhustling wrote:
| Per your last paragraph, moving goalposts aren't frustrating to
| real researchers and practicioners -- they are the only valid
| outcome of successful progress!
|
| The obsession with reaching an AGI finish line is limited to
| charlatans and critics, who are off to the side enabling each
| other.
| saghm wrote:
| > "If your system can have a conversation with a human who
| believes they're talking to another human afterwards, it's AI,
| we're not there yet."
|
| To be fair, that was pretty much the first goalpost proposed;
| it wasn't moved back so much as other people put goalposts up
| closer than it.
| mdp2021 wrote:
| > _moving the goalposts_
|
| Those goals are "research" in terms of "playing with the
| available tools to refine them".
|
| On the strict technical sense, "Intelligence" in AI remains "to
| be able to provide some solution that could compete with that
| of a human solver".
|
| On the large proper sense, "Intelligence" means
| "understanding".
| IfOnlyYouKnew wrote:
| "Understanding" is just as difficult to define as
| "intelligence". Any outward sign of understanding can and
| will be shown by a sufficiently complex artificial system
| sooner or later.
|
| The human mind is not qualitatively different than
| sufficiently complex software that imitates it perfectly.
| Unless, that is, we return to the idea of body/mind dualism
| which is 200 years out of date.
|
| People keep being disappointed by these results because magic
| tricks just aren't as impressive if you know how they work.
| meroes wrote:
| "The human mind is not qualitatively different than
| sufficiently complex software".
|
| Just no. Nothing AI researchers are building has subjective
| experience.
| KyleLewis wrote:
| I could be wrong but I think they might have been talking
| about a hypothetical, arbitrarily complex software. As a
| limiting case, if software were simulating a mind down to
| the quarks, it becomes unclear what the difference would
| be.
|
| I agree with your point of course, what we have today is
| certainly not like a human mind
| notahacker wrote:
| The problem with the hypothetical arbitrarily complex
| software is that there is no particular reason to believe
| it could exist, never mind that it will (at least not for
| meaningful definitions of "software"). A computer so
| powerful and a programmer so smart that they can
| represent the behaviour of the constituent parts of a
| human brain at the subatomic level as a state machine
| programmable to achieve different thought processes is
| _at least_ as much of an imaginary construct as the
| metaphysical dualist mind it 's supposed to be a counter
| argument to.
|
| And you don't need to think that your brain is anything
| other than an immensely complex state machine to think
| that some of the core parts of what we consider to be
| self-awareness (emotions... or chemical responses to
| certain stimuli which have over billions of years helped
| the brain parts of biological organisms make more optimal
| eating and fighting and fucking decisions for the
| survival of the gene code) are an altogether different
| level of problem to train an AI on than solving maths
| problems or generating text. Not least because if you
| want AIs to write love letters to each other, you can get
| very pleasing results quickly with a Chinese room without
| the inconvenience of having to simulate all the
| intractable chemistry of desire.
| mdp2021 wrote:
| > _" Understanding" is just as difficult to define as
| "intelligence"_
|
| But in this field the latter is the generic term and the
| former an implementation detail. For clarity: there are
| disciplines about it.
|
| > _Any outward sign of understanding can and will be shown
| by a sufficiently complex artificial system sooner or
| later_
|
| And? That remains a mockery of understanding, instead of
| the thing itself.
|
| > _People keep being disappointed by these results because
| magic tricks just aren 't as impressive if you know how
| they work_
|
| No. The issue here is that illusionism ("<<magic tricks>>")
| is not magic, and it sometimes pretends to be it on
| bewildering stances.
|
| Such as, if you want the engine to compose like Beethoven,
| make it understand and thus reconstruct the implicit
| references in rhythm melody and structure, or without that,
| you may implement further tests while keeping it a mockery
| machine, but you start looking like putting makeup on a
| puppet.
| hooande wrote:
| The human mind is likely infinitely more complex than
| software that imitates any aspect of it. Just like human
| thought process is orders of magnitude more complex than
| that of a parrot.
| sdenton4 wrote:
| You would be surprised at the complexities of parrots,
| though...
| topynate wrote:
| Moving goalposts is fine, and I speak as the opposite of an AI
| skeptic, thinking as I do that it's 70% likely general AI is
| invented in the next 15 years. You move the goalposts when you
| discover that the goal you set didn't match what you were
| trying to achieve.
|
| I think that it's now possible to give goals that you know you
| probably won't have to change, but those goals, by their
| nature, are worse metrics for current research. One example
| would be that the AI should learn in response to human
| instruction by text, video and audio, as well and as fast as a
| human being does with the same access to that instructor,
| across a variety of tasks. This would be something like "online
| multi-modal few-shot learning" in the current jargon, and you
| can see that most research isn't really working on it -
| probably because it's still really hard to get good results.
| But there's not much hope of solving the harder problems
| without solving the easier sub-problems first.
| somenameforme wrote:
| > "If your system can have a conversation with a human who
| believes they're talking to another human afterwards, it's AI,
| we're not there yet."
|
| ==============
|
| [16:31:08] Judge: don't you thing the imitation game was more
| interesting before Turing got to it?
|
| [16:32:03] Entity: I don't know. That was a long time ago.
|
| [16:33:32] Judge: so you need to guess if _I_ am male or female
|
| [16:34:21] Entity: you have to be male or female
|
| [16:34:34] Judge: or computer
|
| ==============
|
| The Turing test you're referencing [1] (transcripts included)
| was a charade. It seems all participants and organizers were
| determined to create a scenario where the Turing test could be
| passed, regardless of whether it actually could or not.
| Turing's 'test' was never precisely described but he
| essentially said that after 5 minutes an "interrogator" would
| not be able to effectively determine whether an AI he was
| interrogating was a man or a machine.
|
| I'll just list various details on the event, in no particular
| order:
|
| - Turing specified "interrogators" who would be actively
| seeking to expose the AI as an AI. The test in question used
| judges who made no effort whatsoever to challenge the AI
| frequently asking questions like "How old are you?"
|
| - The judges were not judging an AI but instead having a
| simultaneous interaction with two entities, and had to pick
| which was human. Some of the humans seemed to actively make an
| effort to appear non-human, which paired with judges making no
| effort to challenge the AI increased the chances of a random
| result. - The "AI" that won 'replicated' a 13
| year old non-native speaking boy, probably as an excuse for its
| frequent incoherent responses. Other of its responses were
| effectively refusing to cooperate or going on complete
| nonsequiturs which could again be excused for being a 13 year
| old boy. - The interactions were limited to 5 minutes
| with two entities and seemingly slow typing from both the judge
| and the participants. Some interactions were limited to as few
| as 2 responses from which to make a decision. - In
| the paper itself, the researchers were quite confused why some
| of the participants and judges behaved the way they did.
| Obviously they just wanted to be part of something "historic".
| - The dialogue I quoted at the top was from a human, obviously
| trying to trick the judge into misclassifying him as a
| computer. And it worked, that was one of the 3
| misclassifications required to hit their 30% benchmark for
| passing.
|
| [1] -
| https://www.tandfonline.com/doi/full/10.1080/0952813X.2015.1...
| np- wrote:
| > it's AI, we're not there yet.
|
| At this point, we still barely even understand what human
| intelligence is :)
|
| But one thing is for sure: we can definitely notice a _lack_ of
| intelligence, which is why I think the goal posts keep moving
| as AI improves (maybe in a way similar to the uncanny valley,
| that the closer you get to the real thing, the farther away you
| seem, I.e. all the things that make it NOT human get amplified
| on observation).
| mountainriver wrote:
| I think they are moving because people are scared
| mdp2021 wrote:
| > _At this point, we still barely even understand_
|
| If one believed that e.g. Reasoning can be spawned as an
| emergence, then Reasoning should be a specific attempted
| result.
| usrn wrote:
| It's useful to show to people who don't understand AI. There's
| a mental effect similar to "the computer is always right" where
| people will see AI doing something and assume it's some
| engineered piece of software running an algorithm. This kind of
| thing can help remind them that all AI can really do is
| generate convincing looking noise.
| mountainriver wrote:
| That is demonstrably false based on the tasks AI is
| proficient in today
| usrn wrote:
| It gets a lot right but not 100%. Like I said, it's
| convincing noise.
| tobiasSoftware wrote:
| I recently got into stories generated by GPT-3. What I notice
| is that it seems to be missing an understanding of state that
| causes constant inconsistencies.
|
| For example, a popular Youtube video has a battle between Link
| and Kirby in it: "He finally releases the Hylian Shield and
| lets Link be engulfed in a massive fireball. When Link is
| reduced to a pile of ashes, Kirby is victorious. Kirby wins the
| fight to the death. Link stands there, dazed by the attack."
|
| Most of that actually sounds pretty darn good and even sounds
| written by a human. It's to the point where there is a sensible
| structure to the story because the AI is getting the
| relationships between words. Massive fireball -> pile of ashes
| -> victory -> wins the fight to the death. That all looks good.
| The problem is that "Link is reduced to a pile of ashes" should
| put Link into a "dead" state, and when in the "dead" state Link
| can't stand and be dazed.
|
| The problem of course is that the computer can't understand all
| of this. It can understand that there is a probabilistic link
| between "pile of ashes" and "fight to the death" so after
| writing the first it is much more likely to write the second.
| But it still doesn't understand what "death" actually means.
| I've thought for a while that neural nets alone aren't going to
| solve machine generated speech and that the real solution will
| be some sort of hybrid between a neural net and some sort of
| finite state automata. The finite state automata could then put
| a character into a "dead" state and know that when in a "dead"
| state they can't "stand" or "be dazed."
|
| Source: (Video by DougDoug where he manually sets up battles
| between characters with a few paragraphs and then lets the AI
| generate the text of the battle. Sometimes it makes sense and
| other times someone's face will turn into a button or their
| eyes will shoot lasers) https://www.youtube.com/watch?v=PwY-
| jVSM-f0&t=2835s
| Agentlien wrote:
| I've played a lot of AI Dungeon and this is one of my main
| issues. You need to constantly correct the AI or retry the
| latest action (that there are big easily accessible buttons
| for such actions is itself telling).
|
| One of the best sessions I had was a fairly epic story about
| a god of shadows whose unruly shadow monsters were attacking
| all humans. The god himself wanted them stopped and sought my
| (a powerful mage) help. We needed to make our way to his lair
| where a powerful ritual could destroy him and banish all his
| minions.
|
| After an epic tale the plan succeeds, the god is destroyed,
| his shadows disperse. As the dust settles the god
| congratulates me but reminds me that I must make haste for
| the god of shadows has sent his monsters to attack all humans
| and he must be stopped...
| fullstackchris wrote:
| This is more or less the concept of time, correct? I think
| an AI can understand state in simple cases (i.e. it can
| likely answer correctly "I didn't water my plant in 2
| months, is it dead?"), but the way the current models are
| designed its just request / response, they perhaps from the
| very root way of how they are implemented don't (or can't)
| have a sort of _narrative_ sense of state. This is also
| present when you have a conversation with them. They won't
| bring up topics from the start or earlier part of the
| conversation, because they don't really "know" they
| happened. They simply receive and reply, that doesn't
| actually change the state of the model itself. To me this
| is one of the main keys that still need to be unlocked in
| AI capabilities. You need a neural net to modify itself in
| real time and track those changes to be able to have this
| sense.
|
| To provide an example, it's almost like you need a _time
| series_ of GPT-3s, not just a single GPT-3 neural network,
| and the model itself would need to be able to self-inspect
| those time series and say to itself "ah yes, this was my
| old foolish understanding, now I have this new, better,
| understanding". I have no idea how this would look in
| technical terms, these are just the musings of a somewhat
| more-than-casual AI observer.
| benlivengood wrote:
| Check out PaLM and chain-of-thought prompting for a marked
| improvement on reasoning.
|
| https://ai.googleblog.com/2022/04/pathways-language-model-
| pa...
|
| GPT-3 anecdotally can't pick up on chain-of-thought quite as
| well.
| https://www.lesswrong.com/posts/EHbJ69JDs4suovpLw/testing-
| pa...
| visarga wrote:
| > The problem is that "Link is reduced to a pile of ashes"
| should put Link into a "dead" state, and when in the "dead"
| state Link can't stand and be dazed.
|
| Yes, this kind of problem is real. But recent papers show you
| can ask the model to do reasoning / chain of thought /
| rationales before coming up to the answer. They can do
| complex tasks in a series of small steps instead of trying to
| do it in one step and failing. I believe it's not a
| fundamental limitation, just a matter of "blurting out
| something stupid" vs "taking your time to think before you
| speak".
| mdp2021 wrote:
| > _the computer can 't understand all of this_
|
| It could...
|
| > _the real solution will be some sort of hybrid_
|
| An engine that built ontologies even just through ANNs could
| maybe suffice. It's still a game of entities and relations
| ("state" is still a relation, and relations can be
| implemented in ANNs).
|
| Meaning that the network has to define "battler", "instance",
| "Link", "Kirby", "engulf", "fireball", "victory", "death",
| and progressively "know" what those things are and what they
| imply. It has to build a world including the laws and the
| entities.
| axg11 wrote:
| Have you ever read stories written by young children? Kids
| learning to write have similar issues albeit with a much
| smaller vocabulary.
| simonh wrote:
| I was thinking the same thing, but you can explain to a
| child what the problem is. They can learn, in just a few
| minutes, how not to make that mistake again and improve
| their model of the world. It's not as clear to me how you'd
| do that with GPT3, could you construct a text that includes
| this information and have it ingest it?
| visarga wrote:
| Yes, you can put it into the prompt. The prompt can
| contain the task name, a task description, examples, and
| example rationales. Instruct GPT-3 can get the meaning of
| the task very fast, usually with just the task name.
| magicalhippo wrote:
| > The problem is that "Link is reduced to a pile of ashes"
| should put Link into a "dead" state, and when in the "dead"
| state Link can't stand and be dazed.
|
| For a cartoonish video game, that's not too far fetched...
|
| I've seen more ridiculous things in animes and such.
|
| That said, I do get your point, and I agree.
| neatze wrote:
| > some sort of finite state automata.
|
| Why not go one step further and have build in game engines
| that will work like imaginations of future states based from
| interactions within and between engines.
| Jasper_ wrote:
| There's this belief that if we come up with a problem hard
| enough for computers that humans can already do, where there's
| no obvious search or brute force strategy, we'll be forced to
| make something like a generalized human, and then have it solve
| the problem.
|
| Of course, the issue is that the way we _actually_ solve these
| problems is by coming up with a clever search or brute force
| strategy (chess, self-driving, poker), and yet we 're no really
| closer to a generalized AI. Part of this is philosophical -- we
| can't define things like consciousness or intelligence in
| concrete terms, and so we have no real goals to work towards.
| We keep hoping that stumbling around in one of these other
| areas will lead us in the right direction, but so far, it just
| hasn't. The goalposts aren't moving, we simply haven't found
| where they are.
|
| In my opinion, we won't ever have generalized AI. Or at least,
| we might not ever _agree_ that we have it. If 3,000 years of
| philosophers bickering hasn 't given us any true insights yet,
| it's probably not going to suddenly show up now. But for now,
| weeeeee!!! Enjoy the funding before the next AI winter kicks
| in! Look at all the cool new progress happening! But to
| paraphrase someone who's name I forgot, "If your goal is to
| drive from San Francisco to the Moon, making it to Boston sure
| looks like you've made a lot of progress."
| 6gvONxR4sf7o wrote:
| I agree with most of your comment and it's a point well made.
| It's frustrating to see all the "moving the goalposts"
| comments all the time when it's just a problem of not knowing
| the right goalposts.
|
| > In my opinion, we won't ever have generalized AI. Or at
| least, we might not ever agree that we have it.
|
| But I disagree here. Eventually we'll have something that can
| do everything we can do, just better, and at that point we
| might still be unable to draw a line, but we'll at least be
| able to agree that a superset of human intelligence counts as
| AGI.
| mdp2021 wrote:
| > _define things like... intelligence in concrete terms_
|
| Who told you we cannot.
| saghm wrote:
| I think there are potential concrete definitions, but
| people disagree on which (if any) are correct
| olddustytrail wrote:
| You kind of did, by failing to do so.
| darawk wrote:
| I think it should be pretty clear at this point that we can
| at least achieve "generalized AI" to the extent that we as
| humans have it. There is no reason whatsoever to think we
| can't build a machine at least as intelligent as we are.
| cinntaile wrote:
| I don't think there is any scientific consensus around
| this.
| darawk wrote:
| All of the arguments against it are completely non-
| serious. I don't think there are many real AI researchers
| that disagree with the thesis that it is in principle
| possible to build a machine as intelligent as a human.
| There is broad disagreement on how close we are to it, or
| whether or not the current track that ML research is on
| is sufficient to get us there.
| cinntaile wrote:
| > There is broad disagreement on how close we are to it,
| or whether or not the current track that ML research is
| on is sufficient to get us there.
|
| So we don't know, it just sounds plausible.
| thfuran wrote:
| Exceedingly, overwhelmingly plausible.
| mountainriver wrote:
| Huh? We've made demonstrable progress towards GAI, massive
| leaps in the last five years. If you don't think GPT, DALL-E,
| muZero are evident of that then you are lost in cynicism.
|
| We don't need to define consciousness or intelligence
| concretely to have things that most people would agree are
| intelligent.
| Jasper_ wrote:
| There were a lot of people that thought ELIZA was
| intelligent. There's some in-roads in finding specialized
| models that work very well, up until they don't, and when
| they fail, they fail hard, showing us just how different
| their processing can be from our own mental models.
|
| My own personal definitions of intelligence cover some
| amount of self-reinforced learning. That is, if I tell an
| AI how to do something, it will think about how to do it,
| Google for the answer, watch and understand a YouTube
| tutorial, and try to follow along with it, and if it
| doesn't understand a part it has the ability to rewatch and
| try again.
|
| I've never seen a jaguar in person; I've maybe seen 10
| minutes total of videos and photos of one, and yet I can
| imagine one running through a field, chasing down a
| predator, and eating its corpse. Current AI need to be
| trained on gigabytes of data manually labeled "jaguar" to
| identify a single one. Why can't an AI, after seeing
| something it doesn't recognize, ask a question like "what
| is that"?
|
| Just a few years ago, there was that Minecraft AI challenge
| that failed spectacularly -- the task was to watch 50 hours
| of video of someone else playing the game, and then do it
| themselves. I could probably do that after ten minutes.
| sdenton4 wrote:
| In twenty years, we're going to look at classifiers in
| the same way we look at pre-Google search engines. They
| are buggy as hell because the classification problem is
| incredibly ill posed.
|
| DALL-E2 is a fantastic example of zero-shot learning
| working spectacularly well. It's able to combine various
| concepts interchangeably.
| vba616 wrote:
| >"If your system can have a conversation with a human who
| believes they're talking to another human afterwards, it's AI,
| we're not there yet."
|
| That happened well over 50 years ago! There's something
| fundamentally wrong with the goalpost, and abandoning it was
| far from arbitrary.
| kaetemi wrote:
| The most realistic AI programming prediction will just output //
| TODO
| t_mann wrote:
| Time for the obligatory xkcd again: https://xkcd.com/1958/
|
| Let's not forget that humans also make a lot of mistakes. The
| progress in (still mostly specific) AI over the last decade is
| nothing short of phenomenal. But I think it is good that we hold
| machines to a higher standard, because it's so much harder to
| hold anyone accountable.
| oneoff786 wrote:
| One of the examples in the article is gender detection. Humans
| might get this wrong sometimes. But no human is going to guess
| that a picture of dog feces is a human male or female. The ai
| model will. It only predicts male or female. But wait you say,
| you could teach it to guess male, female, or non human. But
| then, uh oh, per the article, lots of your Black human mages
| are no longer classified as human even! AI is hard. Evaluation
| metrics are hard. Cheering for 20 Int 0 Wis is a bad gamble
| imo.
| t_mann wrote:
| _But no human is going to guess that a picture of dog feces
| is a human male or female._
|
| Being a bit cheeky now, but I'm not even sure that you'd get
| zero such answers (or similarly absurd) from humans.
| ufmace wrote:
| Might be cheeky, but actually insightful too - all of these
| experiments share the assumption that the
| human/intelligence involved has chosen to cooperate with
| the experiment and do its best at the assigned task as
| written. So how do you control for things like the person
| decided that the assigned task was boring and wanted to
| amuse themselves by intentionally giving an incorrect
| answer in a way that they found amusing?
| oneoff786 wrote:
| By acknowledging that these experiments are relatively
| pointless
| ben_w wrote:
| I vaguely remember a similar thing with (Viking warrior?)
| skeletons. Everyone(?) assumed they were all male until
| they did DNA tests and realised they weren't.
| oneoff786 wrote:
| I really do not. As much as long tail bell curve guesses
| can be surprising, that's a very extreme take. Presuming
| good faith answers and the mental health required to
| understand and answer the question.
| mdp2021 wrote:
| > _Time for the obligatory xkcd again_
|
| Which fails to consider that the ML outcomes fail on
| unreasonable structural misinterpretations, also explicit in
| the article (e.g. "with the wrong light the signal becomes a
| cat").
|
| > _Let 's not forget that humans also make a lot of mistakes_
|
| Let's not forget that humans are completely, radically
| different - in terms of implemented modules.
|
| An evaluation over reliability does not just come from test
| results: it also results from structural assessments.
|
| Do modify for the progress of Science and Technology the throat
| of a dog to make it speak proper cockney: it will remain a dog,
| just don't be fooled; said <<"progress">> is in term of "re-
| search", not of "re-sults" - you are still searching. Mind the
| direction, because if you tend to fool yourself that a puppet
| with a tape player in its mouth does speak, there is a problem.
| t_mann wrote:
| _An evaluation over reliability does not just come from test
| results: it also results from structural assessments._
|
| Yes, but it's actually remarkable how bad humans are at
| seemingly simple tasks like recalling a short scene that they
| witnessed in person, or even just estimating how good their
| recollection is. Humans will routinely get even the most
| basic facts such as how many people were involved, who did
| what, in what order events occurred... completely wrong,
| while often being adamant that their account is correct.
| There are studies about that in the context of crime
| witnesses and the type of biases also mentioned in the
| article. A NN network might give a different answer if we
| apply some practically invisible Gaussian noise to an image,
| a human will have a different memory recollection depending
| on the lighting conditions of the room they're in, whether
| they're asked before or after lunch break...
|
| Again, I'm on board with saying that our standards for
| machines should be higher than that, but I think we're
| underestimating how hard those seemingly simple tasks are
| even for humans.
| mdp2021 wrote:
| Well, don't call hour human a Witness if you have not
| implemented the ReferenceArchive module,
|
| and don't call your ANN a Judge if you have not implemented
| the ActiveOnthology (Philosophy) module.
| axg11 wrote:
| It's a testament to the progress in AI/ML in the last 5 years
| that we're now at the point where we're having to examine what
| human intelligence is. Every time AI surpasses a previously set
| benchmark, we raise the bar for "intelligence". This is not a
| complaint either, we should be continually raising the bar.
|
| In time I think the consensus will become that human intelligence
| is 98% pattern recognition and 2% magic sprinkled on top. Our
| best pattern recognition algorithms are already at human level.
| Previously a common criticism was that ML algorithms only applied
| in narrow domains, e.g. CNNs in computer vision, language models
| on text. Transformers are increasingly becoming multimodal and
| general purpose, so previous criticisms are becoming less
| relevant.
| baskethead wrote:
| Are they just optimizing for certain data sets? If you take that
| same AI and apply it to other tasks does it produce above-human
| results?
| mdp2021 wrote:
| After such article (possibly mildly infuriating in innocently
| suggesting building mannequins with soft artificial skin and
| realistic makeup, there where a non-marionette was sought),
|
| oblivious of symbolic definition and reasoning,
|
| one is reminded that Paul Graham, our kind host "pg", is a Lisp
| specialist,
|
| and has written a manual, "On Lisp", which is also provided free
| of charge at
|
| http://www.paulgraham.com/onlisp.html
|
| ...It could probably be a therapy for those who, drunk over
| function approximation, have forgotten reasoning and ontology?
| kristiandupont wrote:
| Movarec's Paradox
| (https://en.wikipedia.org/wiki/Moravec%27s_paradox) states that
| things that we consider simple, like walking, is hard for AI
| whereas things we consider complex, like playing chess, is easy.
| mrtweetyhack wrote:
___________________________________________________________________
(page generated 2022-05-07 23:01 UTC)