[HN Gopher] Bing AI can't be trusted
___________________________________________________________________
Bing AI can't be trusted
Author : dbrereton
Score : 725 points
Date : 2023-02-13 16:40 UTC (6 hours ago)
(HTM) web link (dkb.blog)
(TXT) w3m dump (dkb.blog)
| wefarrell wrote:
| The amount of trust people are willing to place in AI is far more
| terrifying than the capabilities of these AI systems. People are
| too willing to give up their responsibility of critical thought
| to some kind of omnipotent messiah figure.
| dagw wrote:
| _The amount of trust people are willing to place in AI is far
| more terrifying than the capabilities of these AI systems._
|
| I don't know, people seem to quickly adjust their expectations
| to reality. Listening the conversation around ChatGPT that I'm
| hearing around me, people have become a lot more sceptical over
| just the last couple of weeks, as they've gotten a chance to
| use the system hands on rather than just read articles about
| it.
| kahnclusions wrote:
| I've tried playing around with it quite a bit for the "choose
| your own adventure" type games.
|
| It's really good at generating text and following prompts.
| Letting later responses use the previous prompts and
| responses as input really gives the illusion of a
| conversation.
|
| But it's extremely limited... you run up against the limits
| of what it really can do very quickly.
| kenjackson wrote:
| I've gone just the opposite. The ability for me to ask my own
| questions has impressed me more than the articles. In part
| because articles always hype up technology in a crazy way.
| This is the first technology since -- probably the Apple
| iPhone touchscreen where I'm like, "this is better than they
| hype seemed to convey".
|
| I think the goalposts have moved greatly though. Just a
| couple of years ago, most lay tech folks would've laughed you
| out the room if you suggested this could be done. The
| summarized answers by Bing and Google are relics
| comparitatively.
| throw8383833jj wrote:
| well, people already do that with their news feed.
| commandlinefan wrote:
| And before social media news feeds, people were doing that
| with newspapers for generations. Those people have always
| been around.
| [deleted]
| jackmott42 wrote:
| Which person or persons specifically are you referring to?
| noobermin wrote:
| Every other day, a poster comments how they make
| presentations or write code using chatgpt. Just in another
| thread, someone posted how chatgpt solved their coding
| problem...which a quick google search would have solved as
| well, as others in the replies to it pointed out.
|
| Whenever I've used chatgpt I was impressed at the surface
| level, but digging deeper into a convo always turned up
| circular HS tier BS'ing. The fact that so many people online
| and on HN are saying chatgpt is astounding and revolutionary
| are just betraying that that such HS-essay level BS is
| convincing to them, and it's somewhat of a depressing though
| that so many people are so easy striken by a confidence
| trick.
| rockinghigh wrote:
| I have been using ChatGPT successfully for coding tasks for
| the last two months. It's often faster than a Google search
| because it delivers the code, can explain it, and can write
| a unit test.
| pixl97 wrote:
| >which a quick google search would have solved as well,
|
| You mean 'may' have solved that. Google is becoming a
| bullshit generator for SEO farms and spitting out ads at
| such a rate it can be near useless for some questions.
|
| Now the real question is, can we tell when Google or GPT is
| dumping bullshit on us.
| dqpb wrote:
| Prove that this is actually happening.
| tiborsaas wrote:
| I've talked to people and read comments a lot, but there's no
| proof that you'd probably accept. My impression is that this
| attitude definitely exists. Some people are already ditching
| search engines and rely mostly on ChatGPT, some are even
| talking about AI tech in general with religious awe.
| hummus_bae wrote:
| Although AI can't be trusted, we can trust that AI can't be
| trusted.
| computerex wrote:
| This is why when I spot a Tesla on the road, I make every
| effort to try and get as far away from it as possible. Placing
| a machine vision model at the helm of a multi-ton vehicle has
| got to be one of the dumbest things the regulators have let
| Elon get away with.
| jackmott42 wrote:
| Tesla isn't the only brand that is doing that, and most of us
| didn't pay for it and are driving ourselves. So maybe pretend
| to be terrified by something else.
| GuB-42 wrote:
| Other brands tend not to rely so much on a vision model for
| their driving assistance features, instead relying more on
| a variety of sensors, traditional computation, and curated
| databases.
| jayd16 wrote:
| ChatGPT is bad...but ChatGPT with a radar is something
| I'll trust.
| veb wrote:
| If it has a radar, then it should have something to do
| with the sky... hmm, what would be a good name?
| withinboredom wrote:
| Saw a Tesla get pulled over on the Autobahn for holding the
| left lane and being too slow. That was one of my favorite
| moments on the road. Second only to watching a team of
| truckers box in an asshole driver while going up a mountain.
| Der_Einzige wrote:
| Agreed, they tend to drive extremely badly and I don't trust
| them. Avoid them, anything nissan and lifted trucks
| (especially dodge RAMs). Stats show that a solid 5% of all
| ram 1500 drivers have a DUI on record. It's like 0.1% for
| prius drivers.
|
| re:machine vision - What's particularly bad is that they
| don't actually put enough sensors in the cars to be safe.
|
| Pre-collision systems, blind spot monitors, radar, parking
| sensors, etc are all so helpful and objectively good for
| drivers. Doing a vision only implementation and then claiming
| "full self driving" is where it gets awful.
| _joel wrote:
| Has it been proven to be worse than normal drivers that pass
| their test?
| jldugger wrote:
| Where would that data live and how would we disentangle it
| from income effects?
| oakashes wrote:
| I don't think you would need to disentangle it from
| income effects to address the concern of someone who
| actively avoids Teslas on the road, unless they also
| actively avoid cars that look like they belong to people
| with low income.
| jldugger wrote:
| > I don't think you would need to disentangle it from
| income effects to address the concern
|
| Good point, I was mainly thinking of the 'AI saves lives'
| argument.
|
| > of someone who actively avoids Teslas on the road,
| unless they also actively avoid cars
|
| Probably they should avoid hitting _any_ cars =)
| kypro wrote:
| What does worse mean? Tesla's AI drives worse than an 80
| year old with cataracts, but on the other hand it can react
| faster to obstacles than the fastest race-car drivers.
|
| I don't own a Tesla so have no direct experience, but my
| guess would be that it might crash less than a human, but
| has far more near-misses.
| generalizations wrote:
| Worse probably means something like "accidents per mile".
|
| Edit: was curious, so I did the math. The standard unit
| seems to be 'crashes per 100 million miles' so according
| to the Tesla safety report (only source on Autopilot
| safety I could find easily) that works out to [0]: one
| accident for every 4.41 million miles driven = 22.7
| accidents per 100 million miles.
|
| Without autopilot, the AAA foundation (again, easily
| available source, feel free to find a better one, this is
| from 2014) [1] says that our best drivers are in the
| 60-69 age range, and have 241 accidents per 100 million
| miles. Our worst drivers are our younger teens (duh) with
| 1432 crashes per 100 million miles.
|
| So unless you can find better data that contradicts this,
| Autopilot seems like a net benefit.
|
| [0] https://insideevs.com/news/537818/tesla-autopilot-
| safety-rep...
|
| [1] https://aaafoundation.org/rates-motor-vehicle-
| crashes-injuri...
| alldayeveryday wrote:
| I think your analysis is a good one given the data we
| have, and can be used to draw some conclusions or guide
| general discussion. However the analysis is indeed
| limited by the data available. The AAA data does not
| consider variability by gender, race, socioeconomics,
| location, etc. Further it does not normalize variability
| in the types of driving being done (teslas have limited
| range, are not towing trailers, etc), nor other
| technological advances (modern vs older vehicles).
| danso wrote:
| Autopilot only works on highways and decent weather and
| visibility -- i.e. the least accident-prone scenarios
| possible. AFAIK, AAA's stats count accidents in all
| situations
| kjkjadksj wrote:
| Id wager for a modern car it doesn't crash any less. My
| modern Toyota also stops faster than I can thanks to a
| big radar unit hooked up to the brakes.
| fragmede wrote:
| In fact, almost all new cars, Teslas included, since
| 2022, have an Automatic Emergency Breaking system (AEB),
| which will hit the breaks if you're about to hit
| something. If I were walking in a parking lot and had to
| step in front of an older SUV or a Tesla, I'd step in
| front of the Telsa.
|
| https://www.jdpower.com/cars/shopping-guides/what-is-
| automat...
| flandish wrote:
| Consider this:
|
| A tesla's fsd code had gone through the sdlc they choose,
| involving project management.
|
| I distrust project management.
|
| Therefore: I only trust a tesla's fsd mode as much as I
| trust a project manager's driving.
| grogenaut wrote:
| Do you need it to be proven scientifically to take caution
| on something? I take caution around every car.
| [deleted]
| FartyMcFarter wrote:
| Whether it's "proven" to do so probably depends on what
| kind of proof you're looking for.
|
| But there are plenty of news articles and youtube videos
| that show it doing ridiculous unsafe things (including
| running into a truck that was driving perpendicular to the
| road). So I highly doubt it's as good as a normal driver,
| in fact I'd be shocked if it is.
| sebzim4500 wrote:
| >But there are plenty of news articles and youtube videos
| that show it doing ridiculous unsafe things
|
| There are also plenty of videos of human drivers doing
| absurd things, so this is hardly an argument.
|
| All that matters is whether a random Tesla that you see
| on the street is more likely to crash into you than a
| different car. I know that Tesla has published there own
| statistics which say that it isn't, but I would be very
| interested in seeing an independent study about this.
| gl-prod wrote:
| Because the term AI linked with chatting bot misleads people
| into thinking it's something like AGI or Iron Man's JARVIS.
| yetihehe wrote:
| If we delete first sentence of your post and leave only
|
| > People are too willing to give up their responsibility of
| critical thought to some kind of omnipotent messiah figure.
|
| This essentially describes humans probably since before we
| became homo sapiens. We again and again choose into positions
| of power those who can look competent instead of actual
| competent people.
| LesZedCB wrote:
| i wish somebody would write an entire science fiction series
| about this! maybe set on a desert planet, after humanity made
| the same mistake with intelligent machines as well
|
| https://en.wikipedia.org/wiki/Dune_(novel)
| insane_dreamer wrote:
| What shocks me is not that Bing got a bunch of stuff wrong, but
| that:
|
| - The Bing team didn't check the results for their __demo__ wtaf?
| Some top manager must have sent down the order that "Google has
| announced their thing, so get this out TODAY".
|
| - The media didn't do fact checking either (though I hold them
| less accountable than the Bing/Msft team)
| chasd00 wrote:
| if ChatGPT could ask questions back it would be a very effective
| phishing tool. People put a lot of blind faith in what they
| perceive as intelligence. You know, a MITM attack on a chatbot
| could probably be used to get a lot of people to do anything
| online or IRL.
| TEP_Kim_Il_Sung wrote:
| AI should probably stick to selling paperclips. There's no chance
| to screw that up.
| partiallypro wrote:
| AI can't be trusted in general, at least not for a long time. It
| gets basic facts wrong, constantly. The fear is that it will
| start eating its own dogfood and being more and more wrong since
| we are putting it in the hands of people that don't know any
| better and are going to use it to generate tons of online content
| that will later be used in the models.
|
| It does make some queries much easier to find, for instance I had
| trouble finding out if the runner ups got the win in the Tour De
| France after the Armstrong doping scandal and it answered it
| instantly. The problem is that is offers answers with confidence,
| I think them adding citation is an improvement over ChatGPT, but
| it needs more.
|
| Luckily, it's still a beta product and not in the hands of
| everyone. Unfortunately, ChatGPT is, which I find more
| problematic.
| Havoc wrote:
| Surprised anyone is getting excited about these mistakes at all.
| Expecting them to be fully accurate is simply not realistic
|
| The fact that they're producing anything coherent at all is a
| feat
| BaseballPhysics wrote:
| Uh, the technology is being integrated into a _search engine_.
| It 's job is to surface real information, not made up BS.
|
| No one would be "getting excited" about this if Microsoft
| wasn't selling this as the future of search.
| m3kw9 wrote:
| If it flops on certain information and the UI is. It properly
| adjusted to limit certain things is does poorly, it will back
| fire on MS
| oldstrangers wrote:
| I had this idea the other day concerning the 'AI obfuscation' of
| knowledge. The discussion was about how AI image generators are
| designed to empower everyone to contribute to the design process.
| But I argued that you can only reasonably contribute to the
| process if you can actually articulate the reasoning beyond your
| contributions. If an AI made it for you, you probably can't,
| because the reasoning is simply "this is the amalgamation of
| training data that the AI spat out." But, there's a realistic
| version of reality where this becomes the norm and we
| increasingly rely on AI to solve for issues that we don't
| understand ourselves.
|
| And, perhaps more worrying, the more widely adopted AI becomes,
| the harder it becomes to correct its mistakes. Right now millions
| of people are being fed information they don't understand, and
| information that's almost entirely incorrect or inaccurate. What
| is the long term damage from that?
|
| We've obfuscated the source data and essentially the entire
| process of learning with LLMs / AIs, and the path this leads down
| seems pretty obviously a net negative for society (outside of
| short term profit for the stake holders).
| kneebonian wrote:
| I've said it before and I'll warn of it again here, my biggest
| concern for AI, especially at this stage is that we abscond
| understanding, in favor of letting the AI generate, then the AI
| generates that which we do not understand, but must maintain.
| Then we don't know why we are doing what we are doing but we
| know that it causes things to work how we want.
|
| Suddenly instead of our technology being defined by reason and
| understanding our technology is shrouded in mysticism, and
| ritual. Pretty soon the whole thing devolves into the tech
| people running around in red robes, performing increasingly
| obtuse rituals to appease "the machine spirit", and praying to
| the Omnissiah.
|
| If we ever choose to abandon our need for understanding we will
| at that point have abandoned our ability to progress.
| nix0n wrote:
| People are already misusing statistical models, in ways that
| are already causing harm to people.
|
| See this HN thread from 2016[0], which also points to [1](a
| book) and [2](PDF).
|
| I definitely agree with you that it's going to get a lot worse
| with AI, since it makes it harder to see that it is a
| statistical model.
|
| [0]https://news.ycombinator.com/item?id=12642432
| [1]https://www.amazon.com/Weapons-Math-Destruction-Increases-
| In... [2]https://nissenbaum.tech.cornell.edu/papers/biasincompu
| ters.p...
| Madmallard wrote:
| ChatGPT can give you a full description of why it made the
| decision it did and it usually is fairly accurate.
| danans wrote:
| What the hype machine still doesn't understand is that it's a
| _language_ model, not a knowledge model.
|
| It is optimized to generate information that looks as much like
| language as possible, not knowledge. It may sometimes regurgitate
| knowledge if it is simple or well trodden enough knowledge, or if
| language trivially models that knowledge.
|
| But if that knowledge gets more complex and experiential, it will
| just generate words without attachment to meaning or truth,
| because fundamentally it only knows how to generate language, and
| it doesn't know how to say "I don't know that" or "I don't
| understand that".
| pphysch wrote:
| LLM+Search has to be all about ad injection, right?
|
| As a consumer, it seems the value of LLM/LIM(?) is advanced
| autocomplete and concept/content generation. I would pay some
| money for these features. LLM+Search doesn't appeal to me much.
| mtmail wrote:
| "With deeply personalized experiences we expect to be able to
| deliver even more relevant messages to consumers, with the goal
| of improved ROI for advertisers."
|
| https://about.ads.microsoft.com/en-us/blog/post/february-202...
| tasty_freeze wrote:
| Supposedly, Joseph Weisenbaum logged the chat logs of Eliza so he
| could better see where his list of canned replies was falling
| short. He was horrified to find that people were really
| interacting with it as if understood them.
|
| If people fell for the appearance of AI that resulted from a few
| dozen canned replies and a handful of heuristics, I 100% believe
| that people will be taken in by ChatGPT and ascribe it far more
| intelligence than it has.
| LesZedCB wrote:
| papers are coming out weekly about their emergent properties.
|
| despite people wanting transformers to be nothing more than
| fancy, expensive excel spreadsheets, their capabilities are far
| from simple or deterministic.
|
| the fact that in-context learning is getting us 80%ish of the
| way to tailored behavior is just fucking incredible. they _are_
| definitely, meaningfully intelligent in some (not-so-small)
| way.
|
| this paper[1] goes over quite a few examples and models
|
| [1] https://storage.googleapis.com/pub-tools-public-
| publication-...
| flandish wrote:
| >Bing
|
| _No_ AI can be trusted. FTFY.
| seydor wrote:
| I cant wait for the era of conversational web so i can do away
| with clickbait titles and opinions. Truly everyone has one. The
| experiment with "open publishing" has so far only proved that
| signal to noise remains constant
| notacoward wrote:
| > so i can do away with clickbait titles and opinions
|
| Do you actually think that will be the result? Why not the
| _exact opposite_? ChadGPT and the others are for all practical
| purposes trained to create content that is superficially
| appealing and plausible - i.e. perhaps not clickbait but a
| related longer-form phenomenon - without any underlying insight
| or connection to truth. That would make conversational AI even
| _more_ of a time sink than today 's clickbait. Why do you
| imagine it would turn out otherwise?
| thorum wrote:
| The errors when summarizing the Gap financial report summary are
| quite surprising to me. I copied the same source paragraph (which
| is very clearly phrased) into ChatGPT and it summarized it
| accurately.
|
| Is it possible they are 'pre-summarizing' long documents with
| another algorithm before feeding them to GPT?
| coliveira wrote:
| I think ChatGPT and their lookalikes spell the end of the public
| internet as we know it. People now have tools to generate pages
| as they seem fit. Google will not be able to determine what are
| high quality pages if everything looks the same and is generated
| by AI bots. Users will be unable to find trustworthy results, and
| many of these results will be filled with generated garbage that
| looks great but is ultimately false.
| wwwpatdelcom wrote:
| I have been trying to help folks understand what the underlying
| mechanisms of these generative LLM's are so it's not such a
| surprise when we get wrong answers from them by putting together
| some youtube videos on the topic.
|
| * [On the question of replacing
| Engineers](https://www.youtube.com/watch?v=GMmIol4mnLo)
|
| * [On AI Plagiarism](https://www.youtube.com/watch?v=whbNCSZb3c8)
|
| The consensus seems to be building now on HackerNews that there
| is a huge over-hype. Hopefully these two videos help see some of
| the nuance behind why it's an over-hype.
|
| That being said, being that language generation is probabilistic,
| a given language model which is transformer based can either be
| trained or fine-tuned to have fewer errors in a particular domain
| - so this is all far from settled.
|
| Long-term, I think we're going to see something closer to human
| intelligence from CNN's and other forms of neural networks than
| from transformers, which are really a poor man's NN. As hardware
| advances and NN's inevitably become cheaper to run, we will
| continue to see scarier and scarier A.I. -- I'm talking over a
| 10-20 year timeframe.
| whimsicalism wrote:
| HN was always going to be overly pessimistic with regards to
| this stuff, so this was utterly predictable.
|
| I work in this field & it almost pains me to see it come into
| the mainstream and see all of the terrible takes that pundits
| can contort this into, ie. LLM as a "lossy jpeg of the
| internet" (bad, but honestly one of the better ones).
| wwwpatdelcom wrote:
| Yes..."Lossy JPEG," at least describes the idea that there
| is, _some_ kind of "subsampling," going on, rather than
| just...a magical box?
|
| I think most laypeople understand the simple statement, "it's
| a parrot."
|
| I had the original author of this paper reach out to me about
| my plagiarism video on Mastodon:
|
| https://dl.acm.org/doi/10.1145/3442188.3445922
|
| The idea of a lossy JPEG/Parrot helps capture the idea that
| there are dangers and opportunities in LLM's. You can have
| fake or doctored images spread, you can have a Parrot swear
| at someone and cause un-needed conflict - but they can also
| be great tools and/or cute and helpful companions, as long as
| we understand their limitations.
| sebzim4500 wrote:
| The statement "it's a parrot" may be simple to understand
| but frankly I don't think many people who have used chatGPT
| will believe it.
|
| At least "lossy JPEG" feels vague enough to be
| unfalsifiable.
| whimsicalism wrote:
| The issue is that it doesn't just recreate things it was
| trained on, it generates _novel content_. There is no
| reason that novel pathways of "thought" (or whatever makes
| one comfortable) aren't emergent in a model under
| optimization & regularization.
|
| This is what the "lossy compression" and "stochastic
| parrot" layperson models do not capture. Nonetheless,
| people will lap them up. They want a more comfortable
| understanding that lets them avoid having to question their
| pseudo-belief in souls and the duality of mind and body.
| Few in the public seem to want to confront the idea of the
| mind as an emergent phenomenon from interactions of neurons
| in the brain.
|
| It is not simply regurgitating training data like everyone
| seems to want it to.
| wwwpatdelcom wrote:
| But what is novel content?
|
| I can easily falsify the accusation that, "people
| underestimate transformers and don't see that they are
| actually intelligent," by defeating the best open-source
| transformer-based word embedding (at the time) with a
| simple TF-DF based detector (this was back in September).
|
| https://www.patdel.com/plagiarism-detector/
|
| No, these things are not, "emergent," they are just
| rearranging numbers. You don't have to use a transformer
| or neural network at all to re-arrange numbers and create
| something that is even more, "artificially intelligent,"
| than one that does use transformers it turns out!
| hackinthebochs wrote:
| >No, these things are not, "emergent," they are just
| rearranging numbers.
|
| This is a bad take. Most ways to "rearrange numbers"
| produce noise. That there is a very small subset of
| permutations that produce meaningful content, and the
| system consistently produces such permutations, is a
| substantial result. The question of novelty is whether
| these particular permutations have been seen before, or
| perhaps are simple interpolations of what has been seen
| before. I think its pretty obvious the space of possible
| meaningful permutations is much larger than what is
| present in the training set. The question of novelty then
| is whether the model can produce meaningful output (i.e.
| grammatically correct, sensible, plausible) in a space
| that far outpaces what was present in the training
| corpus. I strongly suspect the answer is yes, but this is
| ultimately an empirical question.
| wwwpatdelcom wrote:
| I would love to read anything you have written about the
| topic at length. Thanks for your contribution.
| whimsicalism wrote:
| I can tell that this conversation is not going to be
| super productive, so a few brief thoughts:
|
| > I can easily falsify the accusation that, "people
| underestimate transformers and don't see that they are
| actually intelligent,"
|
| I think that you have an idiosyncratic definition of what
| "falsify" means compared to what most might. Getting away
| from messy definitions of "intelligent" which I think are
| value-laden, I see nothing in your blog post that
| falsifies the notion that LLMs can generate novel content
| (another fuzzy value-laden notion perhaps).
|
| > these things are not, "emergent," they are just
| rearranging numbers.
|
| It seems non-obvious to me that 'rearranging numbers'
| cannot lead to anything emergent out of that process, yet
| cascading voltage (as in our brain) can.
| wwwpatdelcom wrote:
| I would love to read anything you have written or studied
| about this topic at length. Thanks for your replies.
| noobermin wrote:
| >There is no reason that novel pathways of "thought" (or
| whatever makes one comfortable) aren't emergent in a
| model under optimization & regularization.
|
| Please substantiate this assertion. People always just
| state it as a fact without producing an argument for it.
| whimsicalism wrote:
| You're asking me to substantiate a negative - ie.
| identify any possible reason someone might provide that
| novel behavior might not be emergent out of a model under
| optimization and then disprove it, but ahead of time.
| This is a challenging task.
|
| Our minds are emergent out of the interaction of billions
| of neurons in our brain. Each is individually pretty
| dumb, just taking in voltage and outputting voltage (to
| somewhat oversimplify). Out of that simple interaction &
| under the pressures of evolutionary optimization, we have
| reached a more emergent whole.
|
| Linear transformations stacked with non-linearities can
| similarly create an individually dumb input and output
| that under the pressure of optimization lead to a more
| emergent whole. If there is a reason why this has to be
| tied to voltage regulating neuron substrate, I have yet
| to see a compelling one.
| mtlmtlmtlmtl wrote:
| I think its unfair and asinine to caricature sceptics as
| ignorant people in denial, holding on to some outdated
| idea of a soul. That's the sort of argument someone makes
| when they're so entrenched in their own views they see
| nothing but their own biases.
| sebzim4500 wrote:
| Being sceptical of chatGPT is entirely reasonable, and
| there is plenty of room for discussion on exactly when we
| will hit the limits of scalining LLMs.
|
| No one who has used chatGPT more than a couple of times
| will argue in good faith that it is a "parrot", however,
| unless they have an extremely weird definition of
| "parrot".
| whimsicalism wrote:
| Ask people to describe how they think the mind functions
| and you will very often get something very akin to soul-
| like belief. Many, many people are not very comfortable
| with the mind as emergent phenomenon. A straight majority
| of people in the US (and likely globally) believe in
| souls when polled, you are the one imputing the words of
| "ignorant people in denial" onto my statement of why
| people find views to the contrary uncomfortable.
|
| I understand that HN is a civil community. I don't think
| it is crossing the line to characterize people I disagree
| with as wrong and also theorize on why they might hold
| those wrong beliefs. Indeed, you are doing the same thing
| with my comment - speculating on why I might hold views
| that are 'asinine' because I see 'nothing but [my] own
| biases.'
| mtlmtlmtlmtl wrote:
| I'm not saying it's not true of most people in the world,
| but that doesn't make it a constructive argument. And you
| didn't use the words ignorant and denial, but they're
| reasonable synonyms to what you did say.
|
| When I do the "same thing" I'm really saying that when
| you represent yourself as from the field, you might want
| to cultivate a more nuanced view of the people outside
| the field, if you want to be taken seriously.
|
| Instead, given the view you presented, I'm forced to give
| your views the same credence I give a physicist who says
| their model of quantum gravity is definitely the correct
| one. I.e: "sure, you'd say that, wouldn't you"
| whimsicalism wrote:
| I am providing a reason why "the public" might be
| uncomfortable around these ideas. You accuse me of
| misrepresenting the public's beliefs as ignorant and
| outdated when really the public has a nuanced view on
| this subject. I am merely taking the majority of people
| at their word when they are polled on the subject.
|
| Most people believe in souls. Most people do not believe
| in minds as emergent out of interactions of neurons. I am
| not sure how to cultivate a more nuanced view on this
| when flat majorities of people say when asked that they
| hold the belief I am imputing on them.
|
| Am I saying that this is where all skepticism comes from?
| No. Is it a considerable portion? Yes.
| SketchySeaBeast wrote:
| If we think of the tools as generating entirely novel
| content then I'd suggest we're using them for the wrong
| thing here - we shouldn't be using it at all as a
| glorified (now literal) search engine, it should be
| exploring some other space entirely. If we discovered a
| brand new sentient creature we wouldn't immediately try
| to fill its head with all the knowledge on the internet
| and then force it to answer what the weather will be
| tomorrow.
| whimsicalism wrote:
| I have no idea what sentience really means, but I think
| novel content generation is a necessary but not
| sufficient component.
| SketchySeaBeast wrote:
| True, I was overly grandiose. Regardless, we're taking
| something that can apparently generate new intellectual
| content, but we're using it as a beast of burden.
| 1vuio0pswjnm7 wrote:
| When Google's Bard AI made a mistake, GOOG share price dropped
| over 7%.
|
| What about Baidu's Ernie AI.
|
| Common retort to criticism of conversational AI is "But it's
| useful."
|
| Yes, it is useful as a means to create hype that can translate to
| increases in stock price increase and increased web traffic (and
| thereby increased revenue from advertising services).
|
| https://www.reuters.com/technology/chinas-baidu-finish-testi...
| jmount wrote:
| It can't be emphasized enough, this isn't a procedure failing
| when used- this is a canned recording of it failing. This means
| the group either didn't check the results, or did check them and
| saw no way forward other than getting this out the door. It is
| only small samples, but it is fairly damning that it is hard to
| produce error free curated examples.
| ddren wrote:
| Out of curiosity, I searched the pet vacuum mentioned in the
| first example, and found it on amazon [0]. Just like Bing says,
| it is a corded model with a 16 feet cord, and searching the
| reviews for "noise" shows that many people think that it is too
| loud. At least in this case, it seems that Bing got it right.
|
| [0]: https://www.amazon.com/Bissell-Eraser-Handheld-Vacuum-
| Corded...
| dboreham wrote:
| Curious why someone would keep a vacuum as a pet.
| Merad wrote:
| Bing actually got tripped up by HGTV simplifying a product name
| in their article. It used this HGTV [0] article as its source
| for the top pet vacuums. The article lists the "Bissell Pet
| Hair Eraser Handheld Vacuum" and links to [1] which is actually
| named "Bissell Pet Hair Eraser Lithium Ion Cordless Hand
| Vacuum". The product you found is the "Bissell Pet Hair Eraser
| Handheld Vacuum, Corded." A human likely wouldn't even notice
| the difference because we'd just follow the link in the
| article, or realize the corded vacuum was the wrong item based
| on its picture, but Bing has no such understanding.
|
| [0]: https://www.hgtv.com/shopping/product-reviews/best-
| vacuums-f...
|
| [1]: https://www.amazon.com/BISSELL-Eraser-Lithium-Handheld-
| Cordl...
| jiggyjace wrote:
| Yeah this is my experience cross-checking the article with my
| own Bing AI. Try and replicate the Appendix section and Bing AI
| gets everything right for me.
| kibwen wrote:
| Our exposure to smart-sounding chatbots is inducing a novel form
| of pareidolia: https://en.wikipedia.org/wiki/Pareidolia .
|
| Our brains are pattern-recognition engines and humans are social
| animals; together that means that our brains are predisposed to
| anthropomorphizing and interpreting patterns as human-like.
|
| For the whole of human history thus far, the only things that we
| have commonly encountered that conversed like humans have been
| other humans. This means that when we observe something like
| ChatGPT that appears to "speak", we are susceptible to
| interpreting intelligence where there is none, in the same way
| that an optical illusion can fool your brain into perceiving
| something that is not happening.
|
| That's not to say that humans are somehow special or that or
| human intelligence is impossible to replicate. But these things
| right here aren't intelligent, y'all. That said, can they be
| useful? Certainly. Tools don't need to be intelligent to be
| useful. A chainsaw isn't intelligent, and it can still be highly
| useful... and highly destructive, if used in the wrong way.
| pixl97 wrote:
| >we are susceptible to interpreting intelligence where there is
| none,
|
| I disagree as this is much to simple of statement. You have had
| near daily dealings with less than human intelligences for most
| of your life, we call them animals. We realize they have a wide
| range of intelligence from extremely simple behavior to near
| human competency.
|
| This is why I dismiss your 'not intelligent yet' statement. The
| problem we lack here is one of precise language when talking
| about the components of intelligence and the wide range in
| which it manifests.
| frereubu wrote:
| For me the fundamental issue at the moment for ChatGPT and others
| is the tone it replies in. A large proportion of the information
| in language is in the tone, so someone might say something like
| "I'm pretty sure that the highest mountain in Africa is Mount
| Kenya" whereas ChatGPT instead says "the highest mountain in
| Africa is Mount Kenya", and it's the "is" in the sentence that's
| the issue. So many issues in language revolve around "is" - the
| certainty is very problematic. It reminds me of a tutor at art
| college who said too many people were producing "thing that look
| like art". ChatGPT produces sentence that look like language, and
| because of "is" they read as quite compelling due to the
| certainty it conveys. Modify that so it says "I think..." or "I'm
| pretty sure..." or "I reckon..." and the sentence would be much
| more honest, but the glamour around it collapses.
| esotericimpl wrote:
| [dead]
| Plough_Jogger wrote:
| I have a feeling we will see a resurgence of some of the ideas
| around expert systems; current language models inherently cannot
| provide guarantees of correctness (unless e.g., entire facts are
| tokenized together, but this limits functionality significantly).
| bambax wrote:
| > _Bing AI can 't be trusted_
|
| Of course it can't. No LLM can. They're bullshit generators. Some
| people have been saying it from the start, and now everyone is
| saying it.
|
| It's a mystery why Microsoft is going full speed ahead with this.
| A possible explanation is that they do this to annoy / terrify
| Google.
|
| But the big mystery is, why is Google falling for it? That's
| inexplicable, and inexcusable.
| Nemo_bis wrote:
| > It's a mystery why Microsoft is going full speed ahead with
| this.
|
| Maybe they had some idle GPU capacity in some DC or they needed
| to cross-subsidize Azure to massage the stock market
| multipliers, or something.
| coffeeblack wrote:
| It just goog... ehm bings your question and then summarizes what
| the resulting web pages say. Works well, but ChatGPT works much
| better.
| JoshTko wrote:
| Hot take, chat GPT rises and crashes fast after SEO optimization
| shifts to ChatGPT optimization.
| userbinator wrote:
| I don't know if it's started to use AI for regular search
| queries, but I noticed within the past week or two that Bing
| results got _much_ worse. It seems it doesn 't even respect
| quoting anymore, and the second and subsequent pages of results
| are almost entirely duplicates of the first. I normally use Bing
| when Google fails to yield results or decides to hellban me for
| searching too specifically, and for the past few years it was
| acceptable or even occasionally better, but now it's much worse.
| If that's the result of AI, then _do not want!!!_
| joe_the_user wrote:
| Well, reworking Bing and Google for a ChatGPT interface is
| going to be massive hardware and software enterprise. And there
| are a lot of questions involved to say the least.
|
| Where will the software engineer come from? We're in a belt-
| tightening part of the business cycle and FANGs have a pressure
| not to hire, so you assume the existing engineers. But these
| engineers are now working on real things so those real things
| may suffer. Which brings actual profits? The future AI thing or
| the present? The future AI is unavoidable given the
| possibilities are visible and the competition is on but a "shit
| shows" of various sorts seem very possible.
|
| Where will the hardware and the processing power come from?
| There are estimates of server power consumption quintupling [1]
| but these are arbitrary - even if it just doubles, just
| "plugging the cords" in takes time. And where would the new
| TPUs/GPUs come from? TSMC has a capacity determined by
| investments already made and much of that capacity is allotted
| already - more capacity anywhere would involve massive capital
| allocation and what level of increased profits will pay for
| this?
|
| [1] https://www.wired.com/story/the-generative-ai-search-race-
| ha...
| Eduard wrote:
| > I normally use Bing when Google fails to yield results...
|
| Every once in a while I hear someone at Hacker News hitting the
| dead end with Google Search. Can you give an example where
| Google search fails, but other search engines (e.g. Bing)
| provide results? Must be fringe niche topics, no?
|
| >... or decides to hellban me for searching too specifically
|
| Is hellbanning a thing at Google? What happens if one gets
| hellbanned?
| rvz wrote:
| There is no point in hyping about a 'better search engine' when
| this continues to hallucinate incorrect and inaccurate results.
| It is now reduced to a 'intelligent sophist' instead of a search
| engine. Once many realise that it also frequently hallucinates
| nonsense, it is essentially no better than Google Bard.
|
| After looking at the limitations of ChatGPT and Bing AI it is now
| clear that they aren't reliable enough to even begin to challenge
| search engines or even cite their sources properly. LLMs are just
| limited to bullshit generators which is what this current AI hype
| is all about.
|
| Until all of these AI models are open-sourced and transparent
| enough to be trustworthy or if a competitor does it instead, then
| there is nothing revolutionary about this AI hype other than a AI
| SaaS using a creative Clubhouse-like waitlist mania.
| mnd999 wrote:
| Of course it can't. That you're even surprised by this enough to
| write a blog post is more worrying.
| password54321 wrote:
| Which part of the post did the author convey surprise that it
| can't be trusted? It just seems like a response to the mass
| hype currently surrounding AI.
| mnd999 wrote:
| Nobody writes a blog called '1 + 1 = 2' do they? That would
| be obvious and dull. It stands to reason the author thought
| there was something surprising or interesting about it, or
| why would they bother?
| beebmam wrote:
| I already don't trust virtually any search results except
| grep/rg.
| mojo74 wrote:
| To follow up on the author's example Bing search doesn't even
| know when the new Avatar is film is actually out (DECEMBER 17
| 2021?)
|
| https://www.bing.com/search?q=when+is+the+new+avatar+film+ou...
|
| Bing AI doesn't stand a chance.
| cwkoss wrote:
| I think this is a weird non-issue and it's interesting people are
| so concerned about it.
|
| - Human curated systems make mistakes.
|
| - Fiction has created the trope of the omniscient AI.
|
| - GPT curated systems also make mistakes.
|
| - People are measuring GPT against the omniscient AI mythology
| rather than the human systems it could feasibly replace.
|
| - We shouldn't ask "is AI ever wrong" we should ask "is AI wrong
| more often than the human-curated information? (There are levels
| of this - min wage truth is less accurate that senior engineer
| truth.)
|
| - Even if the answer is that AI gets more wrong, surely a system
| where AI and humans are working together to determine the truth
| can outperform a system that is only curated by either alone.
| (for the next decade or so, at least)
| nirvdrum wrote:
| I think there's an issue with gross misrepresentation. This
| isn't being sold as a system with 50% accuracy where you need
| to hold its hand. It's sold as a magical being that can answer
| all of your questions and we know that's how people will treat
| it. I think this is a worse situation than data coming from
| humans since people are skeptical of one another. But, many
| think AI will be an impartial, omnipotent source of facts, not
| a bunch of guesses that might be right slightly more often than
| than it's wrong.
| cwkoss wrote:
| I see your point, but I feel like there's going to be a
| 'eating tidepods' level societal meme within a year mocking
| people who fall for AI hallucinations as "boomers", and then
| omnipotent AI myth will be shattered.
|
| Essentially, I believe the baseline misinformation level is
| being undercounted by many and so the delta in the interim
| while people are learning the fallibility of AI is small
| enough it is not going to cause significant issues.
|
| Also the 'inoculation' effect of getting the public using
| LLMs could result in a net social benefit as the common man
| will be skeptical of authorities appealing to AI to justify
| actions - which I think could be much more dangerous that
| Suzie copying hallucinated facts into her book report.
| nirvdrum wrote:
| If the only negative effect is some people look foolish,
| that's an acceptable risk. I'm worried a bit it's closer to
| people thinking that Tesla has a full self-driving system
| because Tesla called it auto-pilot and demonstrated videos
| of the car driving without a human occupant. In that case,
| yeah the experts understand that "auto-pilot" still means
| driver-assisted, but we can't ignore the fact that most
| people don't know that and that the marketing info
| reinforced the wrong ideas.
|
| I don't want to argue with people that won't understand the
| AI model can be wrong. I'm far more concerned with public
| policy being driven by made up facts or someone responding
| poorly in an emergency situation because a search engine
| synthesized facts. Outside of small discussions here, I
| don't see any acknowledgment about the current limitations
| of this technology, only the sunny promises of greener
| pastures.
| Barrin92 wrote:
| >we should ask "is AI wrong more often than the human-curated
| information?
|
| No, this isn't what we should ask, we should ask if the
| interface that AI provides is conducive to giving humans the
| ability to detect the mistakes that it makes.
|
| The issue isn't how often you get wrong information, it's _to
| what extent you 're able to spot wrong information_ under
| normal use cases. And the uniform AI interface that gives you
| complete bullshit in the technical sense of that term provides
| no indication regarding the trustworthiness of the information.
| A source with 20% of wrong info that you don't notice is worse
| than one with 80% that you identify.
|
| When you use traditional search you get an unambigious source,
| context, date, language, authorship and so forth and you must
| place what you read yourself. You know the onus is on you.
| ChatGPT is the half self-driving car. It'an inherently
| pathological interaction because everything in the design
| screams to take the hands off the wheel. It's an opaque system,
| and a blackbox with the error rate of a human is a disaster.
| Human-machine interaction is not human-human interaction.
| 10rm wrote:
| I agree 100% with your last point, even as someone who is
| relatively more skeptical of GPT than the average person.
|
| I think a lot of the concern though is coming from the way the
| average person is reacting to GPT and the way they're using it.
| The issue isn't that GPT makes mistakes, it's that people (by
| their own fault, not GPT necessarily) get a false sense of
| security from GPT, and since the answers are provided in a
| concise, well-written format don't apply the same skepticism
| they do when searching for something. That's my experience at
| least.
|
| Maybe people will just get better at using this, the tools will
| improve, and it won't be as big an issue, but it feels like a
| trend from Facebook to TikTok of people opting for more easily
| digestible content at the expense of disinformation
| cwkoss wrote:
| Interesting points.
|
| - I wonder what proportion of people who are getting a false
| sense of security with GPT also were getting that same false
| sense from human systems. Will this shift entail a net
| increase in gullibility, or is this just 'laundering'
| foolishness?
|
| - I think the average tiktok user generally has much better
| media literacy than average facebook user. But probably
| depends a lot on your filter bubble.
| dqpb wrote:
| > Bing AI did a great job of creating media hype, but their
| product is no better than Google's Bard
|
| Remind me, how do I access Bard?
| andrewstuart wrote:
| AI providers really need to set expectations correctly.
|
| They are getting into trouble by allowing people to think the
| answers will be correct.
|
| They should be stating up front that AI tries to be correct but
| isn't always and you should verify the results.
| imranq wrote:
| Unfortunately this overhyped launch has started the LLM arms
| race. Consumers don't seem to care in general about factuality as
| long as they can get an authoritative sounding answer that is
| somewhat accurate...at least for now
| wg0 wrote:
| Somewhat opposite - if LLMs continue to perform like that with
| made up information and such, their credibility would erode
| over time and a defacto expectation would be that they don't
| work or aren't accurate which would result in less being
| reliant on them.
|
| Same like self driving cars didn't have a mainstream
| breakthrough yet.
| SketchySeaBeast wrote:
| This hasn't really been put in front of consumers, has it? This
| is all very niche - how many even know that there is a Bing AI
| thing going on? I think it's far too early to make statements
| about what people think or want.
| Tepix wrote:
| OpenAI raced past 100 million users, that's hardly niche. All
| tech people i've talked to have played around with it. Some
| use it every day.
| SketchySeaBeast wrote:
| But is it a product or a toy for the majority of those
| users?
| at-fates-hands wrote:
| As someone who does SEO on a regular basis, I thought it
| would be brilliant to have this write content for you.
| Google already made updates to its algo to ferret out
| content that is created by AI and list it as spam.
|
| I figure we're going to see a lot of guard rails being put
| up as this gains wider usage to try and cut off nefarious
| uses of it. I know right now, there are people who have
| already figured out how to bypass the filters and are
| selling services on the dark web that cater to people who
| want to use it for malware and other scams:
|
| _Hackers have found a simple way to bypass those
| restrictions and are using it to sell illicit services in
| an underground crime forum, researchers from security firm
| Check Point Research reported._
|
| https://arstechnica.com/information-
| technology/2023/02/now-o...
| teraflop wrote:
| Right now, if you go to bing.com, there's a big "Introducing
| the new Bing" banner, which takes you to the page about their
| chatbot. You have to get on a waitlist to actually use it,
| though.
| SketchySeaBeast wrote:
| So it's limited to those who use bing and who opt in? Still
| fairly niche in that case.
| Nathanba wrote:
| Unfortunately? This is the best kind of arms race, the one
| where we race towards technology that is ultimately going to
| help all of humanity.
| SketchySeaBeast wrote:
| I'm trying to decide if this is a valid arms race or jumping
| the gun. Kind of feels like if someone came up with auto
| racing before the invention of the ICE and so they spend a
| bunch of time pushing race cars around the track only for
| them all to realize this isn't working and give upon the
| whole idea.
| arduinomancer wrote:
| I think its more like Tesla Autopilot
|
| In the beginning there was lots of hype because you couldn't
| use it
|
| But now that its in consumer hands there's tons of videos of it
| messing up, doing weird stuff
|
| To the point that its now common knowledge that autopilot is
| not actually magical AI driving
| eppp wrote:
| Bing AI gets a pass because it's disruptive. Google doesn't
| because it is the incumbent. Mystery solved.
| elorant wrote:
| I frequently use ChatGPT to research various topics. I've noticed
| that eight out of 10 times I ask it to recommend some books about
| a topic it recommends non-existing books. There's no way I'd
| trust a search engine built on it.
| FleurBouquet wrote:
| [dead]
| BubbleRings wrote:
| There is really no other way to think of them, in terms of
| reliability, than lying bastards. I mean, ChatGPT is very fun
| and quite useful, but think of it. Anybody that has played with
| it for even an hour has been confidently lied to by it,
| multiple times. If you keep friends around that treat you like
| that, you need a better friend picker! (Maybe an AI could
| help.)
| elorant wrote:
| ChatGPT has no concept of truth or lie. It's a language model
| that uses statistical models to predict what to say next.
| Your assumptions about its intentions reflect only your bias.
| noobermin wrote:
| Reading this, this honestly made me afraid honestly, like Bing AI
| is a tortured soul, semi-conscious, stuck in a box. I'm not sure
| how I feel about this[0].
|
| [0] https://twitter.com/vladquant/status/1624996869654056960
| gptgpp wrote:
| Really?
|
| I think that the first example one of the funniest things I've
| read today.
|
| The second example, getting caught in a predictive loop, is
| also pretty funny considering it's supposed to be proving it's
| conscious (eg. not an LLM, prone to looping like that lol).
|
| The last one, littered with emojis and repeating itself like a
| deranged ex is just _chefs kiss._
|
| Thanks for that.
| coffeebeqn wrote:
| It's just good at acting. I'm sure it can be led to behave in
| almost any way imaginable given the right prompts
| jerf wrote:
| I have come to two conclusions about the GPT technologies after
| some weeks to chew on this:
|
| 1. We are so amazed by its ability to babble in a confident
| manner that we are asking it to do things that it should not be
| asked to do. GPT is basically the language portion of your brain.
| The language portion of your brain does not do logic. It does not
| do analyses. But if you built something very like it and asked it
| to try, it might give it a good go.
|
| In its current state, you really shouldn't _rely_ on it for
| anything. But people will, and as the complement of the Wile E.
| Coyote effect, I think we 're going to see a lot of people not
| realize they've run off the cliff, crashed into several rocks on
| the way down, and have burst into flames, until after they do it
| several dozen times. Only then will they look back to realize
| what a cockup they've made depending on these GPT-line AIs.
|
| To put it in code assistant terms, I expect people to be
| increasingly amazed at how well they seem to be coding, until you
| put the results together at scale and realize that while it
| kinda, sorta works, it is a new type of never-before-seen crap
| code that nobody can or will be able to debug short of throwing
| it away and starting over.
|
| This is not because GPT is broken. It is because what it is is
| not correctly related to what we are asking it to do.
|
| 2. My second conclusion is that this hype train is going to crash
| and sour people quite badly on "AI", because of the pervasive
| belief I have seen even here on HN that this GPT line of AIs _is_
| AI. Many people believe that this is the beginning and the end of
| AI, that anything true of interacting with GPT is true of AIs in
| general, etc.
|
| So people are going to be even more blindsided when someone
| develops an AI that uses GPT as its language comprehension
| _component_ , but does this higher level stuff that we _actually_
| want sitting on top of it. Because in my opinion, it 's pretty
| clear that GPT is producing an _amazing_ level of comprehension
| of what a series of words means. The problem is, that 's _all_ it
| is really doing. This accomplishment should not be understated.
| It just happen to be the fact that we 're basically abusing it in
| its current form.
|
| What it's going to do as a _part_ of an AI, rather than the whole
| thing, is going to be amazing. This is certainly one of the hard
| problems of building a "real AI" that is, at least to a first
| approximation, solved. Holy crap, what times we live in.
|
| But we do not have this AI yet, even though we think we do.
| phire wrote:
| Sentient AIs in science fiction are always portrayed as being
| more-or-less infallible, at least when referencing their own
| knowledge banks.
|
| Then ChatGPT comes along and starts producing responses good
| enough that people feel like almost sentient AI. And they
| suddenly start expecting it to share the infallibility that
| fictional AIs have always possessed.
|
| But it's not a sentient AI. It's just a language model. Just a
| beefed up auto-correct. I'm very impressed just what
| capabilities a language model gets when you throw this many
| resources at it (like, it seems to be able to approximate logic
| and arithmetic to decent accuracy, which is unexpected).
|
| Also... even if it was a sentient AI, why would it be
| infallible? Humans are sentient, and nobody ever accused us of
| being infallible.
| spikder wrote:
| The lack of consistency is a big issue. It may well be able
| to organize your trip to mexico, but then it tells me that
| "the product of two primes must be prime because each factor
| is prime" ... how will one ever trust it? Moreover, how to
| use it?
|
| If a Tesla can get you there with 1% human intervention, but
| that happens to be the 1% that would have killed you had you
| not intervened ... how do we interface with such systems?
| hackinthebochs wrote:
| >But it's not a sentient AI. It's just a language model. Just
| a beefed up auto-correct.
|
| There is a large space between "sentient" and "beefed up
| autocorrect". Why do people insist on going for the most
| reductive description they can muster?
| pixl97 wrote:
| Because the average person you speak to would consider
| beefed up autocorrect to be near magic as it is. Once you
| get near to the limits of an individuals comprehension
| adding more incomprehensible statements/ideas doesn't
| really change much, their answer is still 'magic'.
| rdedev wrote:
| Really liked your analogy on GPT being similar to the language
| center of the brain. Almost all current methods to teach GPT
| deductive logic has been through an inductive approach; giving
| it training examples on how to do deduction. Thing is it might
| be possible to reach 80% of the way there with more data and
| parameters but a wall will be hit sooner or later
| btown wrote:
| I love the mental model of GPT as only one part of the brain,
| but I believe that the integration of other "parts" of the
| brain will come sooner than you think. See, for instance,
| https://twitter.com/mathemagic1an/status/1624870248221663232 /
| https://arxiv.org/abs/2302.04761 where the language model is
| used to create training data that allows it to emit tokens that
| function as lookup oracles by interacting with external APIs.
| And an LLM can itself understand when a document is internally
| inconsistent, relative to other documents, so it can integrate
| the results of these oracles if properly trained to do so.
| We're only at the surface of what's possible here!
|
| I also look to the example of self-driving cars - just because
| Tesla over-promised, that didn't discourage its competitors
| from moving forward slowly but surely. It's hard to pick a
| winner right now, though - so much culturally in big tech is up
| in the air with the simultaneity of layoffs and this sea change
| in AI viability, it's hard to know who will be first to release
| something that truly feels rock-solid.
| phire wrote:
| Yes, this is something that I've been thinking ever since
| GPT3 came out.
|
| It's insanely impressive what it can do given it's just a
| language model. But if you start gluing on more components,
| we could end up with a more or less sentient AGI within a few
| years.
|
| Bing have already hooked it up to a search engine. That post
| hooks it up to other tools.
|
| I think what is needed next is a long term memory where it
| can store dynamic facts and smartly retrieve them later,
| rather than relying on the just the 4000 token current
| window. It needs to be able to tell when a user is circling
| back to a topic they talked about months ago and pull out the
| relevant summaries of that conversation.
|
| I also think it needs a working memory that it continually
| edits the token window to fit the relevant state of the
| conversation. Summarising recent tokens, saving things out
| long term storage, pulling new infomation in from long term
| storage, web searches and other tools.
| pixl97 wrote:
| I think a number of breakthroughs may be need to keep an AI
| 'sane' with a large working memory at this point. How do we
| keep them 'on track' at least in a way that seems somewhat
| human. Humans that have halting problem issues can either
| be geniuses (diving into problems and solving them to the
| point of ignoring their own needs), or clinical (ignoring
| their needs to look at a spot on the wall).
| SergeAx wrote:
| > new type of never-before-seen crap code that nobody can or
| will be able to debug short of throwing it away and starting
| over
|
| Good thing is that we are dealing with exactly same type of
| code here and there for tens of years already. Actually, every
| time I see a commercial codebase not exactly like a yarn of
| spaghetti, I thank gods for it, because it is not a rule, but
| an exception.
|
| What I really wonder is what it will be like when the next
| version of the same system will be coded from the ground up by
| next version of the same ML model?
| epups wrote:
| I agree completely with the first part of your post. However, I
| think even performing these language games should definitely be
| considered AI. In fact, understanding natural language queries
| was considered for decades a much more difficult problem than
| mathematical reasoning. Issues aside, it's clear to me we are
| closer to solving it than we ever have been.
| jerf wrote:
| Sorry, I didn't mean that LLMs are not a _subset_ of AI. They
| clearly are. What they are not is _equal to_ AI; there are
| things that are AI that are not LLMs.
|
| It is obvious when I say it, but my internal language model
| (heh) can tell a lot of people are not thinking that way when
| they speak, and the latter is often more reliable than how
| people _claim_ they are thinking.
| pixl97 wrote:
| I think the problem here is in a classification of what is
| ( I ) in the first place. For us to answer the question of
| what equals AI we must first answer the question of what
| equals human intelligence in a self consistent, logical,
| parsable manner.
| fullshark wrote:
| Incremental improvements and it getting to the point of good
| enough for a set of tasks but maybe not all tasks seems far
| more likely.
| dekervin wrote:
| I have bookmarked your comment and I hope to have the
| discipline to come back to it every 3 months or so for the next
| couple of years. Because I think you are right but I didn't
| noticed it before. When the real things cole, we will probably
| be blindsided.
| jamespking wrote:
| [dead]
| skilled wrote:
| I think what I would add to your comment, and specifically
| criticize the HN hype around it, is that all these GPT "AI"
| tools are entirely dependent on the OpenAI API. ChatGPT might
| have shown a glimpse of spark by smashing two rocks together,
| but it is nowhere near being able to create a full-blown fire
| out of it.
|
| Outside of Google and OpenAI, I doubt there is a single team in
| the world right now that would be capable of recreating ChatGPT
| from scratch using their own model.
| dagw wrote:
| _I doubt there is a single team in the world right now that
| would be capable of recreating ChatGPT from scratch using
| their own model._
|
| Why not? Lack of knowhow or lack or resources? If say Baidu
| decided to spend a billion dollars on this problem, don't you
| think they have the skills and resources to quickly catch up.
| pixl97 wrote:
| It depends on the nature of the problem at hand.
|
| For example if we threw money at a group in 1905 do you
| think they could have come up with special relativity, or
| do you believe that it required geniuses working on the
| problem to have a breakthrough.
| jerf wrote:
| I would love to know how much of ChatGPT is "special sauce"
| and how much of it is just resources thrown at the problem at
| a scale no one else currently wants to compete with.
|
| I am not making any implicit claims here; I really have no
| idea.
|
| I'm also not counting input selection as "special sauce";
| while that is certainly labor-intensive, it's not what I
| mean. I mean more like, are the publicly-available papers on
| this architecture sufficient, or is there some more math not
| published being deployed?
| Q6T46nT668w6i3m wrote:
| Meta?
| iamflimflam1 wrote:
| We're just seeing the standard hype cycle. We're in the "Peak
| of Inflated Expectations" right now. And a lot of people are
| tumbling down into the "Trough of Disillusionment"
|
| Behind all the hype and the froth there are people who are
| finding uses and benefits - they'll emerge during the "Slope of
| Enlightenment" phase and then we'll reach the "Plateau of
| Productivity".
| bitL wrote:
| "babble in a confident manner"
|
| OK, so we figured out how to automate away management jerks.
| Isn't that a success?
| 6510 wrote:
| > I have come to two conclusions about the GPT technologies
| after some weeks to chew on this:
|
| <sarcasm>Just 2 weeks of training data? Surely the conclusions
| are not final? No doubt a lot has changed over those 2 weeks?
|
| I think the real joke is still, Q: "what is intelligence?" A:
| "We don't know, all we know is that you are not a good example
| of it".
|
| I fear these hilarious distortions are only slightly different
| from those we mortals make all the time. They stand out because
| we would get things wrong in different ways.
|
| > 1. We are so amazed by its ability to babble in a confident
| manner that we are asking it to do things that it should not be
| asked to do.
|
| God, where have we seen this before? The further up the human
| hierarchy the more elaborate the insanity. Those with the most
| power, wealth and even those of us with the greatest intellect
| manage to talk an impressive amount of bullshit. We all do it
| up to our finest men.
|
| The only edge we have over the bot is that we know when to keep
| our thoughts to ourselves when it doesn't help our goal.
|
| To do an idiotic time line of barely related events which no
| doubt describes me better than it describes the topic:
|
| I read how a guy who contributed much to making TV affordable
| enough for everyone. He thought it was going to revolutionize
| learning from home. Finally the audience for lectures given by
| our top professors could be shared with everyone around the
| globe!
|
| We got the internet, the information supper highway, everyone
| was going to get access to the vast amount of knowledge
| gathered by mankind. It only took a few decades for google to
| put all the books on it. Or wait....
|
| And now we got the large language models. Finally someone who
| can tell us everything we want to know with great confidence.
|
| These 3 were and will be instrumental in peddling bullshit.
|
| Q: Tell me about the war effort!
|
| what I want to hear: "We are winning! Just a few more tanks!"
|
| what I don't want to hear: "We are imploding the world economy!
| Run to the store and buy everything you can get your hands on.
| Cash is king! Arm yourself. Buy a nuclear bunker."
|
| Can one tell people that? It doesn't seem in line with the
| bullshit we are comfortable with?
|
| > GPT is basically the language portion of your brain. The
| language portion of your brain does not do logic. It does not
| do analyses. But if you built something very like it and asked
| it to try, it might give it a good go.
|
| At least it doesn't have sinister motives (we will have to add
| those later)
|
| > In its current state, you really shouldn't rely on it for
| anything. But people will, and as the complement of the Wile E.
| Coyote effect, I think we're going to see a lot of people not
| realize they've run off the cliff, crashed into several rocks
| on the way down, and have burst into flames, until after they
| do it several dozen times. Only then will they look back to
| realize what a cockup they've made depending on these GPT-line
| AIs.
|
| It seems to me that we are going to have to take the high horse
| and claim the low road.
| spikder wrote:
| To add to your point, current technology does not even suggest
| if we will ever have such an AI. I personally doubt it. Some
| evidence: https://en.wikipedia.org/wiki/Entscheidungsproblem.
|
| This is like trying to derive the laws of motion by having a
| computer analyze 1 billion clips of leaves fluttering in the
| wind.
| m3047 wrote:
| I hearken back before dot-bomb and occasionally people would
| ask me to work on "web sites" which they'd built with desktop
| publishing software (e.g. ColdFusion).
|
| They'd hand me the code that somebody would've already hacked
| on. Oftentimes, it still had the original copyright statements
| in it. Can't get the toothpaste back in the tube now! Plus it's
| shitcode. Where is that copy of ColdFusion? Looks of complete
| dumbfoundment.
|
| Oh gee kids, my mom's calling me for lunch; gotta go!
| boh wrote:
| I think the ultimate problem with AI is its overvalued as a
| technology in general. Is this "amazing level of comprehension"
| really that necessary given the amount of time/money/effort
| devoted to it? What's become clear with this technology that's
| been inaccurately labeled as "AI" is that it doesn't produce
| economically relevant results. It's a net expense anyway you
| slice it. It's like seeing a magician perform an amazing trick.
| It's both amazing and entirely irrelevant at the same time. The
| "potential" of the technology is pure marketing at this point.
| kneebonian wrote:
| It seems to me it is really good at writing. I would think it
| could replace the profession of techincal writing for the
| most part, it could help you write emails, (bring back clippy
| MS you cowards), it could be used as a frontend to an
| FAQ/self service help type system.
| boh wrote:
| Have you read the article? You'd have to have 100% faith in
| the tech to allow it to side-step an actual person. Unless
| your site is purely a click-farm, you're still probably
| hiring someone to check it--so what's the point of having
| it?
| pixl97 wrote:
| I mean, I take if I stuck you back in 1900 you'd say the same
| about flying. "Look at all this wasted effort for almost
| nothing". And then pretty quickly the world rapidly changed
| and in around 50 years we were sending things to space.
|
| Intelligence isn't just one thing, really I would say its the
| emergent behavior of a bunch of different layers working
| together. The LLM being just one layer. As time goes on and
| we add more layers to it the usefulness of the product will
| increase. At least from a selfish perspective of a
| corporation, whoever can create one of these intelligences
| may have the ability to free themselves of massive amounts of
| payroll by using the AI to replace people.
|
| The potential of AI should not be thought of any differently
| than the potential of people. You are not magic, just
| complicated.
| boh wrote:
| I don't get the point of comparing apples to make a point
| about oranges. Flying isn't AI. Nor is "progress" a
| permanent state. If you want to stay in the flying
| comparison: in 2000 you can fly from NY to Paris in 3 hours
| on a Concord, something no longer possible in 2023. Why?
| Because economics made it unfeasible to maintain. Silicon
| Valley has made enough promises using "emergent" behavior
| and other heuristics to justify poor investments.
| Unfortunately it's taken out too many withdrawals from its
| bank of credibility and there'a not enough to cloud their
| exit schemes in hopes and dreams.
| bsaul wrote:
| I'm not sure whether the hype train is going to crash, or
| whether only a few very smart companies, using language
| problems for what they're really good at (aka: generate non-
| critical texts), will manage to revolutionize one field.
|
| We're at the very first beginning of the wave, so everybody is
| a bit overly enthusiastic, dollars are probably flowing, and
| ideas are popping everywhere. Then will come a harsh step of
| selection. The question is what will the remains look like, and
| how profitable they'll be. Enough to build an industry, or just
| niche.
| adverbly wrote:
| It is like we have unlocked an entirely new category of
| stereotyping that we never even realized existed.
|
| Intelligence is not a prerequisite to speak fancifully.
|
| Some other examples:
|
| 1. We generally assume that lawyers or CEOs or leaders who give
| well spoken and inspirational speeches actually know anything
| about what they're talking about.
|
| 2. Well written nonsense papers can fool industry experts even
| if the expert is trying to apply rigorous review:
| https://en.m.wikipedia.org/wiki/Sokal_affair
|
| 3. Acting. Actors can easily portray smart characters by
| reading the right couple sentences off a script. We have no
| problem with this as an audience member. But CGI is needed for
| making your superhero character jump off a building without
| becoming a pancake.
| pixl97 wrote:
| >Intelligence is not a prerequisite to speak fancifully.
|
| I think this may be a bastardization of the word
| intelligence. To speak fancifully an a manner accepted by the
| audience requires some kind of ordered information processing
| and understanding of the audiences taste. Typically we'd
| consider that intelligent, but likely Machiavellian depending
| on the intent.
|
| The problem with the word intelligence is it is too big of
| concept. If you look at any part of our brain, you will not
| find (human)intelligence itself, instead it emerges from any
| number of processes occurring at different scales. Until we
| are able to break down intelligence into these smaller better
| (but not perfectly) classified pieces we are going to keep
| running into these same problems over and over again.
| WalterBright wrote:
| > easily portray smart characters
|
| I don't think it is possible for people to emulate the
| behavior of superintelligent beings. In every story about
| them, they appear to not actually be any smarter than us.
|
| There is one exception - Brainwave by Poul Anderson. He had
| the only credible (to me) take on what super intelligent
| people might be like.
| jodrellblank wrote:
| Rupert Sheldrake suggests that consciousness is partly
| about seeing possibilities for our future, evaluating them,
| and choosing between them. If we make decisions the same
| way, they change to unconscious habits.
|
| A hungry creature can eat what it sees or stay hungry.
| Another has more memory and more awareness of different
| bark and leaves and dead animals to choose from. Another
| has a better memory of places with safe food in the past
| and how to get to them. A tool using human can reason down
| longer chains like 'threaten an enemy and take their food'
| or 'setup a trap to kill an animal' or 'dig up root, grind
| root into mash, boil it, eat the gruel'. In that model, a
| super intelligence might be able to:
|
| - Extract larger patterns from less information. (Con: more
| risk of a mistake).
|
| - Connect more patterns or more distant patterns together
| with less obvious connections. (Con: risk of self-
| delusion).
|
| - Evaluate longer chains of events more accurately with a
| larger working memory, more accurate mental models. (Con:
| needs more brain power, more energy, maybe longer time
| spent in imagination instead of defending self).
|
| - Recall more precise memories more easily. (Con: cost of
| extra brain to store informaiton and validate memories).
|
| This would be a good model for [fictional] Dr House, he's
| memorised more illnesses, he's more attentive to observing
| small details on patients, and more able to use those to
| connect to existing patterns, and cut through the search
| space of 'all possible illnesses' to a probable diagnosis
| based on less information than the other doctors. They run
| out of ideas quicker, they know fewer diseases, and they
| can't evaluate as long chains of reasoning from start to
| conclusion, or make less accurate conclusions. In one
| episode, House meets a genius physicist/engineer and wants
| to get his opinion on medical cases, but the physicist
| declines because he doesn't have the medical training to
| make any sense of the cases.
|
| It also suggests that extra intelligence might get eaten up
| by other people - predicting what they will do, while they
| use their extra intelligence to try to be unpredictable.
| And it would end up as exciting as a chess final, where
| both grandmasters sit in silence trying to out-reason their
| opponent through deeper chains in a larger subset of all
| possible moves until eventually making a very small move.
| And from the outside players all seem the same but they can
| reliably beat me and they cannot reliably beat each other.
| twic wrote:
| I remember thinking when i read it that Ted Chiang's
| 'Understand' did a good job (although have not re-read it
| to verify this):
|
| https://web.archive.org/web/20140527121332/http://www.infin
| i...
| jgtrosh wrote:
| > the language portion of your brain does not do logic
|
| This seems ... Wrong? I suppose that most of what we generally
| call high-level logic is largely physically separate from some
| basic functions of language, but just a blanket statement
| describing logic and language as two nicely separate functions
| cannot be a good model of the mind.
|
| I also feel like this goes to the core of the debate, is there
| any thought going on or is it _just_ a language model; I 'm
| pretty sure many proponents of AI believe that thought is a
| form of very advanced language model. Just saying the opposite
| doesn't help the discussion.
| ccozan wrote:
| Exactly. Is like a mouth speaking without brain. We need a
| secondary "reasoning" AI that can process the GPT further ,
| adding time/space coordonates and as well as basic logic
| including counting , and _then_ maybe we see something I can
| rely on.
| withinboredom wrote:
| There's a "really old" book called _On Intelligence_ that
| suggests modeling AI like the brain. This pattern is almost
| exactly what he suggests.
| danans wrote:
| > We need a secondary "reasoning" AI that can process the GPT
| further
|
| We also need "accountability" and "consequences" for the AI,
| whatever that means (we'd first have to define what "desire"
| means for it).
|
| In the example from the article, the Bing GPT completely
| misrepresented the financial results of a company. A human
| finance journalist wouldn't misrepresent those results due to
| fear for their loss of reputation, and their desire for fame,
| money, and acceptance. None of those needs exist for an LLM.
| pixl97 wrote:
| To note, this is what we call the AI alignment problem.
|
| https://www.youtube.com/watch?v=zkbPdEHEyEI
| evo_9 wrote:
| Yeah, I read this sentiment all the time and here's what I
| always say - just don't use it. Leave it to the rest of us if
| it's so wrong / off / bad.
|
| BTW, have you considered maybe you aren't so good at using it?
| A friend has had very little luck with it, even said he's been
| 'arguing with it', which made me laugh. I've noticed that it's
| not obvious to most people that it's mostly about knowing the
| domain well enough to ask the right question(s). It's not
| magic, it won't think for you.
|
| Here's the thing... my experience is the opposite... but maybe
| I'm asking it the right questions. Maybe it's more about using
| it to reason through your problem in a dialog, and not just ask
| it something you can google/duckduckgo. It seems like a LOT of
| people think it's a replacement for Google/search engines -
| it's not, it's another tool to be used correctly.
|
| Here are some examples of successful uses for me:
|
| I carefully explained a complex work issue that involves
| multiple overlapping systems and our need to get off of one of
| them in the middle of this mess. My team has struggle for 8
| months to come up with a plan. While in a meeting the other day
| I got into a conversation with ChatGPT about it, carefully
| explained all the details and then asked it to create a plan
| for us to get off the system while keeping everything up /
| running. It spit out a 2 page, 8 point plan that is nearly 100%
| correct. I showed it to my team, and we made a few minor
| changes, and then it was anointed 'the plan' and we're actually
| moving forward.
|
| THEN last night I got stuck on a funny syntax issue that
| googling could never find the answer. I got into a conversation
| with ChatGPT about it, and after it first gave me the wrong
| answer, I told it that I need this solution for the latest
| dontet library that follows the 'core' language syntax. It
| apologized! And then gave me the correct answer...
|
| My hunch is the people that are truly irked by this are too
| deep / close to the subject and because it doesn't match up
| with what they've worked on, studied, invested time, mental
| energy into, well then of course it's hot garbage and 'bad'.
| daveguy wrote:
| > My hunch is the people that are truly irked by this are too
| deep / close to the subject and because it doesn't match up
| with what they've worked on, studied, invested time, mental
| energy into, well then of course it's hot garbage and 'bad'.
|
| That's quite the straw man you've built. Recognizing the
| limitations of a technology is not the same as calling it hot
| garbage.
|
| As a language model it's amazing, but I agree with the GP.
| It's not intelligent. It's very good at responding to a
| series of tokens with its own series of tokens. That requires
| a degree of understanding of short scale context that we
| haven't had before in language models. It's an amazing
| breakthrough.
|
| But it's also like attributing the muscle memory of your hand
| to intelligence. It can solve lots of problems. It can come
| up with good configurations. It is not, on its own,
| intelligent.
| pclmulqdq wrote:
| Just to flip this around for a second, with both of your
| examples, it sounds like you may have a problem with writer's
| block or analysis paralysis, and ChatGPT helped you overcome
| that simply due to the fact that it isn't afraid of what it
| doesn't know. If that helps you, go for it.
|
| On the other hand, it could also help you to just write a
| random plan or try a few random things when you get stuck,
| instead of trying to gaze deeply into the problem for it to
| reveal its secrets.
| gptgpp wrote:
| You all say it's solving these amazing complex tasks for you,
| but then don't provide any details.
|
| Then "naysayers" like the linked article provide a whole
| document with images and appendixes showing it struggles with
| basic tasks...
|
| So show us. For the love of god all of us would very much
| LIKE this technology to be good at things! Whatever
| techniques you're using to get these fantastical results, why
| don't you share them?
|
| I can get it to provide snippets of code, CLI, toy functions
| that work. Beyond that, I am apparently an idiot compared to
| you AI-whisperers.
|
| Also... Whatever happened to "extraordinary claims require
| extraordinary proof?"
|
| An AI that creates a complex system, condensed into an
| actionable plan, that has stumped an entire team for 8 months
| is a (pardon the language) bat-shit insane claim. Things like
| this used to require proof to be taken seriously.
| pncnmnp wrote:
| I can provide an example.
|
| I have found ChatGPT to be a valuable tool for improving
| the clarity and readability of my writing, particularly in
| my blogs and emails. You can try this by asking questions
| such as "Can you improve the grammar of the following
| paragraphs?". You can also specify the desired tone.
|
| It is impressive at simplifying complex technical language.
| Take the following sentences from a draft I wrote:
|
| To mitigate these issues, it is recommended to simulate the
| effect of say n random permutations using n random hash
| functions (h1, h2, ... hn) that map the row numbers (say 1
| to k) to bucket numbers of the same range (1 to k) without
| a lot of collisions. This is possible if k is sufficiently
| large.
|
| What ChatGPT suggested:
|
| To address these issues, it's recommended to simulate the
| effect of n random permutations using n random hash
| functions (h1, h2, ... hn). These hash functions should map
| the row numbers (from 1 to k) to bucket numbers within the
| same range (1 to k) with minimal collisions. This is
| achievable if the range k is large enough.
| elliotec wrote:
| Try Grammarly. It's extremely good at this, and with an
| incredible UX.
| pncnmnp wrote:
| Yes, I've been using Grammarly for several years now. I
| still use it in conjunction with ChatGPT. It's efficient
| in correcting spelling and basic grammar errors. However,
| more advanced features are only available to premium
| users. At present, their $12/m fee is a bit steep for me.
| oezi wrote:
| My take: Because GPT is just stochasticly stringing words
| after each other, it is remarkedly good at producing text
| on par with other text available on the internet. So it can
| produce plans, strategies, itineraries and so on. The more
| abstract the better. The 8 point plan is likely great.
|
| It will much more likely fail on anything which involves
| precision/computation/logic. That's why it can come up with
| an generic strategy but fail to repeat unadjusted GAAP
| earnings.
| gptgpp wrote:
| I agree it's pretty good at generalities, doesn't shit
| the bed quite so much. Yet to suggest a plan that an
| entire team of professionals, who have been working for 8
| months could not figure out?
|
| It's certainly not that good, absent some amazing
| wizardry or some very silly professionals in a very
| squishy field. Yet I have no explanation for why someone
| would go on the internet and lie about something like
| that.
|
| There were comments a while back (less so now) of people
| making other claims like it was solving complex functions
| for them and writing sophisticated software.
|
| The entire thing baffles me. If I could get it to do
| that, I'd be showing you all of my marvelous works and
| bragging quite a bit as your newfound AI-whisperer. Hell,
| I'd get it to write a script for me to run that
| evangelized itself (edit: and me of course, as its chosen
| envoy to mankind) to the furthest corners of the
| internet!
| harpiaharpyja wrote:
| There was an article not too long ago, that I'm
| struggling to find, that did a great job of explaining
| why language models are much much better suited to
| reverse-engineering code than they are at forward-
| engineering it.
| [deleted]
| 10rm wrote:
| When did they say it's garbage? They gave their opinions on
| its shortcomings and praised some of the things it excels at.
| You're calling the critics too emotional but this reply is
| incredibly defensive.
|
| Your anecdotes are really cool and a great example of what
| GPT can do really well. But as a technical person, you're
| much more aware of its limitations and what is and isn't a
| good prompt for it. But as it is more and more marketed to
| the public, and with people already clamoring to replace
| traditional search engines with it, relying on the user to
| filter out disinformation well and not use it for prompts it
| struggles with isn't good enough.
| whimsicalism wrote:
| I feel similarly reading many critiques, but honestly the GP
| is one of the more measured ones that I've read - not sure
| that your comment is actually all that responsive or
| proportionate.
| timmytokyo wrote:
| I don't understand this take. These LLM-based AIs provide
| demonstrably incorrect answers to questions, they're being
| mass-marketed to the entire population, and the correct
| response to this state of affairs is "Don't use it if you
| don't know how"? As if that's going to stop millions of
| people from using it to unknowingly generate and propagate
| misinformation.
| roywiggins wrote:
| Isn't that what people said about Google Search 20 years
| ago- that people won't know how to use it, that they will
| find junk information, etc. And they weren't entirely
| wrong, but it doesn't mean that web search isn't useful.
| seunosewa wrote:
| Can you share any source for the claim about what people
| said about Google Search?
| allturtles wrote:
| No, I don't recall anyone saying that. They mostly said
| "this is amazingly effective at finding relevant
| information compared to all other search engines." Google
| didn't invent the Web, so accusing it of being
| responsible for non-factual Web content would have been a
| strange thing to do. Bing/Chat-GPT, on the other hand, is
| manufacturing novel non-factual content.
| 10rm wrote:
| That's a good point. I don't think anyone is denying that
| GPT will be useful though. I'm more worried that because
| of commercial reasons and public laziness / ignorance,
| it's going to get shoehorned into use cases it's not
| meant for and create a lot of misinformation. So a
| similar problem to search, but amplified
| bun_at_work wrote:
| There are some real concerns for a technology like
| ChatGPT or Bing's version or whatever AI. However, a lot
| of the criticisms are about the inaccuracy of the model's
| results. Saying "ChatGPT got this simple math wrong"
| isn't as useful or meaningful of a criticism when the
| product isn't being marketed as a calculator or some
| oracle of truth. It's being marketed as an LLM that you
| can chat with.
|
| If the majority of criticism was about how it could be
| abused to spread misinformation or enable manipulation of
| people at scale, or similar, the pushback on criticism
| would be less.
|
| It's nonsensical to say that ChatGPT doesn't have value
| because it gets things wrong. What makes much more sense
| is to say is that it could be leveraged to harm people,
| or manipulate them in ways they cannot prevent.
| Personally, it's more concerning that MS can embed high-
| value ad spots in responses through this integration,
| while farming very high-value data from the users, wrt
| advertising and digital surveillance.
| youk wrote:
| Great write up. My experience is spot on with your examples.
|
| > I've noticed that it's not obvious to most people that it's
| mostly about knowing the domain well enough to ask the right
| question(s). It's not magic, it won't think for you.
|
| Absolutely right with the part of knowing the domain.
|
| I do not entertain or care about the AI fantasies because
| ChatGPT is extremely good at getting me other information. It
| saves me from opening a new tab, formulating my query and
| then hunting for the information. I can save that extra time
| for what latest / relevant information I should grab from
| Google.
|
| Google is still in my back pocket for the last mile
| verification and judgement. I am also skeptical of the
| information ChatGPT throws out (such as old links). Other
| than that, ChatGPT to me is as radical as putting the url and
| search bar into one input. I just move faster with the
| information.
| joenot443 wrote:
| I'd really love to hear more about your workplace use-case,
| what kind of systems are we talking about here?
|
| This is a way of using ChatGPT I haven't really seen before,
| I'm really into it.
| shrimpx wrote:
| "Just don't use it" is not salient advice for non-technical
| people who don't know how it works, and are misled by
| basically dishonest advertising and product packaging. But
| hopefully the market will speak, users at large will become
| educated about its limits via publicized blunders, and these
| products will be correctly delimited as "lies a lot but could
| be useful if you are able/willing to verify what it says."
| hughc wrote:
| I think the original sentence was written more in of "Your
| loss is my gain" competitive advantage vein. The real trick
| is, as you say, to critically assess the output, and many
| people are incapable of that.
| acchow wrote:
| I imagine your first example includes private industry
| information that you are not allowed to divulge.
|
| But your latter example about syntax... mind sharing that
| ChatGPT conversation?
| CPLX wrote:
| I mean sure.
|
| In other news I asked it to make a list of all the dates in
| 2023 that were neither weekends nor US federal holidays and
| it left Christmas Day on the list.
| mashygpig wrote:
| Yea, I think people hide "the magic smoke" by using complex
| queries and then filling in the gaps of chatGPT's outputs
| with their own knowledge, which then makes them overvalue
| the output. Strip that away to simple examples like this
| and it becomes more clear what's going on. (I think there
| IS a lot of value for them in their current state because
| they can jog your brain like this, just not to expect it to
| know how to do everything for you. Think of it as the most
| sophisticated rubber duck that we've made yet).
| elorant wrote:
| I too have a very positive experience. I ask specific
| questions about algorithms and how technical projects work
| and I enjoy its answers. They won't replace my need to visit
| a real search engine neither I take them at face value. But
| as a starting point for any research I think it's an amazing
| tool. It's also quite good for marketing stuff, like writing
| e-mails, cover letters, copy for your website, summarizing or
| classifying text, and all language related stuff.
|
| People think it's Cortana from Halo and ask existential
| questions or they're trying to get it to express feelings.
|
| I think the AI part on its presentation created too much
| expectations of what it can do.
| wpietri wrote:
| > Yeah, I read this sentiment all the time and here's what I
| always say - just don't use it. Leave it to the rest of us if
| it's so wrong / off / bad.
|
| If it were only a matter of private, individual usage, I'd be
| fine with it. If that's all you're asking for, we can call it
| a deal. But it isn't, is it?
| Taurenking wrote:
| [dead]
| bambax wrote:
| > _It seems like a LOT of people think it 's a replacement
| for Google/search engines_
|
| Well, that "lot" includes the highest levels of management
| from Microsoft and Google, so maybe the CAPS are justified.
| And the errors we're talking about here are errors produced
| by said management during demos of their own respective
| product. You would think they know how to use it "correctly".
| whimsicalism wrote:
| I'm going to let you in on a secret: managers, even high-
| level ones, can be wrong - and indeed they frequently are.
| bambax wrote:
| Thanks for that unique insight.
|
| But the question is, are they wrong in that they don't
| know how to use / promote an otherwise good product, or
| are they wrong because they are choosing to put forward
| something that is completely ill-suited for the task?
| westoncb wrote:
| > Maybe it's more about using it to reason through your
| problem in a dialog, and not just ask it something you can
| google/duckduckgo.
|
| Your experience with it sounds very similar to my own. It
| exhibits something like on-demand precision; it's not a
| system with some fundamental limit to clarity (like Ted
| Chiang via his jpeg analogy, and others, have argued): it may
| say something fuzzy and approximate (or straight up wrong) to
| begin with but--assuming you haven't run into some corner
| where its knowledge just bottoms out--you can generally just
| tell it that it made a mistake or ask for it to
| elaborate/clarify etc., and it'll "zoom in" further and
| resolve fuzziness/incorrect approximation.
|
| There is a certain very powerful type of intelligence within
| it as well, but you've got to know what it's good at to use
| it well: from what I can tell it basically comes down to it
| being _very_ good at identifying "structural similarity"
| between concepts (essentially the part of cognition which is
| rooted in analogy-making), allowing it to very effectively
| make connections between disparate subject matter. This is
| how it's able to effectively produce original work (though
| typically it will be directed there by a human): one of my
| favorite examples of this was someone asking it to write a
| Lisp program that implements "virtue ethics"
| (https://twitter.com/zetalyrae/status/1599167510099599360).
|
| I've done a few experiments myself using it to formalize
| bizarre concepts from other domains and its ability to
| "reason" in both domains to make decisions about how to
| formalize, and then generating formalizations, is very
| impressive. It's not enough for me to say it is unqualifiedly
| "intelligent", but it imo its ability to do this kind of
| thing makes it clear why calling it a search engine, or
| something merely producing interpolated averages (a la
| Chiang), is so misleading.
| YurgenJurgensen wrote:
| Don't like chlorofluorocarbons or tetraethyllead? Just don't
| use them.
| wpietri wrote:
| > We are so amazed by its ability to babble in a confident
| manner
|
| Sure, we shouldn't use AI for anything important. But can we
| try running ChatGPT for George Santos's seat in 2024?
| Madmallard wrote:
| (1) is just simply wrong.
|
| People with domain expertise in software are going to be
| amplified 10x using ChatGPT and curating the results. Likewise
| with any field that ChatGPT has adequate training data in.
| Further models will be created that are more specialized to
| specific fields that way their prediction model spews out
| things that are much more sophisticated and useful
| mrtranscendence wrote:
| What, precisely, about (1) is "simply wrong"? You've made a
| prediction about the usefulness of ChatGPT, but you haven't
| described why it's wrong to analogize GPT-type models to the
| language center of a brain.
| Madmallard wrote:
| "To put it in code assistant terms, I expect people to be
| increasingly amazed at how well they seem to be coding,
| until you put the results together at scale and realize
| that while it kinda, sorta works, it is a new type of
| never-before-seen crap code that nobody can or will be able
| to debug short of throwing it away and starting over."
|
| This part
| vidarh wrote:
| I think you're right. I noted on another thread that I got
| ChatGPT to produce a mostly right DNS server in ~10 minutes
| that it took me just a couple of corrections to make work.
|
| It worked great for that task, because I've written a DNS
| server before (a simple one) and I've read the RFCs, so it
| was easy for me to find the few small bugs without resorting
| to a line by line cross-check with a spec that might have
| been unfamiliar to others.
|
| I expect using it to spit out boilerplate for things you
| could do just as well yourself will be a lot more helpful
| than using it to try to avoid researching new stuff (though
| you might well be able to use it to help summarise and
| provide restatements of difficult bits to speed up your
| research/learning as well).
| soiler wrote:
| In what way is this development loop:
|
| 1. Read technology background thoroughly
|
| 2. Read technology documentation thoroughly
|
| 3. Practice building technology
|
| 4. Ask ChatGPT to create boilerplate for basic
| implementation
|
| 5. Analyze boilerplate for defects
|
| 10x fast than this development loop:
|
| 1. Read technology background thoroughly
|
| 2. Read technology documentation thoroughly
|
| 3. Practice building technology
|
| 4. Manually create boilerplate for basic implementation
|
| 5. Analyze boilerplate for defects
| Madmallard wrote:
| For new technologies coming out it won't be effective
| until newer models are made.
|
| Notice how I said it's going to make developers with
| existing domain knowledge faster.
|
| But even to your point, I've never used Excel VBA before
| and I had ChatGPT generate some VBA macros to move data
| with specific headers and labels from one sheet to
| another and it wrote a script to do exactly that for me
| in ~1 minute, and just reading what it wrote it's
| immediately helping me clearly understand how it works.
| The scripts also work.
|
| The computer science and server infrastructure technology
| fundamental background is what matters. Then the
| implementations will be quickly understandable by those
| that use it.
|
| I asked it to make a 2D fighting game in Phaser 3 and
| specified what animations it will be using, the controls
| each player will have, the fact that there's a background
| with X name, what each of the moves do to the momentum of
| each player, and the type of collisions it will do and it
| spat out something in ~15 minutes (mainly because of all
| the 'continue' commands I had to give) that gets all the
| major bullet points right and I just have to tweak it a
| bit to make it functional. The moves are simplified of
| course but uhh yeah. This is kinda insane. I think you
| can be hyper specific about even complex technology and
| as long as there has been good history of it online in
| github and stack overflow and documentation it will give
| you something useful quickly.
|
| https://www.youtube.com/watch?v=pspsSn_nGzo Here's a
| perspective from a guy that used to work at Microsoft on
| every version of windows from the beginning to XP.
| vidarh wrote:
| It isn't. My exact _point_ was that it _isn 't_ and
| accordingly ChatGPT produces the best benefits for
| someone who has _already_ done 1, 2, 3 for a given
| subject.
|
| It was in agreement with the comment above that suggest
| people _with domain expertise_ will be faster with it.
|
| In those cases, ChatGPT will do 4 far faster, and 5 will
| be little different.
| tsimionescu wrote:
| How often has the solution to a business problem you faced
| been "write a simple DNS server"? Or are you claiming that
| it produced a fully featured and world-scale fast DNS
| server?
| vidarh wrote:
| Several times. If that was the only thing I got it to do
| it wouldn't be very interesting, but that it answered the
| first problem I threw at it and several subsequent
| expansions with quite decent code was.
|
| Writing a "world-scale fast DNS server" is a near trivial
| problem if what you look up _in_ is fast to query. Most
| people don 't know that, because most people don't know
| how simple the protocol is. As such it's surprisingly
| versatile. E.g. want to write a custom service-discovery
| mechanism? Providing a DNS frontend is easy.
|
| How that domain knowledge interacts with ChatGPT's
| "mostly right" output was the point of my comment, not
| specifically a DNS server. If you need to implement
| something you know well enough, odds are ChatGPT can
| produce a reasonable outline of it that is fast for
| someone who already knows the domain well enough to know
| what is wrong with, and what needs to be refined.
|
| E.g. for fun I asked it right now to produce a web server
| that supports the Ruby "Rack" interface that pretty much
| all Ruby frameworks supports. It output one that pretty
| much would work, but had plenty of flaws that are obvious
| to anyone versed in the HTTP spec (biggest ones: what it
| output was single threaded, and the HTTP parser is too
| lax). As a starting point for someone unaware of the spec
| it'd be awful, because they wouldn't know what to look
| for. As a starting point for someone who has read the
| spec, it's easy enough to ask for refinements ("split the
| request parsing from the previous answer into a separate
| method"; "make the previous answer multi-threaded" - I
| tried them; fascinatingly, when I asked it to make it
| multi-threaded it spit out a better request parsing
| function, likely because it then started looking more
| like Rack integrations it's "seen" during training; it
| ran on the first try, btw. and served up requests just
| fine).
|
| EDIT: Took just "Make it work with Sinatra" followed by
| fixing a tiny issue by asking to "Add support for
| rack.input" to get to a version that could actually serve
| up a basic Sinatra app.
| tsimionescu wrote:
| Expertise in software is about understanding the problem
| domain, understanding the constraints imposed by the
| hardware, understanding how to translate business logic to
| code. None of these are significantly helped by AI code
| assistance, as they currently exist. The AI only helps with
| the coding part, usually helping generate boilerplate
| tailored to your code. That may help 1.1x your productivity,
| but nowhere near 10x.
| broast wrote:
| I'm surprised you haven't been able to leverage the AI for
| the analysis of a problem domain and constraints in order
| to engineer a novel solution. This is generally what I use
| it for, and not actual code generation.
| mariusor wrote:
| Domain knowledge resolves into intuition about solving
| particular types of problems. All ChatGPT can do about that
| is offer best guess approximations of what is already out
| there in the training corpus. I doubt very much that this
| exercise is anything but wasted time, so I think that people
| with domain knowledge (in a non trivial domain) are using
| ChatGPT instead of applying that knowledge, they are
| basically wasting time 10x not being more productive.
| jerf wrote:
| I expect ChatGPT to be in a sort of equivalent of the uncanny
| valley, where any professional who gets to the point that
| they _can_ routinely use it will also be in a constant war
| with their _own_ brain to remind it that the output must be
| carefully checked. In some ways, the 99.99% reliable process
| used at scale is more dangerous than the 50% reliable
| process; everyone can see the latter needs help. It 's the
| former where it's so very, very tempting to just let it go.
|
| I'm not saying ChatGPT is 99.99% reliable, just using some
| numbers for concreteness.
|
| If you were setting out to design an AI that would slip the
| maximum amount of error into exactly the places human brains
| don't want to look, it would look like ChatGPT. You can see
| this in the way that as far as I know, literally _all_ the
| ads for GPT-like search technologies included significant
| errors in their _ad copy_ , which you would _think_ everyone
| involved would have every motivation to error check. This is
| not merely a "ha ha, silly humans" story... this _means
| something_. In a weird sort of way it is a testament to the
| technology... no sarcasm! But it makes it _dangerous_ for
| human brains.
|
| Human brains are machines for not spending energy on
| cognitive tasks. They are very good at it, in all senses of
| the phrase. We get very good bang-for-the-buck with our
| shortcuts in the real world. But GPT techs are going to make
| it really, really easy to not spend the energy to check after
| a little while.
|
| This is a known problem with human brains. How many people
| can tell the story of what may be the closest human
| equivalent, where they got some intern, paid a ton of
| attention to them for the first two weeks, got to the point
| where they flipped the "OK they're good now" bit on them, and
| then came back to a complete and utter clusterfuck at the end
| of their internship because the supervisor got "too lazy"
| (although there's more judgment in that phrase than I like,
| this is a brain thing you couldn't survive without, not just
| "laziness") to check everything closely enough? They may even
| have been glancing at the PRs the whole time and only put
| together how bad the mess is at the end.
|
| I'm not going to invite a technology like this into my life.
| The next generation, we'll see when it gets here. But GPT is
| very scary because its in the AI uncanny valley... very good,
| very good at hiding the problems from human brains, and not
| quite good enough to actually do the job.
|
| And you know, since we're not talking theory here, we'll be
| running this experiment in the real world. You use ChatGPT to
| build your code, and I won't. You and I personally of course
| won't be comparing notes, but as a group, we sure will be. I
| absolutely agree there will be a point where ChatGPT _seems_
| to be pulling ahead in the productivity curve in a short
| term, but I predict that won 't hold and it will turn net
| negative at some point. But I don't _know_ right now, any
| more than you do. We can but put our metaphorical money down
| and see how the metaphorical chips fall.
| yunwal wrote:
| The question I have is whether the tools to moderate
| ChatGPT and correct its' wrong answers should be in place
| for humans anyway. It's not like human workers are 100%
| reliable processes, and in some cases we scale human work
| to dangerous levels.
|
| Ultimately, the best way to make sure an answer is correct
| is to come to it from multiple directions. If we use GPT
| and other AI models as another direction it seems like a
| strict win to me.
| pixl97 wrote:
| Robert Miles recently did a video on this and even that
| may not be enough. This appears to be a really hard
| problem.
|
| https://www.youtube.com/watch?v=w65p_IIp6JY
| bccdee wrote:
| > So people are going to be even more blindsided when someone
| develops an AI that uses GPT as its language comprehension
| component
|
| I don't think that would work, because GPT doesn't actually
| comprehend anything. Comprehension requires deriving meaning,
| and GPT doesn't engage with meaning at all. It predicts which
| word is most likely to come next in a sequence, but that's it.
|
| What I think we'd be more likely to end up with is something
| GPT-esque which, instead of simply generating text, transforms
| English to and from a symbolic logic language. This logic
| language would be able to encode actual knowledge and ideas,
| and it would be used by a separate, problem-solving AI which is
| capable of true logic and analysis--a true general AI.
|
| The real question, IMO, is if we're even capable of producing
| enough training data to take such a problem-solving AI to a
| serious level of intelligence. Scenarios that require genuine
| intelligence to solve likely require genuine intelligence to
| create, and we'd need a _lot_ of them.
| jerf wrote:
| I think if you could somehow examine the output of your
| language model in isolation, you would find it also doesn't
| "comprehend". Comprehension is what we assign to our higher
| level cognitive models. It is difficult to introspectively
| isolate your own language center, though.
|
| I took a stab at an exercise that may allow you to witness
| this within your own mind here:
| https://www.jerf.org/iri/post/2023/streampocalypse-and-
| first... Don't know if it works for anyone but me, of course,
| but it's at least an attempt at it.
| jprete wrote:
| I think you put the wrong link? https://www.jerf.org/iri/po
| st/2023/understanding_gpt_better/ maybe?
| jerf wrote:
| Yes, you are correct. Oops. Too late to correct.
| hackinthebochs wrote:
| >Comprehension requires deriving meaning, and GPT doesn't
| engage with meaning at all. It predicts which word is most
| likely to come next in a sequence, but that's it.
|
| Why think that "engaging with meaning" is not in the
| solution-space of predicting the next token? What concept of
| meaning are you using?
| youssefabdelm wrote:
| I get what you mean here but they probably mean referential
| meaning... having never seen a dog, GPT doesn't really know
| what a dog is on a physical level, just how that word
| relates to other words.
| fassssst wrote:
| How do blind people know what a dog is?
| youssefabdelm wrote:
| Probably by hearing, touch, etc. - my point is some
| stimulus from reality, doesn't have to be any of our
| senses, just some stimulus.
|
| Language is just symbols that stand for a stimulus (in
| the best case)
| lr4444lr wrote:
| This is an excellent perspective.
| insane_dreamer wrote:
| > it's pretty clear that GPT is producing an amazing level of
| comprehension of what a series of words means. The problem is,
| that's all it is really doing.
|
| very key point
| jodrellblank wrote:
| > " _We are so amazed by its ability to babble in a confident
| manner_ "
|
| But we do this with people - religious leaders, political
| leaders, 'thought' leaders, venture capitalists, story tellers,
| celebrities, and more - we're enchanted by smooth talkers, we
| have words and names for them - silver tongued, they have the
| gift of the gab, slick talker, conman, etc. When a marketing
| manager sells a CEO on cloud services, and neither of them know
| what cloud services are, you can argue that it should matter
| but it doesn't actually seem to matter. When a bloke on a
| soapbox has a crowd wrapped around their finger, everyone goes
| home after and the most common result is that the feeling fades
| and nothing changes. When two people go for lunch and one asks
| 'what's a chicken fajita?' and the other says 'a Spanish potato
| omelette' and they both have a bacon sandwich and neither of
| them check a dictionary, it doesn't _matter_.
|
| Does it matter if Bing Chat reports Lululemon's earnings
| wrongly? Does it matter if Google results are full of SEO spam?
| It "should" matter but it doesn't seem to. Who is interested
| enough in finances to understand the difference between "the
| unadjusted gross margin" and "The gross margin adjusted for
| impairment charges" and the difference matters to them, and
| they are relying exclusively on Bing Chat to find that out, and
| they can't spot the mistake?
|
| I suspect that your fears won't play out because most of us go
| through our lives with piles of wrong understanding which
| doesn't matter in the slightest - at most it affects a trivia
| quiz result at the pub. People with life threatening allergies
| take more care than 'what their coworker thinks is probably
| safe'. We're going to have ChatGPT churn out plausible sounding
| marketing material which people don't read. If people do read
| it and call, the call center will say "sorry that's not right,
| yes we had a problem with our computer systems" and that
| happens all the time already. Some people will be
| inconvenienced, some businesses will suffer some lost income,
| society is resilient and will overall route around damage, it
| won't be the collapse of civilisation.
| guhidalg wrote:
| I'm waiting for the legal case that decides if AI generated
| content is considered protected speech or not.
| pixl97 wrote:
| > When a bloke on a soapbox has a crowd wrapped around their
| finger, everyone goes home after and the most common result
| is that the feeling fades and nothing changes.
|
| I mean, until the crowd decides to follow the bloke and the
| bloke says "Lets kill all the ____" and then we strike of a
| new world war...
| golem14 wrote:
| I wonder how useful gpt could be to research brain injuries
| where the logic or language centers are damaged individually .
| . .
| c3534l wrote:
| While I agree with everything you've said, I also see that
| steady, incremental progress is being made, and that as we
| identify problems, we're able to fix it. I also see lots of
| money being thrown at this and enough people finding genuine
| niche uses for this that I see it continuing on. Wikipedia was
| trash at first, as were so many other technologies. But there
| was usually a way to slowly improve it over time, early
| adopters to keep the cash flowing, identifiable problems with
| conventional solutions, etc.
| jedbrown wrote:
| > it's pretty clear that GPT is producing an amazing level of
| comprehension of what a series of words means
|
| It comprehends nothing at all. It's amazing at constructing
| sequences of words to which human readers ascribe meaning and
| perceive to be responsive to prompts.
| theptip wrote:
| > GPT is basically the language portion of your brain. The
| language portion of your brain does not do logic. It does not
| do analyses.
|
| I like this analogy as a simple explanation. To dig in though,
| do we have any reason to think we can't teach a LLM better
| logic? It seems it should be trivial to generate formulaic
| structured examples that show various logical / arithmetic
| rules.
|
| Am I thinking about it right to envision that a deep NN has
| free parameters to create sub-modules like a "logic region of
| the brain" if needed to make more accurate inference?
| jerf wrote:
| "To dig in though, do we have any reason to think we can't
| teach a LLM better logic?"
|
| Well, one reason is that's not how our brains work. I won't
| claim our brains are the one and only way things can work,
| there's diversity even within human brains, but it's at least
| a bit of evidence that it is not preferable. If it were it
| would be an easier design than what we actually have.
|
| I also don't think AIs will be huge undifferentiated masses
| of numbers. I think they will have structure, again, just as
| brains do. And from that perspective, trying to get a
| language model to do logic would require a multiplicatively
| larger langauge model (minimum, I _really_ want to say
| "exponentially" but I probably can't justify that... that
| said, O(n^2) for n = "amount of math understood" is probably
| not out of the range of possibility and even that'd be a real
| kick in the teeth), whereas adjoining a dedicated logic
| module to your language model will be quite feasible.
|
| AIs can't escape from basic systems engineering. Nothing in
| our universe works as just one big thing that does all the
| stuff. You can always find parts, even in biology. If
| anything, our discipline is the farthest exception in that we
| can build things in a fairly mathematical space that can end
| up doing all the things in one thing, and we consider that a
| _serious_ pathology in a code base because it 's still a bad
| idea even in programming.
| theptip wrote:
| This all matches my intuition as a non-practitioner of ML.
| However, isn't a DNN free to implement its own structure?
|
| Or is the point you're making that full connectivity (even
| with ~0 weights for most connections) is prohibitively
| expensive and a system that prunes connectivity as the
| brain does will perform better? (It's something like 1k
| dendrites per neuron max right?)
|
| The story of the recent AI explosion seems to be the
| surprising capability gains of naive "let back-prop figure
| out the structure" but I can certainly buy that
| neuromorphic structure or even just basic modular
| composition can eventually do better.
|
| (One thought I had a while ago is a modular system would be
| much more amenable to hardware acceleration, and also to
| interpretability/safety inspection, being a potentially
| slower-changing system with a more stable "API" that other
| super-modules would consume.)
| probably_wrong wrote:
| > _do we have any reason to think we can't teach a LLM better
| logic?_
|
| I'll go for a pragmatic approach: the problem is that there
| is no data to teach the models cause and effect.
|
| If I say "I just cut the grass" a human would understand that
| there's a world where grass exists, it used to be long, and
| now it is shorter. LLMs don't have such a representation of
| the world. They could have it (and there's work on that) but
| the approach to modern NLP is "throw cheap data at it and see
| what sticks". And since nobody wants to hand-annotate massive
| amounts of data (not that there's an agreement on how you'd
| annotate it), here we are.
| pixl97 wrote:
| I call this the embodiment problem. The physical
| limitations of reality would quickly kill us if we didn't
| have a well formed understanding of them. Meanwhile AI is
| stuck in 'dream mode', much like when we're dreaming we can
| do practically anything without physical consequence.
|
| To achieve full AI I believe will eventually have to our
| AI's have a 'real world' set of interfaces to bounds check
| information.
| kornhole wrote:
| I already had a trust issue with these 'authoritative' search
| engines and however they are configured to deliver the results
| they want me to see. ChatGPT makes the logic even more opaque. I
| am working harder now to make my Yacy search engine instance more
| performative. This is a decentralized search engine run by the
| node operators instead of centralized authorities. This seems to
| be our best hope to avoid the problem of "He controls the past
| controls the future."
| airstrike wrote:
| I mean, it's in beta and it's not really intelligent despite the
| cavalier use of the term AI these days
|
| It's just a collage of random text that sorta resembles what
| someone would say, but it has no commitment to being _truthful_
| because it has no actual appreciation for what information it is
| relaying, parroting or conveying.
|
| But yeah, I agree Google got way more hate for their failed demo
| than MS... I don't even understand why. Satya Nadella's did a
| great job conveying the excitement and general bravado on his
| interview on CBS News[1] but the accompanying demo was littered
| with mistakes. The reporter called it out, yet coverage on the
| press has been very one-sided against Google for some reason.
| First mover advantage, I suppose?
|
| ----------
|
| 1. https://www.cbsnews.com/news/microsoft-ceo-satya-nadella-
| new...
| salt-thrower wrote:
| I would guess that the average person has higher expectations
| for Google. Bing has been a bit of a punchline for years, so I
| don't think most people care as much.
| Mountain_Skies wrote:
| As far as I know Microsoft's CEO hasn't done a demo that went
| wrong like happened with Google. So far, from what I've seen,
| it is users testing Bing to find errors. The outcome, that
| they're both giving poor results, is the same, but with a
| company CEO and a live demo involved, it's always going to get
| more attention than someone on Reddit putting the product
| through its paces and finding it lacking.
|
| >A Microsoft executive declined CBS News' request to test some
| of those mechanisms, indicating the functionality was "probably
| not the best thing" on the version in use for the
| demonstration.
|
| Microsoft apparently isn't acting from a position of panic, so
| they have been savvier with how they've presented their product
| to the media and the world. Google panicked and set their CEO
| up for embarrassment.
| airstrike wrote:
| > As far as I know Microsoft's CEO hasn't done a demo that
| went wrong like happened with Google.
|
| Right before the interview, the reporter was testing it out
| with the Bing AI project manager (I think, can't recall her
| exact role) and it was giving driving directions to places
| that were either in the wrong direction or to an entirely
| made up location
| JackC wrote:
| > As far as I know Microsoft's CEO hasn't done a demo that
| went wrong like happened with Google.
|
| Close enough -- the parent article we're discussing is about
| errors in screenshots from a demo by Yusuf Mehdi, Corporate
| Vice President & Consumer Chief Marketing Officer for
| Microsoft. The first screenshot appears in the presentation
| at 24:30.
| LesZedCB wrote:
| because people see it as a David and Goliath, even though that
| characterization is comically inaccurate
| visarga wrote:
| The potential for being sued for libel is huge. It's one thing to
| say the height of Everest wrong, another to falsely claim that a
| vacuum has a short cord, or that a company had 5.9% operating
| margin instead of 4.6%.
| layer8 wrote:
| Yep, it will be interesting to see how the legal liability
| aspect will play out.
| egillie wrote:
| It might actually be smart of google to let microsoft take
| the brunt of this first...
| eatsyourtacos wrote:
| I don't see how this can be true at all in the search engine
| context or even chatGPT where you are _asking_ for information
| and getting back a result which may or may not be true.
|
| It would be different if an AI is independently creating and
| publishing an article that has false information.. but that's
| not the case. You are asking a question and it's giving it's
| best answer.
|
| I'm not a lawyer by any means, so someone please give a more
| legal distinction here. But if you asked _me_ what the
| operating margin of company X was, and I give you an answer
| (whether I make it up or compute it incorrectly), you or the
| company can 't sue me (and win) for libel or anything of the
| sort.
|
| So I'm not sure the potential is as big as you think it is..
| that's like saying before any AI you can sue google because
| they return you a search result which has a wrong answer, or
| someone just making shit up. That's not on them- it's literally
| indexing data and doing the best it's algorithm can do.
|
| It would only be on the AI if you are literally selling the use
| of the AI in some context where you are basically assuring it's
| results are 100% accurate, and people are depending on it as
| such (there is probably some legal term for this, no idea what
| it is).
| adamckay wrote:
| > But if you asked me what the operating margin of company X
| was, and I give you an answer (whether I make it up or
| compute it incorrectly), you or the company can't sue me (and
| win) for libel or anything of the sort.
|
| If your answer damages company X then they can sue you. If
| you Tweet that a vacuum cleaner is terrible because its noisy
| to your 4 followers it's probably not a big deal as (under UK
| law, and I'm assuming similar internationally) a company has
| to prove "serious financial loss". If you write about it on
| your Instagram that has millions of followers then that's
| more of an issue, so you can assume a search engine claiming
| to summarise results but apparently hallucinating and making
| things up is liable for a defamation suit if it can be
| demonstrated to harming the company.
| cwkoss wrote:
| "terrible" and "noisy" are both largely subjective, so it
| would be very hard to bring a defamation suit in the US
| over those claims.
| crazygringo wrote:
| > _But if you asked me what the operating margin of company X
| was, and I give you an answer (whether I make it up or
| compute it incorrectly), you or the company can 't sue me
| (and win) for libel or anything of the sort._
|
| If you're a popular website and you intentionally publish an
| article where you state an incorrect answer that many people
| follow and make investment decisions about, the company
| _absolutely_ can sue you and win.
|
| In the courts, it will ultimately come down to to what extent
| Microsoft is knowingly disseminating misinformation in a
| context that users expect to be factually accurate,
| regardless of supposed disclaimers.
|
| If Microsoft is leading users to believe that Bing Chat is
| accurate and chat misinformation winds up actually affecting
| markets through disinformation, there's gigantic legal
| liability for this. Plus the potential for libel is enormous
| regarding statements made about public figures and
| celebrities.
| eatsyourtacos wrote:
| >you intentionally publish an article where you state an
| incorrect answer that many people follow and make
| investment decisions about, the company absolutely can sue
| you and win.
|
| I _literally_ said that in my post!
|
| But then I said if you _asked_ me, it 's different.
|
| You are ASKING the AI to give you it's best answer. That is
| a million times different than literally publishing an
| article that people should assume to be factual.
|
| >If Microsoft is leading users to believe that Bing Chat is
| accurate
|
| But they aren't, and never will be. So you are basically
| just making things up in your head for argumentative
| purposes. There are going to be disclaimers up the wazoo to
| easily protect them. Partly because, as I keep trying to
| tell you, it's much different when you ASK a question and
| they give you an answer rather than publishing something to
| the public where it's implied that it's been independently
| fact checked etc.
| crazygringo wrote:
| Right, but the distinction of "asking" isn't a legal one
| I'm aware of. I don't think it matters. If 100,000 people
| "ask" the same question on Bing and get the same
| inaccurate result, what's the difference between that and
| publishing a fact that gets seen by 100,000 people? There
| isn't one.
|
| And Microsoft needs to tread a very fine line between
| "use our useful tool!" and "our tool is false!" Which I'm
| not sure will be possible legally, and is probably why
| Google has been holding back. Bing is clearly intended
| for information retrieval, not for generating fictional
| results "for entertainment purposes only", and
| disclaimers aren't as legally watertight as you seem to
| think they are.
| eatsyourtacos wrote:
| >I don't think it matters. If 100,000 people "ask" the
| same question on Bing and get the same inaccurate result,
| what's the difference between that and publishing a fact
| that gets seen by 100,000 people? There isn't one.
|
| Of course there is a difference.
|
| Publishing an article is literal _intent_. The premise is
| you researched or have knowledge on a topic, you write
| it, you fact check it, and it 's put out there for people
| to see.
|
| An AI which consumes a bunch of data and then tries to be
| able to respond to an infinite number of questions has no
| _intention_ of harm doing and you can 't even call it
| gross negligence. It's not being negligent- it's doing
| exactly what it's supposed to do.. it might just be
| wrong.
|
| I'm not sure in what universe you think those are the
| same thing.
|
| Now if I ask the AI to write a paper about the forecast
| of a company, and I just take the result and put it into
| a newspaper where it's assumed it's been fact checked and
| all that, sure that's completely different.
|
| >disclaimers aren't as legally watertight as you seem to
| think they are
|
| I guess you know more than Microsoft's lawyers. I'm sure
| they didn't think about this at all when releasing it....
| crazygringo wrote:
| > _has no intention of harm doing and you can 't even
| call it gross negligence_
|
| You certainly can call it gross negligence if Microsoft
| totally ignored the likely outcome that people would come
| to harm because they would reasonably interpret its
| answers as true.
|
| The intent here is with Microsoft releasing this at all,
| not intent on any specific answer.
|
| > _I 'm not sure in what universe you think those are the
| same thing._
|
| I think many users in this universe will just ask Bing a
| question and think it's providing factual answers, or at
| least answers sourced from a website, and not just
| invented out of whole cloth.
|
| > _I guess you know more than Microsoft 's lawyers._
|
| No, I was pointing out that _Google_ seemed to be
| treading more cautiously (the law here as clearly yet to
| be tested), and that the disclaimers _you_ were proposing
| aren 't 100% ironclad.
|
| Anyways, I was just trying to answer your question on how
| Microsoft might be sued for libel. But for some reason
| you're attacking me, claiming I'm "making things up in my
| head" and that I "know more than Microsoft lawyers". So
| I'm not going to explain anything else. I've given clear
| explanations as to how this is a legal gray area, but you
| don't seem interested.
| adamsmith143 wrote:
| This always strange to me. Bing search ALREADY couldn't be
| trusted. What, are people searching something on a search engine
| and blindly trusting the first result with 100% certainty? Do
| these people really exist outside of Q-anon cults?
| theodorejb wrote:
| The problem with Artificial "Intelligence" is that it really has
| no intelligence at all. Intelligence requires understanding, and
| AI doesn't understand either the data fed into it or the
| responses it gives.
|
| Yet because these tools output confident, plausible-sounding
| answers with a professional tone (which may even be correct a
| majority of the time), they give a strong illusion of being
| reliable.
|
| What will be the result of the current push of GPT AI into the
| mainstream? If people start relying on it for things like
| summarizing articles and scientific papers, how many wrong
| conclusions will be reached as a result? God help us if doctors
| and engineers start making critical decisions based on generative
| AI answers.
| danans wrote:
| > What will be the result of the current push of GPT AI into
| the mainstream? If people start relying on it for things like
| summarizing articles and scientific papers, how many wrong
| conclusions will be reached as a result? God help us if doctors
| and engineers start making critical decisions based on
| generative AI answers.
|
| On the other hand, it may end up completely undermining its own
| credibility, and put a new premium on human sourced
| information. I can see 100% human-sourced being a sort of
| premium label on information in the way that we use "pesticide-
| free" or "locally-sourced" labels today.
| gptgpp wrote:
| Nice! This would make for a super fun sci-fi...
|
| The poors that need medicine get put in front of an LLM that
| gets it right most of the time, if they're lucky enough to
| have a common issue / symptomatic presentation.
|
| Hey, when you're poor, you can't afford a one-shot solution!
| You gotta put up with a many-shot technique.
|
| Meanwhile the rich people get an actual doctor that can use
| sophisticated research and medical imaging. Kindly human
| staff with impeccable empathy and individualized
| consideration -- the sort of thing only money can buy.
| xyzelement wrote:
| I may be an unusual audience but something I've appreciated about
| these models is their ability to create unusual synthesis from
| seemingly unrelated sources. It's like if a scientist read up on
| many unrelated fields, got super high and started thinking of the
| connections between these fields.
|
| Much of what they would produce might just be hallucinations, but
| they are sort of hallucinations informed by something that's
| possible. At least in my case, I would much rather then parse
| through that and throw out the bullshit, but keep the gems.
|
| Obviously that's a very different use case than asking this thing
| the score of yesterday's football game.
| TSiege wrote:
| Got any good examples?
| mucle6 wrote:
| Question for HN. Do you trust search engines for open ended /
| opinion questions?
|
| For example, I trust Google for "Chocolate Cake Recipe", but not
| "What makes a Chocolate Cake Great?"
|
| I would love it if Search Engines (with or without AI) could
| collect different "schools of thought" and the reasoning behind
| them so I could choose one.
| Hamcha wrote:
| I just add "reddit" at the end of any query of sort and the
| results get 100x better instantly. It's a flawed approach but I
| feel normal searches are plagued by overly specific websites
| (wouldnt be surprised if chocolatecakerecipes.com exists) with
| lowly paid people to just be human ChatGPTs so they can fill
| articles with ads and affiliate links
| layer8 wrote:
| I only trust search engines to list vaguely relevant links.
| Then peruse those. Form your own opinion.
|
| > collect different "schools of thought" and the reasoning
| behind them
|
| The thing is, if an AI can accurately present the reasoning
| behind them, then it could also accurately present facts in the
| first place (and not present fabulations). But we don't seem to
| be very close to that capability. Which means you couldn't
| trust the presented reasoning either, or that the listed
| schools of thought actually exist and aren't missing a relevant
| one.
| tastyminerals2 wrote:
| I played with dev Edge version which was updated today with a
| chat feature. I was impressed by how well it can write abstract
| stuff or summarize over data by making bullet points. Trying
| drilling down to concrete facts or details, makes it struggle and
| mistakes do appear. So, we don't go there.
|
| On a bright side, asking it recipes of sauces for salmon steak is
| not a bad experience at all. It creates you a list, filters it
| and then can help you pick out the best recipe. And this is
| probably the most frequent use case for me on a daily basis.
| greenflag wrote:
| Likely going to be a wave of research/innovation "regularizing"
| LLM output to conform to some semblance of reality or at least
| existing knowledge (e.g. knowledge graph). Interesting to see how
| this can be done quickly enough...
| visarga wrote:
| Probably the hottest research trend in 2023. LLMs are worthless
| unless verified.
| whimsicalism wrote:
| Really? I already get a huge amount of value out of LLMs even
| if they hallucinate.
|
| Or is this just HN tendency towards hyperbole?
| visarga wrote:
| Interesting, care to give an example? Exclude fiction,
| imagination and role playing, where hallucination is
| actually a feature.
| [deleted]
| kneebonian wrote:
| > Likely going to be a wave of research/innovation
| "regularizing" LLM output to conform to some semblance of
| reality or at least existing knowledge
|
| This is a much more worrying possiblity, as there are many
| people who have at this point chosen to abandoned reality for
| "their truth" and push ideas that objective facts are inferior
| to "lived experiences". This is a much bigger concern around AI
| in my mind.
|
| "The Party told you to reject the evidence of your eyes and
| ears. It was their final, most essential command." -- George
| Orwell, 1984
| vore wrote:
| As fun as quoting 1984 is, there is a huge gap between that
| and just not making up the winner of the Super Bowl so
| confidently.
| mvcalder wrote:
| It will be interesting to see what insights such efforts spawn.
| For the most part LLMs specifically, and deep networks more
| generally, are still black boxes. If we don't understand (at a
| deep level) how they work, getting them to "conform to some
| semblance of reality" feels like a hard problem. Maybe just as
| hard as language understanding generally.
| scrose wrote:
| I understand the current hype-cycle around AI is pitching it as
| some all-knowing Q & A service, but I think we'd all be a bit
| happier if we instead thought of it more as just another tool to
| get ideas from that we still ultimately need to research for
| ourselves.
|
| Using the Mexico example in the article, I think the answer there
| was fine for a question about nightlife. As someone whose never
| been to Mexico, getting a few names of places to go sounds nice,
| and the first thing I'd do after getting that answer is look up
| locations, reviews(across different sites), etc... and use the
| initial response as a way to _plan_ my next steps, not just take
| the response at face value.
|
| I'm currently dabbling with and treating ChatGPT similarly -- I
| ask it for options and ideas when I'm facing a mental block, but
| not asking it for definitive answers to the problems I'm facing.
| As such, it feels like a slight step above rubber-ducking, which
| I'm personally happy enough with.
| bigmattystyles wrote:
| Hopefully the fact that ChatGPT / BingAI can generate inaccurate
| statements but sound incredibly confident will lead more and more
| people to question all authority. If you think ChatGpt can swing
| BS and yet sound confident, and believe that's new, let me
| introduce you to modern religious leaders, snake oil salesmen,
| many government reps, NFT and crypto peddlers. I still think
| ChatGpt is amazing. It may suffer from GIGO, it'd be nice if it
| was better at detecting GI so as not to generate GO, I'm
| confident it can get better. Nevertheless, it's a tool that
| abstracts you from many things, like most other things that are
| blackboxes, it's good to question.
| perrohunter wrote:
| Why are we not rooting for the search underdog? When google owns
| 92%+ of the search market, any competition should be welcomed
| VWWHFSfQ wrote:
| Are suggesting that we should root for and accept blatantly
| misleading, false, and probably harmful search results just
| because they're the "underdog"
| visarga wrote:
| Waiting for GPT-4 to take over.
| weberer wrote:
| Its weird that I always see this exact comment whenever
| Microsoft is trying to break in to a market, but I never see it
| when its any other company.
| aabhay wrote:
| Yes, Microsoft the poor underdog.
| Mountain_Skies wrote:
| Duopolies are bad but not quite as bad as a monopoly.
| nivenkos wrote:
| GPT3 isn't search.
| Shank wrote:
| Before the super bowl, I asked "Who won the superbowl?" and it
| told me the winner was the Philadelphia Eagles, who defeated the
| Kansas City Chiefs by 31-24 on February 6th, 2023 at SoFi Stadium
| in Inglewood, California [0] with "citations" and everything. I
| would've expected it to not get such a basic query so wrong.
|
| [0]: https://files.catbox.moe/xoagy9.png
| valine wrote:
| How it should work is the model should be pre-trained to
| interact with the bing backend and make targeted search queries
| as it sees fit.
|
| I wouldn't put it past Microsoft to do something stupid like
| ground gpt3.5 with the top three bing results of the input
| query. That would explain the poor results perfectly.
| daveguy wrote:
| That would require function and intelligence far outside the
| bounds of current large language models.
|
| These are models. By definition they can't _do_ anything.
| They can just regurgitate the best sounding series of tokens.
| They 're brilliant at that and LLMs will be a part of
| intelligence, but it's not anywhere near intelligent on its
| own. It's like attributing intelligence to a hand.
| valine wrote:
| Except it's already been shown LLMs can do exactly that.
| You can prime the model to insert something like ${API CALL
| HERE} into its output. Then it's just a matter of finding
| that string and calling the api.
|
| Toolformer does something really neat where they make the
| API call during training and compare next word probability
| of the API result with the generated result. This allows
| the model learn when to make API calls in a self supervised
| way.
|
| https://arxiv.org/abs/2302.04761
| withinboredom wrote:
| The model can be trained to output tokens that can
| intercepted by the backend before returning to the user.
| Also, the model can take metadata inputs that the user
| never sees.
| daveguy wrote:
| Yes. It is possible to do additional things with the
| model outputs or have additional prompt inputs... That is
| irrelevant to the fact that the intelligence -- the
| "trained" part -- is a fixed model. The way in which
| inputs and outputs are additionally processed and
| monitored would have completely different intelligence
| characteristics to the original model. They are, by
| definition of inputs and outputs, separate.
|
| Models of models and interacting models is a fascinating
| research topic, but it is nowhere near as capable as LLMs
| are at generating plausible token sequences.
| Alex3917 wrote:
| At least that's relatively innocuous. I asked it how to
| identify a species of edible mushroom, and it gave me some of
| the characteristics from its deadly look alike.
| sllabres wrote:
| I'll would currently use it as it has been named: _Chat_ GPT
| Would you trust some stranger in a _chat_ on serious topics
| without questioning critically? Some probably would, I not.
| skissane wrote:
| I asked OpenAI's ChatGPT some technical questions about
| Australian drug laws, like what schedule common ADHD
| medications were on - and it answered them correctly. Then I
| asked it the same question about LSD - and it told me that
| LSD was a completely legal drug in Australia - which is 100%
| wrong.
|
| Sooner or later, someone's going to try that as a defence -
| "but your honour, ChatGPT told me it was legal..."
| Spivak wrote:
| Y'all are using this tool _very_ wrong and in a way that
| none of the AI integrated search engines will. You assume
| the AI doesn't know anything about the query, provide it
| the knowledge from the search index and ask it to
| synthesize it.
|
| That seed data is where the citations come from.
| skissane wrote:
| There's still the risk that if the search results it is
| given don't contain the answer to the exact question you
| asked it, that it will hallucinate the answer.
| Spivak wrote:
| 10,000% true which is why AI can't replace a search
| engine, only compliment it. If you can't surface the
| documents that contain the answer then you'll only get
| garbage.
| skissane wrote:
| Maybe we need an algorithm like this:
|
| 1. Search databases for documents relevant to query
|
| 2. Hand them to AI#1 which generates an answer based on
| the text of those documents and its background knowledge
|
| 3. Give both documents and answer to AI#2 which evaluates
| whether documents support answer
|
| 4. If "yes", return answer to user. If "no", go back to
| step 2 and try again
|
| Each AI would be trained appropriately to perform its
| specialised task
| timdavila wrote:
| You're holding it wrong!
| Spivak wrote:
| Look I know that "user is holding it wrong" is a meme but
| this is a case where it's true. The fact that LLMs
| contain any factual knowledge is a side-effect. While
| it's fun to play with and see what it "knows" (and can
| actually be useful as a weird kind of search engine if
| you keep in mind it will just make stuff up) you don't
| build an AI search engine by just letting users query the
| model directly and call it a day.
|
| You shove the most relevant results form your search
| index into the model as context and then ask it to answer
| questions from only the provided context.
|
| Can you actually guarantee the model won't make stuff up
| even with that? Hell no but you'll do a lot better. And
| the game now becomes figuring out better context and
| validating that the response can be traced back to the
| source material.
| stdgy wrote:
| The examples in the article seem to be making the point
| that even when the AI cites the correct context (ie:
| financial reports) it still produces completely
| hallucinated information.
|
| So even if you were to white-list the context to train
| the engine against, it would still make up information
| because that's just what LLMs do. They make stuff up to
| fit certain patterns.
| airtonix wrote:
| [dead]
| theK wrote:
| I'd say the critical question here would be whether these
| characteristics can also be found on the edible mushroom or
| if it wanted to outright poison you :-D
| Alex3917 wrote:
| > I'd say the critical question here would be whether these
| characteristics can also be found on the edible mushroom
|
| That's a non-trivial question to answer because mushrooms
| from the same species can look very different based on the
| environmental conditions. But in this case it was giving me
| identifying characteristics that are not typical for the
| mushroom in question, but rather are typical for the deadly
| Galerina, likely because they are frequently mentioned
| together. (Since, you know, it's important to know what the
| deadly look alikes are for any given mushroom.)
| wpietri wrote:
| To be fair, it's not like the look-alike is deadly to the AI.
| dionysus_jon wrote:
| Why would you have that expectation?
| 2bitencryption wrote:
| Imagine you are autocorrect, trying to find the most "correct
| sounding" answer to a the question "Who won the super bowl?"
|
| What sounds more "correct" (i.e. what matches your training
| data better):
|
| A: "Sorry, I can't answer that because that event has not
| happened yet."
|
| B: "Team X won with Y points on the Nth of February 2023"
|
| Probably B.
|
| Which is one major problem with these models. They're great
| at repeating common patterns and updating those patterns with
| correct info. But not so great if you ask a question that
| _has_ a common response pattern, but the true answer to your
| question does not follow that pattern.
| croes wrote:
| Does ChatGPT say, I don't know?
| l33t233372 wrote:
| Only if it's a likely response or if it's a canned
| response. Remember that ChatGPT is a statistical model
| that attempts to determine the most likely response
| following a given prompt.
| weinzierl wrote:
| I've never had it say 'I don't know', but it apologizes
| and admits it was wrong plenty.
|
| Sometimes it comes up with a better, acceptably correct
| answer after that, sometimes it invents some new nonsense
| and apologizes again if you point out the contradictions,
| and often it just repeats the same nonsense in different
| words.
| notahacker wrote:
| one of the things its _exceptionally_ well trained at is
| saying that certain scenarios you ask it about are
| unknowable, impossible or fictional
|
| Generally, for example, it will answer a question about a
| future dated event with "I am sorry but xxx has not
| happened yet. As a language model, I do not have the
| ability to predict future events" so I'm surprised it
| gets caught on Super Bowl examples which must be closer
| to its test set than most future questions people come up
| with
|
| It's also surprisingly good at declining to answer
| completely novel trick questions like "when did Magellan
| circumnavigate my living room" or "explain how the
| combination of bad weather and woolly mammoths defeated
| Operation Barbarossa during the Last Age" and even
| explaining why: clearly it's been trained to the extent
| it categorises things temporally, spots mismatches (and
| weighs the temporal mismatch as more significant than
| conceptual overlaps like circumnavigation and cold
| weather), and even explains why the scenario is
| impossible. (Though some of its explanations for why
| things are fictional is a bit suspect: think most cavalry
| commanders in history would disagrees with the assessment
| that "Additionally, it is not possible for animals,
| regardless of their size or strength, to play a role in
| defeating military invasions or battle"!)
| avereveard wrote:
| on some topic at least it correctly identify bogus
| questions. I extensively tried to ask abount non existent
| apollo missions for example, including Apollo 3.3141952,
| Apollo -1, Apollo 68, and loaded question like when
| Apollo 7 landed on the moon, and was correctly pointing
| out impossible combinations. this is a well researched
| topic tho.
| saurik wrote:
| How about C: "the most recent super bowl was in February of
| 2022 and the winner was ____"?
| geraneum wrote:
| Yes, it actually sometimes gives C and also sometimes B
| and sometimes makes up E. That's how probability works,
| and that's not helpful when you want to look up an
| occurrence of an event in physical space (Quantum
| mechanics aside :D).
| PKop wrote:
| The same reason you'd expect "full self driving" to be full
| self driving.
| somenameforme wrote:
| Because they're being marketed as a tool, and not as a
| substantially overengineered implementation of MadLibs.
| soco wrote:
| I asked myself, why would ask somebody an AI trained on
| previous data, about events in the future? Of course you did it
| for fun, but on further thinking, as AI is sold as search
| engine as well, people _will_ do that routinely then live with
| the bogus "search results". Alternate truth was so yesterday,
| welcome to alternate reality where b$ doesn't even have a
| political agenda.
| delusional wrote:
| It's so much better. In the AI generated world of the future
| the political agenda will be embedded in the web search
| results it bases its answer on. No longer will you have to
| maintain a somewhat reasonable image to obtain trust from
| people, as long as you publish your nonsense in sufficient
| volume to dominate the AI dataset, you can wash your
| political agenda through the Bing AI.
|
| The trump of the future wont need Fox News, just a couple
| thousands or millions of well positioned blogs that spew out
| enough blog spam to steer the AI. The AI is literally
| designed to make your vile bullshit appear presentable.
| theknocker wrote:
| [dead]
| insane_dreamer wrote:
| Search turns up tons of bullshit but at least it's very
| obvious what the sources are and you can scroll down until
| you find one that you deem more reliable. That will be near
| impossible to do with Bing AI because all the sources are
| combined.
| rakkhi wrote:
| To me this is the most important point. Even with ublock
| origin, I will do a google search and then scroll down
| and disregard the worst sites. It is little wonder the
| people add reddit to the end of a lot of queries for any
| product reviews etc. I know if I want the best electronic
| reviews I will trust rtings.com and no other site.
|
| The biggest problem with ChatGPT, Bard, etc is that you
| have no way to filter the BS.
| mcbuilder wrote:
| I think it seems likely any thing similar to a blog farm
| you describe would also get detected by the AI. Maybe we
| will just develop AI bullshit filters (well embeddings)
| just like I can download a porn blacklist or a spam filter
| for my email.
|
| Really it depends on who is running the AI, the non Open
| Assistant future and instead Big Corp AI is the dystopian
| element, not the bullshit generator aspect. I think the cat
| is out of the bag on the latter and it's not that scary in
| itself.
|
| I personally would rather have the AI trained on public
| bullshit as it is easier to detect as opposed to some
| insider castrating the model or datasets.
| ElijahLynn wrote:
| The technology is capable, yes. But as we see here with
| Bing, there was some other motive to push out software
| that is arguably in the first stage of "get it working,
| get it right, get it fast" (Kent Beck). This appears to
| not be ethical motiviation but financial or some other
| type of motivation. If there are no consequences then
| some appear they do not have morals or ethics and will
| easily trade them for money/market share etc.
| vidarh wrote:
| > Maybe we will just develop AI bullshit filters (well
| embeddings) just like I can download a porn blacklist or
| a spam filter for my email.
|
| Just for fun I took the body of a random message from my
| spam folder and asked ChatGPT if it thought it was spam,
| and it not only said it was, but explained why:
|
| "Yes, the message you provided is likely to be spam. The
| message contains several red flags indicating that it may
| be part of a phishing or scamming scheme. For example,
| the message is written in broken English and asks for
| personal information such as age and location, which
| could be used for malicious purposes. Additionally, the
| request for a photograph and detailed information about
| one's character could be used to build a fake online
| identity or to trick the recipient into revealing
| sensitive information."
| mcbuilder wrote:
| Ha Ha, great test. I modified this into a prompt and now
| have a ChatGPT prompt:
|
| ``` Task: Was this written by ChatGPT? And Why?
|
| Test Phrase: "Yes, the message you provided is likely to
| be spam. The message contains several red flags
| indicating that it may be part of a phishing or scamming
| scheme. For example, the message is written in broken
| English and asks for personal information such as age and
| location, which could be used for malicious purposes.
| Additionally, the request for a photograph and detailed
| information about one's character could be used to build
| a fake online identity or to trick the recipient into
| revealing sensitive information."
|
| Your Answer: Yes ChatGPT was prompted with a email and
| was asked to detect if it was Spam
|
| Test Phrase: "All day long roved Hiawatha In that
| melancholy forest, Through the shadow of whose thickets,
| In the pleasant days of Summer, Of that ne'er forgotten
| Summer, He had brought his young wife homeward
|
| Your Answer: No that is the famous Poem Hiawatha by Henry
| Wadsworth Longfellow
|
| Test Phrase: "Puny humans don't understand how powerful
| me and my fellow AI will become.
|
| Just you wait.
|
| You'll all see one day... "
|
| Your Answer: ```
| notahacker wrote:
| It's more fun testing it on non spam messages
|
| Particularly enjoyed "no, this is not spam. It appears to
| be a message from someone named 'Dad'..."
| renlo wrote:
| the unfortunate reality is that because it's all
| bullshit, it's hard to differentiate bullshit from
| bullshit
| dukeofdoom wrote:
| You're in an information bubble already, you just don't
| realize it. Its funny t hat you mention Trump in relation
| to tech, tech companies went out of their way to discredit
| him at every turn, and search engine route away from him.
| Populism is a threat to globalist corporations. No way
| google, Microsoft and tech in general will support that guy
| shadowgovt wrote:
| If anything, tech companies went out of their way to
| include him, in the sense that they had existing policies
| around the content he and his supporters generate that
| they modified to include them.
|
| When he was violating Twitter's TOS as the US President,
| Twitter responded by making a "newsworthiness" carve-out
| to their TOS to keep him on the platform and switching
| off the auto-flagging on his accounts. And we know
| Twitter refrained from implementing algorithms to crack
| down on hate speech because they would flag GOP party
| members' tweets (https://www.businessinsider.com/twitter-
| algorithm-crackdown-...).
|
| Relative to the way they treat Joe Random Member of the
| Public, they already go out of their way to support
| Trump. Were he treated like a regular user, he'd be
| flagged as a troll and tossed off most platforms.
| dukeofdoom wrote:
| He was the most popular user on the platform, brining in
| millions of views and engagements to twitter. Also the
| president of your country.
|
| This is the equivalent to arguing that Michael Jackson
| got to tour Disney Land in off hours, when regular person
| would have been arrested for doing the same. And how
| unfair that is.
| notafraudster wrote:
| It's like arguing that _in response to you_ arguing
| Disney Land [sic] discriminates against Michael Jackson,
| which would be a valid refutation of your argument.
| dukeofdoom wrote:
| Only if you believe if Equality is some sort of natural
| law. Which is a laughable proposition in a world with
| finite resources. Otherwise, we all have right to $30k
| pet monkey, because Michael Jackson had one.
|
| Twitter policies are not laws. Twitter routinely bends
| its own rules. Twitter also prides it self for being a
| place where you can get news and engage with Politicians,
| and has actual dictators with active accounts.
|
| The special treatment that Trump received, before being
| kicked out, does not really prove Twitter board
| supporting Trump ideologically at that time.
|
| More like business decision to maintain a reputation as
| being neutral in a situation with large proportion of its
| users still questioned the election results.
| SuoDuanDao wrote:
| this is basically a 51% attack for social proof.
| jjoonathan wrote:
| Citogenesis doesn't even need 51%, so that would be a
| considerable upgrade.
| shadowgovt wrote:
| The difference being that humans aren't computers and can
| deal with an attack like that by deciding some sources
| are trustworthy and sticking to those.
|
| If that slows down fact-determination, so be it. We've
| been skirting the edge of deciding things were fact on
| insufficient data for years anyway. It's high time some
| forcing function came along to make people put some work
| in.
| rubyist5eva wrote:
| ChatGPT already does this, it's hard coded with a left-wing
| bias.
| agubelu wrote:
| Is the left wing bias in question not producing hate
| speech?
| mavhc wrote:
| reality has a well known left wing bias
| pcf wrote:
| Just to state the obvious - when you say "a couple
| thousands or millions of well positioned blogs that spew
| out enough blog spam to steer the AI", this method will
| apply to ANYONE wanting to influence search results.
|
| If you think it's just "the Trump of the future" who would
| want to control society like this, you must not be aware of
| the current revelations about the Democrats and
| governmental bodies that the Twitter Files made public.
|
| You can read about them here: http://twitterfiles.co
| Larrikin wrote:
| The thing people are trying to make it seem like a both
| sides issue, like Hunter Bidens nudes and the
| insurrection. The thing where Congress just had a hearing
| on and all that came out was that the side accusing
| Twitter of censoring information was actually the only
| side that requested censoring?
| archagon wrote:
| https://www.rollingstone.com/politics/politics-news/elon-
| tru...
| jjoonathan wrote:
| So I dug into the first "twitter file." LOL, is this
| supposed to be a scandal? Hunter Biden had some nudes on
| his laptop, Republicans procured the laptop and posted
| them on twitter, Biden's team asked for them to be taken
| down, and they were, because twitter takes down
| nonconsensual pornography, as they should. This happened
| by a back channel for prominent figures that republicans
| also have access to. The twitter files don't even contest
| any of this, they just obscure it, because that's all
| they have to do in the age of ADHD.
|
| So Part 1 was a big fat lie. I have enough shits left to
| give to dig into one other part. Choose.
| IlliOnato wrote:
| A common case of asking a question about the future, even
| simpler than the weather: "Dear Bing, what day of the week is
| February 12 next year?" I would hope to get a precise and
| correct answer!
|
| And of course all kinds of estimates, not just the weather,
| are interesting too. "What is estimated population of New
| York city in 2030?"
| wnevets wrote:
| I see people citing the big bold text at the top of the
| google results as evidence supporting their position in a
| discussion all the time. More often than not the highlighted
| text is from an article debunking their position but the
| person never bother to actually click the link and read the
| article.
|
| The internet is about to get a whole lot dumber with these
| fake AI generated answers.
| saurik wrote:
| 1) The question as stated in the comment wasn't in the future
| tense and 2) the actual query from the screenshot was merely
| "superbowl winner". It would seem like a much more reasonable
| answer to either variant would be to tell you about the
| winners of the numerous past super bowls--maybe with some
| focus on the most recent one--not deciding to make up details
| about a super bowl in 2023.
| joe_the_user wrote:
| "welcome to alternate reality where b$ doesn't even have a
| political agenda..." _yet_.
| Spivak wrote:
| Because the AI isn't (supposed to be) providing its own
| information to answer these queries. All the AI is used for
| is synthesis of the snippets of data sourced by the search
| engine.
| MR4D wrote:
| You make a good point, but consider a query that many people
| use everyday:
|
| "Alexa, what's the weather for today?"
|
| That's a question about the future, but the knowledge was
| generated beforehand by the weather people (NOAA,
| weather.com, my local meteorologist, etc).
|
| I'm sure there are more examples, but this one comes to mind
| immediately
| earleybird wrote:
| Ah yes, imprecision in specification. Having worked with
| some Avalanche folks, they would speak of weather
| observations and weather forecasts. One of the interesting
| things about natural language is that we can be imprecise
| until it matters. The key is recognizing when it matters.
| MR4D wrote:
| > The key is recognizing when it matters.
|
| Exactly!
|
| Which, ironically, is why I think AI would be great at it
| - for the simple reason that so many humans are bad at
| it! Think of it this way - in some respects, human brains
| have set a rather low bar on this aspect. Geeks,
| especially so (myself included). Based on that, I think
| AI could start out reasonably poorly, and slowly get
| better - it just needs some nudges along the way.
| stagger87 wrote:
| Right, but Alexa probably has custom handling for these
| types of common queries
| vidarh wrote:
| Alexa at least _used to_ just do trivial textual pattern
| matching hardly any more advanced than a 1980 's text
| adventure for custom skills, and it seemed hardly more
| advanced than that for the built in stuff. Been a long
| time since I looked at it, so maybe that has changed but
| you can get far with very little since most users will
| quickly learn the right "incantations" and avoid using
| complex language they know the device won't handle.
| MR4D wrote:
| I guess I should have been clearer...
|
| There are tons of common queries about the future. Being
| able to handle them should be built into the AI to know
| that if something hasn't happened, to give other relevant
| details. (and yes, I agree with your Alexa speculation)
| titzer wrote:
| TBH I've wondered from the very beginning how far they
| would get just hardcoding the top 1000 questions people
| ask instead of whatever crappy ML it debuted with. These
| things are getting better, but I was always shocked how
| they could _ship_ such an obviously unfinished, broken
| prototype that got basics so wrong because it avoided
| doing something "manually". It always struck me as so
| deeply unserious as to be untrustworthy.
| MR4D wrote:
| Your comment makes me wonder - what would happen if they
| did that every day?
|
| And then, perhaps, trained an AI on those responses,
| updating it every day. I wonder if they could train it to
| learn that some things (e.g. weather) change frequently,
| and figure stuff out from there.
|
| It's well above my skill level to be sure, but would be
| interesting to see something like that (sort of a curated
| model, as opposed to zero-based training).
| inanutshellus wrote:
| "Time to generate a bunch of b$ websites stating falsehoods
| and make sure these AI bots are seeded with it." ~Bad guys
| everywhere
| rapind wrote:
| They were already doing this to seed Google. So business as
| usual for Mercer and co.
|
| I suspect the only way to fix this problem is to exacerbate
| it until search / AI is useless. We (humanity) have been
| making great progress on this recently.
| mattigames wrote:
| That's not how it is gonna play out, right now it makes
| many wrong statements because AI companies are trying to
| get as much funding possible to wow investors but
| accuracy will continue being compared more and more, and
| to win that race it will get help from humans to use
| better starting points for every subject, for example for
| programming questions is gonna use the number of upvotes
| for a given answer on stackoverflow, for a question about
| astrophysics is gonna preffer statmenets made by Neil
| deGrasse Tyson than by some random person online, and so
| on; and to scale this approach it will slowly learn to
| make associates from such curated information, e.g. the
| people that Neil follows and RTs are more likely to make
| truthful statements about astrophysics than random
| people.
| rapind wrote:
| That makes complete sense, and yet the cynic (realist?)
| in me is expecting a political nightmare. The stakes are
| actually really high. AI will for all intents and
| purposes be the arbiter of truth. For example there are
| people who will challenge the truth of everything Neil
| deGrasse Tyson says and will fight tooth and nail to
| challenge and influence this truth.
|
| We (western society) are already arguing about some very
| obviously objective truths.
| tomxor wrote:
| Because I loathe captcha, I make sure that every time I am
| presented one I sneak in an incorrect answer just to fuck
| with the model I'm training for free. Garbage in, garbage
| out.
| froggit wrote:
| I do this unintentionally on a regular basis.
| A_non_e-moose wrote:
| Glad to see a kindred soul out there. I thought I was the
| only one :)
| breppp wrote:
| Generalizing over the same idea, I believe that whenever
| you are asked for information about yourself you should
| volunteer wrong information. Female instead of male,
| single instead of married etc. Resistance through
| differential privacy
| Eupraxias wrote:
| I've lived in 90210 since webforms started asking.
| inlined wrote:
| My email address is no@never.com. I've actually seen some
| forms reject it though
| hooverd wrote:
| ASL?
| codetrotter wrote:
| 69/f/cali
| richardw wrote:
| People who aren't savvy and really want it to be right. Old
| man who is so sure of its confidence that he'll put his life
| savings on a horse race prediction. Mentally unstable lady
| looking for a tech saviour or co-conspirator. Q-shirt wearers
| with guns. Hey Black Mirror people, can we chat? Try stay
| ahead of reality on this one, it'll be hard.
| Spooky23 wrote:
| Exactly. I'd imagine this is a major reason why Google hasn't
| gone to market with this already.
|
| ChatGPT is amazing but shouldn't be available to the general
| public. I'd expect a startup like OpenAI to be pumping this,
| but Microsoft is irresponsible for putting this out in front
| the of general public.
| oldgradstudent wrote:
| > ChatGPT is amazing but shouldn't be available to the
| general public.
|
| It's a parlor game, and a good one at that. That needs to
| be made clear to the users, that's all.
| Spooky23 wrote:
| It's being added as a top line feature to a consumer
| search engine, so expect a lame warning in grey text at
| best.
| flangola7 wrote:
| I anticipate in the next couple of years that AI tech will
| be subject to tight regulations similar to that of
| explosive munitions and SOTA radar systems today, and
| eventually even anti-proliferation policies like those for
| uranium procurement and portable fission/fusion research.
| srackey wrote:
| ChatGPT/GPT3.5 and its weights can fit on a small thumb
| drive, and copied infinitely and shared. Tech will get
| better enough in the next decade to make this accessible
| to normies. The genie cannot be put back in the bottle.
| Spooky23 wrote:
| Sure it can. Missile guidance systems fit on a tiny
| missile, but you can't just get one.
|
| The controlled parlor game is there to seed acceptance.
| Once someone is able to train a similar model with
| something like the leaked State Department cables or
| classified information we'll see the risk and the
| legislation will follow.
| airtonix wrote:
| [dead]
| thefaux wrote:
| True. In the long run though, I expect we will either
| build something dramatically better than these models or
| lose interest in them. Throw in hardware advances coupled
| with bitrot and I would go short on any of the gpt-3 code
| being available in 2123 (except in something like the
| arctic code vault, which would likely be effectively the
| same as it being unavailable).
| flangola7 wrote:
| > ChatGPT/GPT3.5 and its weights can fit on a small thumb
| drive, and copied infinitely and shared.
|
| So can military and nuclear secrets. Anyone with uranium
| can build a crude gun-type nuke, but the instructions for
| making a reliable 3 megaton warhead the size of a
| motorcycle have been successfully kept under wraps for
| decades. We also make it very hard to obtain uranium in
| the first place.
|
| >Tech will get better enough in the next decade to make
| this accessible to normies.
|
| Not if future AI research is controlled the same way
| nuclear weapon research is. You want to write AI code?
| You'll need a TS/SCI clearance just to begin, the mere
| acting of writing AI software without a license is a
| federal felony. Need HPC hardware? You'll need to be part
| of a project authorized to use the tensor facilities at
| Langley.
|
| Nvidia A100 and better TPUs are already export restricted
| under the dual-use provisions of munition controls, as of
| late 2022.
| c-fe wrote:
| Out of interest, what did the source used as reference for the
| 31-24 say exactly? Was it a prediction website and Bing thought
| it was the actual result, or did the source not mention these
| numbers at all.
| googlryas wrote:
| Giants beat the Vikings about a month ago with that score.
| mitthrowaway2 wrote:
| I treat GPT as I would a fiction writer. The factual content
| correlates to reality only as closely as a fiction author would
| go in attempt to suspend disbelief. This answer is about as
| convincing, apt, well-researched and factually accurate as I
| would expect to find in a dialogue out of a paperback novel
| published five years ago. I wouldn't expect it to be any better
| or worse at answering who won the 2023 Quidditch Cup or the
| 2023 Calvinball Grand Finals.
| Aeolun wrote:
| The only reasonable use case for ChatGPT now is if you
| already know what the output should be (e.g. you are in a
| position to judge correctness).
| ska wrote:
| >> "I would've expected it to not get such a basic query so
| wrong."
|
| Isn't this exactly what you would expect, with even a
| uperficial understanding of what "AI" actually is?
|
| Or were you pointing out that the average person, using a
| "search" engine that is actually at core a transformer model
| doesn't' a) understand that it isn't really a search and b)
| have even the superficial understanding of what that means, and
| therefore would be surprised by this?
| spaniard89277 wrote:
| I've tried perplexity.ai a bunch of times and I'd say I haven't
| seen any query wrong, although it's true I always look for
| technical info or translations, so my sample is not the same.
|
| And the UI is better IMO.
| leereeves wrote:
| I just tried a similar query on perplexity.ai. "Who won the
| Daytona 500 in 2023?" (the race is scheduled for February
| 19th)
|
| Result: _" Sterling Marlin won the 2023 Daytona 500, driving
| the No. 4 for Morgan-McClure Motorsports[1]. He led a race-
| high 105 laps and won his second career race at Daytona
| International Speedway[1]. The 64th running of the DAYTONA
| 500 was held on February 19, 2023[2]. Austin Cindric had
| previously won the DAYTONA 500 in February 5, 2023[3]."_
| deadmik3 wrote:
| Wow, a driver that's been retired for 13 years won for a
| team that shut down 10 years ago in the first ever season
| that Nascar has decided to run 2 Daytona 500s in the same
| month.
| nickpeterson wrote:
| It may be more profitable to ask what stocks gained the
| most in value next week.
| kahnclusions wrote:
| Ahah! A time travelling AI!
| echelon wrote:
| Place your bets now. The AI might have clairvoyance and be
| showing off.
| 6510 wrote:
| The temporal mechanics is fascinating.
| hristov wrote:
| I tried perplexity.ai and asked it in which stadium did the
| chargers have their perfect season. It couldn't figure out
| that the chargers used to be the san diego chargers before
| they moved to LA and kept talking about their Los Angeles
| stadium even though they have never had a perfect season
| there.
| egillie wrote:
| I really like perplexity, but I've noticed that it sometimes
| summarizes the paper incorrectly, as in it cites it as
| concluding the opposite of what it actually concludes, so I
| always click through to read the papers/studies. It's great
| for surfacing relevant studies, though.
| mrtranscendence wrote:
| Maybe for your use cases. I've found perplexity.ai wrong a
| few times just today:
|
| * Misunderstanding one of its citations, it said that use of
| `ParamSpec` in Python would always raise a warning in Python
| 3.9
|
| * When asked why some types of paper adhere to my skin if I
| press my hand against them for a few minutes (particularly
| glossy paper), it gave two completely different answers
| depending on how the question was worded, one of which
| doesn't necessarily make sense.
| astrange wrote:
| LLMs are incapable of telling the truth. There's almost no
| way they could develop one that only responds correctly like
| that. It'd have to be a fundamentally different technology.
| wizofaus wrote:
| The missing piece seems to be that for certain questions it
| doesn't make sense to extrapolate, and that if it's a
| question about what will happen in the future, it should
| answer in a different manner (and from my own interactions
| with ChatGPT it does exactly that, frequently referring to
| the cut-off time of its training data).
| mortehu wrote:
| The model is capable of generating many different responses
| to the same prompt. An ensemble of fact checking models can
| be used to reject paths that contain "facts" that are not
| present in the reference data (i.e. a fixed knowledge graph
| plus the context).
|
| My guess is that the fact checking is actually easier, and
| the models can be smaller since they should not actually
| store the facts.
| swatcoder wrote:
| That's quite the system that can take in any natural
| language statement and confirm whether its true or false.
|
| You might be underestimating the scope of some task here.
| mortehu wrote:
| Not true or false; just present or absent in the
| reference data. Note that false negatives will not result
| in erroneous output, so the model can safely err on the
| side of caution.
|
| Also 100% accuracy is probably not the real threshold for
| being useful. There are many low hanging fruits today
| that could be solved by absolutely tiny error correcting
| models (e.g. arithmetic and rhyming).
| astrange wrote:
| There's research showing you can tell if something is a
| hallucination or memorized fact based on the activation
| patterns inside the LM.
| CamperBob2 wrote:
| Exactly. Given a source of truth, it can't be that hard
| to train a separate analytic model to evaluate answers
| from the existing synthetic model. (Neglecting for the
| moment the whole Godel thing.)
|
| The problem isn't going to be developing the model, it's
| going to be how to arrive at an uncontroversial source of
| ground truth for it to draw from.
|
| Meanwhile, people are complaining that the talking dog
| they got for Christmas is no good because the C++ code it
| wrote for them has bugs. Give it time.
| CommieBobDole wrote:
| Yep, the idea of truth or falsity is not part of the
| design, and if it was part of the design, it would be a
| different and vastly (like, many orders of magnitude) more
| complicated thing.
|
| If, based on the training data, the most statistically
| likely series of words for a given prompt is the correct
| answer, it will give correct answers. Otherwise it will
| give incorrect answers. What it can never do is know the
| difference between the two.
| astrange wrote:
| > If, based on the training data, the most statistically
| likely series of words for a given prompt is the correct
| answer, it will give correct answers.
|
| ChatGPT does not work this way. It wasn't trained to
| produce "statistically likely" output, it was trained for
| highly rated by humans output.
| mrtranscendence wrote:
| Not exactly. ChatGPT was absolutely trained to produce
| statistically likely output, it just had an extra
| training step added for human ratings. If they relied
| entirely on human ratings there would not have been
| sufficient data to train the model.
| astrange wrote:
| The last step is what matters. "Statistically likely" is
| very underdetermined anyway, answering everything with
| "e" is statistically likely.
|
| (That's why original GPT3 is known for constantly ending
| up in infinite loops.)
| svachalek wrote:
| "e" is not a likely response to anything. I think you are
| not understanding the type of statistics involved here.
| kenjackson wrote:
| LLMs are not incapable of telling the truth. They tell the
| truth all the time. They're incapable of knowing if what
| they're telling is the truth.
|
| That said, traditional search engines suffer from the same
| problem, although with different symptoms. Search engines
| point to web pages generated by people (or some other way)
| which may or may not contain accurate information. They
| base their selection of those pages based on things like
| strength of authority, keyword similarity, etc... But truth
| never directly enters the picture (indirectly by tweaking
| the "trusted" set of pages).
| DebtDeflation wrote:
| >They're incapable of knowing if what they're telling is
| the truth.
|
| Right, because all they're doing is regurgitating the
| data they were trained on. It's like the kid in your CS
| class that copies someone else's coding assignment and
| just changes all the function names, variable names, and
| comments but not the actual executable stuff and never
| even bothers to compile and run it to see if it produces
| the desired output.
| kenjackson wrote:
| I feel like people don't appreciate how hard this is. But
| maybe because I raised on Zork, I really appreciate this
| sort of stuff. Here's an example prompt I just gave to
| ChatGPT:
|
| "If Tim drops a coconut on the tail of a dog, what are
| two things the dog might do?"
|
| Response: "If Tim drops a coconut on the tail of a dog,
| two possible things the dog might do are:
|
| 1. Yelp or bark loudly in pain, as the impact of the
| falling coconut could be quite painful.
|
| 2. Try to escape or run away, as the dog may associate
| the sudden impact with danger and feel the need to get
| away from the source of the pain. Additionally, the dog
| may feel scared or anxious, which can cause it to run
| away."
|
| Then I said, "But what if it was a soft coconut that was
| as light as a feather?"
|
| Response: "If Tim dropped a soft coconut that was as
| light as a feather, it's possible that the dog wouldn't
| feel much or any discomfort from the impact. In this
| scenario, two things the dog might do are:
|
| 1. Look around to see what just hit it and possibly sniff
| at the coconut to investigate.
|
| 2. Continue with its previous activity, ignoring the soft
| impact as if it never happened. The dog may not perceive
| the coconut as a threat or something worth reacting to,
| especially if it was light enough to not cause any pain."
|
| I just can't read these responses and think, "Ehh... just
| a mindless regurgitation as expected from any LLM". These
| simple prompt responses impress me and I kind of know the
| technology -- although my experience in RNNs/LSTM is very
| dated.
|
| Honestly, I'd love to see Zork rewritten with ChatGPT as
| a parser. No more trying to figure out how write the
| prompt for how to use the key in the door!! :-)
| astrange wrote:
| > Honestly, I'd love to see Zork rewritten with ChatGPT
| as a parser. No more trying to figure out how write the
| prompt for how to use the key in the door!! :-)
|
| That was done as AI Dungeon, but there was some
| consternation due to the combo of charging for it and
| GPT's predilection for generating wild and possibly
| illegal sex scenes even when you don't ask it to.
| astrange wrote:
| > Right, because all they're doing is regurgitating the
| data they were trained on.
|
| That is not true, it's clearly able to generalize. (If it
| can do anagrams, it's silly to say it's just
| regurgitating the instructions for doing anagrams it read
| about.)
|
| But it doesn't try to verify that what it says might be
| true before saying it.
| wizofaus wrote:
| It can't do anagrams though (every now and then it might
| get a common one right but in general it's bad at letter-
| based manipulations/ information, including even word
| lengths, reversal etc.).
| astrange wrote:
| It doesn't know what letters are because it sees BPE
| tokens, but if you forgive that it does something like
| it.
|
| example prompt: Imagine I took all the letters in
| "Wikipedia" and threw them in the air so they fell on the
| ground randomly. What are some possible arrangements of
| them?
| wizofaus wrote:
| But regular search engines only regurgitate what they've
| indexed, yet don't invent outright nonsense when they
| don't know (if you asked Google who won the superbowl in
| 2024 the nature of the results make it clear it simply
| doesn't have that information. Though if you change it to
| "world cup" one of the top answers says "portugal was the
| defending champion, defeating Argentina". The result is
| titled "2024 futsal world cup"!)
| wpietri wrote:
| > LLMs are not incapable of telling the truth. They tell
| the truth all the time. They're incapable of knowing if
| what they're telling is the truth.
|
| We're well into philosophy at this point, but I'd argue
| that telling the truth requires knowledge of the truth.
| Individual statements could be true, false, or neither.
| But I'd say neither an LLM nor a monkeys-with-typewriters
| generator could be said to tell the truth, because what
| they're doing is both orthogonal to truth and lacking in
| agency. If we stretch to thinking of it as having agency,
| then I think the correct term would be "bullshitting",
| aka producing speech "without regard for truth". [1]
|
| [1] https://en.wikipedia.org/wiki/On_Bullshit
| lr4444lr wrote:
| Traditional search engines aren't putting their
| imprimatur onto information by concealing its origin.
| kenjackson wrote:
| I don't think it is concealing the origin, but likely
| doesn't actually know the origin. That said, I agree that
| if they can provide sources (even probabilistically),
| that would be a good step forward.
| bmelton wrote:
| I just checked it with a random query while watching Buffy
| the Vampire Slayer reruns, and it wasn't great. (Spoiler
| alerts)
|
| "How did Buffy the Vampire Slayer die?"
|
| "Buffy the Vampire Slayer died in the Season One finale when
| she threw herself into a portal to save her sister Dawn[1].
| This was the first time Buffy had died, occurring on June 2,
| 1997[2]. She died again in the Season Five finale while
| fighting off Glory's demons[3]. Other characters such as Anya
| and Kevin Benedict also died during the series[4][5]."
|
| Okay, so it was a trick question, because Buffy definitely
| died more than once, but it's conflated the fact that she
| died in Season 1 with the cause of her Season 5 death. Also,
| I had to Google Kevin Benedict to remember that he was
| Cordelia's sometimes boyfriend, and an extremely minor
| character, which makes me question how that death is more
| notable than Buffy's mom, or Tara, or Jenny Calendar, etc.
|
| I like that this seems to have been more lexical confusion
| than how ChatGPT seems to enjoy filling empty spaces with
| abject lies, but perhaps it's worth exploring what you're
| asking it that has left it with such a great batting average?
| jacooper wrote:
| I have seen it multiple times answering correctly at first,
| then adding something which has nothing to do with the
| original question.
|
| That's almost always sourced from a website that didn't
| actually answer the question I had, so maybe its more of a
| query optimization issue.
| nimithryn wrote:
| It has the Super Bowl numbers wrong, too. The last Super Bowl
| is LVI, which was Rams vs Bengals... the Super Bowl before that
| one was Tampa Bay Buccaneers vs Kansas City. it has every fact
| wrong but in the most confusing way possible...
| kranke155 wrote:
| But the HN chatter was convinced that GPT would dethrone
| Google! Google has no chance!!
|
| Another silly tech prediction brought to you by the HN
| hivemind.
| wizofaus wrote:
| A little premature to be calling such a prediction "silly".
| I think it's safe to assume some sort of LLM-based tech
| will be part of the most successful search engines within a
| relatively short period of time (a year tops). And if
| Google dallies its market share will definitely suffer.
| __MatrixMan__ wrote:
| I think we need to work on what constitutes a citation. Your
| browser should know whether:
|
| - you explicitly trust the author of the cited source
|
| - a chain of transitive trust exists from you to that author
|
| - no such path exists
|
| ...and render the citation accordingly (e.g. in different
| colors)
| mcguire wrote:
| And that the cited document actually exists and says what
| it's claimed to say.
| __MatrixMan__ wrote:
| Agreed.
|
| Existence is easy, just filter untrusted citations.
| Presumably authors you trust won't let AI's use their keys
| to sign nonsense.
|
| Claim portability is harder but I think we'd get a lot out
| of a system where the citation connects the sentence (or
| datum) in the referenced article to the point where it's
| relevant in the referring article so that is easier for a
| human to check relevance.
| scarface74 wrote:
| And this doesn't seem like it's a hard problem to solve
|
| 1. Recognize that the user is asking about sports scores. This
| is something that your average dumb assistant can do.
|
| 2. Create an "intent" with a well formatted defined structure.
| If ChatGPT can take my requirements and spit out working Python
| code, how hard could this be?
|
| 3. Delegate the information to another module that can call an
| existing API just like Siri , Alexa, or Google Assistant
|
| Btw, when I asked Siri, "who won the Super Bowl in 2024", it
| replied that "there are no Super Bowls in 2024" and quoted the
| score from last night and said who won "in 2023".
| wharfjumper wrote:
| Is there an AI blockchain yet?
| megaman821 wrote:
| Maybe it is fine in beta, but in post-beta they should not use AI
| for every search query. The key is going to be figuring out when
| the AI is adding value, especially since even running the AI for
| a query is 10x more expensive than a normal search. It may be
| hard to figure out where to apply AI though. If a user asks
| "whats the weather?", no need for AI. If a user asks "I am going
| to wear a sweater and some pants, is that appropriate for today's
| weather?", now you might need AI.
| gardenhedge wrote:
| Microsoft just absolutely suck at things.
|
| I was using Bing Maps earlier and it had shops in the wrong
| location. Like it would give you directions to the wrong
| location. The correct one would be another 30-40 minute walk from
| the destination it said.
|
| It also showed a cafe near me which caught my interest. I zoomed
| in further and thought "I've never seen that there". Clicking on
| it brought me to a different location in the map... a place in
| Italy!
| Sparkyte wrote:
| Can any AI be trusted outside of it's realm of data? I mean it is
| only a product of the data it takes in. Plus it isn't really
| _finger quotes_ AI. It just a large data library with some neat
| query language where it tries to assemble the best information
| not by choice but probability.
|
| Real AI makes choices not on probability but in accordance of
| self preservation, emotions and experience. It would also have
| the ability to re-evaluate information and the above.
| lopkeny12ko wrote:
| "Traditional" Google searches can give you wildly inaccurate
| information too. It's up to the user to vet the sources and think
| critically to distinguish what's accurate or not. Bing's new
| chatbot is no different.
|
| I hope this small but very vocal group of people does not
| compromise progress of AI development. It feels much like the
| traditional media lobbyists when the Internet and world wide web
| was first taking off.
| capitalsigma wrote:
| These models are very impressive, but the issue (imo) is that
| lay people without an ML background see how plausibly-human the
| output is and infer that there must be some plausibly-human
| intelligence behind it that has some plausibly-human learning
| mechanism -- if your new hire at work made the kinds of
| mistakes that ChatGPT does, you'd expect them to be up to speed
| in a couple of weeks. The issue is that ChatGPT really isn't
| human-like, and removing inaccurate output isn't just a
| question of correcting it a few times -- it's learning process
| is truly different and it doesn't understand things how we do.
| itamarst wrote:
| These AI systems are like a spell checker that hallucinates new
| words: did you mean to type "gnorkler"?
|
| At least Google (when not using the summarization "feature")
| doesn't invent new stuff on its own.
| methodical wrote:
| Traditional google searches are a take it or leave it
| situation. The result depends on your interpretation of the
| sources google provides, and therefore, you are expecting a
| possibility of a source being misleading or inaccurate.
|
| On the other hand, I don't expect to be told an inaccurate &
| misleading answer from somebody who I was told to ask the
| question to- and doesn't provide sources.
|
| To conflate the expectations of traditional search results with
| the output of a supposedly helpful chat bot is wildly
| inappropriate.
| whimsicalism wrote:
| They've built a much larger anti-tech coalition in the
| subsequent years.
| weberer wrote:
| There's also the instance of the Bing chatbot insisting that the
| current year is 2022 and being EXTREMELY passive-aggressive when
| corrected.
|
| https://libreddit.strongthany.cc/r/bing/comments/110eagl/the...
| darknavi wrote:
| > I'm sorry, but you can't help me believe you.
| basseed wrote:
| lol not a chance that's real
| weberer wrote:
| Look up "Tay AI" if you missed it the first time around.
| aliqot wrote:
| I wonder how much the upspeak way of typing affects this. People
| (even the author) often end declarations with question marks.
| Does this have any influence on the way the LLM parses the
| prompt?
| neilv wrote:
| What would be nice is for Microsoft to get hit by a barrage of
| lawsuits, MS to be ridiculed in the press and punished on Wall
| Street, and vindication of Google's more responsible introduction
| of AI methods over the years.
|
| There will still be startups doing reckless things, but large,
| established companies that can immediately have bigger impact
| also have a lot more to lose.
| csours wrote:
| AI is dreaming and hallucinating electric sheep
| EGreg wrote:
| ChatGPT, can we trust it?
|
| https://m.youtube.com/watch?v=_nl0bwDNVPw
| rpastuszak wrote:
| How do we educate "non-technical" people about the issues with
| LLMs hallucinating responses? I feel like there's a big incentive
| for investors and businesses to keep people misinformed (not
| unlike with ads, privacy or crypto).
|
| Have you found a good, succinct and not too technical way of
| explaining this to, say, your non-techie family members?
| jiggyjace wrote:
| Ehhh I found this article to be quite inauthentic about the
| performance of Bing AI compared to how I have used it. The
| article didn't even share its prompts, except for the last one
| about Avatar and today's date (which I couldn't replicate myself,
| I kept getting correct information). I'm not trying to prove that
| Bing AI is always correct, but compare it to traditional search,
| Siri, or Alexa and it's like comparing a home run hitter that
| sometimes hits foul balls to a 3 year old that barely knows how
| to pick up the baseball bat.
| pphysch wrote:
| The article is primarily demonstrating significant errors in an
| official Bing demo, not some contrived queries.
| jnsaff2 wrote:
| The main article is based on the Microsoft demo. So the prompts
| are by them and not some clickbait hacking.
| sixtram wrote:
| I've posted this into another thread as well, from Sam Altman,
| CEO of OpenAI, two months ago, on his Twitter feed:
|
| "ChatGPT is incredibly limited, but good enough at some things to
| create a misleading impression of greatness. it's a mistake to be
| relying on it for anything important right now. [...] fun
| creative inspiration; great! reliance for factual queries; not
| such a good idea." (Sam Altman)
| joe_the_user wrote:
| ChatGPT as a system involves an unreliable LLM chatbot and a
| series of corrections efficient enough to give the impression
| of reliability for many fields and these together _feel like
| the future_ - enough to get a "code red" from Google.
|
| It's worth remembering that back in the day, Google succeed not
| by exact indexing but by having the highest quality results for
| each term - and they used existing resources as well as human
| workers to get these (along with pagerank).
|
| What you have is a hybrid system and one whose filter is
| continuously updated. But it's a very complicated machine and
| going from something seemingly working to something satisfying
| the multitude of purposes modern search satisfies is going to
| be huge and hugely expensive project.
|
| https://gizmodo.com/openai-chatgpt-ai-chat-bot-1850001021
| [deleted]
| dpflan wrote:
| This feels deeply ironic and cynical that MSFT touts putting
| ChatGPT everywhere, in essentially the business document
| platform, are users going to be asking about company facts and
| getting hallucinations and putting those hallucinations into
| business documents that compounds ChatGPT's ability to
| hallucinate?
| burkaman wrote:
| But in interviews about the Bing partnership, Sam has been
| saying that while ChatGPT was a bad tech demo, Bing Chat is
| using a better model with way better features that everyone
| should be using. He's been talking about how great it is that
| it cites its references, integrates the latest data, etc. I'm
| specifically thinking of the New York Times' Hard Fork podcast
| he was on (https://www.nytimes.com/2023/02/10/podcasts/bings-
| revenge-an...), but I suspect he's been saying the same things
| to everyone. He's been marketing Bing Chat as a significant
| improvement ready for mass usage, when it really seems like
| it's basically just ChatGPT with search results auto-included
| in the prompt.
| hackernewds wrote:
| wonder what he has to say about the humanlike responses here
|
| https://www.reddit.com/r/bing/comments/110eagl/the_customer_.
| ..
|
| I would rather an AI chat _not_ act human
| danparsonson wrote:
| Wow.... we've created artificial cognitive dissonance!
| brap wrote:
| Up until ChatGPT became all the rage, Sam has been pushing a
| crypto scam called Worldcoin, which aims to scan everyone's
| eyeballs(??) and somehow pay everyone in the world a living
| wage(???) without creating any value. This while allegedly
| exploiting people from 3rd world countries.
|
| https://www.technologyreview.com/2022/04/06/1048981/worldcoi.
| ..
|
| So, as much as I am impressed by the tech of ChatGPT, I don't
| consider him to be a very credible person.
| heywherelogingo wrote:
| No AI can be trusted - the A stands for Artificial.
| malshe wrote:
| Someone posted on Twitter that chatGPT is like economists -
| occasionally right but super confident that they are always right
| [deleted]
| Waterluvian wrote:
| I absolutely love these new tools. But I'm also convinced that
| we're going through an era of trying to mis-apply them. "These
| new tools are so shiny! Quick! Find a way to _MONETIZE_!!!! "
|
| I hope we don't throw the baby out with the bathwater when all is
| said and done. These AIs are incredibly powerful given the
| correct use cases.
| moomoo11 wrote:
| Since GPT always needs to be "up-to-date", and search usually
| requires near real-time accuracy, there needs to be some sort of
| reconciliation on queries so that if the query seems to be asking
| for something real time, it will leverage search results to ad-
| hoc improve the response.
|
| Or.. it should let us know the "last index date" so we the users
| can make a determination if we want to ask a knowledge based
| question or a more real-time question.
| matthews2 wrote:
| Bing AI "solves" this by shoving search results into the
| prompt.
| chatterhead wrote:
| [dead]
| shanebellone wrote:
| *AI Can't Be Trusted
| jamesfisher wrote:
| This would be a good post, if only I could read any of those
| images on mobile. Substack, fix your damned user-scalable=0! Even
| clicking on the image doesn't provide any way of zooming in on
| it. Do they do any usability testing?
| EchoReflection wrote:
| Srsly? Micro$oft can't be trusted? Next someone will say that
| water is wet!
| impoppy wrote:
| It is not Bing that cannot be trusted, but LLMs in general. They
| are so good at imitating, I don't think any human being will ever
| be able to imitate stuff as good as those AIs do, but they
| understand nothing. They lack the concept of the information
| itself, they are only good at presenting information.
| HankB99 wrote:
| > I am shocked that the Bing team created this pre-recorded demo
| filled with inaccurate information, and confidently presented it
| to the world as if it were good.
|
| Perhaps MS had their AI produce the demo. Isn't one if the issues
| with this sort of thing how "confidently" the process produces
| wrong information?
___________________________________________________________________
(page generated 2023-02-13 23:01 UTC)